5 Things to avoid when caching with Varnish
Here is some guidance for Varnish users to help them get the maximum performance and efficiency from their Varnish setups.
1. Don’t set your time to live (TTL) to zero
It seems counterintuitive. If you don’t want to cache your content, doesn’t it seem like it makes sense to give the content a null TTL?
However, do not set your TTL to zero. This is really important. Keep in mind that if you set a 0 TTL, you’re instructing Varnish to fetch content from the backend and immediately discard it, which could very easily hamper performance significantly. If one of our primary goals is to serve content rapidly and to avoid overwhelming backends with stampedes of traffic, a null TTL will exacerbate the problem.
One method we use to protect the origin and serve content faster is a feature called request coalescing, which bundles potentially thousands of concurrent requests for the same piece of content, collapsing them into a single request, which is then cached. This cannot happen if the TTL is set to zero. Instead all the requests will be handled serially, and we never want this.
Essentially, instead, we want to cache the decision not to cache, creating a hit-for-miss.
Since we covered this point in various other places already we won’t spend too much time on it here. But if you’d like to get more explanations listen in to this webinar that covers the dangers of the null TTL and what you should do to avoid it.
2. Don’t skip the built-in VCL
Varnish Configuration Language offers tremendous flexibility and the possibility to customize your logic. With that said, however, it is never a good idea to ignore the built-in, ready-made VCL that Varnish comes with. It’s not only a kind of safety net, it’s also created by experts whose only job is to work with Varnish and who understand the ins and outs of Varnish behavior. Built-in VCL sets baseline default behaviors that most setups don’t need to diverge from.
The temptation, of course, can be to do it all yourself because Varnish lets you, because you have the skills to do it, or because you aren’t aware of what the built-in VCL does. Know from the get-go: built-in VCL is perfectly adequate for conventional behavior, and custom VCL should be reserved for custom logic situations.
The webinar dives into the detail behind a number of specific built-in VCL examples and what can go wrong if you change their structure. For example, vcl_hash and vcl_hit, among others; watch to find out more.
The bottom line: Extend — don’t amend — the built-in VCL. It’s there for a reason; use it.
3. Don’t vary on the user-agent
User-agent is an HTTP request header used by browsers to retrieve content from the internet to identify itself, and this will have potentially thousands of variations, depending on operating system, browser and browser version, user region, the device being used, etc. A user-agent string then could contain a lot of different information based on what each browser does.
An incorrect assumption: If you store something in cache, you assume that every user will see the same thing. This is not always the case, and this is where variations come into play. By issuing a Vary response header, which instructs the HTTP cache what parts of a response should and should not be cached, you can cache the home page for the local host, but at the same time state that you need a variation, for example for the language the user will see based on the language header.
You definitely want to sanitize your user-agent string to reduce variations (and keep your cache hit rates high). Out of the box, the DeviceAtlas VMOD, which is available in Varnish Enterprise, will do this for you.
4. Don’t use Varnish’s built-in file storage
A number of years ago Varnish started testing what could be done with a file system. And we found big limitations: in the long run, a file system kills performance, and isn’t persistent, so you lose everything if you have to restart your cache.
File storage, in a nutshell, then, offers limited performance, is prone to fragmentation and does not persist across restarts.
We’re all about performance, right? So we launched the Massive Storage Engine (MSE), which is a feature available in Varnish Enterprise.
With MSE you get the benefits of both disk and RAM memory and you have a persistent cache. MSE uses not only disk caching but also comes with a RAM layer whose speed you can leverage. You can store all the content you’re able to store on disk so it can persist on cache restarts, server power outages, unexpected server failures/reboots, maintenance, etc.
5. Don’t reinvent the wheel — use VMODs
In much the same way as you shouldn’t ignore or mess with built-in VCL because some of the basics are already in place for you, you also should take a look at what Varnish modules are available to you before you go reinventing the wheel.
When we say “reinvent the wheel” here, we’re saying… check to see what’s already been created in terms of packaged VMODs. Varnish modules, VMODs, are written in C and expose an API that is usable in VCL. Some use cases require a lot of VCL to perform, and it’s not a great idea to try to do some DIY VCL by using inline-C. It’s next to impossible to test and is very difficult to maintain. It makes a lot more sense to look at the available packaged VMODs created by Varnish Software to expand Varnish capabilities and functionality, and address specific use cases and industry needs.
Which VMODs get created is often influenced by a customer-specific use case but end up meeting a widespread need for which the VMOD can be recycled and used either within the Varnish Enterprise product or, quite often, in the open source community. There’s a large and growing library of official, packaged VMODs, which get some attention in the webinar.
Definitely do not try to do things from scratch if there’s already a VMOD there to meet your needs!
Now that we covered the 5 Varnish don’ts… read up on the 5 thing to do in Varnish!