Cache Invalidation Magic with Wildcards

The two hardest things in computer science are: naming things, off-by-one errors, and cache invalidation. This post is about the latter, specifically, how to invalidate your users’ browser cache when your static files have changed.

I found out about this one weird trick (doctors hate him) a few years ago but I think it bears repeating because it’s quite neat.

The Problem

You have a web application that is versioned and is able to generate its static URL (the root URL for all static assets) — something like /static/. Throughout this post I’ll use the term URL even when referring to just a path like /static/; in your mind you can prepend a hostname if you want — the principle remains the same.

The web server software you’re using wants to serve all static content from a certain directory, say /var/www/mysite/static/.

Your deployment system (although awesome) always puts static files in the same directory, and that directory doesn’t include the application version number in its path.

When you release a new version of your application, you want to be sure people get the latest version of your assets.

The Solution

Since your application is versioned it can generate a static URL that includes this version number, for example, /static-3.14/. However, a directory with the version number in its name will not exist in the file system (remember, your assets are always deployed to the same place). So unless you are able to deploy your assets to a variable path (and it’s possible if your deployment system supports variables) you need some way of translating the dynamic URL to the static path on disk. And if you could do it all without changing your web server configuration every time your application changed that would be great.

Wildcard Redirects

I’m going to show you a way you can use versioned static URLs that automatically point to the right place on the disk, using wildcard redirects. How it works is that we can have a URL that is like /static-[anything]/asset.ext and it will always load the file at /var/www/mysite/static/asset.ext. No configuration changes necessary.

By using a wildcard redirect our application version can change to anything, our static assets can keep going into the same directory, and browsers will always invalidate their cache for new versions of the asset, because the URL has changed.

I will give configuration examples for Nginx and Apache (using mod_rewrite).

Nginx

Using nginx, it is as simple as adding a rewrite line like so (inside a server block):

rewrite    ^/static-[^/]+/(.*)$    /static/$1 last;

The regular expression matches paths starting with /static-, then any character except a forward slash, so the version can be in any format (2 or 3.4 or 4x5df or anything — all will match). The item in the bracket is the filename or path that will be appended to /static/, which should then match a file on disk.

Apache with mod_rewrite

Assuming mod_rewrite is installed and enabled, you can turn it on in a .htaccess file or inside the server configuration. Assuming you have the RewriteEngine On line already, just add something like this:

RewriteRule    ^/static-[^/]+/(.*)$    /static/$1    [NC,L]

This regular expression works the same as the nginx one detailed above, redirecting any version of the static file to the same one on disk.

Other Web Servers

This technique should work on any web server that support regular expressions in redirects. I won’t include any more examples but the concept is the same.

How It Works

To give a full rundown on how this works:

  1. Your static files are always deployed to the same location (/var/www/mysite/static).
  2. Your web application can use its version to generate a static URL, for example /static-1/.
  3. A browser requests the main CSS file from your site, referenced in your HTML with the relative URL /static-1/main.css.
  4. The web server uses a wildcard match to redirect /static-1/main.css to /static/main.css, and loads /var/www/mysite/static/main.css.
  5. You release version 2 of the web application. Assets are still deployed to /var/www/mysite/static but static URLs are now generated like /static-2/.
  6. When the same browser visits your site, your HTML now include links to /static-2/main.css. The browser sees this as a new file, so it downloads it again.
  7. The same wildcard redirect also works for /static-2/main.css, redirecting to /static/main.css, and loading /var/www/mysite/static/main.css.
  8. You did not need to do any configuration changes along with your application, regardless of the version number your application specifies the browser will load the correct file.

Conclusion

There are a few ways that you can solve the cache-invalidation problem but I am particularly fond of this one because it fits nicely into many different systems and doesn’t require much setup. It’s not perfect; if you only want to cache-break some of your assets then this doesn’t work, you would have to use separate URLs for cache/no-cache assets. But generally, I just think it’s pretty cool and that I’d share it.

Want to discuss elegant solutions for your application pain-points? Please contact for more information.

Previous entry

Next entry

Related entries

Deb Constrictor for Application Deployment (Part 1)