Cache Invalidation with Clouldflare, WordPress, Varnish and HTTP PURGE

cache-invalidation

While having a cache can help WordPress scale you encounter one of the hardest computer science problems of cache invalidation. When a new post is published then the homepage cache needs to be broken in order to refresh.

When using Varnish there is a really nice wordpress plugin called Varnish Http Purge. Under the covers when a new post or comment is published it issues a HTTP PURGE request to break the cache.

Unfortunately if you have cloudflare in front of your domain then it will attempt to process the PURGE request and fail with a 403. After all you don’t want the entire world being able to break your cache.

$ curl -XPURGE http://blog.benhall.me.uk
<html>
<head><title>403 Forbidden</title></head>
<body bgcolor="white">
<center><h1>403 Forbidden</h1></center>
<hr><center>cloudflare-nginx</center>
</body>
</html>

My solution was to add a /etc/hosts entry for the domain on my local machine to point to the local IP address. When a HTTP request is issue to the domain from my web server then it skips cloudflare and goes straight to the Varnish instance, allowing the cache to be broken and solving the problem.

Scaling WordPress with Varnish and Docker

In my previous post I discussed how my blog is hosted. While it’s a great configuration, it is running on a small instance and the WordPress cache plugins only offer limited value. Andrew Martin showed me his blitz.io stats and it put mine to shame. Adding Varnish, an HTTP accelerator designed for content-heavy dynamic web sites to the stack was agreed.

My aim was to have a varnish instance running in-between Nginx container that does the routing for all incoming requests to the server and my WordPress container. With a carefully crafted Varnish configuration file I use the following to bring up the container:

docker run -d --name blog_benhall_varnish-2 
   --link blog_benhall-2:wordpress 
   -e VIRTUAL_HOST=blog.benhall.me.uk 
   -e VARNISH_BACKEND_PORT=80 
   -e VARNISH_BACKEND_HOST=wordpress 
   benhall/docker-varnish

The VIRTUAL_HOST environment variable is used for Nginx Proxy. The Docker link allowing Varnish and WordPress to communicate, my wordpress container is called blog_benhall-2. VARNISH_BACKEND_PORT defines the port WordPress runs on inside the container. VARNISH_BACKEND_HOST defines the internal hostname which we set while creating the docker link between containers.

When a request comes into the Varnish container it is either returned instantly or proxied to a different container and cached on the way back out.

Thanks to Nginx Proxy I didn’t have to change any configuration, as they simply reconfigured themselves as new containers were introduced. The setup really is a thing of beauty, that can now scale. I can use the same docker-varnish image to cache other containers in the future.

The Dockerfile and configuration can be found on Github.

The Docker image has been uploaded to my hub.