Caching Minified files with Nginx

A quick introduction to Minify

Minify is a popular PHP5 library best described by the Minify site itself:

Minify is a PHP5 app that helps you follow several of Yahoo!’s Rules for High Performance Web Sites.
It combines multiple CSS or Javascript files, removes unnecessary whitespace and comments, and serves them with gzip encoding and optimal client-side cache headers.

The uses of Minify are twofold:

  1. Speed up websites by combining and minifying files
  2. Help eliminate the problems of cached css and javascript files in users browsers.

Minify is fairly easy to setup and can dramatically increase the performance of websites, but as mentioned in the FAQs because each request is served by PHP, it can actually slow your site down (for example if your site recieves a lot of traffic or you have are on a feeble shared server)

A simple solution to this problem – suggested in the Minify FAQs  is to serve your Minified files through a reverse proxy.

An even quicker introduction to Nginx

Nginx is popular high performance web server and reverse proxy server. I’m not going to try and summarise what it does here – if you are reading this, the chances are you already know.

We have used Nginx for a while in a standard way, using Nginx in front of Apache. This approach is easy to configure and is also to revert to a pure Apache setup. We served all static content (css, js, images etc.) directly from Nginx and passing requests for dynamic content to a backend (PHP on an Apache server) using the Nginx HttpProxyModule.

Despite the fact that our Minified css and Javascript rarely change with this setup a request for either is still a PHP request. We can avoid this request by using the Nginx HttpProxyModule  to cache responses from the backend making a faster response and reducing the load on the backend server.

Nginx configuration

This is a simple Nginx configuration taken from a development server running Ubuntu. For ease of maintenance, the configuration file is split up into several files using sites-available and sites-enabled directories (Debian/Ubuntu style). In this case Nginx is listening on Port 80 and Apache is listening on Port 8080

nginx.conf

The main file is nginx.conf which then includes all the live virtual hosts that are in the sites-enabled directory (often organised symbolic links to actual files in the sites-available directory)

The white-space in the config files doesn’t matter so it makes sense to organise your .conf files in the way you find most readable.

worker_processes  1;

events {
    worker_connections  1024;
}

http {
    include mime.types;
    default_type application/octet-stream;

    log_format  new_log
    '$remote_addr - $remote_user [$time_local] $request '
    '"$status" $body_bytes_sent "$http_referer" '
    '"$http_user_agent" "$http_x_forwarded_for"';

    # Proxy cache and temp configuration.
    proxy_cache_path 	/var/www/nginx_cache levels=1:2
			keys_zone=main:10m
			max_size=1g inactive=30m;
    proxy_temp_path 	/var/www/nginx_temp;

    sendfile on;

    include /etc/nginx/sites-enabled/*; 
}

The key directive here are the lines that setup the Proxy cache and which allow us to cache the results of scripts that are returned by the backend server. (see documentation)

    # Proxy cache and temp configuration.
    proxy_cache_path 	/var/www/nginx_cache levels=1:2
			keys_zone=main:10m
			max_size=1g inactive=30m;
    proxy_temp_path 	/var/www/nginx_temp;

proxy_cache_path – where on the filesystem the cached files will be stored
levels – helps define the structure of files stored in the cache directory
max_size – the maximum size of the cache
inactive – timeout period for requests to the cached files
proxy_temp_path – a buffer for requests from the file system

example.conf

This is a simple example of one of the virtual host .conf files this is included using the include /etc/nginx/sites-enabled/*; directive in the nginx.conf file.

server {
    listen 80;
    access_log /var/log/nginx/www.example.com.access.log;
    error_log /var/log/nginx/www.example.com.error.log;
    root /var/www/www.example.com/;
    index index.php index.html;
    server_name www.example.com;

    # send appropriate headers to enable browser caching for static files
    # static files are identified by file extension
    location ~* ^.+.(jpg|jpeg|gif|png|ico|css|zip|tgz|gz|rar|bz2|doc|xls|exe|pdf|ppt|txt|tar|mid|midi|wav|bmp|rtf|js)${
    access_log off;
    expires 30d;
   }

    # Set the proxy cache key
    set $cache_key $scheme$host$uri$is_args$args;

    location ~/min/ {
        # Set proxy headers for the passthrough
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
	proxy_pass http://192.168.1.5:8080;
        proxy_cache main;
        proxy_cache_key $cache_key;
        proxy_cache_valid 30m; # 200, 301 and 302 will be cached.
        # Fallback to stale cache on certain errors.
        # 503 is deliberately missing, if we're down for maintenance
        # we want the page to display.
        proxy_cache_use_stale 	error
               	              	timeout
                       	      	invalid_header
                      		http_500
                       		http_502
                       		http_504
                       		http_404;
	} 
    # proxy any other requests back to the Apache server listening on Port 8080
    location / {
        more_clear_headers 'Content-Length' 'Transfer-Encoding';
	proxy_cache_bypass 1;
	proxy_no_cache 1;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header Host $host;
        proxy_pass http://192.168.1.5:8080;
    }

}

The virtual host .conf file above contains 3 different location blocks each with a different role.

The first location block catches requests for static files and serves them directly from the file system – in this case the Apache server is not accessed at all. Unfortunately request to minify are all .php requests so cache headers are not sent an every time a minified file is requested it will be passed back to Apache.

The role of the third location block is to send any request that hasn’t already been dealt with already back to the Apache server.

The role of the second location block is deal with any requests to minify. We have set a cache timeout of 30 minutes and any requests within that time period will be served from the Nginx cache – if there is no match in the cache, the request will be passed back to Apache. If you are using minify_group, when the underlying Javascript and CSS files change, the timestamp on the minified URL will change and therefore no matching cache file will exist, so the Nginx cache will update.

How do I know if it’s working

The easiest way to see this is working is if you have Apache mod_status enabled. You will know it works because you will no longer see any requests to /min/?