You are here

Part VII: Varnish

I front of your web-server you can place a revers proxy also known as an application cache to take the load of the whole server by serving content directly from memory and out the network card. Varnish is a very fast reveres proxy that is able to scale to a very high number of request per second.

Varnish can also be used as a load balancer to divide the request out over a number of backends, but here we only use varnish with a single default backend. I have earlier written about Varnish and Drupal 7 where more than one backend is defined.

First step is to configure your Apache to listen to another port (8080) and configure Varnish to listen to port 80 to capture all HTTP requests. Varnish's basic configuration is located in /etc/default/varnish and the following lines should be useable for most configurations. It allocations 512Mb of memory to varnish, if you have more memory to give it, it will like that.

DAEMON_OPTS="-a :80 \
             -T localhost:6082 \
             -f /etc/varnish/drupal.vcl \
             -u varnish -g varnish \
             -S /etc/varnish/secret \
             -p thread_pool_add_delay=2 \
             -p thread_pools=4 \
             -p thread_pool_min=2 \
             -p thread_pool_max=4000 \
             -p session_linger=50 \
             -p sess_workspace=262144 \
             -s malloc,512m"

Varnish it self is configure in the VCL (varnish configuration language), which looks a lot like C code in syntax. The configuration is compiled into binary code when varnish is started to make varnish even faster. The code below should give you a good starting point for caching you Drupal site and should be place in /etc/varnish/drupal.vcl as defined by the configuration above.

# Define backend(s).
backend default {
  .host = "127.0.0.1";
  .port = "8080";
  .probe = {
    .timeout = 2s;
    .interval = 30s;
    .window = 10;
    .threshold = 2;
    .request =
      "GET /status.php HTTP/1.1"
      "Host: www.example.com"
      "Connection: close"
      "Accept-Encoding: text/html";
  }
}

# Respond to incoming requests.
sub vcl_recv {
  
  # Make sure that the client ip is forward to the client.
  if (req.restarts == 0) {
    if (req.http.x-forwarded-for) {
      set req.http.X-Forwarded-For = req.http.X-Forwarded-For + ", " + client.ip;
    }
    else {
      set req.http.X-Forwarded-For = client.ip;
    }
  }

  # Do not cache these paths.
  if (req.url ~ "^/status.php$" ||
    req.url ~ "^/update.php$" ||
    req.url ~ "^/admin/build/features" ||
    req.url ~ "^/info/.$" ||
    req.url ~ "^/flag/.
$" ||
    req.url ~ "^./ajax/.$" ||
    req.url ~ "^./ahah/.$") {
    return (pass);
  }

  # Pipe these paths directly to Apache for streaming.
  if (req.url ~ "^/admin/content/backup_migrate/export") {
    return (pipe);
  }

  # Deal with GET and HEAD requests only, everything else gets through
  if (req.request != "GET" && req.request != "HEAD") {
    return (pass);
  }

  # Allow the backend to serve up stale content if it is responding slowly.
  set req.grace = 6h;

  # Use anonymous, cached pages if all backends are down.
  if (!req.backend.healthy) {
    unset req.http.Cookie;
  }

  # Always cache the following file types for all users.
  if (req.url ~ "(?i).(png|gif|jpeg|jpg|ico|swf|css|js|html|htm)(\?[\w\d=.-]+)?$") {
    unset req.http.Cookie;
  }

  # Remove all cookies that Drupal doesn't need to know about. ANY remaining
  # cookie will cause the request to pass-through to Apache. For the most part
  # we always set the NO_CACHE cookie after any POST request, disabling the
  # Varnish cache temporarily. The session cookie allows all authenticated users
  # to pass through as long as they're logged in.
  if (req.http.Cookie) {
    set req.http.Cookie = ";" + req.http.Cookie;
    set req.http.Cookie = regsuball(req.http.Cookie, "; +", ";");
    set req.http.Cookie = regsuball(req.http.Cookie, ";(SESS[a-z0-9]+|NO_CACHE)=", "; \1=");
    set req.http.Cookie = regsuball(req.http.Cookie, ";[^ ][^;]*", "");
    set req.http.Cookie = regsuball(req.http.Cookie, "^[; ]+|[; ]+$", "");

    if (req.http.Cookie == "") {
      # If there are no remaining cookies, remove the cookie header. If there
      # aren't any cookie headers, Varnish's default behaviour will be to cache
      # the page.
      unset req.http.Cookie;
    }
    else {
      # If there are any cookies left (a session or NO_CACHE cookie), do not
      # cache the page. Pass it on to Apache directly.
      return (pass);
    }
  }
  # Handle compression correctly. By consolidating compression headers into
  # a consistent format, we can reduce the size of the cache and get more hits.
  # @see: http:// varnish.projects.linpro.no/wiki/FAQ/Compression
  if (req.http.Accept-Encoding) {
    if (req.http.Accept-Encoding ~ "gzip") {
      # If the browser supports it, we'll use gzip.
      set req.http.Accept-Encoding = "gzip";
    }
    else if (req.http.Accept-Encoding ~ "deflate") {
      # Next, try deflate if it is supported.
      set req.http.Accept-Encoding = "deflate";
    }
    else {
      # Unknown algorithm. Remove it and send unencoded.
      unset req.http.Accept-Encoding;
    }
  }

  # If we get to here lookup the cache.
  return (lookup);
}

# Code determining what to do when serving items from the Apache servers.
sub vcl_fetch {
  # Varnish determined the object was not cacheable
  if (beresp.ttl <= 0s) {
    set beresp.http.X-Cacheable = "NO: Not Cacheable";

  # You don't wish to cache content for logged in users
  } elsif (req.http.Cookie ~ "(UserID|_session)") {
    set beresp.http.X-Cacheable = "NO: Got Session";
    return(hit_for_pass);
 
  # You are respecting the Cache-Control=private header from the backend
  } elsif (beresp.http.Cache-Control ~ "private") {
    set beresp.http.X-Cacheable = "NO: Cache-Control=private";
    return(hit_for_pass);
 
  # Varnish determined the object was cacheable
  } else {
    set beresp.http.X-Cacheable = "YES";
  }

  # Allow items to be stale if needed.
  set beresp.grace = 6h;

}

sub vcl_deliver {
  // Debugging code that should be removed in production.
  if (obj.hits > 0) {
    set resp.http.X-Cache = "HIT";
  } else {
    set resp.http.X-Cache = "MISS";
  }

  # Unset drupal cache and PHP info
  set resp.http.X-Powered-By = "Awesomeness and Open source";
  unset resp.http.X-Drupal-Cache;
}

Status script

As you may have noticed the definition of the backend above reference to a status.php file on the server. It's used to check if the backend is healthy by checking the database and Memcache operations. The script is shown below and can easily be extended to check for more things.

<?php

register_shutdown_function('status_shutdown');
function status_shutdown() {
  exit();
}

// Drupal bootstrap.
require_once './includes/bootstrap.inc';
drupal_bootstrap(DRUPAL_BOOTSTRAP_DATABASE);

// Build up our list of errors.
$errors = array();

// Check that the main database is active.
$uid = db_query('SELECT uid FROM {users} WHERE uid = 1')->fetchField();
if (!$uid) {
  $errors[] = 'Master database not responding.';
}

// Check that all memcache instances are running on this server.
if (isset($conf['cache_default_class']) && $conf['cache_default_class'] == 'MemCacheDrupal') {
  foreach ($conf['memcache_servers'] as $address => $bin) {
    list($ip, $port) = explode(':', $address);
    $memcache = new Memcache();
    if (!$memcache->addServer($ip, $port, false)) {
      $errors[] = 'Memcache bin <em>' . $bin . '</em> at address ' . $address . ' is not available.';
    }
    else {
      if (!$memcache->set('status_string', 'A simple test string')) {
        $errors[] = 'Memcache bin <em>' . $bin . '</em> at address ' . $address . ' is not available.';
      }
    }
  }
}

// Check that the files directory is operating properly.
if ($test = tempnam(variable_get('file_directory_path', conf_path() .'/files'), 'status_check_')) {
  if (!unlink($test)) {
    $errors[] = 'Could not delete newly create files in the files directory.';
  }
}
else {
  $errors[] = 'Could not create temporary file in the files directory.';
}

// Print all errors.
if ($errors) {
  $errors[] = 'Errors on this server will cause it to be removed from the load balancer.';
  header('HTTP/1.1 500 Internal Server Error');
  print implode("<br />\n", $errors);
}
else {
  // Split up this message, to prevent the remote chance of monitoring software
  // reading the source code if mod_php fails and then matching the string.
  print 'CONGRATULATIONS' . ' 200';
}

// Exit immediately, note the shutdown function registered at the top of the file.
exit();

Add new comment

Plain text

  • No HTML tags allowed.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Lines and paragraphs break automatically.