Drupal: Peformance and Scalability

Since I do work on Drupal sites from time to time and have built some very large ones I do keep an eye on the Drupal "stack" and change. But, I haven't revisted it for a while.

A few months ago I posted an article about a Drupal stack that I had tested w/ a big media company to do 2.5 billion page views a a month. It was a lot of servers and some sophisticated modifications to Drupal.

It's still a solid architecture but there are some new entries I thought I'd like to evaluate. I also deployed that one on Joyent so I was curious what I might be able to do with other cloud vendors. This time I picked Rackspace's Cloud Servers since they were kind enough to comp. nScaled, Inc. a little free time to test things out an demo to clients. So, I've been beating up on them pretty good. I'm impressed and considering what I did to some of their servers yesterday I'm surprised I didn't get a cease and desist!

Here is the Drupal Stack I built and configured.

Drupal Software Stack

So, once I got everything installed I load tested it using httperf 0.9.0. I was able to generate and server approximately 150 req/sec over the internet from another cloud providers servers in a different state. Now, I wasn't going for super scientific on my load testing there. Just a reasonable look at what it might be able to do. I also installed the APC statistics page. During the run, it looked like this:

APC Stats Page for Drupal Test

As you can see it was pumping out a little over 800 requests / second from the cache.

I configured AuthCache, which is a fork of the Drupal CacheRouter project to use memcached as it's data store. It was also pretty busy. The output from a little memcached status script I used showed the following.

Test:/src# ruby stats.rb
STAT pid 3823
STAT uptime 1238
STAT time 1238884956
STAT version 1.2.2
STAT pointer_size 64
STAT rusage_user 0.428026
STAT rusage_system 2.076129
STAT curr_items 30
STAT total_items 108
STAT bytes 198842
STAT curr_connections 151
STAT total_connections 152
STAT connection_structures 152
STAT cmd_get 135821
STAT cmd_set 129
STAT get_hits 135780
STAT get_misses 41
STAT evictions 0
STAT bytes_read 5123354
STAT bytes_written 313147950
STAT limit_maxbytes 67108864
STAT threads 1
END

Oh, I also used munin to keep track of what was going on at the machine level.  I can't show all that output here, but you can see that network traffic was certainly pumping with this graph.  That big spike there was when I was pushing the full rate.  I wasn't actually able to go beyond this level pretty much no matter what I threw at the system.

 

 Maybe that's a hard limit.  I'm not sure yet.  I really thought I would be able to get more out of the system.

It's worth mentioning that I also integrated varnish as a reverse proxy.  After going through a ton of different VCL files I might have found one that works a little bit. But, it turns out that Drupal is very unfriendly out of the box for reverse caches.  What this means in practical terms is that if you want to use a local reverse proxy like varnish, squid, or perlbal then you will need to do some hacking on Drupal to make it more cache friendly.  This also means that CDN services might have a difficult time working with Drupal without you modifying Drupal itself.  I really think this is an area ripe for improvement with Drupal 7.  Hopefully cachability is being addressed with that release.

This entire test was performed on a 2GB slice on Rackspace's Cloud Server.  I started with a 512 slice but it needed a bit more RAM to cram all that into it.

The level of performance I've got would translate to the ability to do a maximum of about 388 million page views a month assuming you didn't do a bunch of weird stuff to Drupal making is slower.  This would fit quite well as built in a 1G slice.  So, that would cost, in instance fees, $43.80 per month plus whatever bandwidth you happen to use.  I'd say that's quite a bargain.

TO DO:

 

  1. See if I can get it to work with Cloud Files somehow.  That would be sweet.  But, there is that nagging issues with having to hack up Drupal which comes with other problems.
  2. If I can't get it to work with Cloud Files, try to go hybrid and use the Drupal Cloud Front module that integrates with Amazon Cloud Front.  That could be interesting.
  3. Keep tweaking on the varnish VCL and see if I get something I like better.  Not really happy with it.
  4. Scale it and keep testing to see how far it can go this way.

 

I'm certain that by breaking the stack apart to more machines and a variety of other scaling tricks I could make this system scale quite well.  I might play with that a bit later when I have more time.  This was fun for a little weekend project though.