scale

ZeroMQ Info Roundup

I've been reading what I could find on ZeroMQ and it has grabbed my attention for various reasons.  Here is a batch of the ones I found interesting into one article for future reference since things were a bit spread out.

ZeroMQ Introductory Video (great starting point)

http://www.youtube.com/watch?v=_JCBphyciAs

Paper on Messaging and Multi-Threading

http://www.zeromq.org/whitepapers:multithreading-magic

A Rather Ambitious Project with very little information(but I like the ambition)

http://prodatalab.wikidot.com/

Streaming Video Example

http://www.zeromq.org/code:examples-camera

DripDrop - DripDrop encapsulates both zmqmachine, and eventmachine. It provides some sane default messaging choices, using BERT (A binary, JSON, like serialization format) and JSON for serialization. While zmqmachine and eventmachine APIs, some convoluted ones, the goal here is to smooth over the bumps, and make them play together nicely.

http://blog.andrewvc.com/next-steps-for-dripdrop-zeromq-distributed-wo

http://github.com/andrewvc/dripdrop

mongrel2 application server - Mongrel2 is an application, language, and network architecture agnostic web server that focuses on webapplications using modern browser technologies.  ZeroMQ used heavily.

http://mongrel2.org/doc/tip/docs/manual/book.wiki#x1-760005.6

http://mongrel2.org/doc/tip/docs/manual/book.wiki#x1-640005.2

Async Worker Process article on a good blog that discusses ZeroMQ

http://kfsone.wordpress.com/2010/07/21/asyncworker-parallelism-with-zeromq/

Interesting Projects around ZeroMQ from the creators of ZeroMQ

http://github.com/imatix/zdevices

http://github.com/imatix/zfl

http://github.com/imatix/zguide

Several Blog Entries from Around the Web regarding ZeroMQ

ZeroMQ and Scalability

http://kfsone.wordpress.com/2010/07/21/zeromq-and-scalability/

ZeroMQ and Introduction

http://nichol.as/zeromq-an-introduction

Basic ZeroMQ example in Ruby

http://willj.net/2010/08/01/basic-zero-mq-ruby-example/

ZeroMQ Tutorial

http://github.com/andrewvc/learn-ruby-zeromq

0MQ: A new approach to messaging

http://lwn.net/Articles/370307/

Gluster 3.1 GA Release

Over the past couple of months I was taking a really close look at GlusterFS for potential use on a virtualization project.  Today I saw the notice the version 3.1 was released.  That's good news.  They call it a scale out NAS platform which it is but it's also a bit more than that too.

I had the chance to speak at length with Anand Babu (AB) Periasamy and a few members of his team at VMWorld recently about 3.1 prior to release and it was genuinely interesting and exciting. I've been following the Gluster project for years and it really just seems to keep getting better and better.  Not only that, they seem pretty passionate about what they do which is always a good thing.

Of particular interest in 3.1 is that you are now supposed to be able to add and remove nodes to the cluster without impacting the applications using the cluster at all.  This is CRITICAL and was a major barrier to adoption perviously.  Previously you actually had to restart the cluster to expand.

One of the things that can be challenging is large scale file sharing to many, and sometimes varying numbers of, application servers in large scale web environments.  I could see GlusterFS 3.1 being very useful in this scenario.  One recently published example of this is the way that Acquia uses GlusterFS for scaling Drupal.  

Of course, other options exist such as Swift from open stack, MongoDB w/ GridFS, Riak perhaps in smaller file size senarios, and perhaps Ceph which just released.  The file / storage space is hot right now with change and even *gasp* innovation.  It is pretty exciting and more choice over the last few years has been a very good thing.

I suspect I'll be writing more about this in the future assuming I can get some of the testing I want to do completed.  As usual, my lab in my secret lair is under powered and over utilized. *sigh*

Building for the Cloud is Building for Scalability

Even today with all the hoorah about cloud computing there are very few applications that can really call themselves cloud computing applications outside of the sciences and even there, I suspect there aren't that many.  So, I began trying to piece together a list of the features that I think a cloud computing application must have.  Otherwise, it's not really a cloud computing application at all but perhaps on its way to being one or not at all.  What I realized quickly was that being a cloud application is often about three things. 

  • Scalability
  • Availability 
  • Costs 

Part of what prompted me to write this is that I have installed and scaled countless websites and applications over the years.  I've done Drupal, Wordpress, Clearspace, ModX, Expression Engine, Custom CMS applications, X Cart, Magento, Django, and every language you can think of from ruby to PHP to java to python.  Thing is, they all have a very serious problem and it baffels me that it doesn't appear to be addressed well yet.  Every single one of these applications was NOT built for the cloud.  They just do not scale well without serious work and complexity to make them scale.  I did not say they do not scale as in most cases, one way or another, they almost certainly can be made to scale.  For example, I've built Drupal sites that could handle 1.5 to 2.0 billion page views per month for example.  But, it was forced and very complex all in all.  

But, what is important to understand is that very, very few applications can natively and easily scale to what we call cloud scale or web scale as they are out of the box.  They simply cannot.  But, if you do want you application to be able to utilize all the benefits of cloud computing such as:

  • Automated provisioning and scale - Using tools like Chef and Puppet to code your infrastructure
  • Rapid Deployment - You should be able to deploy fast and often
  • Lower Cost to Market - You should be able to get up and running for less money in infrastructure than ever before (less capital expenditures on your balance sheet)
  • High server to admin ratios and reduced operational costs over time - You can do more with less and it's all about quality over quantity
  • Client satisfaction - Happy clients are clients that never even know your service grew 10x last quarter, they just know it works
  • Shareholder/Partner Satisfaction - sleep better, make more money, run a better business

Here are some items that I think are must-haves if you are going to build a true cloud computing application. They are in no particular order but all are important.  The take away from all this for me is that building something to scale is very much about building something for the cloud.

  • Be Distributed - It must be able to do whatever it does over a dispersed network of nodes(servers/services) nearly as easily as it does on a single server
  • Be Asynchronous - Think message based event driven frameworks
  • Be Monitorable (aka Be Instrumented) - Just like your developers should implement unit tests they should implement monitoring instrumentation at the application level
  • Be Self healing and Resiliant - The system should be able to handle faults and route around it when needed. A degraded state is often just fine so long as transaction can keep flowing to some degree.
  • Be multi-provider - Do not put all your eggs in one basket (see my post on configuration management and deployment automation w/ Chef for example)
  • Be multi-site - your site should be active-active to some degree across mutiple geographic locations.  There is no reason any longer not to do this.
  • Be able to Scale Out - Designing a distributed, stateless and asynchronous application will go a long way here
  • Be Stateless - This one is not always possible for many reasons it is worth doing or at least minimizing in clever ways, think hard about this one and don't abuse state.
  • Use the Cloud for your Infrasturucture - (note that I did not say virtualize only) - 'Nuff Said.
  • Be able to rollback anything - If you deploy it, you should be able to un-deploy it just as easy.  Oh database schema changes how I love thee.
  • Test, test, test - Make sure your test process is solid.
  • Be highly available (N+1) - If you need 5 servers to provide the capacity you need then you should have at least 6 servers at any given time. This is a simpler version of more complex capacity planning that you can do on a per server/service basis that's very, very important
  • Use the right technology for the job - If you need a relational database by all means use one. If you only need a key/value store then by all means do NOT use a relational database.  If you need data persistence to disk and complete data durablility then don't use memcached (not picking on memcached of course.. great tech there)  Use other technologies properly configured to give you what you need.  Put in non-technical terms, don't use a spoon to dig the foundation for a house.  That's just silly.
  • Build in failsafes (aka circuit breakers) - When something breaks like access from the application server to the database then the application should be able to handle it gracefully.  Build this in!

I'm sure this list can be tuned and improved.  I'm sure I will do so.  What I'm after is laying down the rules for building truly cloud native applications.  There has been some phenomenal work in this area over the years but it's only just beginning to be better understood and move more into the mainstream.  This is both frustrating and exciting!