I'm sure there are actual very good answers to the "Why Bother?" portion of this posts title. But, this post is more or less in response to Scaling out MySQL from Nati Shalom's blog. The argument essentially that you should augment the relational database layer with an IMDB (In Memory Data Grid) for transactional activities and use the Relational DB as a back end persistent data store. It is also a nice run down of various things that one might have to do to enable a MySQL Relational database layer to scale and continue to perform as load increases to insane levels where vertical scaling becomes impossible or cost prohibitive.
In reading that post I just could not stop thinking about all the hoops we all jump through to get around the fact that current implementations of Relational Databases just do not seem to be able to provide the performance and scale that successful modern web applications demand.
Using in memory data grids like Coherence or in memory distributed cache technology like memcached gives me the scalability and performance I need to handle modern web application transaction loads on the systems I design. I use them for a couple of reasons.
1. Protect the database from meltdown
2. Enable shared access to data across a horizontally scalable clusters of machines
I have considered that the work being done on columnar databases like Vertica might be interesting to apply to web applications but I have not had a chance to really dig into that idea.
So, because of the limitations of my primary permanent relational data store I am forced to have to take the transactions out of the database. Which makes me continue to ask the question over and over again of why I need the relational database anyway when I often don't use or need referential itegrity (I see DBA's shivering everywhere when I say that). I really think that things like Mnesia, CouchDB, SimpleDB, HBase, Bigtable, and other technologies along those lines are coming in fast and furious to replace the relational database in its entirety as the persistent data store anyway. This is especially true if you need to do major heavy lifting data mining of the data store or fancy things like Rackspace's log parsing with Hadoop or the NYT creating 11 million PDF's in 24 hours.
Resources:
Scaling Out MySQL by Nati Shalom
http://natishalom.typepad.com/nati_shaloms_blog/2008/03/scaling-out-mys.html
Vertica
http://www.vertica.com/
Memcached
http://www.productionscale.com/display/Search?searchQuery=memcached&moduleId=1481658
http://www.danga.com/memcached/
Oracle Coherence
http://www.oracle.com/technology/products/coherence/index.html