Types of Partitioning for Database Scalability

Yesterday I attended a good talk at ZendCon 2007 by Eli White.  One of the better pieces of take away was his explanation of the three different types of partitioning for your data at the database layer.  These are just my notes from the talk.  This is not great literature.
The three types mentioned were:

  1. Horizontal
  2. Vertical
  3. Application Level

Horizontal Partitioning

The example given would be something like splitting large tables in two (like 50% rows in 1DB, 50% in a second)

Several types of Horizontal Partitioning schemes were mentioned.

Range Based - partition on some range of items
Date Based - partition based on date
Interlaced - hash them out based on some modulus (lots of buckets)
User Based - someusers on one partition others on another

Another example called out was to possibly keep a last 30 days table, and everything table somewhere for when you need it.

Vertical Partitioning - Moving Various columns of your table into different tables

Various types:
  - move rarely used columns into aux tables
  - move often empty columns into aux tables
  - move columns that are not used in where clauses

Application Level Partitioning (I also call this functional partitioning sometimes)
  - move single tables to specific servers
  - move groups of related tables together to allow joining
  - move entire functions (like search) to their own tables/databases

Making choices on where to partition has a lot to do with how and where you want to break joins.  So, consider this carefully.

Automate the understanding of your partitions(shards) or you'll be killing yourself w/ complexity.  Write code to keep up with it or even create a specilized database to store the relationship data.