Thursday, April 23, 2009

Highly Available Storage (without high prices)

One of the most interesting themes I have been paying attention to at this years Mysql users conference is techniques to create highly available storage volumes without spending a million dollars on a SAN or NAS infrastructure using companies like EMC or Network Appliance or IBM.

At least 3 options exist that I was not aware of before:

Amazon Elastic Block Store: as part of Amazon's EC2 web services you can have a virtual block level device available from your EC2 instance. Using this block level device you can either mount a typical linux filesystem and access the device with standard file access system calls or you can even do raw IO against the device without a filesystem. The data is stored on amazon's cloud, and is thus relatively highly available. As with all Amazon services you only pay for what you use. I was quoted performance numbers around 100 MB / sec which seems quite reasonable. You can only mount the storage on one instance at a time for the moment, but you could set up NFS between instances if you really wanted to.

Rackspace Virtualized Storage: I talked breifly with some guys from Rackspace and they said they have a service backed by a Network Appliance NAS farm that allows hosting clients to have access to NetApp volumes on a rental basis. This sounds pretty cool in that you can have NetApp storage space without actually buying the hardware. NetApps are usually highly available so you don't have to worry as you do with commodity linux boxes that it may go down at anytime. However when I went to the Rackspace site I couldn't really find the exact offering they were talking about so this option needs some more research.

DRBD: DRBD seems to be a fairly popular product, that I not heard about until now. It allows you to have a volume on one machine that appears to be a standard filesystem volume but is actually replicated using DRBD to another machine. There seems to be a few modes one of which allows your fsync calls to block until all data is flushed both on the local disk and the remote disk, another allows you to block until the data is in memory on the other machine but not flushed to disk, etc. Choosing the modes and finding out the exact characteristics of write and fsync with each mode, and the relative performance of these combinations will be important details (hopefully with no devil in there). At their booth they were quoting numbers that looked very similar to the IO throughput you would get on a commodity box for around 100MB / sec, but again this all depends on your config.

More info is available in these links: