BLOBs have performance limits due to the way they are currently handled by mysql. First I will give you a little background info, then describe the problem, and potential solutions.
BLOB is an acronym for Binary Large Object. It is a datatype available in Mysql in 4 different sizes. There are also 4 text data types that are almost the same as BLOBs accept they contain string information instead of binary information. Text also has different comparison characteristics than BLOBs since it is a string oriented type. BLOBs can be used to store JPEG files or any other large binary data. BLOB data types come in 4 sizes: tiny, regular, medium, and long which have size limitations respectively of 25 bytes, 64 KB, 16 MB, and 4GB.
Ok, so here is the issue: when a mysql storage engine returns a BLOB from disk to mysql it has to provide a contiguous buffer where the data is stored and a pointer to that buffer. In other words "The entire BLOB has to fit into memory at once!!" even for a 4GB long blob. Now imaging you have 100 different 4GB long BLOB files (let's imagine movie files) and there are select statements being run simultaneously to retrieve all 100 files. Your storage engine would need to allocate 400 GB of memory to put all those BLOBs into. Clearly, this is not scalable since boxes today cap out around 64GB of ram.
So what is the solution? Conceptually, the storage engine needs to give the data back to the user in chunks. Mysql could have some fixed buffers, say 100 MB large and the storage engine could pump data into the buffers as the last chunk was sent to the client over the network. Seems pretty straight forward, but it does require changes in mysql server and maybe mysql client as well as all of the storage engines. I wonder if this is already on Mysql's roadmap (anyone know, please leave a comment).
One storage engine did try work around the problem. The PBXT storage engine has created a streaming BLOB infrastructure that seems to be pretty generic. The problem is it requires changes to the mysql client and server in addition to the storage engine and those changes don't seem to be integrated into the main line mysql development branch as far as I can tell.
Any comments or additional ideas on the subject please post a comment. Thanks,
Logs and more logs, who has time to read them ?
5 weeks ago