[CentOS] Question about optimal filesystem with many small files.

Mon Jul 13 15:02:21 UTC 2009
JohnS <jses27 at gmail.com>

On Mon, 2009-07-13 at 05:49 +0000, oooooooooooo ooooooooooooo wrote:

> >It is 1024 chars long. Witch want still help.
> I'm usng mysam and according to: http://dev.mysql.com/doc/refman/5.1/en/myisam-storage-engine.html
> "The maximum key length is 1000 bytes. This can also be changed by changing the source and recompiling. For the case of a key longer than 250 bytes, a larger key block size than the default of 1024 bytes is used. "
> 
> >I would not store images in either one
> as your SELECT LIKE and Random will kill it. 
> 
> Well, I think that this can be avoided, using just searches in teh key fields should not give these issues. Does somebody have experience storing a large amount of medium (1KB-150KB) blob objects in mysql?

True

An option would be to encode them to Base64 on INSERT but if you Index
all of you BLOBS on INSERT really there should be no problem. Besides
150Kb is not a big for a BLOB. Consider 20MB to 100MB with multiple
joins on MSSQL, 64Bit although. Apparently size is based on the maximum
amount of memory the client has. VARBLOB apparently has no limit per
docs. As doing this on MySQL I can not relate to. I can on DB2 and
MSSQL. I can say you can rival the 32Bit MSSQL performance by at least
15 percent. I can only say that I have experiance with raw DB
predictions in Graphing. Edge and Adjacency Modeling on MySQL.

What I see slowing you down is the TQSL and SPROCS. The dll for the md5
I posted earlier will scale to 1000s of inserts at the time. If speed is
really your essence then use RAW Partitions for the DB and RAM. Use the
MySQL Connector or the ODBC or you will hit size limits on INSERT and
SELECT.

> >However I have not a clue that this is even doable in MySQL.
> 
> In mysql there is already a MD5 funtion: http://dev.mysql.com/doc/refman/5.1/en/encryption-functions.html#function_md5

Yes, I was informed that a call from a SPROC to "md5()" would do the
trick and take the load of the client. At least that was my intent of
the idea to balance the load. That is if this is client/server.

I do wonder about your memory allocation and disk. It is all about the
DB design. Think about a Genealogy DB. Where do you end design? You
don't. Where does predictions end? They don't.

John