[CentOS] Disk usage for small files in ext3 in CentOS 5

Wed Mar 11 21:29:06 UTC 2009
Filipe Brandenburger <filbranden at gmail.com>

Hello,

I noticed something unusual today.

If I "du" a small file (couple of bytes) in CentOS 5, it tells me the
file is using 8kb, while I was expecting 4kb which is the block size
I'm using.

I tried this on several CentOS 5 machines, both x86_64 and i386:

$ echo test >test.txt
$ ls -l test.txt
-rw-rw-r-- 1 filbranden filbranden 5 Mar 11 17:24 test.txt
$ du -h test.txt
8.0K	test.txt

If I do the same on a CentOS 4 machine:

$ echo test >test.txt
$ ls -l test.txt
-rw-rw-r--  1 filbranden filbranden 5 Mar 11 17:25 test.txt
$ du -h test.txt
4.0K	test.txt

On all machines I tested, both CentOS 4 and CentOS 5:

# tune2fs -l /dev/xxxxx
...
Block size:               4096
Fragment size:            4096

I could not find any differences that would explain the behaviour.
Have you seen this before? Can you reproduce it on your systems? Do
you know how to get the CentOS 4 behaviour?

More on the point: I'm migrating some data from CentOS 4 to CentOS 5,
it's around 70GB of millions of small files. I would like it to still
take 70GB, not 140GB. For now, I'm working around this issue by using
"-T small" to mke2fs, I'm not sure if it's going to have the effect I
want, and I'm not sure about any other impact (performance?) it might
have on my filesystem.

Thanks,
Filipe