Hello,
I noticed something unusual today.
If I "du" a small file (couple of bytes) in CentOS 5, it tells me the file is using 8kb, while I was expecting 4kb which is the block size I'm using.
I tried this on several CentOS 5 machines, both x86_64 and i386:
$ echo test >test.txt $ ls -l test.txt -rw-rw-r-- 1 filbranden filbranden 5 Mar 11 17:24 test.txt $ du -h test.txt 8.0K test.txt
If I do the same on a CentOS 4 machine:
$ echo test >test.txt $ ls -l test.txt -rw-rw-r-- 1 filbranden filbranden 5 Mar 11 17:25 test.txt $ du -h test.txt 4.0K test.txt
On all machines I tested, both CentOS 4 and CentOS 5:
# tune2fs -l /dev/xxxxx ... Block size: 4096 Fragment size: 4096
I could not find any differences that would explain the behaviour. Have you seen this before? Can you reproduce it on your systems? Do you know how to get the CentOS 4 behaviour?
More on the point: I'm migrating some data from CentOS 4 to CentOS 5, it's around 70GB of millions of small files. I would like it to still take 70GB, not 140GB. For now, I'm working around this issue by using "-T small" to mke2fs, I'm not sure if it's going to have the effect I want, and I'm not sure about any other impact (performance?) it might have on my filesystem.
Thanks, Filipe
Filipe Brandenburger wrote:
Hello,
I noticed something unusual today.
If I "du" a small file (couple of bytes) in CentOS 5, it tells me the file is using 8kb, while I was expecting 4kb which is the block size I'm using.
I tried this on several CentOS 5 machines, both x86_64 and i386:
$ echo test >test.txt $ ls -l test.txt -rw-rw-r-- 1 filbranden filbranden 5 Mar 11 17:24 test.txt $ du -h test.txt 8.0K test.txt
<snip>
I could not find any differences that would explain the behaviour. Have you seen this before? Can you reproduce it on your systems? Do you know how to get the CentOS 4 behaviour?
strange. I don't reproduce on an x86_64 centos 5 machine: [nthierry@localhost ~]$ echo test >test.txt [nthierry@localhost ~]$ ls -l test.txt -rw-rw-r-- 1 nthierry nthierry 5 Mar 11 22:44 test.txt [nthierry@localhost ~]$ du -h test.txt 4.0K test.txt
I'm pretty sure I did nothing special when making the fs.
HTH
Nicolas Thierry-Mieg wrote:
Filipe Brandenburger wrote:
Hello,
I noticed something unusual today.
If I "du" a small file (couple of bytes) in CentOS 5, it tells me the file is using 8kb, while I was expecting 4kb which is the block size I'm using.
I tried this on several CentOS 5 machines, both x86_64 and i386:
$ echo test >test.txt $ ls -l test.txt -rw-rw-r-- 1 filbranden filbranden 5 Mar 11 17:24 test.txt $ du -h test.txt 8.0K test.txt
<snip>
I could not find any differences that would explain the behaviour. Have you seen this before? Can you reproduce it on your systems? Do you know how to get the CentOS 4 behaviour?
strange. I don't reproduce on an x86_64 centos 5 machine: [nthierry@localhost ~]$ echo test >test.txt [nthierry@localhost ~]$ ls -l test.txt -rw-rw-r-- 1 nthierry nthierry 5 Mar 11 22:44 test.txt [nthierry@localhost ~]$ du -h test.txt 4.0K test.txt
I'm pretty sure I did nothing special when making the fs.
HTH
I did the same test but my du -h test.txt gives 8.0K test.txt I am running linux mirrored drives, thus in one respect it is actually using 2 times 4.0K - once per drive?? Rob
CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
On Wed, March 11, 2009 5:51 pm, Nicolas Thierry-Mieg wrote:
Filipe Brandenburger wrote:
Hello,
I noticed something unusual today.
If I "du" a small file (couple of bytes) in CentOS 5, it tells me the file is using 8kb, while I was expecting 4kb which is the block size I'm using.
I tried this on several CentOS 5 machines, both x86_64 and i386:
$ echo test >test.txt $ ls -l test.txt -rw-rw-r-- 1 filbranden filbranden 5 Mar 11 17:24 test.txt $ du -h test.txt 8.0K test.txt
<snip> > I could not find any differences that would explain the behaviour. > Have you seen this before? Can you reproduce it on your systems? Do > you know how to get the CentOS 4 behaviour?
strange. I don't reproduce on an x86_64 centos 5 machine: [nthierry@localhost ~]$ echo test >test.txt [nthierry@localhost ~]$ ls -l test.txt -rw-rw-r-- 1 nthierry nthierry 5 Mar 11 22:44 test.txt [nthierry@localhost ~]$ du -h test.txt 4.0K test.txt
I'm pretty sure I did nothing special when making the fs.
I just did it on a 32 bit machine and got 4.0K. The file system was created using default parameters.
Marko
Nicolas Thierry-Mieg wrote:
$ echo test >test.txt $ ls -l test.txt -rw-rw-r-- 1 filbranden filbranden 5 Mar 11 17:24 test.txt $ du -h test.txt 8.0K test.txt
<snip> > I could not find any differences that would explain the behaviour. > Have you seen this before? Can you reproduce it on your systems? Do > you know how to get the CentOS 4 behaviour?
strange. I don't reproduce on an x86_64 centos 5 machine: [nthierry@localhost ~]$ echo test >test.txt [nthierry@localhost ~]$ ls -l test.txt -rw-rw-r-- 1 nthierry nthierry 5 Mar 11 22:44 test.txt [nthierry@localhost ~]$ du -h test.txt 4.0K test.txt
I'm pretty sure I did nothing special when making the fs.
Doublecheck it with:
tune2fs -l /dev/hda1 |grep 'Block size'
Substitute your partition for /dev/hda1 above.
Filipe Brandenburger wrote:
Hello,
I noticed something unusual today.
If I "du" a small file (couple of bytes) in CentOS 5, it tells me the file is using 8kb, while I was expecting 4kb which is the block size I'm using.
I tried this on several CentOS 5 machines, both x86_64 and i386:
$ echo test >test.txt $ ls -l test.txt -rw-rw-r-- 1 filbranden filbranden 5 Mar 11 17:24 test.txt $ du -h test.txt 8.0K test.txt
Odd. I'm not seeing this on CentOS 5.2:
$ echo test >test.txt $ ls -ls test.txt 4 -rw-rw-r-- 1 rnichols rnichols 5 Mar 11 16:57 test.txt $ du -h test.txt 4.0K test.txt $ stat test.txt File: `test.txt' Size: 5 Blocks: 8 IO Block: 4096 regular file Device: 341h/833d Inode: 4325491 Links: 1 Access: (0664/-rw-rw-r--) Uid: ( 500/rnichols) Gid: ( 500/rnichols) Access: 2009-03-11 16:57:18.000000000 -0500 Modify: 2009-03-11 16:57:18.000000000 -0500 Change: 2009-03-11 16:57:18.000000000 -0500 $ df . Filesystem 1K-blocks Used Available Use% Mounted on /dev/hdb1 487397840 320075816 142902824 70% /xstore $ su - -c "tune2fs -l /dev/hdb1" | egrep 'features|size' Password: Filesystem features: has_journal resize_inode dir_index filetype needs_recovery sparse_super large_file Block size: 4096 Fragment size: 4096 Inode size: 128
Everything exactly as expected.
On Wed, 2009-03-11 at 17:29 -0400, Filipe Brandenburger wrote:
Hello,
I noticed something unusual today.
If I "du" a small file (couple of bytes) in CentOS 5, it tells me the file is using 8kb, while I was expecting 4kb which is the block size I'm using.
I tried this on several CentOS 5 machines, both x86_64 and i386:
$ echo test >test.txt $ ls -l test.txt -rw-rw-r-- 1 filbranden filbranden 5 Mar 11 17:24 test.txt $ du -h test.txt 8.0K test.txt
If I do the same on a CentOS 4 machine:
$ echo test >test.txt $ ls -l test.txt -rw-rw-r-- 1 filbranden filbranden 5 Mar 11 17:25 test.txt $ du -h test.txt 4.0K test.txt
On all machines I tested, both CentOS 4 and CentOS 5:
# tune2fs -l /dev/xxxxx ... Block size: 4096 Fragment size: 4096
I could not find any differences that would explain the behaviour. Have you seen this before? Can you reproduce it on your systems? Do you know how to get the CentOS 4 behaviour?
More on the point: I'm migrating some data from CentOS 4 to CentOS 5, it's around 70GB of millions of small files. I would like it to still take 70GB, not 140GB. For now, I'm working around this issue by using "-T small" to mke2fs, I'm not sure if it's going to have the effect I want, and I'm not sure about any other impact (performance?) it might have on my filesystem.
I'm a gambler, so I'll bet on this. Very large disks? If so, it may be that some of the tunables specify two blocks per "fragment" or the bytes-per-inode specifies more than 4K. I've been able, in the past, to affect things like this by tuning the number of i-nodes up/down when making the file system. Generally though, I'm reducing the number as there is a lot of space that can be gained since normally there will be 1 per block, IIRC. Since my desktop FS doesn't experience that much growth, and lots of the files are large, this is safe. YMMV.
The output of the tune2fs command might give some hints.
Also, using mke2fs with the "-n" parameter will tell you what it would do if you were to (re) make the file system.
<snip sig stuff>
HTH
Hi,
On Wed, Mar 11, 2009 at 17:29, Filipe Brandenburger filbranden@gmail.com wrote:
If I "du" a small file (couple of bytes) in CentOS 5, it tells me the file is using 8kb, while I was expecting 4kb which is the block size I'm using.
Found it! It's not related to CentOS 4 or 5 (I found a C4 machine in which small files took 8kb of diskspace and a C5 machine in which small files took 4kb). It's related to SELinux being enabled or not. Casually most of my C4 machines had SELinux disabled and most of my C5 have it enabled. Now I dug out some machines with the opposite config and I checked it out.
I believe if SELinux is enabled, it will use extended attributes to store the file's SELinux context (you can see it with "ls -Z", for some reason you cannot see it with "getfattr -d", I was expecting that to be possible). I guess when the file has extended attributes it will use an additional block to store them. That basically doubles the storage requirements if you have millions of tiny files...
ACLs would probably have the same effect (I did not test it though).
I wonder if there is a way to override this, for instance by mounting a filesystem and disabling extended attributes, specifying the SELinux context for all the files in the mount options or something. I know that is possible for NFS, but not for local filesystems... I'll dig in, I'll let you know if I find anything.
Thanks! Filipe
Filipe Brandenburger wrote:
Found it! It's not related to CentOS 4 or 5 (I found a C4 machine in which small files took 8kb of diskspace and a C5 machine in which small files took 4kb). It's related to SELinux being enabled or not. Casually most of my C4 machines had SELinux disabled and most of my C5 have it enabled. Now I dug out some machines with the opposite config and I checked it out.
I believe if SELinux is enabled, it will use extended attributes to store the file's SELinux context (you can see it with "ls -Z", for some reason you cannot see it with "getfattr -d", I was expecting that to be possible). I guess when the file has extended attributes it will use an additional block to store them. That basically doubles the storage requirements if you have millions of tiny files...
It shouldn't be doing that. Was this an old filesystem, originally created without security attributes? What does tune2fs show for the inode size? On my Fedora 10 laptop, where the filesystem was originally set up with SELinux attributes, the inode size is 256 bytes and the security attributes are stored in the inode itself. On my systems without SELinux, the inode size is 128 bytes, so the penalty is the additional 128 bytes per inode, not 4K per file.
If you run debugfs on the partition and use its 'stat' command you can see where the security attributes are stored.
AFAIK if you are running SELinux there is no way to keep it out of any filesystem capable of supporting extended attributes.
Robert Nichols wrote:
Filipe Brandenburger wrote:
Found it! It's not related to CentOS 4 or 5 (I found a C4 machine in which small files took 8kb of diskspace and a C5 machine in which small files took 4kb). It's related to SELinux being enabled or not. Casually most of my C4 machines had SELinux disabled and most of my C5 have it enabled. Now I dug out some machines with the opposite config and I checked it out.
I believe if SELinux is enabled, it will use extended attributes to store the file's SELinux context (you can see it with "ls -Z", for some reason you cannot see it with "getfattr -d", I was expecting that to be possible). I guess when the file has extended attributes it will use an additional block to store them. That basically doubles the storage requirements if you have millions of tiny files...
It shouldn't be doing that.
In any case, I can confirm that selinux is disabled on the x86_64 C5 box where I tested yesterday and reported 4kb usage. And I see the same on another similar box (4kb file, selinux disabled on x86_64 C5).
Filipe Brandenburger wrote on Wed, 11 Mar 2009 17:29:06 -0400:
$ du -h test.txt 8.0K test.txt
and just "du test.txt"? e.g. without "translation"?
Kai