[CentOS] Disk usage incorrectly reported by du

Wed Mar 19 12:33:53 UTC 2014
zGreenfelder <zgreenfelder at gmail.com>

On Wed, Mar 19, 2014 at 8:14 AM, Radu Radutiu <rradutiu at gmail.com> wrote:
> I have an ext4  filesystem for which the reported disk usage is not
> correct.  I have noticed the discrepancy after I rsync-ed the content to
> another filesystem and noticed that the used space on the target is almost
> double of the size reported on the source.
> Both machines are running the same software - with the same kernel version
> and same coreutils version (which I later upgraded to latest available
> version).
> Both filesystems are clean (verified with fsck.ext4).
> No sparse files.
> After further investigation I think that the problem is most likely on the
> source machine.
> Here is the du output for for one directory exhibiting the problem:
>
> #du -h |grep \/51
> 201M    ./51/msg/8
> 567M    ./51/msg/9
> 237M    ./51/msg/6
> 279M    ./51/msg/0
> 174M    ./51/msg/10
> 273M    ./51/msg/2
> 341M    ./51/msg/7
> 408M    ./51/msg/4
> 222M    ./51/msg/11
> 174M    ./51/msg/5
> 238M    ./51/msg/1
> 271M    ./51/msg/3
> 3.3G    ./51/msg
> 3.3G    ./51
>
> after changing the directory and running du again I get different numbers
>
> #cd 51
> du -h
> 306M    ./msg/8
> 676M    ./msg/9
> 351M    ./msg/6
> 338M    ./msg/0
> 347M    ./msg/10
> 394M    ./msg/2
> 480M    ./msg/7
> 544M    ./msg/4
> 407M    ./msg/11
> 312M    ./msg/5
> 326M    ./msg/1
> 377M    ./msg/3
> 4.8G    ./msg
> 4.8G    .
>
> Do you have any idea what could cause this behaviour?
> _______________________________________________

so you have software creating file on machines A & B, synchronized
from A to B and now B is using 2x the space; was the software running
on B when you did the sync?   I've seen similar things happen on all
unix systems when you don't close out the file handles on running
programs but then overwrite their opened files.    to fix it you have
to have make the programs close and re-open their files.   with well
written programs you can do that via a signal or some other trigger
mechanism, others will need to be restarted.   often it's easier to
just schedule a reboot and restart everything rather than wade through
all the individual process shutdowns, restarts and time that you'll
take affecting production processes, but YMMV.



-- 
Even the Magic 8 ball has an opinion on email clients: Outlook not so good.