Just wondering if anyone else has come across an issue where files cached in memory appear to become 'corrupted' - for example, on one workstation, I've just had the issue:
# yum Traceback (most recent call last): File "/usr/bin/yum", line 4, in ? import yum File "__init__.py", line 36, in ? File "config.py", line 34, in ? File "repos.py", line 29, in ? File "repoMDObject.py", line 18, in ?
strace'ing the process didn't show up anything obvious
However, I ran a simple process that just grabs memory - which has the side effect of 'flushing' caches from memory.
After, doing this, yum ran fine ...
I've seen this issue quite a few times with various applications and on different machines - and flushing the caches in the way fixes the issue ...
OK, it could be a real memory problem, but given the number of times I see this issue, on machines that are fine otherwise, it makes we wonder if it could be a cache issue ...
All the machines are running a CentOS4.4 based kernel - (I plan to move to a CentOS4.6 based kernel when it is out). The kernel is not a vanilla CentOS4.4 kernel - it includes a few small non-standard patches - but nothing that touches the VFS layer (AFAIK). The issue occurs on i686 or x86_64 boxes. All machines have at least 4GB of memory. They also use the nVidia binary-only module.
Anyone seen issues like this?
Thanks
James Pearson
James Pearson wrote:
Just wondering if anyone else has come across an issue where files cached in memory appear to become 'corrupted' - for example, on one workstation, I've just had the issue:
There is a kernel bug for this kind of problem but it is for AMD x86_64 only: http://bugzilla.kernel.org/show_bug.cgi?id=7768 . Since you are reporting this for i686 as well, I'm not sure it's the same thing. The symptom is that an entire 4k cached file block is trashed. This was fixed upstream for 5.x: https://bugzilla.redhat.com/show_bug.cgi?id=238709. I cannot tell if a similar patch has gone into a 4.x kernel: the appropriate RedHat bug (https://bugzilla.redhat.com/show_bug.cgi?id=223238) is unfortunately private. Someone could look at the latest kernel sources and see.
Dan
Dan Halbert wrote:
James Pearson wrote:
Just wondering if anyone else has come across an issue where files cached in memory appear to become 'corrupted' - for example, on one workstation, I've just had the issue:
There is a kernel bug for this kind of problem but it is for AMD x86_64 only: http://bugzilla.kernel.org/show_bug.cgi?id=7768 . Since you are reporting this for i686 as well, I'm not sure it's the same thing. The symptom is that an entire 4k cached file block is trashed. This was fixed upstream for 5.x: https://bugzilla.redhat.com/show_bug.cgi?id=238709. I cannot tell if a similar patch has gone into a 4.x kernel: the appropriate RedHat bug (https://bugzilla.redhat.com/show_bug.cgi?id=223238) is unfortunately private. Someone could look at the latest kernel sources and see.
Thanks for the info - the machine I recently noticed this 'yum' issue is a Supermicro X5DA8 with 2 x 3Ghz (32 bit) Xeons - so not a AMD x86_64 issue - as that kernel bugzilla issue seems to be about ...
Anyway, I'll keep poking about ...
Thanks
James Pearson