Hi list, this problem is already known and I'm sorry to bother if an acceptable workaround was already debated on the list. I was getting trouble with a 'grep something /var/log*' which caused the "Memory exhausted" message. With some deeper search I found the lastlog file in /var/log/ to be 1.2T sized. This seems to come from the nfsnobody's uid to be 4294967294 on x86_64 system (and -1 on i386) and the pre-allocation space for every uid (so from 0 to ...4294967294...hum..) from lastlog. Since I do not use NFS at all, can I just erase the nfsnobody from /etc/passwd and start with a blank lastlog file (this looks like a great way to nuke my system)? Or do I need to explicitly exclude this file from all my search? How do you -x86_64 users- deal with this? It seems the mainstream people don't see this as an issue.
the system: 2.6.9-11.ELsmp #1 SMP Fri May 20 18:25:30 EDT 2005 x86_64 x86_64 x86_64 GNU/Linux CentOS 4
Thanks, Manuel
On Tue, 2005-09-06 at 16:20 +0200, Manuel BERTRAND wrote:
Hi list, this problem is already known and I'm sorry to bother if an acceptable workaround was already debated on the list. I was getting trouble with a 'grep something /var/log*' which caused the "Memory exhausted" message. With some deeper search I found the lastlog file in /var/log/ to be 1.2T sized. This seems to come from the nfsnobody's uid to be 4294967294 on x86_64 system (and -1 on i386) and the pre-allocation space for every uid (so from 0 to ...4294967294...hum..) from lastlog. Since I do not use NFS at all, can I just erase the nfsnobody from /etc/passwd and start with a blank lastlog file (this looks like a great way to nuke my system)? Or do I need to explicitly exclude this file from all my search? How do you -x86_64 users- deal with this? It seems the mainstream people don't see this as an issue.
According to https://bugzilla.redhat.com/bugzilla/show_bug.cgi? id=149407#c22 there is a fix for this in nfs-utils-1.0.6-60.EL4, but I don't think it's the correct fix (meaning that, while it may solve the symptom, it doens't solve the root problem).
Although the file is listed as being 1.2T, it's a sparse file, meaning that it doesn't really take up too much space on disk. The problem we're having is that grep and many other utilities don't handle sparse files well.
There are a few ways to deal with this. I'm presenting them in order, from "most correct" to "easiest to implement". "Correct" in this case means "most directly solving the root problem," and not necessarily "most practical."
1) Redesign the kernel's vfs layer to handle sparse files automatically even when accessed by programs that don't know how to handle sparse files. I'm not sure how this might be done. At the moment, several programs (including grep and tar) attempt to read the entire 1.2TB file, instead of just the non-zero bits. I'm not sure if this is a problem with grep & tar, with open(), or with the vfs.
2) Change the format of lastlog so it doesn't use a sparse file. This isn't trivial, but I think it's probably the best solution in the long run (unless #1 happens).
3) Make grep, tar, rpm, rsync, scp, cp, and so on handle sparse files better. This probably needs to be done anyway.
4) Remove the nfs-utils package, remove /var/log/lastlog, and recreate it. This should make lastlog much smaller, unless you have a lot of users on your system.
5) Exclude lastlog from any greps you do in /var/log.
For an end user, #5 is the easiest, followed by #4.
References: https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=64891 https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=138676 https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=144538 https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=145305 https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=146214 https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=149407 https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=156809 https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=163273 https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=164614 https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=165058
On Tue, 2005-09-06 at 14:04, David Johnston wrote:
I was getting trouble with a 'grep something /var/log*' which caused the "Memory exhausted" message. With some deeper search I found the lastlog file in /var/log/ to be 1.2T sized.
- Change the format of lastlog so it doesn't use a sparse file. This
isn't trivial, but I think it's probably the best solution in the long run (unless #1 happens).
This is painful for most ways of backing up and cloning systems too. How about gdbm format as something that would work with miminal changes?
thanks for your replies, I will try this one, since the server is not in a production state. What would happen if it was... a simple grep eating 2 GB of RAM.... David Johnston wrote:
- Remove the nfs-utils package, remove /var/log/lastlog, and recreate
it. This should make lastlog much smaller, unless you have a lot of users on your system.
Les Mikesell wrote:
- Change the format of lastlog so it doesn't use a sparse file. This
isn't trivial, but I think it's probably the best solution in the long run (unless #1 happens).
This is painful for most ways of backing up and cloning systems too. How about gdbm format as something that would work with miminal changes?
I dont get it. lastlog is in dbm format and if it is converted to gdbm it wont't be a sparse file anymore ?
On Tue, 2005-09-06 at 19:40, Manuel BERTRAND wrote:
- Change the format of lastlog so it doesn't use a sparse file. This
isn't trivial, but I think it's probably the best solution in the long run (unless #1 happens).
This is painful for most ways of backing up and cloning systems too. How about gdbm format as something that would work with miminal changes?
I dont get it. lastlog is in dbm format and if it is converted to gdbm it wont't be a sparse file anymore ?
Gdbm files aren't sparse but it provides a backwards compatible api for old dbm and ndbm so it should work with minimal program changes. Of course the file format is incompatible and everything that uses it would have to be changed at once.