[CentOS] gigantic memory leak in Clock Applet...

Mon Jan 7 14:08:36 UTC 2013
ken <gebser at mousecar.com>

> On Sun, Jan 06, 2013 at 06:23:20PM -0500, ken wrote:
>> On 01/06/2013 05:18 PM fred smith wrote:
>>> On Sun, Jan 06, 2013 at 02:43:09PM -0500, ken wrote:
>>>> On 01/06/2013 09:55 AM fred smith wrote:
>>>>> On Sun, Jan 06, 2013 at 06:33:07AM -0500, ken wrote:
>>>>>> Fred,
>>>>>>
>>>>>> Also running an up-to-date 5.8 but with just 2G of RAM, clock-applet
>>>>>> consumes the following:
>>>>>>
>>>>>> PID USER PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
>>>>>> 4133 me  15   0 29568 3748 2944 S  0.0  0.2 190:51.33 clock-applet
>>>>>>
>>>>>> My uptime at the moment is coming on 68 days.  Over time the %CPU field
>>>>>> may flicker up to 0.3 or even 0.7, but the RES column and others are
>>>>>> steady at the numbers you see.  I should add that all Preferences which
>>>>>> we'd expect to consume more resources (e.g., display seconds, 12-hour
>>>>>> time) are on.
>>>>>>
>>>>>> Do you use evolution?
>>>>>
>>>>> no, I have never found it to my liking.
>>>>>
>>>>>>
>>>>>> KDE, Gnome, or other WM?
>>>>>
>>>>> gnome.
>>>>
>>>> I don't know what to tell you then because, like you, I use gnome but
>>>> not evolution.  So our systems-- what of them which are directly related
>>>> to clock-applet-- are much the same, yet you have a memory problem with
>>>> clock-applet which I don't.
>>>
>>> here's what top reports today (clock-applet has not been restarted since
>>> the event mentioned in my original posting):
>>>
>>>     PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
>>> 11159 fredex    16   0  263m 149m  10m S  0.3  3.8   1:36.87 clock-applet
>
> it's now up to "156m". :(
>
>>>
>>> in which I note it is now up to "149m".
>>>
>>>>
>>>> Here are some items to compare:
>>>>
>>>> # rpm -q gnome-panel
>>>> gnome-panel-2.16.1-7.el5
>>>> # ll /usr/libexec/clock-applet
>>>> -rwxr-xr-x 1 root root 88048 May 24  2008 /usr/libexec/clock-applet
>>>> # md5sum /usr/libexec/clock-applet
>>>> 9d21ca21a0e99ad26aa10e1cd5b42024  /usr/libexec/clock-applet
>>>>
>>> # rpm -q gnome-panel
>>> gnome-panel-2.16.1-7.el5
>>> # ll /usr/libexec/clock-applet
>>> -rwxr-xr-x 1 root root 88048 May 24  2008 /usr/libexec/clock-applet
>>> # md5sum /usr/libexec/clock-applet
>>> 2bc9a73a5251d1b4747ec133839412b7  /usr/libexec/clock-applet
>>>
>>> it's the same version and size as yours, but the md5sum differs. have
>>> you perhaps disabled prelink? (I don't call that I have ever done so)
>>> It's not obvious to me what other (legitimate) event would account for
>>> the difference in checksum.
>>
>> Take a look in /etc/sysconfig/prelink.  At the top it should tell you if
>> you've got prelink on.  You should also have a file called prelink in
>> /etc/cron.daily/.
>
> yes, it's on. there's a log in /var/log/prelink from just yesterday morning.
>
> wouldn't ya think that if prelink has modified it, that the rpm -V would
> have flagged a modified checksum? Or is prelink smart enuff to tweak
> the RPM database? I have no clue.

Think about what we've already discovered.  You and I have different 
results from md5sum, yet neither of our clock-applet files are flagged.
So we do have some kind of clue.  Without digging into code and running 
tests etc., I'm guessing that rpm must amend/revise its database 
contingent upon prelink's being invoked.



>> If none of that explains things, you might want to just reinstall
>> gnome-panel and see if that fixes the memory problem.
>
> might try that, though it pains me to have to resort to the sort of
> "fixes" that Windows folks think are normal: power off, power on: reinstall:
> reboot. Gah!
> :)

I know that feeling too.  But, first, you shouldn't have to reboot... 
probably not, because this gnome stuff should load only at runlevel 5. 
Trace out the init scripts if you want to be sure.  If so, that would 
mean you only need to "init 2 (or maybe just 3); rpm -e gnome-panel; yum 
install gnome-panel; init 5".  That's really not very windozey.  You 
might also want to determine whether that command sequence will wipe 
your current panel configuration and, if so, which files to backup prior 
and restore later (so you wouldn't have to set up your panel all over 
again).  Do Windows folks do that?

1a:  It just occurred to me that (and this is a long shot, but still a 
possibility) the problem could go away just by reloading the applet. 
Just delete it from the panel and then install it again.  Then watch its 
memory consumption to see if there's a difference.  Even if this works, 
it likely wouldn't be anything more than a temporary fix, but it might 
point to where the problem lies.  And it's a real quick and easy.

Secondly, if the fate of the universe hung from our discovering exactly 
what the origin of the memory problem was, then yeah, it would be worth 
the effort.  But it doesn't.  There are lots of other and bigger issues 
around and dwelling on this little one would give people to think that 
we have too much time on our hands and that we don't know about all that 
bigger stuff.

#C: Reinstalling might not actually fix the problem, but only make the 
current memory problem go away... or maybe not even that.  The actual 
origin of the problem could be lurking in a lot places, e.g., with the 
drive (which see below), with a small bit of the RAM where the 
executable got parked,....



>>> If I run:
>>>
>>> 	rpm -V -v gnome-panel
>>>
>>> it shows no differences at all, so I don't think the clock-applet has
>>> been damaged or hacked. (but I wonder what it shows on your system, since
>>> yours has a different md5sum.)
>>>
>>> 	........    /usr/libexec/clock-applet
>>
>> Yeah, same here.
>>
>> The clock applet would be a weird thing for somebody to hack.  But maybe
>> you're seeing an early sign of a disk problem.  Bit rot or something
>> like it could "damage" the executable.
>
> I suppose it's POSSIBLE. I'm running two identical drives in a RAID-1
> configuration, but as I understand it, RAID doesn't cover things like
> invalid/incorrect reads from a drive, it merely provides redundancy in
> case of total failure.

Correct.  RAID 1 is lots better than JBOD, but doesn't do parity.  I've 
run RAID 10 (and others) but never RAID 1, so I'm not sure about this... 
but there might be raid tools around that allows the examination and 
comparison of data on each drive separately.  Perhaps the file is 
corrupt on one drive, but okay on the other.  Failing a tool to do that, 
you may be able to take one of the disks out of the RAID, mount it 
separately as a data drive, and compare the files by hand.  Of course 
you want to first be confident that you could re-mirror the drives.




>
> Thanks for the ideas, I'll post back to the list if anything interesting
> turns up.
>
> Fred
>>
>>
>>>>>> On 01/04/2013 05:11 PM fred smith wrote:
>>>>>>> I've discovered recently that something on my Centos 5.8 box (up to date)
>>>>>>> is hogging a ton of RAM.
>>>>>>>
>>>>>>> so a little while ago I sat and watched top for a while. it showed
>>>>>>> (sorry, I didn't take screen shots or write this down, so the numbers
>>>>>>> are a bit rough) that out of 8 gigs of swap, around 2 1/2 was in use,
>>>>>>> and all the RAM (except for the little the kernel keeps for itself)
>>>>>>> was in use (it's got 4 gigs).
>>>>>>>
>>>>>>> this might not sound bad, but there's hardly ever anything big running
>>>>>>> on this box, it's just my home desktop machine used mostly for web
>>>>>>> browsing/music/email and similar.
>>>>>>>
>>>>>>> so, watching top run for a while I could eventually make out that
>>>>>>> something had "1.6g" flashing in the "RES" column. slowing the refresh
>>>>>>> a little I saw that it was "clock applet". so I killed the clock applet
>>>>>>> and restarted it, then clock applet showed "11m" in the "RES" column,
>>>>>>> and the unused RAM was suddenly like a gig and 3/4, or so, and the
>>>>>>> swap used slowly started dropping while the free ram began being used up,
>>>>>>> as it normally should.
>>>>>>>
>>>>>>> as I continue to watch it run (10-15 mins later) I can see that clock
>>>>>>> applet is now showing 14m in the RES column, so it's still growing.
>>>>>>>
>>>>>>> Is anyone else seeing the clock applet hogging (tons of tiny leaks, I
>>>>>>> assume) RAM needlessly?
>>>>>>>
>>>>>> _______________________________________________
>>>>>> CentOS mailing list
>>>>>> CentOS at centos.org
>>>>>> http://lists.centos.org/mailman/listinfo/centos
>>>>>
>>>> _______________________________________________
>>>> CentOS mailing list
>>>> CentOS at centos.org
>>>> http://lists.centos.org/mailman/listinfo/centos
>>>
>> _______________________________________________
>> CentOS mailing list
>> CentOS at centos.org
>> http://lists.centos.org/mailman/listinfo/centos
>