Judging from the frequency of my messages here one could think that I'm too stupid to upgrade a workstation to 5.2 (but the servers I've tried work without problem)
OK. The problem: I've tried to upgrade a 5.1-x86_64-workstation to 5.2. During the upgrade immidiatly after (according to the /var/log/messages) upgrading the two (64 & 32-bit) libgcc-packages an EXT3-error occurs and the root-partition gets remounted in read-only-mode. Consequently ALL following package upgrades throw errors saying this or that can't be done because he can't write to /etc, /usr but in the end yum says everything is OK.
(Side info: /var is on a different partition, so it is still writeable)
At reboot the machine wants a manual fsck which throws a lot of errors. Then the machine reboots with with a lot of error messages (basically because it can't find /lib64/libgcc_s.so.1 (nothing a symlink can't fix). After that I try to fix things by manually upgrading the libgcc-packages (otherwise yum won'T run). "rpm -q" for selected packages shows that the majority of the packages is still in 5.1-state. Trying to "yum upgrade" again fail because the machine can't find any servers (but network functionality seems OK)
So I try to reboot. And now the really strange thing happens: A simple "rpm -q rpm" says that no rpm is installed. Rebuilding the rpm-database doesn't help.
So I'm a bit stuck here with a machine that is in limbo.
My questions: what could have been the cause for this (the machine was working OK before that). The only thing I feel a little guilty about was deinstalling the nvidia-kernel-driver and not reboot, but this can't f### up the file-system, can it?
Any suggestions what I can do to get the machine into a normal state again (apart from reinstall from scratch)
OK. I managed to beat the machine into submission. But a slight incertainty remains.
On Wed, 25 Jun 2008 18:34:26 +0200 "BG" == Bernhard Gschaider bgschaid_lists@ice-sf.at wrote:
BG> Judging from the frequency of my messages here one could think BG> that I'm too stupid to upgrade a workstation to 5.2 (but the BG> servers I've tried work without problem)
BG> OK. The problem: I've tried to upgrade a BG> 5.1-x86_64-workstation to 5.2. During the upgrade immidiatly BG> after (according to the /var/log/messages) upgrading the two BG> (64 & 32-bit) libgcc-packages an EXT3-error occurs and the BG> root-partition gets remounted in read-only-mode. Consequently BG> ALL following package upgrades throw errors saying this or BG> that can't be done because he can't write to /etc, /usr but in BG> the end yum says everything is OK.
BG> (Side info: /var is on a different partition, so it is still BG> writeable)
That is my problem: Did that step of the "upgrade" leave the rpm-database in a state that is not in tune with what is actually on the disk
BG> At reboot the machine wants a manual fsck which throws a lot BG> of errors. Then the machine reboots with with a lot of error BG> messages (basically because it can't find /lib64/libgcc_s.so.1 BG> (nothing a symlink can't fix). After that I try to fix things BG> by manually upgrading the libgcc-packages (otherwise yum won'T BG> run). "rpm -q" for selected packages shows that the majority BG> of the packages is still in 5.1-state. Trying to "yum BG> upgrade" again fail because the machine can't find any servers BG> (but network functionality seems OK)
The problem was that the centos-release was removed at the start of the upgrade (and that is needed by yum to determine which $releasever to use)
BG> So I try to reboot. And now the really strange thing happens: BG> A simple "rpm -q rpm" says that no rpm is BG> installed. Rebuilding the rpm-database doesn't help.
BG> So I'm a bit stuck here with a machine that is in limbo.
By manually reinstalling the rpm.rpm and some other packages I managed to kickstart the upgrade again (and it seems to have succeeded)
Only problem: in the second upgrade run less packages were listed as due to be updated (roughly a half). So I'm not sure: Are they marked as upgraded in the rpm-database but in reality there are the old versions on the disk.
Is there a way to say: "Hey RPM, have a look whether really the files in your database are on the disk)" ?
BG> My questions: what could have been the cause for this (the BG> machine was working OK before that). The only thing I feel a BG> little guilty about was deinstalling the nvidia-kernel-driver BG> and not reboot, but this can't f### up the file-system, can BG> it?
That question remains. But it is academic
BG> Any suggestions what I can do to get the machine into a normal BG> state again (apart from reinstall from scratch)
As I said: solved
Thanks for listening
On Wed, 2008-06-25 at 20:27 +0200, Bernhard Gschaider wrote:
<snip>
Is there a way to say: "Hey RPM, have a look whether really the files in your database are on the disk)" ?
Use rpm's verify option. I forget the exact syntax: I'm sorry to have to sentence you to the rpm manpage dungeon. :-(
<snip>
HTH
Bernhard Gschaider wrote:
OK. I managed to beat the machine into submission. But a slight incertainty remains.
I'd be concerned about the initial filesystem error perhaps being hardware related. You might try a 'cat /dev/sda >/dev/null' to force a full disk read, then 'smartctl -a /dev/sda' to see if the health still looks OK.
On Wed, 25 Jun 2008 14:39:22 -0400 "WLM" == William L Maltby CentOS4Bill@triad.rr.com wrote:
WLM> On Wed, 2008-06-25 at 20:27 +0200, Bernhard Gschaider wrote: >> <snip>
>> Is there a way to say: "Hey RPM, have a look whether really the >> files in your database are on the disk)" ?
WLM> Use rpm's verify option. I forget the exact syntax: I'm sorry WLM> to have to sentence you to the rpm manpage dungeon. :-(
Thanks. I was looking for the keyword "check" in man-pages (shows you that half the art in searching is knowing "for what")
On Wed, 25 Jun 2008 13:51:16 -0500 "LM" == Les Mikesell lesmikesell@gmail.com wrote:
LM> Bernhard Gschaider wrote: >> OK. I managed to beat the machine into submission. But a slight >> incertainty remains.
LM> I'd be concerned about the initial filesystem error perhaps LM> being hardware related. You might try a 'cat /dev/sda LM> >/dev/null' to force a full disk read, then 'smartctl -a LM> /dev/sda' to see if the health still looks OK.
Thanks. That's one of the things that never come up in the "System administration for half-wits"-books that I usually read
On Wed, 25 Jun 2008 23:53:16 +0200 "BG" == Bernhard Gschaider bgschaid_lists@ice-sf.at wrote:
On Wed, 25 Jun 2008 14:39:22 -0400 "WLM" == William L Maltby CentOS4Bill@triad.rr.com wrote:
WLM> On Wed, 2008-06-25 at 20:27 +0200, Bernhard Gschaider wrote: >>> <snip>
>>> Is there a way to say: "Hey RPM, have a look whether really >>> the files in your database are on the disk)" ?
WLM> Use rpm's verify option. I forget the exact syntax: I'm sorry WLM> to have to sentence you to the rpm manpage dungeon. :-(
Sorry. Stupid question again: and if I find inconsistencies, then the only way to force rpm to correct them yould be something like
yum remove offendingPackage yum install offendingPackage
or the equivalent rpm-commands?
Currently the machine behaves quite strange: - Boots OK - Lets users log in and most applications work - Firefox works only for root - yumex hangs at starting - "man rpm" says XXX WARNING: old character encoding and/or character set
All this leads me to the conclusion that there are only some selected packages corrupt (and I don't want to reinstall the machine). Would Installing/Repairing from DVD help?
Bernhard
BG> Thanks. I was looking for the keyword "check" in man-pages BG> (shows you that half the art in searching is knowing "for BG> what")
WARNING! Due to my background, I don't often read man pages like I used to. So there may be some inaccuracies or ambiguities below.
On Thu, 2008-06-26 at 11:58 +0200, Bernhard Gschaider wrote:
<snip>
Sorry. Stupid question again: and if I find inconsistencies, then the
Keep in mind that *some* inconsistencies are expected. Local config files being one good example. You must look at the codes displayed in the output, and possibly the files, to be sure it is really a discrepancy.
only way to force rpm to correct them yould be something like
yum remove offendingPackage yum install offendingPackage
or the equivalent rpm-commands?
Not the only way, but probably the safest. However, that may try to also remove some dependencies, depending on the package you're trying to remove.
I seem to recall a "force" parameter that is available for rpm and yum. Although normally disparaged, this is a perfect situation for its use.
Currently the machine behaves quite strange:
- Boots OK
- Lets users log in and most applications work
- Firefox works only for root
- yumex hangs at starting
Depending on your time-frame, this may be a symptom of the load on the servers you access. Yesterday A.M. I saw *BIG* delays downloading the xml(?) files. But I use yum CLI, so I see the blood-n-guts on the screen. <BIAS> GUIs suck... in general</BIAS>
- "man rpm" says XXX WARNING: old character encoding and/or character set
All this leads me to the conclusion that there are only some selected packages corrupt (and I don't want to reinstall the machine). Would Installing/Repairing from DVD help?
Maybe. But some of the rpms might be on your system from the update activities. Do and updatedb and then a locate .rpm. You may see some in /var/cache/yum. Subdirs under it might have what you need.
Bernhard
<snip>
HTH
On Thu, 26 Jun 2008 06:51:35 -0400 "WLM" == William L Maltby CentOS4Bill@triad.rr.com wrote:
WLM> WARNING! Due to my background, I don't often read man pages WLM> like I used to. So there may be some inaccuracies or WLM> ambiguities below.
WLM> On Thu, 2008-06-26 at 11:58 +0200, Bernhard Gschaider wrote: >> <snip>
>> Sorry. Stupid question again: and if I find inconsistencies, >> then the
WLM> Keep in mind that *some* inconsistencies are expected. Local WLM> config files being one good example. You must look at the WLM> codes displayed in the output, and possibly the files, to be WLM> sure it is really a discrepancy.
I know. I compared with the verify-output from a working machine. For my theory ("there are different rpm-packages on the disk than in the rpm-database") to be right there should be a large amounts of files with wrong MD5-sums. And there is only a handful for which this is the case (and they seem mostly harmless)
>> only way to force rpm to correct them yould be something like >> >> yum remove offendingPackage yum install offendingPackage >> >> or the equivalent rpm-commands?
WLM> Not the only way, but probably the safest. However, that may WLM> try to also remove some dependencies, depending on the WLM> package you're trying to remove.
Yep. That's what I was afraid of
WLM> I seem to recall a "force" parameter that is available for WLM> rpm and yum. Although normally disparaged, this is a perfect WLM> situation for its use.
It exists in RPM, but in yum it is notoriously absent
>> Currently the machine behaves quite strange: - Boots OK - Lets >> users log in and most applications work - Firefox works only >> for root - yumex hangs at starting
WLM> Depending on your time-frame, this may be a symptom of the WLM> load on the servers you access. Yesterday A.M. I saw *BIG* WLM> delays downloading the xml(?) files. But I use yum CLI, so I WLM> see the blood-n-guts on the screen. <BIAS> GUIs suck... in WLM> general</BIAS>
yum works. The problem according to an "strace yum" seems to be that it is poll-ing on something, but I don't know on what, because I don't get the arguments to that call, because it never finishes (last line just says "poll("
>> - "man rpm" says XXX WARNING: old character encoding and/or >> character set >> >> All this leads me to the conclusion that there are only some >> selected packages corrupt (and I don't want to reinstall the >> machine). Would Installing/Repairing from DVD help?
WLM> Maybe. But some of the rpms might be on your system from the WLM> update activities. Do and updatedb and then a locate WLM> .rpm. You may see some in /var/cache/yum. Subdirs under it WLM> might have what you need.
I'll try that. If it doesn't help I'll have to scratch the machine and install anew.
Thanks Bernhard
<<snip>>
All this leads me to the conclusion that there are only some selected packages corrupt (and I don't want to reinstall the machine). Would Installing/Repairing from DVD help?
If you have packages newer than what is on the dvd, I don't think they will get replaced. You might get away with removing the packages from the rpm database only and then re-install them, but yum probably wouldn't help you here.
On Thu, 26 Jun 2008, Bernhard Gschaider wrote:
On Wed, 25 Jun 2008 23:53:16 +0200 "BG" == Bernhard Gschaider bgschaid_lists@ice-sf.at wrote:
On Wed, 25 Jun 2008 14:39:22 -0400 "WLM" == William L Maltby CentOS4Bill@triad.rr.com wrote:
WLM> On Wed, 2008-06-25 at 20:27 +0200, Bernhard Gschaider wrote:
<snip>
Is there a way to say: "Hey RPM, have a look whether really the files in your database are on the disk)" ?
WLM> Use rpm's verify option. I forget the exact syntax: I'm sorry WLM> to have to sentence you to the rpm manpage dungeon. :-(
Sorry. Stupid question again: and if I find inconsistencies, then the only way to force rpm to correct them yould be something like
yum remove offendingPackage yum install offendingPackage
or the equivalent rpm-commands?
With apt-rpm you have the possibility to replace a package inline from a repository, you can do this with:
apt-get install --reinstall <package-name>
This is useful if you damaged files that belonged to an installed RPM package without having to uninstall all the packages that depend on it as well.
Under the hood it is the same as:
rpm -Uhv --replacefiles --replacepkgs <file-name>
The --reinstall feature is also useful when during CentOS QA packages are being updated with the exact same version-release. Or when you want to convert a RHEL into a CentOS or the other way around.
On Wed, 2008-06-25 at 23:53 +0200, Bernhard Gschaider wrote:
<snip>
LM> I'd be concerned about the initial filesystem error perhaps LM> being hardware related. You might try a 'cat /dev/sda LM> >/dev/null' to force a full disk read, then 'smartctl -a LM> /dev/sda' to see if the health still looks OK.
Thanks. That's one of the things that never come up in the "System administration for half-wits"-books that I usually read
There's your problem! You have to be a half-wit for those books. The fact that you have the the incentive to read a book means you are *not* a half-wit. Ergo: your reading the wrong books! ;-)
<snip sig stuff>