Safety

List overview All Threads
Download

newer

older

yum failing

SSH Migration -- Conversion

Sam Drinkard

20 Mar 2006 20 Mar '06

3:39 p.m.

My server has not yet been updated with all the goodies and is still a stock 4.2 installation. What is the consensus about remote updating? Would it be better if I were to physically be there and do it or are things stable enough that I could do it remotely and then reboot. It's kind of a PITA to have to go downtown to the C0-LO site, but can be done.

Thanks..

Sam

Show replies by date

Chris Mauritz

20 Mar 20 Mar

3:52 p.m.

Sam Drinkard wrote:

...

My server has not yet been updated with all the goodies and is still a stock 4.2 installation. What is the consensus about remote updating? Would it be better if I were to physically be there and do it or are things stable enough that I could do it remotely and then reboot. It's kind of a PITA to have to go downtown to the C0-LO site, but can be done.

It's pretty safe to do remotely if your machine is already known to come right back up from a remote reboot. Unless there are some really critical vulnerabilities, I normally wait a few days before remote application to see if anyone else stumbles into any show stoppers. We've been remotely updating our 4.2 boxes without incident.

Cheers,

Andrew Cotter

3:54 p.m.

We keep our boxes as "stock" as possible. We have not had an issue using yum to do any updates that would require being in front of the box. That is our experience.

Andrew

...

-----Original Message----- From: centos-bounces@centos.org [mailto:centos-bounces@centos.org]On Behalf Of Sam Drinkard Sent: Monday, March 20, 2006 10:39 AM To: CentOS mailing list Subject: [CentOS] Safety

My server has not yet been updated with all the goodies and is still a stock 4.2 installation. What is the consensus about remote updating? Would it be better if I were to physically be there and do it or are things stable enough that I could do it remotely and then reboot. It's kind of a PITA to have to go downtown to the C0-LO site, but can be done.

Thanks..

Sam

CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos

R P Herrold

3:56 p.m.

New subject: centos] Safety

On Mon, 20 Mar 2006, Sam Drinkard wrote:

...

My server has not yet been updated with all the goodies and is still a stock 4.2 installation. What is the consensus about remote updating? Would it be better if I were to physically be there and do it or are things stable enough that I could do it remotely and then reboot. It's kind of a PITA to have to go downtown to the C0-LO site, but can be done.

Nothing beats a local staging server with 'unusual' hardware to test on. That said, the Centos build team roll security updates at once, and test non-security updates before release (successfully dodging the current NFS issues with PNAELV 4, U 3 carry), come to mind.

If the server is slimmed down, hardened systematically, and fairly mainline hardware, I consider it safe enough to update on the fly. The COLUG server is updated that way

- Russ Herrold

Scott Silva

4:38 p.m.

Sam Drinkard spake the following on 3/20/2006 7:39 AM:

...

My server has not yet been updated with all the goodies and is still a stock 4.2 installation. What is the consensus about remote updating? Would it be better if I were to physically be there and do it or are things stable enough that I could do it remotely and then reboot. It's kind of a PITA to have to go downtown to the C0-LO site, but can be done.

Thanks..

Sam

A remote update works 99.99% of the time, but nothing is "infallible". You can do the remote update, and run downtown only if it fails. You only need a reboot if you get a new kernel, which you will probably get. The biggest gotcha is if you are running something non-standard that an update might hose. But if that is the case, you can always remotely fix it or try to go back.

The risk of running without the updates can offset the risk of updating most of the time.

Peter Farrow

7:18 p.m.

The only time I had remote updates not work, is when either :

a)Someone leaves a floppy in the machine and it tries to boot of it when I reboot.... b)Someone leaves the keyboard disconnected or something resting on the keyboard.

I've had very few genuine issues where a machine dies after a solid yumming session...

Scott Silva wrote:

...

Sam Drinkard spake the following on 3/20/2006 7:39 AM:

...
My server has not yet been updated with all the goodies and is still a stock 4.2 installation. What is the consensus about remote updating? Would it be better if I were to physically be there and do it or are things stable enough that I could do it remotely and then reboot. It's kind of a PITA to have to go downtown to the C0-LO site, but can be done.

Thanks..

Sam

A remote update works 99.99% of the time, but nothing is "infallible". You can do the remote update, and run downtown only if it fails. You only need a reboot if you get a new kernel, which you will probably get. The biggest gotcha is if you are running something non-standard that an update might hose. But if that is the case, you can always remotely fix it or try to go back.

The risk of running without the updates can offset the risk of updating most of the time.

CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos

Chris Mauritz

7:54 p.m.

Peter Farrow wrote:

...

The only time I had remote updates not work, is when either :

a)Someone leaves a floppy in the machine and it tries to boot of it when I reboot.... b)Someone leaves the keyboard disconnected or something resting on the keyboard.

I've had very few genuine issues where a machine dies after a solid yumming session...

I've had a few instances where:

The update buggered grub.conf or lilo.conf leaving the machine staring at its navel during the next reboot.

The update woke up kudzu (which I normally remove/deactivate) and the system wakes up expecting some keyboard input to tell it what to do with all the new wonderful devices it just found.

But the vast majority of the time, it simply works like it's supposed to. 8-)

Cheers,

Sam Drinkard

9:57 p.m.

Thanks to all of you for your responses. The update did go off without a hitch, however I've not rebooted yet. I'll wait till tomorrow now as after hours trips to the pop are frowned upon unless an emergency. I feel pretty sure it would come back, but will err on the side of safety.

Sam

Adam Gibson

8:25 p.m.

Peter Farrow wrote:

...

The only time I had remote updates not work, is when either :

a)Someone leaves a floppy in the machine and it tries to boot of it when I reboot.... b)Someone leaves the keyboard disconnected or something resting on the keyboard.

Only problem I ever had was many years ago where RedHat x.x (dont remember what version) had a problem where after glibc was updated sshd quit working. Now sshd is restarted automatically after a glibc update now so it is not a problem but it just goes to show that s*** happens.

Mike

7:50 p.m.

I do remote YUM updates on my servers frequently. I've only had a few problems over time, most of which have little or nothing to do with the update itself. IE, a hardware problem being discovered due to a reboot, or the reboot taking far longer than expected due to the machine having been up for around a year and an FSCK was forced.

Overall, CentOS w/YUM seems pretty rock solid.

Scott Silva wrote:

...

Sam Drinkard spake the following on 3/20/2006 7:39 AM:

...
My server has not yet been updated with all the goodies and is still a stock 4.2 installation. What is the consensus about remote updating? Would it be better if I were to physically be there and do it or are things stable enough that I could do it remotely and then reboot. It's kind of a PITA to have to go downtown to the C0-LO site, but can be done.

Thanks..

Sam

A remote update works 99.99% of the time, but nothing is "infallible". You can do the remote update, and run downtown only if it fails. You only need a reboot if you get a new kernel, which you will probably get. The biggest gotcha is if you are running something non-standard that an update might hose. But if that is the case, you can always remotely fix it or try to go back.

The risk of running without the updates can offset the risk of updating most of the time.

CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos

Robert Hanson

10:34 p.m.

} } My server has not yet been updated with all the goodies and is still a } stock 4.2 installation. What is the consensus about remote updating? } Would it be better if I were to physically be there and do it or are } things stable enough that I could do it remotely and then reboot. It's } kind of a PITA to have to go downtown to the C0-LO site, but can be done. } Thanks.. } } Sam

we have had absolutely no problems on small or large updates as long as the remote session connection is stable

i asked some time ago about remote upgrades over possible unstable tcpip connection and there were a lot of good answers.

******* ok, i found it

yum update in background

was the subject... things like use "screen" or

nohup yum -y update &

etc...

anyways... i start to digress

- rh

-- Robert Hanson - Abba Communications Computer & Internet Services (509) 624-7159 - www.abbacomm.net

Peter Farrow

10:43 p.m.

You can always use:

screen yum -y update ctrl-A D

screen -r

to reattach later to see how it went...

Robert Hanson wrote:

...

} } My server has not yet been updated with all the goodies and is still a } stock 4.2 installation. What is the consensus about remote updating? } Would it be better if I were to physically be there and do it or are } things stable enough that I could do it remotely and then reboot. It's } kind of a PITA to have to go downtown to the C0-LO site, but can be done. } Thanks.. } } Sam

we have had absolutely no problems on small or large updates as long as the remote session connection is stable

i asked some time ago about remote upgrades over possible unstable tcpip connection and there were a lot of good answers.

******* ok, i found it

yum update in background

was the subject... things like use "screen" or

nohup yum -y update &

etc...

anyways... i start to digress

rh

-- Robert Hanson - Abba Communications Computer & Internet Services (509) 624-7159 - www.abbacomm.net

CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos

Gilles CHAUVIN

10:54 p.m.

Just because there are some unlucky people here, the first and only time when I had a problem with remote updating was recently with the 2.6.9-34 kernel. After I rebooted my server it never came back online and showed me a quite nice "kernel panic". I had to boot from the previous kernel to see my server back online.

That's the first time I encounter such a problem and it is very annoying (especially because I wanted to "play" with smartd with SATA support and also because I don't like running not up to date servers).

Maybe filling a bugreport would be a good idea but I prefer to wait for the U3 to see if it fixes the bug or not.

Do you think it is safe to file a bug on RH's bugzilla ? Since CentOS is binary compatible with RHEL I assume I would have the same problem with a RHEL4 but, unfortunately, I cannot do any tests for confirming or not the problem with that distro :(.

Gilles.

Maciej Żenczykowski

10:58 p.m.

Hi,

I'm pretty sure that U3 will have the same kernel. As such it would probably be best to fix this now, instead of waiting for U3 to come out. Could you tell us when the kernel panic happens? Processor? i586? i686? architecture? single? multiprocessor? ... etc ... any options with which it works (like noacpi & others).

Cheers, MaZe.

On Mon, 20 Mar 2006, Gilles CHAUVIN wrote:

...

Just because there are some unlucky people here, the first and only

time when I had a problem with remote updating was recently with the 2.6.9-34 kernel. After I rebooted my server it never came back online and showed me a quite nice "kernel panic". I had to boot from the previous kernel to see my server back online.

Maybe filling a bugreport would be a good idea but I prefer to wait for the U3 to see if it fixes the bug or not.

Gilles.

Gilles CHAUVIN

11:12 p.m.

On 3/20/06, Maciej Żenczykowski maze@cela.pl wrote:

...

I'm pretty sure that U3 will have the same kernel. As such it would probably be best to fix this now, instead of waiting for U3 to come out. Could you tell us when the kernel panic happens? Processor? i586? i686? architecture? single? multiprocessor? ... etc ... any options with which it works (like noacpi & others).

$ cat /proc/cpuinfo processor : 0 vendor_id : GenuineIntel cpu family : 15 model : 4 model name : Intel(R) Pentium(R) 4 CPU 2.80GHz stepping : 1 cpu MHz : 2799.397 cache size : 1024 KB fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 5 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe pni monitor ds_cpl cid xtpr bogomips : 5521.40

$ sudo cat /boot/grub/menu.lst Password: # grub.conf generated by anaconda # # Note that you do not have to rerun grub after making changes to this file # NOTICE: You have a /boot partition. This means that # all kernel and initrd paths are relative to /boot/, eg. # root (hd0,0) # kernel /vmlinuz-version ro root=/dev/VG00/root # initrd /initrd-version.img #boot=/dev/sda1 default=1 timeout=5 splashimage=(hd0,0)/grub/splash.xpm.gz hiddenmenu title CentOS (2.6.9-34.EL) root (hd0,0) kernel /vmlinuz-2.6.9-34.EL ro root=/dev/VG00/root initrd /initrd-2.6.9-34.EL.img title CentOS (2.6.9-22.0.2.EL) root (hd0,0) kernel /vmlinuz-2.6.9-22.0.2.EL ro root=/dev/VG00/root initrd /initrd-2.6.9-22.0.2.EL.img

For now, I'm running the -22.0.2 kernel since this is the last kernel known to work on that hardware. I upgraded 5 PCs. 3 of them have the exact same hardware and the exact same failure (kernel panic).

One of those server isn't in production yet so it would be possible to do some testing. The above informations were gathered from a remote SSH session. I cannot give more details for now and I won't be able to tell more about that problem until next wednesday.

Gilles.

Maciej Żenczykowski

11:16 p.m.

Okay, running and showing us lspci (both -n and -v) output would probably help... not that I'm an expert on this and likely to be able to help...

Furthermore knowing _when_ it panics would also be a real boon.

Cheers, MaZe.

On Tue, 21 Mar 2006, Gilles CHAUVIN wrote:

...

On 3/20/06, Maciej Żenczykowski maze@cela.pl wrote: I'm pretty sure that U3 will have the same kernel. As such it would probably be best to fix this now, instead of waiting for U3 to come out. Could you tell us when the kernel panic happens? Processor? i586? i686? architecture? single? multiprocessor? ... etc ... any options with which it works (like noacpi & others).

Gilles.

Gilles CHAUVIN

11:24 p.m.

On 3/21/06, Maciej Żenczykowski maze@cela.pl wrote:

...

Okay, running and showing us lspci (both -n and -v) output would probably help... not that I'm an expert on this and likely to be able to help...

Furthermore knowing _when_ it panics would also be a real boon.

All I can remember is it panics during bootup. Apparently before executing the /etc/rc.d/init.d/* scripts.

I will post a new message when I'll be able to get the missing infos but I guess I'll probably have to file a bugreport if I want that bug to be fixed.

Regards, Gilles.

Scott Silva

11:33 p.m.

Gilles CHAUVIN spake the following on 3/20/2006 3:12 PM:

...

On 3/20/06, Maciej Żenczykowski maze@cela.pl wrote:

...
I'm pretty sure that U3 will have the same kernel. As such it would probably be best to fix this now, instead of waiting for U3 to come out. Could you tell us when the kernel panic happens? Processor? i586? i686? architecture? single? multiprocessor? ... etc ... any options with which it works (like noacpi & others).

$ cat /proc/cpuinfo processor : 0 vendor_id : GenuineIntel cpu family : 15 model : 4 model name : Intel(R) Pentium(R) 4 CPU 2.80GHz stepping : 1 cpu MHz : 2799.397 cache size : 1024 KB fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 5 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe pni monitor ds_cpl cid xtpr bogomips : 5521.40

$ sudo cat /boot/grub/menu.lst Password: # grub.conf generated by anaconda # # Note that you do not have to rerun grub after making changes to this file # NOTICE: You have a /boot partition. This means that # all kernel and initrd paths are relative to /boot/, eg. # root (hd0,0) # kernel /vmlinuz-version ro root=/dev/VG00/root # initrd /initrd-version.img #boot=/dev/sda1 default=1 timeout=5 splashimage=(hd0,0)/grub/splash.xpm.gz hiddenmenu title CentOS (2.6.9-34.EL) root (hd0,0) kernel /vmlinuz-2.6.9-34.EL ro root=/dev/VG00/root initrd /initrd-2.6.9-34.EL.img title CentOS (2.6.9-22.0.2.EL) root (hd0,0) kernel /vmlinuz-2.6.9-22.0.2.EL ro root=/dev/VG00/root initrd /initrd-2.6.9-22.0.2.EL.img

For now, I'm running the -22.0.2 kernel since this is the last kernel known to work on that hardware. I upgraded 5 PCs. 3 of them have the exact same hardware and the exact same failure (kernel panic).

One of those server isn't in production yet so it would be possible to do some testing. The above informations were gathered from a remote SSH session. I cannot give more details for now and I won't be able to tell more about that problem until next wednesday.

Gilles.

Is this a name brand server or a whitebox? Maybe if you have details on what hardware is in it. Does it by any chance have a nvidia chipset? Will it boot with kernel options like -noapic or -noht Maybe a check for a bios update on the system not in production.

Gilles CHAUVIN

22 Mar 22 Mar

3:27 p.m.

On 3/21/06, Scott Silva ssilva@sgvwater.com wrote:

...

Is this a name brand server or a whitebox?

NEC Express5800/TM710

...

Maybe if you have details on what hardware is in it.

The motherboard is an ASUS P4C800-E. The error message showed during the kernel panic was, apparently, about the chipset present on that mainboard.

...

Does it by any chance have a nvidia chipset?

Intel chipset.

...

Will it boot with kernel options like -noapic or -noht

See below

...

Maybe a check for a bios update on the system not in production.

BIOS is the latest available on the NEC website.

After having answered all those questions :), believe it or not, but on my test machine, I did an upgrade to CentOS 4.3 and tried to reboot and.... it worked !!!

I've done the same thing on the production server and, not surprisingly, it worked too. I don't know what happened but I guess there were a missing/not updated package that caused the kernel panic.

The file is closed for me :).

Thanks to all the people who answered my message.

Regards, Gilles, a happy CentOS admin :).

7049

Age (days ago)

7051

Last active (days ago)

discuss@lists.centos.org

18 comments

11 participants

tags (0)

participants (11)

Adam Gibson
Andrew Cotter
Chris Mauritz
Gilles CHAUVIN
Maciej Żenczykowski
Mike
Peter Farrow
R P Herrold
Robert Hanson
Sam Drinkard
Scott Silva