Hello,
I have CentOS 5.2 installed on two of the afore-mentioned blades. I've noticed that the OS started to crash lately (kernel panic) and I've been assigned the task to troubleshoot this issue.
I would like to know what is the best way of recovering the kernel dump after the OS crashes.
I know there are two software implementations that would enable me to do this, kexec and 'crash' , redhat's own implementation that allows you to pust the dump via network to a remote machine.
What would be the best thing to do at this point?
Regards,
-Andrei F
On Thu, Jul 16, 2009 at 5:31 PM, Andrei Ffrunzales@gmail.com wrote:
I have CentOS 5.2 installed on two of the afore-mentioned blades. I've noticed that the OS started to crash lately (kernel panic) and I've been assigned the task to troubleshoot this issue.
I would like to know what is the best way of recovering the kernel dump after the OS crashes.
I know there are two software implementations that would enable me to do this, kexec and 'crash' , redhat's own implementation that allows you to pust the dump via network to a remote machine.
What would be the best thing to do at this point?
I don't have any experience for how to debug this. But I have CentOS 5 running fine on the same kind of Blades. I have about 25 of them and have seen no crashes. What kernel are you running ? Did you upgrade the firmware on the Blades to the latest version ?
Regards, Tim
Tim Verhoeven schrieb:
On Thu, Jul 16, 2009 at 5:31 PM, Andrei Ffrunzales@gmail.com wrote:
I have CentOS 5.2 installed on two of the afore-mentioned blades. I've noticed that the OS started to crash lately (kernel panic) and I've been assigned the task to troubleshoot this issue.
I would like to know what is the best way of recovering the kernel dump after the OS crashes.
I know there are two software implementations that would enable me to do this, kexec and 'crash' , redhat's own implementation that allows you to pust the dump via network to a remote machine.
What would be the best thing to do at this point?
I don't have any experience for how to debug this. But I have CentOS 5 running fine on the same kind of Blades. I have about 25 of them and have seen no crashes. What kernel are you running ? Did you upgrade the firmware on the Blades to the latest version ?
Regards, Tim
There's also the BMC's firmware, and the I/O-modules firmware etc.
It shouldn't crash. Does it run the latest kernel?
Change the disks to another blade in another blade-center (if you have another) and see if it crashes there, too.
It's most likely a hardware-problem. Open a case with IBM.
Rainer
On Thu, Jul 16, 2009 at 11:31 AM, Andrei Ffrunzales@gmail.com wrote:
Hello,
I have CentOS 5.2 installed on two of the afore-mentioned blades. I've noticed that the OS started to crash lately (kernel panic) and I've been assigned the task to troubleshoot this issue.
I would like to know what is the best way of recovering the kernel dump after the OS crashes.
I know there are two software implementations that would enable me to do this, kexec and 'crash' , redhat's own implementation that allows you to pust the dump via network to a remote machine.
I've used kdump to troubleshoot these types of issues in the past:
http://prefetch.net/blog/index.php/2009/07/06/using-kdump-to-get-core-files-...
Once you have a core file, you can use crash and company to figure out what caused the kernel to panic.
- Ryan -- http://prefetch.net
Hello,
Thank you Matty, I will follow that tutorial and configure my servers accordingly.
Regards,
-Andrei
On Thu, Jul 16, 2009 at 7:38 PM, Matty matty91@gmail.com wrote:
On Thu, Jul 16, 2009 at 11:31 AM, Andrei Ffrunzales@gmail.com wrote:
Hello,
I have CentOS 5.2 installed on two of the afore-mentioned blades. I've noticed that the OS started to crash lately (kernel panic) and I've been assigned the task to troubleshoot this issue.
I would like to know what is the best way of recovering the kernel dump after the OS crashes.
I know there are two software implementations that would enable me to do this, kexec and 'crash' , redhat's own implementation that allows you to pust the dump via network to a remote machine.
I've used kdump to troubleshoot these types of issues in the past:
http://prefetch.net/blog/index.php/2009/07/06/using-kdump-to-get-core-files-...
Once you have a core file, you can use crash and company to figure out what caused the kernel to panic.
- Ryan
-- http://prefetch.net _______________________________________________ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Hi,
I've configured my servers as described here:
http://prefetch.net/blog/index.php/2009/07/06/using-kdump-to-get-core-files-...
When I try to start the kdump service via service kdump start, I get the following warnings:
[root@lweb2 boot]# service kdump start No kdump initial ramdisk found. [WARNING] Rebuilding /boot/initrd-2.6.18-92.1.22.el5kdump.img Starting kdump: [FAILED]
First of all I like the idea of automatically building an initrd image with kdump support, but I also need MPP support. Just to give you an example, this is how both machines are booting up:
title RDAC CentOS (2.6.18-92.1.22.el5) with MPP root (hd0,0) kernel /vmlinuz-2.6.18-92.1.22.el5 ro root=/dev/VolGroup00/LogVol00 initrd /mpp-2.6.18-92.1.22.el5.img
At this point I'm wondering how to generate an initrd image with mpp & kdump support.
Also /var/log/messages gives me this:
Jul 17 11:42:27 lweb2 kdump: No crashkernel parameter specified for running kernel Jul 17 11:42:27 lweb2 kdump: failed to start up
I assume that once the server is being rebooted with the correct kernel arguments like crashkernel=128M@16M and the correct initrd with mpp & kdump support the service should start just fine.
Regards,
-Andrei
On Thu, Jul 16, 2009 at 7:38 PM, Matty matty91@gmail.com wrote:
On Thu, Jul 16, 2009 at 11:31 AM, Andrei Ffrunzales@gmail.com wrote:
Hello,
I have CentOS 5.2 installed on two of the afore-mentioned blades. I've noticed that the OS started to crash lately (kernel panic) and I've been assigned the task to troubleshoot this issue.
I would like to know what is the best way of recovering the kernel dump after the OS crashes.
I know there are two software implementations that would enable me to do this, kexec and 'crash' , redhat's own implementation that allows you to pust the dump via network to a remote machine.
I've used kdump to troubleshoot these types of issues in the past:
http://prefetch.net/blog/index.php/2009/07/06/using-kdump-to-get-core-files-...
Once you have a core file, you can use crash and company to figure out what caused the kernel to panic.
- Ryan
Hi,
On Fri, Jul 17, 2009 at 11:50, Andrei Ffrunzales@gmail.com wrote:
On Thu, Jul 16, 2009 at 11:31 AM, Andrei Ffrunzales@gmail.com wrote:
Hello,
I have CentOS 5.2 installed on two of the afore-mentioned blades. I've noticed that the OS started to crash lately (kernel panic) and I've been assigned the task to troubleshoot this issue.
I also need MPP support. Just to give you an example, this is how both machines are booting up:
title RDAC CentOS (2.6.18-92.1.22.el5) with MPP root (hd0,0) kernel /vmlinuz-2.6.18-92.1.22.el5 ro root=/dev/VolGroup00/LogVol00 initrd /mpp-2.6.18-92.1.22.el5.img
Couldn't RDAC be the cause of your kernel panics?
Did you try to use the multi-path package that is built-in to CentOS? (I believe the RPM for that would be device-mapper-multipath.)
HTH, Filipe
On Fri, Jul 17, 2009 at 11:50 AM, Andrei Ffrunzales@gmail.com wrote:
Hi,
I've configured my servers as described here:
http://prefetch.net/blog/index.php/2009/07/06/using-kdump-to-get-core-files-...
When I try to start the kdump service via service kdump start, I get the following warnings:
[root@lweb2 boot]# service kdump start No kdump initial ramdisk found. [WARNING] Rebuilding /boot/initrd-2.6.18-92.1.22.el5kdump.img Starting kdump: [FAILED]
First of all I like the idea of automatically building an initrd image with kdump support, but I also need MPP support. Just to give you an example, this is how both machines are booting up:
title RDAC CentOS (2.6.18-92.1.22.el5) with MPP root (hd0,0) kernel /vmlinuz-2.6.18-92.1.22.el5 ro root=/dev/VolGroup00/LogVol00 initrd /mpp-2.6.18-92.1.22.el5.img
At this point I'm wondering how to generate an initrd image with mpp & kdump support.
Also /var/log/messages gives me this:
Jul 17 11:42:27 lweb2 kdump: No crashkernel parameter specified for running kernel Jul 17 11:42:27 lweb2 kdump: failed to start up
I assume that once the server is being rebooted with the correct kernel arguments like crashkernel=128M@16M and the correct initrd with mpp & kdump support the service should start just fine.
You are correct. The init script checks the kernel command line (/proc/cmdline) for the crashkernel line. If it's not present, it will fail to initialize.
- Ryan -- http://prefetch.net
Hello,
My servers didn't crash during the week-end which is a good sign. However I'm still trying to get a dump in /var/crash. For this I need an init ramdisk that's RDAC & MPP enabled. As I previously said I am using LSI's drivers in order to access my SAN:
http://www.lsi.com/rdac/ds4000.html
The instalation procedure is easy. The driver package listed above comes with a SHELL script that builds a custom initrd which then gets copied in /boot:
[xxx@localhost ~]$ ls -l /boot/mpp-2.6.18-128.2.1.el5.img -rw------- 1 root root 4049795 Jul 20 15:28 /boot/mpp-2.6.18-128.2.1.el5.img
At this point I'm not sure if it's safe use /sbin/mkdumprd along with a custom /etc/kdump.conf that includes all the drivers in the above ramdisk image.
Do you guys have any experience at all with this kind of stuff?
Regards,
-Andrei
On Fri, Jul 17, 2009 at 2:26 PM, Matty matty91@gmail.com wrote:
On Fri, Jul 17, 2009 at 11:50 AM, Andrei Ffrunzales@gmail.com wrote:
Hi,
I've configured my servers as described here:
http://prefetch.net/blog/index.php/2009/07/06/using-kdump-to-get-core-files-...
When I try to start the kdump service via service kdump start, I get the following warnings:
[root@lweb2 boot]# service kdump start No kdump initial ramdisk found. [WARNING] Rebuilding /boot/initrd-2.6.18-92.1.22.el5kdump.img Starting kdump: [FAILED]
First of all I like the idea of automatically building an initrd image
with
kdump support, but I also need MPP support. Just to give you an example, this is how both machines are booting up:
title RDAC CentOS (2.6.18-92.1.22.el5) with MPP root (hd0,0) kernel /vmlinuz-2.6.18-92.1.22.el5 ro root=/dev/VolGroup00/LogVol00 initrd /mpp-2.6.18-92.1.22.el5.img
At this point I'm wondering how to generate an initrd image with mpp &
kdump
support.
Also /var/log/messages gives me this:
Jul 17 11:42:27 lweb2 kdump: No crashkernel parameter specified for
running
kernel Jul 17 11:42:27 lweb2 kdump: failed to start up
I assume that once the server is being rebooted with the correct kernel arguments like crashkernel=128M@16M and the correct initrd with mpp & kdump support the service should start just fine.
You are correct. The init script checks the kernel command line (/proc/cmdline) for the crashkernel line. If it's not present, it will fail to initialize.
- Ryan
-- http://prefetch.net _______________________________________________ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos