I just recently upgraded a box from i386 5.3 -> 5.4. The box has heartbeat packages installed from "extras:" heartbeat-pils-2.1.3-3.el5.centos heartbeat-stonith-2.1.3-3.el5.centos heartbeat-devel-2.1.3-3.el5.centos heartbeat-2.1.3-3.el5.centos The heartbeat daemon no longer starts.. the init script reports a success, as well as the logs: May 4 22:33:10 fc-fmcln02 heartbeat: [9344]: info: Enabling logging daemon May 4 22:33:10 fc-fmcln02 heartbeat: [9344]: info: logfile and debug file are those specified in logd config file (default /etc/logd.cf) May 4 22:33:10 fc-fmcln02 heartbeat: [9344]: info: Version 2 support: false May 4 22:33:10 fc-fmcln02 heartbeat: [9344]: WARN: logd is enabled but logfile/debugfile is still configured in ha.cf May 4 22:33:10 fc-fmcln02 heartbeat: [9344]: info: ************************** May 4 22:33:10 fc-fmcln02 heartbeat: [9344]: info: Configuration validated. Starting heartbeat 2.1.3 May 4 22:33:10 fc-fmcln02 heartbeat: [9345]: info: heartbeat: version 2.1.3 May 4 22:33:11 fc-fmcln02 heartbeat: [9345]: info: Heartbeat generation: 1208455492 However, the daemons never actually start. When I run the daemon interactively without the init script, the following error appears: heartbeat[8818]: 2010/05/04_22:23:37 ERROR: Cannot shmget for process status: Invalid argument This may suggest that some libs on the system may have been upgraded and heartbeat is trying to use the old ones? Does anyone have any suggestions on how to get heartbeat working again? Thanks, Josh
On 5/4/2010 11:39 PM, Baird, Josh wrote:
I just recently upgraded a box from i386 5.3 -> 5.4. The box has heartbeat packages installed from "extras:" heartbeat-pils-2.1.3-3.el5.centos heartbeat-stonith-2.1.3-3.el5.centos heartbeat-devel-2.1.3-3.el5.centos heartbeat-2.1.3-3.el5.centos The heartbeat daemon no longer starts.. the init script reports a success, as well as the logs: May 4 22:33:10 fc-fmcln02 heartbeat: [9344]: info: Enabling logging daemon May 4 22:33:10 fc-fmcln02 heartbeat: [9344]: info: logfile and debug file are those specified in logd config file (default /etc/logd.cf) May 4 22:33:10 fc-fmcln02 heartbeat: [9344]: info: Version 2 support: false May 4 22:33:10 fc-fmcln02 heartbeat: [9344]: WARN: logd is enabled but logfile/debugfile is still configured in ha.cf May 4 22:33:10 fc-fmcln02 heartbeat: [9344]: info: ************************** May 4 22:33:10 fc-fmcln02 heartbeat: [9344]: info: Configuration validated. Starting heartbeat 2.1.3 May 4 22:33:10 fc-fmcln02 heartbeat: [9345]: info: heartbeat: version 2.1.3 May 4 22:33:11 fc-fmcln02 heartbeat: [9345]: info: Heartbeat generation: 1208455492
However, the daemons never actually start. When I run the daemon interactively without the init script, the following error appears: heartbeat[8818]: 2010/05/04_22:23:37 ERROR: Cannot shmget for process status: Invalid argument This may suggest that some libs on the system may have been upgraded and heartbeat is trying to use the old ones? Does anyone have any suggestions on how to get heartbeat working again?
Running heartbeat on Centos 5.4 here without a problem. Just powered up my test cluster and made sure system was up-to-date using yum. Heartbeat started without a problem.
Perhaps you have selinux enabled on the system? Can you try disabling selinux?
This may sound like a half-hearted attempt to 'repair' the issue, but try backing up your authkeys, ha.cf and haresources on each host and try removing and reinstalling the packages. At this point you have nothing to lose since the daemons will not start.
SELINUX is disabled, and I have also tried reinstalling the heartbeat related packages. No luck so far. heartbeat[8818]: 2010/05/04_22:23:37 ERROR: Cannot shmget for process status: Invalid argument This seems to be the issue. Any other ideas?
Thanks
________________________________
From: centos-bounces@centos.org on behalf of Ryan Manikowski Sent: Tue 5/4/2010 11:23 PM To: centos@centos.org Subject: Re: [CentOS] heartbeat package in extras trouble with 5.4
On 5/4/2010 11:39 PM, Baird, Josh wrote:
I just recently upgraded a box from i386 5.3 -> 5.4. The box has heartbeat packages installed from "extras:" heartbeat-pils-2.1.3-3.el5.centos heartbeat-stonith-2.1.3-3.el5.centos heartbeat-devel-2.1.3-3.el5.centos heartbeat-2.1.3-3.el5.centos The heartbeat daemon no longer starts.. the init script reports a success, as well as the logs: May 4 22:33:10 fc-fmcln02 heartbeat: [9344]: info: Enabling logging daemon May 4 22:33:10 fc-fmcln02 heartbeat: [9344]: info: logfile and debug file are those specified in logd config file (default /etc/logd.cf) May 4 22:33:10 fc-fmcln02 heartbeat: [9344]: info: Version 2 support: false May 4 22:33:10 fc-fmcln02 heartbeat: [9344]: WARN: logd is enabled but logfile/debugfile is still configured in ha.cf May 4 22:33:10 fc-fmcln02 heartbeat: [9344]: info: ************************** May 4 22:33:10 fc-fmcln02 heartbeat: [9344]: info: Configuration validated. Starting heartbeat 2.1.3 May 4 22:33:10 fc-fmcln02 heartbeat: [9345]: info: heartbeat: version 2.1.3 May 4 22:33:11 fc-fmcln02 heartbeat: [9345]: info: Heartbeat generation: 1208455492 However, the daemons never actually start. When I run the daemon interactively without the init script, the following error appears: heartbeat[8818]: 2010/05/04_22:23:37 ERROR: Cannot shmget for process status: Invalid argument This may suggest that some libs on the system may have been upgraded and heartbeat is trying to use the old ones? Does anyone have any suggestions on how to get heartbeat working again?
Running heartbeat on Centos 5.4 here without a problem. Just powered up my test cluster and made sure system was up-to-date using yum. Heartbeat started without a problem.
Perhaps you have selinux enabled on the system? Can you try disabling selinux?
This may sound like a half-hearted attempt to 'repair' the issue, but try backing up your authkeys, ha.cf and haresources on each host and try removing and reinstalling the packages. At this point you have nothing to lose since the daemons will not start.
On Wed, May 05, 2010 at 07:59:16AM -0500, Baird, Josh wrote:
SELINUX is disabled, and I have also tried reinstalling the heartbeat related packages. No luck so far.
heartbeat[8818]: 2010/05/04_22:23:37 ERROR: Cannot shmget for process status: Invalid argument
This seems to be the issue. Any other ideas?
Try to strace heartbeat process and check errno from shmget() and compare it against shmget(2) `ERRORS' section. Maybe you need to set some sysctls.
Below is a snippet of a strace:
open("/usr/lib/pils/plugins/InterfaceMgr/generic.so", O_RDONLY) = 3 read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\300\6\0\0004\0\0\0"..., 512) = 512 fstat64(3, {st_mode=S_IFREG|0755, st_size=7636, ...}) = 0 mmap2(NULL, 10532, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x6ef000 mmap2(0x6f1000, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1) = 0x6f1000 close(3) = 0 shmget(IPC_PRIVATE, 7816, 0600) = -1 EINVAL (Invalid argument) time(NULL) = 1273067316 open("/etc/localtime", O_RDONLY) = 3 fstat64(3, {st_mode=S_IFREG|0644, st_size=3543, ...}) = 0 fstat64(3, {st_mode=S_IFREG|0644, st_size=3543, ...}) = 0 mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7f4f000 read(3, "TZif2\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\5\0\0\0\5\0\0\0\0"..., 4096) = 3543 close(3) = 0 munmap(0xb7f4f000, 4096) = 0 write(2, "heartbeat[28218]: 2010/05/05_08:"..., 38) = 38 write(2, "ERROR: Cannot shmget for process"..., 58) = 58
It looks like it's trying to find an existing shared memory segment?
From sysctl.conf:
# Controls the maximum size of a message, in bytes kernel.msgmnb = 65536 # Controls the default maxmimum size of a mesage queue kernel.msgmax = 65536 # Controls the maximum shared segment size, in bytes kernel.shmmax = 68719476736 # Controls the maximum number of shared memory segments, in pages kernel.shmall = 4294967296
$ ipcs
------ Shared Memory Segments -------- key shmid owner perms bytes nattch status
------ Semaphore Arrays -------- key semid owner perms nsems 0x00000000 2424832 root 600 1 0x00000000 2719745 root 600 1 0x00000000 2752514 root 600 1 0x00000000 2785283 root 600 1 0x00000000 2818052 root 600 1 0x00000000 2850821 root 600 1 0x00000000 2883590 root 600 1 0x00000000 2916359 root 600 1 0x00000000 3506184 root 600 1 0x01fe101f 3014665 root 600 1 0x00000000 3407882 root 666 1 0x00000000 3080203 root 600 1 0x00000000 3112972 root 600 1 0x00000000 3145741 root 600 1 0x00000000 3178510 root 600 1 0x00000000 3211279 root 600 1 0x00000000 3244048 root 600 1 0x00000000 3276817 root 600 1 0x00000000 3309586 root 600 1 0x00000000 3342355 root 600 1 0x00000000 3440660 root 600 1 0x00000000 3473429 root 600 1
------ Message Queues -------- key msqid owner perms used-bytes messages
I'm stuck!
Josh
-----Original Message----- From: centos-bounces@centos.org [mailto:centos-bounces@centos.org] On Behalf Of Dominik Zyla Sent: Wednesday, May 05, 2010 8:10 AM To: centos@centos.org Subject: Re: [CentOS] heartbeat package in extras trouble with 5.4
On Wed, May 05, 2010 at 07:59:16AM -0500, Baird, Josh wrote:
SELINUX is disabled, and I have also tried reinstalling the heartbeat related packages. No luck so far.
heartbeat[8818]: 2010/05/04_22:23:37 ERROR: Cannot shmget for process status: Invalid argument
This seems to be the issue. Any other ideas?
Try to strace heartbeat process and check errno from shmget() and compare it against shmget(2) `ERRORS' section. Maybe you need to set some sysctls.
-- Dominik Zyla
Ok, so for some reason, I had shmax set to 64GB. Prior to 5.4, I'm guessing that i386 just ignored this absurd value, but now, it forces the value to be 0:
root@fc-fmcln02:/var/log$ ipcs -l ------ Shared Memory Limits -------- max number of segments = 4096 max seg size (kbytes) = 0 max total shared memory (kbytes) = 0 min seg size (bytes) = 1
I set kernal.shnmax back to it's default value and this seemed to fix the heartbeat issues with allocating shared memory segments.
Thanks,
Josh
-----Original Message----- From: centos-bounces@centos.org [mailto:centos-bounces@centos.org] On Behalf Of Baird, Josh Sent: Wednesday, May 05, 2010 9:39 AM To: CentOS mailing list Subject: Re: [CentOS] heartbeat package in extras trouble with 5.4
Below is a snippet of a strace:
open("/usr/lib/pils/plugins/InterfaceMgr/generic.so", O_RDONLY) = 3 read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\300\6\0\0004\0\0\0"..., 512) = 512 fstat64(3, {st_mode=S_IFREG|0755, st_size=7636, ...}) = 0 mmap2(NULL, 10532, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x6ef000 mmap2(0x6f1000, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1) = 0x6f1000 close(3) = 0 shmget(IPC_PRIVATE, 7816, 0600) = -1 EINVAL (Invalid argument) time(NULL) = 1273067316 open("/etc/localtime", O_RDONLY) = 3 fstat64(3, {st_mode=S_IFREG|0644, st_size=3543, ...}) = 0 fstat64(3, {st_mode=S_IFREG|0644, st_size=3543, ...}) = 0 mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7f4f000 read(3, "TZif2\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\5\0\0\0\5\0\0\0\0"..., 4096) = 3543 close(3) = 0 munmap(0xb7f4f000, 4096) = 0 write(2, "heartbeat[28218]: 2010/05/05_08:"..., 38) = 38 write(2, "ERROR: Cannot shmget for process"..., 58) = 58
It looks like it's trying to find an existing shared memory segment?
From sysctl.conf:
# Controls the maximum size of a message, in bytes kernel.msgmnb = 65536 # Controls the default maxmimum size of a mesage queue kernel.msgmax = 65536 # Controls the maximum shared segment size, in bytes kernel.shmmax = 68719476736 # Controls the maximum number of shared memory segments, in pages kernel.shmall = 4294967296
$ ipcs
------ Shared Memory Segments -------- key shmid owner perms bytes nattch status
------ Semaphore Arrays -------- key semid owner perms nsems 0x00000000 2424832 root 600 1 0x00000000 2719745 root 600 1 0x00000000 2752514 root 600 1 0x00000000 2785283 root 600 1 0x00000000 2818052 root 600 1 0x00000000 2850821 root 600 1 0x00000000 2883590 root 600 1 0x00000000 2916359 root 600 1 0x00000000 3506184 root 600 1 0x01fe101f 3014665 root 600 1 0x00000000 3407882 root 666 1 0x00000000 3080203 root 600 1 0x00000000 3112972 root 600 1 0x00000000 3145741 root 600 1 0x00000000 3178510 root 600 1 0x00000000 3211279 root 600 1 0x00000000 3244048 root 600 1 0x00000000 3276817 root 600 1 0x00000000 3309586 root 600 1 0x00000000 3342355 root 600 1 0x00000000 3440660 root 600 1 0x00000000 3473429 root 600 1
------ Message Queues -------- key msqid owner perms used-bytes messages
I'm stuck!
Josh
-----Original Message----- From: centos-bounces@centos.org [mailto:centos-bounces@centos.org] On Behalf Of Dominik Zyla Sent: Wednesday, May 05, 2010 8:10 AM To: centos@centos.org Subject: Re: [CentOS] heartbeat package in extras trouble with 5.4
On Wed, May 05, 2010 at 07:59:16AM -0500, Baird, Josh wrote:
SELINUX is disabled, and I have also tried reinstalling the heartbeat related packages. No luck so far.
heartbeat[8818]: 2010/05/04_22:23:37 ERROR: Cannot shmget for process status: Invalid argument
This seems to be the issue. Any other ideas?
Try to strace heartbeat process and check errno from shmget() and compare it against shmget(2) `ERRORS' section. Maybe you need to set some sysctls.
-- Dominik Zyla
_______________________________________________ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos