[CentOS] Cluster Broken Pipe error and Heartbeat configuration

lingu hicheerup at gmail.com
Wed Nov 12 13:47:30 UTC 2008


Hi,

  I am running two node active/passive cluster on RHEL3U8-64 bit
operating system for my oracle database,both the nodes are connected
to HP MSA-500 storage(scsi not Fibre channel) . Below are my hardware
and clumanager version details. It was running fine and stable for
last two years but all of a sudden for the past one month i am getting
below errors on syslog  and cluster restarting locally.

Server Hardware: HP ProLiant DL580 G4
OS: RHEL3U8-64BIT INTEL EMT
Kernel : 2.4.21-47.EL
Storage : HP MSA-500 storage (scsci channel)

Cluster Version:
clumanager-1.2.26.1-1
redhat-config-cluster-1.0.7-1

NODE1 ip: 20.2.135.161 (network bonding configured)
NODE2 ip: 20.2.135.162 (network bonding configured)
VIP : 20.2.135.35

Syslog errors

cluquorumd[1921]: <warning> Disk-TB: Detected I/O Hang!
clulockd[1996]: <warning> Potential recursive lock #0 grant to member
#1, PID1962
clulockd[1996]: <warning> Denied 20.1.135.162: Broken pipe
clulockd[1996]: <err> select error: Broken pipe
clulockd[1996]: <warning> Denied 20.1.135.162: Broken pipe
clulockd[1996]: <err> select error: Broken pipe
cluquorumd[1921]: <warning> Disk-TB: Detected I/O Hang!
clulockd[1996]: <warning> Denied 20.1.135.161: Broken pipe
clulockd[1996]: <err> select error: Broken pipe
clusvcmgrd[2011]: <err> Unable to obtain cluster lock: Connection timed out
cluquorumd[2100]: <err> VF: Abort: Invalid header in reply from member #0
cluquorumd[1934]: <err> __msg_send: Incomplete write to 13. Error:
Connection reset by peer

 Can any one guide me  what is this above error indicates and how to
troubleshoot.After a long google search i found the below link from
redhat that is matching my scenario.Can i follow the same because it
is my very critical production server.

https://bugzilla.redhat.com/show_bug.cgi?id=185484


 Also  anyone help me to configure a dedicated LAN (for example eth3)
as heartbeat(private  point to point cross over cable network for
cluster communications),I don't wish heartbeat over public LAN ,
because of heavy Network saturation.

 Fot the above heartbeat configuration  i didnot found any suitable
document for rhel. Can any one provide me the suitable link or guide
me what are all the changes i have to made in my  existing cluster.xml
 file for this private heartbeat configuration to work.

Waiting for some one reply its urgent for me

Regards,
Lingu


More information about the CentOS mailing list