I am not sure if we can not send attachments to the mailing list. There were quite a lot replies before, but I got nothing back since attachements was added. I will remove the attachments and send it again. Please have a look at the email below. Thanks for your help.
---
Dear All,
Thanks for all your help. I will put all the comments together. Please have a look if there is any clue on such ghost problem. I have also attached the log files: dmesg, secure, messages. Please note that there is a message in secure when it exited just now. Oct 9 10:55:55 maya2012 su: pam_unix(su:session): session closed for user root
Can you trigger the error reliably by doing something network intensive, like scp or rsync a large file? I've seen similar behaviour with a bad NIC that was in the process of dying.
Yes, I copied tens of Gb files using rsync. It worked well.
That's very often a result of IP conflict. I'm assuming that you're connecting to an IPv4 address. If so, log in to your CentOS server and use arping to look for conflicts:
# arping -c 2 D -I em1 <your address>
The IP is fixed to my server. The network administrator has checked the address, and only this computer uses it. When I run the above command line, the output is:
[root@maya2012 hwang]# arping -c 2 -D -I em1 222.200.125.5 ARPING 222.200.125.5 from 0.0.0.0 em1 Sent 2 probes (2 broadcast(s)) Received 0 response(s)
- Login via Mac, Windows, Linux systems from different computers.
- Modify sshd_config on the server as suggested by many posts:
TCPKeepAlive yes ClientAliveInterval 60
TCPKeepAlive is "yes" by default. ClientAliveInterval doesn't appear to be a valid setting. Either TCPKeepAlive or ServerAliveInterval could be useful if the problem were a stateful firewall which was dropping your connection from its state table, and then resetting the connection in response to a later packet from your client.
Since those don't help, that tends to suggest that the problem isn't an intermediate host, but the server itself. Possibly an IP conflict. Also, check the output of "dmesg" to see if there are any problems recorded with the NIC. Check the output of "ifconfig" to see if there are TX or RX errors that increase when your connections are reset.
[root@maya2012 hwang]# ifconfig em1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 inet 222.200.125.5 netmask 255.255.255.128 broadcast 222.200.125.127 inet6 fe80::d6ae:52ff:fe6a:405e prefixlen 64 scopeid 0x20<link> ether d4:ae:52:6a:40:5e txqueuelen 1000 (Ethernet) RX packets 2865 bytes 396191 (386.9 KiB) RX errors 0 dropped 180 overruns 0 frame 0 TX packets 510 bytes 55844 (54.5 KiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
em2: flags=4099<UP,BROADCAST,MULTICAST> mtu 1500 ether d4:ae:52:6a:40:5f txqueuelen 1000 (Ethernet) RX packets 0 bytes 0 (0.0 B) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 0 bytes 0 (0.0 B) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
lo: flags=73<UP,LOOPBACK,RUNNING> mtu 65536 inet 127.0.0.1 netmask 255.0.0.0 inet6 ::1 prefixlen 128 scopeid 0x10<host> loop txqueuelen 0 (Local Loopback) RX packets 7 bytes 748 (748.0 B) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 7 bytes 748 (748.0 B) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
[root@maya2012 hwang]# ip -s -d l l 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 promiscuity 0 RX: bytes packets errors dropped overrun mcast 748 7 0 0 0 0 TX: bytes packets errors dropped carrier collsns 748 7 0 0 0 0 2: em1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT qlen 1000 link/ether d4:ae:52:6a:40:5e brd ff:ff:ff:ff:ff:ff promiscuity 0 RX: bytes packets errors dropped overrun mcast 312908 2272 0 138 0 1081 TX: bytes packets errors dropped carrier collsns 43946 403 0 0 0 0 3: em2: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN mode DEFAULT qlen 1000 link/ether d4:ae:52:6a:40:5f brd ff:ff:ff:ff:ff:ff promiscuity 0 RX: bytes packets errors dropped overrun mcast 0 0 0 0 0 0 TX: bytes packets errors dropped carrier collsns 0 0 0 0 0 0
Thanks,
Hua