Hi Folks -
Using CentOS on a server destined to have a dozen SATA drives in it. The server is fine, raid 5 is set up on groups of 4 SATA drives.
Today we decide to disconnect one SATA drive to simulate a failure. The box trucked on fine... a little too fine. We waited some minutes but no problem was visible in /proc/mdstat or in /var/log/messages or on the console.
I ran mdadm --monitor /dev/md0 and no problem was shown.
We rebooted still without the drive and finally mdadm --monitor reported that the array was running in a degraded state.
We reconnected the SATA drive and still nothing was reported and nothing happened with the raid state according to /proc/mdstat.
I expected the box to keep on trucking but to become freaked out with warnings all over the shop. What should I have expected in this case and what should I do to become aware of evil events like the drive melting remotely?
-Andy