Now I want to be able to monitor the box, specifically with respect to the RAID drives so I'll know if one has gone bad and the RAID configuration has failed over to it. Anyone have any suggestions for tools to use ?
We use 3Ware RAID cards, and they have a CLI util called tw_cli. We run "tw_cli info c0" every hour into a rotating file. Then from crontab we run a script that diffs the current output to the last hour's output.
If there's a difference, it emails us.