This is an issue that I have been having with one of our production servers for a couple of months, just out of no where /var goes to read only.
There aren't any errors that show up in /var/log/messages nor dmesg, I checked to make sure that the drives on error option is set to continue and there isn't anything as far as I can tell wrong with the disks. Also there is plenty of disk space.
Here is the output of tune2fs -l /dev/sda3:
Filesystem volume name: /var1 Last mounted on: <not available> Filesystem UUID: 6bb81ddf-b0f8-4c47-ae9f-b39f504c114e Filesystem magic number: 0xEF53 Filesystem revision #: 1 (dynamic) Filesystem features: has_journal ext_attr resize_inode dir_index filetype needs_recovery sparse_super large_file Default mount options: (none) Filesystem state: clean Errors behavior: Continue Filesystem OS type: Linux Inode count: 1281696 Block count: 2560359 Reserved block count: 128017 Free blocks: 1946888 Free inodes: 1279680 First block: 0 Block size: 4096 Fragment size: 4096 Reserved GDT blocks: 625 Blocks per group: 32768 Fragments per group: 32768 Inodes per group: 16224 Inode blocks per group: 507 Filesystem created: Wed Jun 29 10:43:21 2005 Last mount time: Fri Dec 1 10:30:30 2006 Last write time: Fri Dec 1 10:30:30 2006 Mount count: 21 Maximum mount count: -1 Last checked: Wed Jun 29 10:43:21 2005 Check interval: 0 (<none>) Reserved blocks uid: 0 (user root) Reserved blocks gid: 0 (group root) First inode: 11 Inode size: 128 Journal inode: 8 Default directory hash: tea Directory Hash Seed: 6f24f054-98af-4bd8-9af2-d40254126ac7 Journal backup: inode blocks
Here is what is in /etc/fstab:
# This file is edited by fstab-sync - see 'man fstab-sync' for details LABEL=/1 / ext3 defaults 1 1 none /dev/pts devpts gid=5,mode=620 0 0 none /dev/shm tmpfs defaults 0 0 none /proc proc defaults 0 0 LABEL=/qt1 /opt ext3 defaults 1 2 none /sys sysfs defaults 0 0 LABEL=/var1 /var ext3 defaults 1 2 LABEL=SWAP-sda5 swap swap defaults 0 0 /dev/hda /media/cdrom auto pamconsole,exec,noauto,managed 0 0 /dev/fd0 /media/floppy auto pamconsole,exec,noauto,managed 0 0
Here is the output of df -h:
Filesystem Size Used Avail Use% Mounted on /dev/sda2 22G 2.0G 19G 10% / none 4.0G 0 4.0G 0% /dev/shm /dev/sdb1 269G 48G 208G 19% /opt /dev/sda3 9.7G 2.3G 7.0G 25% /var
Kernel version: 2.6.9-42.0.3.ELsmp
Any help would be greatly appreciated!!
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On Fri, Dec 01, 2006 at 02:07:49PM -0700, Joshua Gimer wrote:
This is an issue that I have been having with one of our production servers for a couple of months, just out of no where /var goes to read only.
There aren't any errors that show up in /var/log/messages nor dmesg, I checked to make sure that the drives on error option is set to continue and there isn't anything as far as I can tell wrong with the disks. Also there is plenty of disk space.
My suggestion is the standard one: try forcing a check on the disk.
[]s
- -- Rodrigo Barbosa "Quid quid Latine dictum sit, altum viditur" "Be excellent to each other ..." - Bill & Ted (Wyld Stallyns)
On Dec 1, 2006, at 16:07, Joshua Gimer wrote:
This is an issue that I have been having with one of our production servers for a couple of months, just out of no where /var goes to read only.
Maybe the disk is going bad. I had the same (or similar) thing happen to me with the root partition, which in my case also contains the /var partition. It turned out to be a disk going bad. Check the archives of this mailing list for a thread titled "EXT3-fs error (devive dm-0) in start_transaction: Journal has aborted" earlier this year. Run smarctl on the drive with the /var partition if it supports it.
Alfred
Thanks, that is what we were thinking was happening. Smartd will not start at boot, it parses the config file just fine and then fails.
Thanks for all your help!
On 12/1/06, Alfred von Campe alfred@110.net wrote:
On Dec 1, 2006, at 16:07, Joshua Gimer wrote:
This is an issue that I have been having with one of our production servers for a couple of months, just out of no where /var goes to read only.
Maybe the disk is going bad. I had the same (or similar) thing happen to me with the root partition, which in my case also contains the /var partition. It turned out to be a disk going bad. Check the archives of this mailing list for a thread titled "EXT3-fs error (devive dm-0) in start_transaction: Journal has aborted" earlier this year. Run smarctl on the drive with the /var partition if it supports it.
Alfred
CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
On Friday 01 December 2006 15:24, Joshua Gimer wrote:
Thanks, that is what we were thinking was happening. Smartd will not start at boot, it parses the config file just fine and then fails.
Well, unless that's different than noted behavior before there were problems, that doesn't really indicate a bad drive any more than a drive/driver that doesn't support smartd. Smartd doesn't work on most sata drives with the sata driver included in the stock CentOS kernel.
When I have seen smartd working, it starts fine but puts messages in the logs about drive problems when there are actual problems.
Kevan Benson wrote:
On Friday 01 December 2006 15:24, Joshua Gimer wrote:
Thanks, that is what we were thinking was happening. Smartd will not start at boot, it parses the config file just fine and then fails.
Well, unless that's different than noted behavior before there were problems, that doesn't really indicate a bad drive any more than a drive/driver that doesn't support smartd. Smartd doesn't work on most sata drives with the sata driver included in the stock CentOS kernel.
smart does work, the default config from from redhat is wrong see bug #176835 and #187181. The output from smartctl used to be wrong. The correct command was sent by Alfred
Use '-d ata'. -d is for device type, not debug.
Older versions of smart incorrectly said to use '-d libata'. Old versions of CentOS-4 did not support smart on sata but no one would be running anything that old would they?
If you want smartd to start at boot, edit /etc/smartd.conf and add '-d ata'
You should probably also read the instructions because out of the box it probably won't do much useful work.
My config file looks like this: /dev/sda -d ata -a -m smart-errors@xxxx.com -s S/../.././02|L/../01/./04 -I 1 -I 194 -I 195
John.
When I have seen smartd working, it starts fine but puts messages in the logs about drive problems when there are actual problems.
On 12/4/06, John Newbigin jnewbigin@ict.swin.edu.au wrote:
Kevan Benson wrote:
On Friday 01 December 2006 15:24, Joshua Gimer wrote:
Thanks, that is what we were thinking was happening. Smartd will not start at boot, it parses the config file just fine and then fails.
Well, unless that's different than noted behavior before there were problems, that doesn't really indicate a bad drive any more than a drive/driver that doesn't support smartd. Smartd doesn't work on most sata drives with the sata driver included in the stock CentOS kernel.
smart does work, the default config from from redhat is wrong see bug #176835 and #187181. The output from smartctl used to be wrong. The correct command was sent by Alfred
Use '-d ata'. -d is for device type, not debug.
I was referring to the "-d" flag that you pass to smartd on the command line:
-d, --debug Start smartd in debug mode
Not the configuration flag that goes into the smartd.conf configuration file. Sorry for any confusion.
Thanks, - Ryan
Greetings, Matty.
Well, unless that's different than noted behavior before there were problems, that doesn't really indicate a bad drive any more than a drive/driver that doesn't support smartd. Smartd doesn't work on most sata drives with the sata driver included in the stock CentOS kernel.
smart does work, the default config from from redhat is wrong see bug #176835 and #187181. The output from smartctl used to be wrong. The correct command was sent by Alfred
Use '-d ata'. -d is for device type, not debug.
Hmm, and how about this:
[root@cappa etc]# smartctl -a /dev/sda smartctl version 5.33 [i686-redhat-linux-gnu] Copyright (C) 2002-4 Bruce Allen Home page is http://smartmontools.sourceforge.net/
Device: ATA ST3160812AS Version: 3.AA
SATA disks accessed via libata are not currently supported by smartmontools. When libata is given an ATA pass-thru ioctl() then an additional '-d libata' device type will be added to smartmontools.
Server has got CentOS4.4 installed, with two SATA discs, attached to an Intel SATA onboard controller. CentOS4.4 uses libata as a kernel module for it. The output from smartctl says all for itself.
On Sunday 03 December 2006 21:22, John Newbigin wrote:
Kevan Benson wrote:
On Friday 01 December 2006 15:24, Joshua Gimer wrote:
Thanks, that is what we were thinking was happening. Smartd will not start at boot, it parses the config file just fine and then fails.
Well, unless that's different than noted behavior before there were problems, that doesn't really indicate a bad drive any more than a drive/driver that doesn't support smartd. Smartd doesn't work on most sata drives with the sata driver included in the stock CentOS kernel.
smart does work, the default config from from redhat is wrong see bug #176835 and #187181. The output from smartctl used to be wrong. The correct command was sent by Alfred
Use '-d ata'. -d is for device type, not debug.
Older versions of smart incorrectly said to use '-d libata'. Old versions of CentOS-4 did not support smart on sata but no one would be running anything that old would they?
If you want smartd to start at boot, edit /etc/smartd.conf and add '-d ata'
You should probably also read the instructions because out of the box it probably won't do much useful work.
My config file looks like this: /dev/sda -d ata -a -m smart-errors@xxxx.com -s S/../.././02|L/../01/./04 -I 1 -I 194 -I 195
I remember when I first started encountering this, I researched it and found that the kernel module/subsystem (libata) was noted to not support smartd, and I hadn't seen anything noting that the regular ata command set worked.
Or maybe I just went by the smartd debug error message (smartd -d) which indicates that SATA just plain isn't supported, which seems to be incorrect based on the bugs you mentioned. In any case, it looks like I can use smartd on lots of system I thought I couldn't, thanks for the info.
Thanks, that is what we were thinking was happening. Smartd will not start at boot, it parses the config file just fine and then fails.
Use this command:
smartctl -d ata -t long /dev/sda
Substitute ata and the device for the appropriate type on your system. Then wait a while for the test to complete and do a "smartctl -a /dev/sda" to see the report.
Alfred
On 12/1/06, Joshua Gimer jgimer@gmail.com wrote:
Thanks, that is what we were thinking was happening. Smartd will not start at boot, it parses the config file just fine and then fails.
When I bump into these types of issues, I like to run smartd by hand with the "-d" (debug) option. This can assist with locating problematic devices, and if the information is fed back into the smartmontools community, it can be used to update the device model information in the smartctl database (which can be printed by running smartctl with the "-P show" option).
Thanks, - Ryan