[CentOS] USB disk dropping out under light load

Thu Nov 9 07:07:29 UTC 2006
Simen Thoresen <simentt at dolphinics.no>

Hi all,

I'm running a pretty updated CentOS4 x86_64 server (Still on kernel 
2.6.9-42.0.2, but appart from that fully up to date against the official 
repos) with a USB-disk attached (the USB-disk is a 750G Seagate disk in a 
Seagate enclosure) over a USB hub.

I've noticed several times that after longish periods of activity, the disk 
drops out (log from last time, below). In this case, the disk activity was 
generated by running two bittorrent-clients (ie random access r/w patterns) 
fed by an 2Mbps connection (ie ~200kB/s datarate). I've been able to use the 
drive for copying multi-GB file-trees from the main disks (it's part of a 
backup project), so the failure below seems strange, and not related to traffic.

Nov  9 01:09:45 kasse kernel: SCSI error : <12 0 0 0> return code = 0x70000
Nov  9 01:09:45 kasse kernel: end_request: I/O error, dev sdg, sector 37463
Nov  9 01:10:34 kasse kernel: SCSI error : <12 0 0 0> return code = 0x6000000
Nov  9 01:10:34 kasse kernel: end_request: I/O error, dev sdg, sector 37471
Nov  9 01:10:40 kasse kernel: SCSI error : <12 0 0 0> return code = 0x70000
Nov  9 01:10:40 kasse kernel: end_request: I/O error, dev sdg, sector 37479
Nov  9 01:13:08 kasse kernel: SCSI error : <12 0 0 0> return code = 0x70000
Nov  9 01:13:08 kasse kernel: end_request: I/O error, dev sdg, sector 37487
Nov  9 01:13:08 kasse kernel: Buffer I/O error on device sdg1, logical block 
4678
Nov  9 01:13:08 kasse kernel: lost page write due to I/O error on sdg1
Nov  9 01:13:08 kasse kernel: Aborting journal on device sdg1.
Nov  9 01:13:13 kasse kernel: ext3_abort called.
Nov  9 01:13:13 kasse kernel: EXT3-fs error (device sdg1): 
ext3_journal_start_sb: Detected aborted journal
Nov  9 01:13:13 kasse kernel: Remounting filesystem read-only
Nov  9 01:13:41 kasse kernel: EXT3-fs error (device sdg1) in 
start_transaction: Journal has aborted
Nov  9 01:13:41 kasse kernel: EXT3-fs error (device sdg1) in 
start_transaction: Journal has aborted
Nov  9 01:15:05 kasse kernel: usb 1-1.1: reset high speed USB device using 
address 13
Nov  9 01:15:10 kasse kernel: usb 1-1.1: control timeout on ep0out
Nov  9 01:15:15 kasse kernel: usb 1-1.1: control timeout on ep0out
Nov  9 01:15:15 kasse kernel: usb 1-1.1: device not accepting address 13, 
error -110
Nov  9 01:15:15 kasse kernel: usb-storage: Bus reset ended with -19
Nov  9 01:15:15 kasse kernel: scsi: Device offlined - not ready after error 
recovery: host 12 channel 0 id 0 lun 0
Nov  9 01:15:15 kasse kernel: SCSI error : <12 0 0 0> return code = 0x70000
Nov  9 01:15:15 kasse kernel: end_request: I/O error, dev sdg, sector 78039
Nov  9 01:15:15 kasse kernel: Buffer I/O error on device sdg1, logical block 
9747
Nov  9 01:15:15 kasse kernel: lost page write due to I/O error on sdg1
Nov  9 01:15:15 kasse kernel: scsi12 (0:0): rejecting I/O to offline device
Nov  9 01:15:15 kasse kernel: Buffer I/O error on device sdg1, logical block 
12288
Nov  9 01:15:15 kasse kernel: lost page write due to I/O error on sdg1
Nov  9 01:15:15 kasse kernel: scsi12 (0:0): rejecting I/O to offline device
Nov  9 01:15:15 kasse kernel: Buffer I/O error on device sdg1, logical block 
151945218
Nov  9 01:15:15 kasse kernel: lost page write due to I/O error on sdg1
Nov  9 01:15:15 kasse kernel: Buffer I/O error on device sdg1, logical block 
151945219
Nov  9 01:15:15 kasse kernel: lost page write due to I/O error on sdg1
Nov  9 01:15:15 kasse kernel: scsi12 (0:0): rejecting I/O to offline device
Nov  9 01:15:15 kasse kernel: scsi12 (0:0): rejecting I/O to offline device
Nov  9 01:15:15 kasse kernel: __journal_remove_journal_head: freeing 
b_committed_data
Nov  9 01:15:15 kasse kernel: scsi12 (0:0): rejecting I/O to offline device
Nov  9 01:15:15 kasse kernel: scsi12 (0:0): rejecting I/O to offline device
Nov  9 01:15:15 kasse kernel: usb 1-1.1: USB disconnect, address 13
Nov  9 01:15:15 kasse kernel: scsi12 (0:0): rejecting I/O to device being 
removed
Nov  9 01:15:15 kasse kernel: Buffer I/O error on device sdg1, logical block 1
Nov  9 01:15:15 kasse kernel: lost page write due to I/O error on sdg1
Nov  9 01:15:15 kasse kernel: scsi12 (0:0): rejecting I/O to device being 
removed
Nov  9 01:15:15 kasse kernel: Buffer I/O error on device sdg1, logical block 
1025
Nov  9 01:15:15 kasse kernel: lost page write due to I/O error on sdg1
Nov  9 01:15:15 kasse kernel: scsi12 (0:0): rejecting I/O to device being 
removed
Nov  9 01:15:15 kasse kernel: Buffer I/O error on device sdg1, logical block 
1027
Nov  9 01:15:15 kasse kernel: lost page write due to I/O error on sdg1
Nov  9 01:15:15 kasse kernel: usb 1-1.1: new high speed USB device using 
address 14
Nov  9 01:15:15 kasse kernel: EXT3-fs error (device sdg1) in 
start_transaction: Journal has aborted
Nov  9 01:15:16 kasse kernel: EXT3-fs error (device sdg1) in 
start_transaction: Journal has aborted
Nov  9 01:15:20 kasse kernel: usb 1-1.1: control timeout on ep0out
Nov  9 01:15:26 kasse kernel: usb 1-1.1: control timeout on ep0out
Nov  9 01:15:26 kasse kernel: usb 1-1.1: device not accepting address 14, 
error -110
Nov  9 01:15:26 kasse kernel: usb 1-1.1: new high speed USB device using 
address 15
Nov  9 01:15:31 kasse kernel: usb 1-1.1: control timeout on ep0out
Nov  9 01:15:33 kasse kernel: scsi12 (0:0): rejecting I/O to dead device
Nov  9 01:15:33 kasse kernel: EXT3-fs error (device sdg1): ext3_readdir: 
directory #2 contains a hole at offset 0
Nov  9 01:15:36 kasse kernel: usb 1-1.1: control timeout on ep0out
Nov  9 01:15:36 kasse kernel: usb 1-1.1: device not accepting address 15, 
error -110
Nov  9 01:15:53 kasse kernel: scsi12 (0:0): rejecting I/O to dead device
Nov  9 01:15:53 kasse kernel: EXT3-fs error (device sdg1): ext3_readdir: 
directory #2 contains a hole at offset 0
Nov  9 01:16:13 kasse kernel: scsi12 (0:0): rejecting I/O to dead device
Nov  9 01:16:13 kasse kernel: EXT3-fs error (device sdg1): ext3_readdir: 
directory #2 contains a hole at offset 0
(and multipler lines like this until the partition is unmounted).

After repowering the disk again, it seems happy, and fsck passes without 
comments, and restarting the torrent-clients works well, so the data on the 
disk is at least reasonably intact.

Does anyone here have similar experiences or does anyone use USB-disks for 
any extended period of time?

Yours,
-S

-- 
Simen Thoresen, Dolphin ICS