Re: [CentOS] sfdisk -l output errors and crc erros

16 Jul 2006


      On Thu, July 13, 2006 9:50 am, William L. Maltby wrote:
...
On Wed, 2006-07-12 at 19:33 -0400, Paul wrote:
...
OK, I'm still trying to solve this.  Though the server has been up rock
steady, but the errors concern me.  I built this on a test box months
ago
and now that I am thinking, I may have built it originally on a drive of
a
different manufacturer, although about the same size (20g).  This may
have
something to do with it.  What is the easiest way to get these errors
taken care of?  I've tried e2fsck, and also ran fsck on Vol00.  Looks
like
I made a fine mess of things.  Is there I wasy to fix it without
reloading
AFAIK, there is no "easiest way". From my *limited* knowledge, you have
a couple different problems (maybe) and they are not identified. I'll
offer some guesses and suggestions, but without my own hard-headed
stubbornness in play, results are even more iffy.
...
Centos?  Here are some outputs:
snapshot from /var/log/messages:
Jul 12 04:03:21 hostname kernel: hda: dma_intr: status=0x51 { DriveReady
SeekComplete Error }
Jul 12 04:03:21 hostname kernel: hda: dma_intr: error=0x84 {
DriveStatusError BadCRC }
Jul 12 04:03:21 hostname kernel: ide: failed opcode was: unknown
I've experienced these regularly on a certain brand of older drive
(*really* older, probably not your situation). Maxtor IIRC. Anyway, the
problem occurred mostly on cold boot or when re-spinning the drive after
it slept. It apparently had a really *slow* spin up speed and timeout
would occur (not handled in the protocol I guess), IIRC.
This is definitely a symptom.  I wonder if LVM has anything to do with it?
 I'm running an "IBM-DTLA-307020" (20gig).  I was previously running an
"IBM-DTLA-307015" on FC1 on ext3 partitions and never had a problem.
When I find the time, I am just going reload the Centos4.3 on ext3
partitions, restore data, and see how it goes.
...
Your post doesn't mention if this might be related. If all your log
occurrences tend to indicate it happens only after long periods of
inactivity, or upon cold boot, it might not be an issue. But even there,
hdparm might have some help. Also, if it does seem to be only on cold-
boot or long periods of "sleeping", is it possible that a bunch of
things starting at the same time are taxing the power supply? Is the PS
"weak". Remember that PSs must have not only a maximum wattage
sufficient to support the maximum draw of all devices at the same time
(plus a margin for safety), but that also various 5/12 volt lines are
limited. Different PSs have different limits on those lines and often
they are not published on the PS label. Lots of 12 or 5 volt draws at
the same time (as happens in a non-sequenced start-up) might be
producing an unacceptable voltage or amperage drop.
Is your PCI bus 33/66/100 MHz? Do you get messages on boot saying
"assume 33MHz.... use idebus=66"? I hear it's OK to have an idebus param
that is too fast, but it's a problem if your bus is faster than what the
kernel thinks it is.
Re-check and make sure all cables are well-seated and that power is well
connected. Speaking of cables, is it new or "old"? Maybe cable has a
small intermittent break? Try replacing the cable. Try using an 80-
conductor (UDMA?) cable, if not using that already. If the problem is
only on cold boot, can you get a DC volt-meter on the power connector?
If so, look for the voltages to "sag". That might tell you that you are
taxing your PS. Or use the labels, do the math and calculate if your are
close to the max wattage in a worst-case scenario.
I suggest using hdparm (*very* carefully) to see if the problem can be
replicated on demand. Take the drive into various reduced-power modes
and restart it and see if the problem is fairly consistent.
...
sfdisk -l:
Disk /dev/hda: 39870 cylinders, 16 heads, 63 sectors/track
Warning: The partition table looks like it was made
  for C/H/S=*/255/63 (instead of 39870/16/63).
For this listing I'll assume that geometry.
Units = cylinders of 8225280 bytes, blocks of 1024 bytes, counting from
0
Device Boot Start     End   #cyls    #blocks   Id  System
/dev/hda1   *      0+     12      13-    104391   83  Linux
/dev/hda2         13    2500    2488   19984860   8e  Linux LVM
/dev/hda3          0       -       0          0    0  Empty
/dev/hda4          0       -       0          0    0  Empty
Warning: start=63 - this looks like a partition rather than
the entire disk. Using fdisk on it is probably meaningless.
[Use the --force option if you really want this]
What does your BIOS show for this drive? It's likely here that the drive
was labeled (or copied from a drive that was labeled) in another
machine. The "key" for me is the "255" vs. "16". The only fix here (not
important to do it though) is to get the drive properly labeled for this
machine. B/u data, make sure BIOS is set correctly, fdisk (or sfdisk) it
to get partitions correct.
WARNING! Although this can be done "live", use sfdisk -l -uS to get
starting sector numbers and make the partitions match. When you re-label
at "255", some of the calculated translations internal to the drivers(?)
might change (Do things *still* translate to CHS on modern drives? I'll
need to look into that some day. I bet not.). Also, the *desired*
starting and ending sectors of the partitions are likely to change. What
I'm saying is that the final partitioning will likely be "non-standard"
in layout and laying in wait to bite your butt.
I would backup the data, change BIOS, sfdisk it (or fdisk or cfdisk, or
any other partitioner, your choice). If system is hot, sfdisk -R will
re-read the params and get them into the kernel. Then reload data (if
needed). If it's "hot", single user, or run level 1, mounted "ro", of
course. Careful reading of sfdisk can allow you to script and test (on
another drive) parts of this.
I really want to try some of this, but not until I have a hot ready
standby HD to throw in if it get's hosed.  I'm hosting some stuff and like
to known for reliable 24x7 service.
...
Easy enough so far?  >:-)
Yea, peace of cake.  Thanks for sharing your knowledge!  I do need to play
around with LVM more and get comfortable with it.  LVM seems to be
somewhere between Solaris metabd's and ZFS.
...
...
sfdisk -lf
The "f" does you no good here, as you can see. It is really useful only
when trying to change disk label. What would be useful (maybe) to you is
"-uS".
...
<snip>
HTH
Bill
_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos
-- 

^^^^^^^^^^^^| || \
| Budvar    ######|| ||'|",__.
| _..._...______ ===|=||_|__|...]
"(@)'(@)""""**|(@)(@)*****(@)I

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

Re: [CentOS] sfdisk -l output errors and crc erros

HTH