e2fsck with millions of files

List overview All Threads
Download

newer

older

problem about editing keypad

Re: [CentOS] e2fsck with millions...

Sean Carolan

31 Aug 2010 31 Aug '10

1:14 p.m.

I have a large (1.5TB) partition with millions of files on it. e2fsck has been running nearly 12 hours and is still on "Checking directory structure". Any tips for speeding this along?

Attachments:

attachment.html (text/html — 197 bytes)

Show replies by date

Matthew Miller

31 Aug 31 Aug

1:19 p.m.

On Tue, Aug 31, 2010 at 08:14:23AM -0500, Sean Carolan wrote:

...

I have a large (1.5TB) partition with millions of files on it. e2fsck has been running nearly 12 hours and is still on "Checking directory structure". Any tips for speeding this along?

Yes -- use ext4. Otherwise, it's inevitable.

-- Matthew Miller mattdm@mattdm.org http://mattdm.org/

cliff here

1:26 p.m.

Yep, same answer here, I had RHEL4.8 on a 2.6 TB MSA, and you just leave it going over the weekend.

On Tue, Aug 31, 2010 at 9:19 AM, Matthew Miller mattdm@mattdm.org wrote:

...

On Tue, Aug 31, 2010 at 08:14:23AM -0500, Sean Carolan wrote:

...
I have a large (1.5TB) partition with millions of files on it. e2fsck

has

...
been running nearly 12 hours and is still on "Checking directory

structure".

...
Any tips for speeding this along?

Yes -- use ext4. Otherwise, it's inevitable.

-- Matthew Miller mattdm@mattdm.org http://mattdm.org/ _______________________________________________ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos

-- ------------------------------------------------------------------------------------------------------------------------------------- NOTICE: This message, including all attachments, is intended for the use of the individual or entity to which it is addressed and may contain information that is privileged, confidential and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, or the employee or agent responsible for delivering this message to its intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error, please notify the sender immediately by replying "Received in error" and immediately delete this message and all its attachments. -------------------------------------------------------------------------------------------------------------------------------------

Sean Carolan

1:45 p.m.

...

Yep, same answer here, I had RHEL4.8 on a 2.6 TB MSA, and you just leave it going over the weekend.

I kind of figured as much; we're letting ours run during the week so that hopefully the partition will be ready for weekend backup jobs. Thanks for the feedback.

Benjamin Franz

2:37 p.m.

On 08/31/2010 06:19 AM, Matthew Miller wrote:

...

On Tue, Aug 31, 2010 at 08:14:23AM -0500, Sean Carolan wrote:

...
I have a large (1.5TB) partition with millions of files on it. e2fsck has been running nearly 12 hours and is still on "Checking directory structure". Any tips for speeding this along?

Yes -- use ext4. Otherwise, it's inevitable.

To extend his comment: There is a bug in e2fsck for filesystems with many hardlinks. It could take *weeks* or longer, if it finishes at all, to run on a large filesystem with lots of hardlinks.

http://www.mail-archive.com/scientific-linux-users@listserv.fnal.gov/msg0218...

-- Benjamin Franz

Sean Carolan

4:36 p.m.

...

To extend his comment: There is a bug in e2fsck for filesystems with many hardlinks. It could take *weeks* or longer, if it finishes at all, to run on a large filesystem with lots of hardlinks.

http://www.mail-archive.com/scientific-linux-users@listserv.fnal.gov/msg0218...

Awesome. This happens to be our exact situation - this partition is used for BackupPC which heavily relies on hard links.

Ross Walker

1:44 p.m.

On Aug 31, 2010, at 9:14 AM, Sean Carolan scarolan@gmail.com wrote:

...

I have a large (1.5TB) partition with millions of files on it. e2fsck has been running nearly 12 hours and is still on "Checking directory structure". Any tips for speeding this along?

Disable fsck for that file system then google for online/delayed/background fsck script.

The script will take an LVM snapshot and fsck the snapshot and only if it finds a problem with the snapshot will it email the operator and set the file system to be fsck'ed on the next reboot. When it finishes it deletes the snapshot.

This will allow your system to come up right away. Performance will decline during this, but it's better than inaccessible.

-Ross

m.roth＠5-cent.us

2:27 p.m.

Sean Carolan wrote:

...

I have a large (1.5TB) partition with millions of files on it. e2fsck has been running nearly 12 hours and is still on "Checking directory structure". Any tips for speeding this along?

Kill it. And make sure it doesn't try to do it. There's a known bug with fsck (at least I think it was with CentOS, not *bleah* FC13). On large drives that we're doing online backups, it hits 70% and that's all she wrote: it never ends, and I need to kill it.

mark

Don Krause

5:44 p.m.

On Aug 31, 2010, at 7:27 AM, m.roth@5-cent.us wrote:

...

Sean Carolan wrote:

...
I have a large (1.5TB) partition with millions of files on it. e2fsck has been running nearly 12 hours and is still on "Checking directory structure". Any tips for speeding this along?

Kill it. And make sure it doesn't try to do it. There's a known bug with fsck (at least I think it was with CentOS, not *bleah* FC13). On large drives that we're doing online backups, it hits 70% and that's all she wrote: it never ends, and I need to kill it.
     mark

FWIW, after our backup server's 4 TB ext3 filesystem crashed, I found this gem somewhere on the internet, applied it, and was able to successfully fsck and repair the file system in less than a day.

This is in the source from e2fsprogs-1.39

e2fsprogs-1.39/lib/ext2fs

*** icount.c 2005-09-06 02:40:14.000000000 -0700 --- icount.c.new 2010-04-28 10:38:39.000000000 -0700 *************** *** 251,256 **** --- 251,259 ---- range = ((float) (ino - lowval)) / (highval - lowval); mid = low + ((int) (range * (high-low))); + /* Trap mid due to floating point error */ + if (mid > high) mid = high; + if (mid < low) mid = low; } #endif if (ino == icount->list[mid].ino) {

-- Don Krause Head Systems Geek, Waver of Deceased Chickens. Optivus Proton Therapy, Inc. P.O. Box 608 Loma Linda, California 92354 909.799.8327 Tel 909.799.8366 Fax dkrause@optivus.com www.optivus.com "This message represents the official view of the voices in my head."

Sean Carolan

7:40 p.m.

According to the release notes this bug has been fixed in version 1.40:

http://e2fsprogs.sourceforge.net/e2fsprogs-release.html#1.40 E2fsprogs 1.40 (June 29, 2007) There was a floating point precision error which could cause e2fsck to loop forever on really big filesystems with a large inode count. (Addresses Debian Bug: #411838)

What are the odds of this getting included in CentOS 5.6?

Laurent Wandrebeck

7:55 p.m.

On Tue, 31 Aug 2010 14:40:55 -0500 Sean Carolan scarolan@gmail.com wrote:

...

According to the release notes this bug has been fixed in version 1.40:

http://e2fsprogs.sourceforge.net/e2fsprogs-release.html#1.40 E2fsprogs 1.40 (June 29, 2007) There was a floating point precision error which could cause e2fsck to loop forever on really big filesystems with a large inode count. (Addresses Debian Bug: #411838)

What are the odds of this getting included in CentOS 5.6?

IMHO, quite high if you open the bug on RH bugzilla, with the patch. Check first if such a thing hasn't already been opened. Regards, Laurent.

Matt

8:36 p.m.

...

According to the release notes this bug has been fixed in version 1.40:

http://e2fsprogs.sourceforge.net/e2fsprogs-release.html#1.40 E2fsprogs 1.40 (June 29, 2007) There was a floating point precision error which could cause e2fsck to loop forever on really big filesystems with a large inode count. (Addresses Debian Bug: #411838)

What are the odds of this getting included in CentOS 5.6?

I am guessing this bug is still present in CentOS 4.8?

[root@server ~]# uname -a Linux server.XXXXXXXXXXXXXXXXXX-.net 2.6.9-78.0.13.ELsmp #1 SMP Wed Jan 14 16:12:46 EST 2009 i686 i686 i386 GNU/Linux [root@server ~]# cat /etc/redhat-release CentOS release 4.8 (Final) [root@server ~]# e2fsck -V e2fsck 1.35 (28-Feb-2004) Using EXT2FS Library version 1.35, 28-Feb-2004

How slow would this be with ~500K files?

Matt

5453

Age (days ago)

5453

Last active (days ago)

discuss@lists.centos.org

11 comments

9 participants

tags (0)

participants (9)

Benjamin Franz
cliff here
Don Krause
Laurent Wandrebeck
m.roth＠5-cent.us
Matt
Matthew Miller
Ross Walker
Sean Carolan