Dear All,
I tried a USB2 Maxtor One touch II external hard disk on a couple of my Centos 4.2 boxes and found it initiallised the SCSI subsystem ok and added device "sda". But the performance is miserable, yet the same hardware running XP the performance is satisfactory.
HDPARM gives results varying from 120k/sec to , at its peak 4.75M/s on a USB 2 machine, still very poor by any stretch.
On a twin CPU USB 1 machine it give a steady 1M/sec, which is consistently slow, which is better than erratically slow ( :-) )
Still dog slow, wondered if anyone has seen this, and wondered if the firewire interface would be better? (I need to get a cable to try this).
P.
Peter Farrow wrote:
Dear All,
I tried a USB2 Maxtor One touch II external hard disk on a couple of my Centos 4.2 boxes and found it initiallised the SCSI subsystem ok and added device "sda". But the performance is miserable, yet the same hardware running XP the performance is satisfactory.
HDPARM gives results varying from 120k/sec to , at its peak 4.75M/s on a USB 2 machine, still very poor by any stretch.
On a twin CPU USB 1 machine it give a steady 1M/sec, which is consistently slow, which is better than erratically slow ( :-) )
Still dog slow, wondered if anyone has seen this, and wondered if the firewire interface would be better? (I need to get a cable to try this).
I back up weekly to a Maxtor OneTouch (original) USB2-connected hard drive. This happens while I sleep but it looks like last Wednesday morning, it took 36 minutes to to copy 19GB.
[root@mavis ~]# du -hs /media/OTOT/2005-11-30 19G /media/OTOT/2005-11-30 [root@mavis ~]#
[rj@mavis ~]$ cat backup_progress_2005-11-30 Wed Nov 30 02:02:07 CST 2005 Removing /media/OTOT/2005-11-09 Backup to /media/OTOT/2005-11-30 Started at Wed Nov 30 02:06:13 CST 2005 Wed Nov 30 02:06:14 CST 2005 Completed: /bin Wed Nov 30 02:06:15 CST 2005 Completed: /boot <snip> Wed Nov 30 02:42:14 CST 2005 Completed: /var Filesystem 1K-blocks Used Available Use% Mounted on /dev/sdb2 152206916 132579164 11896072 92% /media/OTOT /dev/sdb2 successfully unmounted from /media/OTOT All Finished at Wed Nov 30 02:42:14 CST 2005
Looks like that averages out to about 8.8MB/sec I'm running an Athlon 2600+, 2GHz, 512MB on an ASUS A7N8X. I hope this helps.
Robert wrote:
Peter Farrow wrote:
Dear All,
I tried a USB2 Maxtor One touch II external hard disk on a couple of my Centos 4.2 boxes and found it initiallised the SCSI subsystem ok and added device "sda". But the performance is miserable, yet the same hardware running XP the performance is satisfactory.
HDPARM gives results varying from 120k/sec to , at its peak 4.75M/s on a USB 2 machine, still very poor by any stretch.
On a twin CPU USB 1 machine it give a steady 1M/sec, which is consistently slow, which is better than erratically slow ( :-) )
Still dog slow, wondered if anyone has seen this, and wondered if the firewire interface would be better? (I need to get a cable to try this).
I back up weekly to a Maxtor OneTouch (original) USB2-connected hard drive. This happens while I sleep but it looks like last Wednesday morning, it took 36 minutes to to copy 19GB.
[root@mavis ~]# du -hs /media/OTOT/2005-11-30 19G /media/OTOT/2005-11-30 [root@mavis ~]#
[rj@mavis ~]$ cat backup_progress_2005-11-30 Wed Nov 30 02:02:07 CST 2005 Removing /media/OTOT/2005-11-09 Backup to /media/OTOT/2005-11-30 Started at Wed Nov 30 02:06:13 CST 2005 Wed Nov 30 02:06:14 CST 2005 Completed: /bin Wed Nov 30 02:06:15 CST 2005 Completed: /boot
<snip> Wed Nov 30 02:42:14 CST 2005 Completed: /var Filesystem 1K-blocks Used Available Use% Mounted on /dev/sdb2 152206916 132579164 11896072 92% /media/OTOT /dev/sdb2 successfully unmounted from /media/OTOT All Finished at Wed Nov 30 02:42:14 CST 2005
Looks like that averages out to about 8.8MB/sec I'm running an Athlon 2600+, 2GHz, 512MB on an ASUS A7N8X. I hope this helps.
No, not at all. What's the setup? How is the USB configured? How's the drive configured, formatted ext3, vfat...
Syv Ritch wrote:
Robert wrote:
Peter Farrow wrote:
Dear All,
I tried a USB2 Maxtor One touch II external hard disk on a couple of my Centos 4.2 boxes and found it initiallised the SCSI subsystem ok and added device "sda". But the performance is miserable, yet the same hardware running XP the performance is satisfactory.
HDPARM gives results varying from 120k/sec to , at its peak 4.75M/s on a USB 2 machine, still very poor by any stretch.
On a twin CPU USB 1 machine it give a steady 1M/sec, which is consistently slow, which is better than erratically slow ( :-) )
Still dog slow, wondered if anyone has seen this, and wondered if the firewire interface would be better? (I need to get a cable to try this).
I back up weekly to a Maxtor OneTouch (original) USB2-connected hard drive. This happens while I sleep but it looks like last Wednesday morning, it took 36 minutes to to copy 19GB.
[root@mavis ~]# du -hs /media/OTOT/2005-11-30 19G /media/OTOT/2005-11-30 [root@mavis ~]#
[rj@mavis ~]$ cat backup_progress_2005-11-30 Wed Nov 30 02:02:07 CST 2005 Removing /media/OTOT/2005-11-09 Backup to /media/OTOT/2005-11-30 Started at Wed Nov 30 02:06:13 CST 2005 Wed Nov 30 02:06:14 CST 2005 Completed: /bin Wed Nov 30 02:06:15 CST 2005 Completed: /boot
<snip> Wed Nov 30 02:42:14 CST 2005 Completed: /var Filesystem 1K-blocks Used Available Use% Mounted on /dev/sdb2 152206916 132579164 11896072 92% /media/OTOT /dev/sdb2 successfully unmounted from /media/OTOT All Finished at Wed Nov 30 02:42:14 CST 2005 Looks like that averages out to about 8.8MB/sec I'm running an Athlon 2600+, 2GHz, 512MB on an ASUS A7N8X. I hope this helps.
No, not at all. What's the setup? How is the USB configured? How's the drive configured, formatted ext3, vfat...
Drive 0d49:7010 --> >--hub 05e3:0605 --> >--motherboard rear jack.
[root@mavis log]# uname -r 2.6.9-22.0.1.EL [root@mavis log]# lsusb Bus 003 Device 001: ID 0000:0000 Bus 002 Device 003: ID 0764:0005 Cyber Power System, Inc. Cyber Power UPS Bus 002 Device 002: ID 045e:0040 Microsoft Corp. Wheel Mouse Optical Bus 002 Device 001: ID 0000:0000 Bus 001 Device 011: ID 0d49:7010 Maxtor Bus 001 Device 007: ID 03f0:3404 Hewlett-Packard DeskJet 6122 Bus 001 Device 006: ID 1267:0103 Logic3 / SpectraVideo plc Bus 001 Device 005: ID 05e3:0605 Genesys Logic, Inc. Bus 001 Device 004: ID 0409:0059 NEC Corp. HighSpeed Hub Bus 001 Device 001: ID 0000:0000 [root@mavis log]# fdisk -l /dev/sda
Disk /dev/sda: 163.9 GB, 163927556096 bytes 255 heads, 63 sectors/track, 19929 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot Start End Blocks Id System /dev/sda1 1 678 5446003+ c W95 FAT32 (LBA) /dev/sda2 679 19929 154633657+ 83 Linux [root@mavis log]#
for dir in bin boot etc home initrd lib lost+found misc opt root \ sbin selinux srv tftpboot \ usr var ; do find /$dir -depth -print0 | cpio --null -pmd $UD/$DT echo `date` Completed: /$dir >>$PF done
Robert kerplop@sbcglobal.net wrote:
for dir in bin boot etc home initrd lib lost+found misc
opt root \ sbin selinux srv tftpboot \ usr var ; do find /$dir -depth -print0 | cpio --null -pmd $UD/$DT echo `date` Completed: /$dir >>$PF done
That's not a good benchmark. You're adding the overhead of inode/tree traversal and all sorts of other factors. You're easily cutting performance by 2-3x over.
Since USB 2.0 EHCI is capable of only 60MBps (480Mbps) theoretical, and Intel openly admits that only 30MBps is realistic, 8.8MBps is not unreasonable for this command.
Try a "raw" dd from /dev/zero, that is at least 2x your memory: dd if=/dev/zero of=(some file) bs=8M count=1000
Or consider a bonnie benchmark.
Bryan J. Smith wrote:
Robert kerplop@sbcglobal.net wrote:
for dir in bin boot etc home initrd lib lost+found misc opt root \ sbin selinux srv tftpboot \ usr var ; do find /$dir -depth -print0 | cpio --null -pmd $UD/$DT echo `date` Completed: /$dir >>$PF done
That's not a good benchmark. You're adding the overhead of inode/tree traversal and all sorts of other factors. You're easily cutting performance by 2-3x over.
Since USB 2.0 EHCI is capable of only 60MBps (480Mbps) theoretical, and Intel openly admits that only 30MBps is realistic, 8.8MBps is not unreasonable for this command.
Try a "raw" dd from /dev/zero, that is at least 2x your memory: dd if=/dev/zero of=(some file) bs=8M count=1000
Or consider a bonnie benchmark.
I wasn't complaining, Bryan, simply responding. Actually, having moved from a DAT-2 drive to the USB-connected disk, I'm happy as a pig in sh*t to be able to backup the whole thing unattended and have a reasonable expectation that the resulting wad of crud is good! (I'm still gonna burn /boot, /root, /home and /etc to DVD once a month, though.) Have a great day!
Robert kerplop@sbcglobal.net wrote:
I wasn't complaining, Bryan, simply responding.
I never assumed you were complaining. I merely pointed out that a find|cpio command isn't a good command to use for benchmarking, and 8.8MBps wasn't unreasonable for it given the additional head seeks of the command and bottlenecks of the USB interface. That's all.
Actually, having moved from a DAT-2 drive
DDS-2 drives are rather slow, about 0.8MBps natively -- some 15 years old -- and only hold 4GB natively anyway. You'd be better off with a DVD-R drive today. Even 1x DVD-R is 1.35MBps. I typically get 8x DVD-R for around 10MBps native.
Now if you're talking a modern tape backup like LTO-3, then it's hard to beat. It's literally 150x faster than old DDS-2, and can stream at rates like a 2 or even 4 disk RAID-0 set. DLT and LTO are fairly reliable, although not the ultimate (that goes to IBM's proprietary tape technology).
to the USB-connected disk, I'm happy as a pig in sh*t to be able to backup the whole thing unattended and have a reasonable expectation that the resulting wad of crud is
good!
The problem with tape backup is that it's not used appropriately. You should never backup directly to tape. You should at least buffer to disk. But in reality, why does everything have to go to tape in the first place? That's the common issue. Stuff like daily backups really don't, and even weekly backups are not always required.
At the same time, as much as fixed disk might have an advantage in near-guaranteed backups, it does present an issue when it comes to long-term retention. The disaster recovery schemes I've seen fail are the ones that were either all tape-only or all disk-only -- the combination is most effective, with disk as the immediate and primary, and tape for off-lining longer-term.
Hence the new "killer app" of Virtual Tape Libraries. http://www.samag.com/documents/sam0509a/
(I'm still gonna burn /boot, /root, /home and /etc to DVD
once
a month, though.)
Yes, an effective disaster recovery ensures you can bring up the system ASAP. Having a bootable DVD is critical, and typically why I make my root (/) partition 8GiB or less (so it fits on a 4.35GiB DVD-R when compressed).
On Mon, 2005-12-05 at 12:18, Bryan J. Smith wrote:
The problem with tape backup is that it's not used appropriately. You should never backup directly to tape. You should at least buffer to disk. But in reality, why does everything have to go to tape in the first place? That's the common issue. Stuff like daily backups really don't, and even weekly backups are not always required.
My favorite for online disk backups: http://backuppc.sourceforge.net/ It uses compression and linking of duplicate files to allow holding much more data than you'd expect so you can keep some history around.
At the same time, as much as fixed disk might have an advantage in near-guaranteed backups, it does present an issue when it comes to long-term retention. The disaster recovery schemes I've seen fail are the ones that were either all tape-only or all disk-only -- the combination is most effective, with disk as the immediate and primary, and tape for off-lining longer-term.
Hence the new "killer app" of Virtual Tape Libraries. http://www.samag.com/documents/sam0509a/
The free version is to image-copy the backuppc archive filesystem to an external drive. It's cheaper than a fast tape drive and you can plug the external disk into a laptop anywhere for immediate remote restores without needing another tape drive. The downside has been firewire support on Linux the way I'm trying to do it, which is to periodically sync a RAID1 mirror and break it for offsite storage. I had been hoping to leave the RAID live except for a weekly swap - and in fact this worked with the 2.4 kernel, but there I had to do some contortions with modprobe to notice the new drive and once in a while it would take a reboot. With 2.6 based kernels the hotswap is usually noticed correctly, but with some versions leaving the RAID active causes a crash and others the drive is kicked out in a few hours. It does work just well enough to do complete a sync when backups aren't running so I'm putting up with the problems for now.
I suppose I could unmount the internal disk and use dd to copy for the same effect.
Les Mikesell lesmikesell@gmail.com wrote:
My favorite for online disk backups: http://backuppc.sourceforge.net/ It uses compression and linking of duplicate files to allow holding much more data than you'd expect so you can keep some history around.
Yes, I use it myself.
The free version is to image-copy the backuppc archive filesystem to an external drive. It's cheaper than a fast tape drive and you can plug the external disk into a laptop anywhere for immediate remote restores without needing another tape drive.
How many times do I have to say it, I _never_ backup directly to tape. I _always_ backup to disk. ;-> I then do _not_ commit every backup to tape, only a subset. Anyone who backs up everything -- let alone everything directly to tape -- is living in the '80s.
There is nothing more chronically stupid than direct backup to tape -- especially over a network. I don't know how many times I go into companies and they are still trying to do end-node to tape-server backups in an 8 hour window and just can't do it. I always just hit my head and say, "why don't you sync your filesystem changes to a centralized server, and then backup from that system locally to the tape -- which you can do in the middle of the day?!"
It's like an epiphany for them -- so obvious they missed it.
You should at least buffer to disk first, if not just store backups whole on disk. Then and only then should any long-term backups, retention or off-site considerations be put to tape -- locally on the same system. But not everything has to go to tape -- in fact, a _minority_, only those things that are going off-site or need to be stored longer than a few months. At the same time, don't trust backups for months on a fixed disk.
The downside has been firewire support on Linux the way I'm trying to do it, which is to periodically sync a RAID1
mirror
and break it for offsite storage.
I'd rather not backup in a filesystem format. I'd rather backup to a streaming archive format which is far more recoverable when there are errors. That way errors are localized and don't affect the rest of the backup. Filesystems rely on meta-data to recover data -- streaming archive formats do not.
As far as off-line storage, I do not trust commodity fixed disk more than 3 months on the shelf (if even that long). They are not designed to be stored that long and inactive after periodic activity. The vendors have come up with removable rigid disk (RRD) to address the longevity issues, but then RRD loses the performance or cost of fixed disk.
I suppose I could unmount the internal disk and use dd to copy for the same effect.
Why not a streaming backup format like afio or ustar? Far more recoverable and errors are localized than with a traditional filesystem.
On Mon, 2005-12-05 at 13:27, Bryan J. Smith wrote:
My favorite for online disk backups: http://backuppc.sourceforge.net/
There is nothing more chronically stupid than direct backup to tape -- especially over a network. I don't know how many times I go into companies and they are still trying to do end-node to tape-server backups in an 8 hour window and just can't do it. I always just hit my head and say, "why don't you sync your filesystem changes to a centralized server, and then backup from that system locally to the tape -- which you can do in the middle of the day?!"
It's like an epiphany for them -- so obvious they missed it.
You should at least buffer to disk first, if not just store backups whole on disk.
Amanda has done that for at least 15 years now. It's clunky to restore, but they got the backup side right from the beginning. You can tell it how much bandwidth to use on your network(s) and it will stream that much into the holding disk simultaneously from many different hosts, writing to tape in sequence as they are completely received.
Since it is almost full-auto - I still let amanda run tapes to be held offsite but I really don't ever want to use them except as a last resort, hence the offsite backuppc disk.
The downside has been firewire support on Linux the way I'm trying to do it, which is to periodically sync a RAID1
mirror
and break it for offsite storage.
I'd rather not backup in a filesystem format. I'd rather backup to a streaming archive format which is far more recoverable when there are errors. That way errors are localized and don't affect the rest of the backup. Filesystems rely on meta-data to recover data -- streaming archive formats do not.
My backuppc archive probably has at least a million hardlinks and conventional copy mechanisms take longer than practical. It's about 2 hours to sync the 250 gig partiton to it's raid mirror. About 120 gigs are used but it would take 700 or more gigs to hold it any other way and equivalent time to copy. I do cycle through several of the offsite drives so a single disk failure won't wipe out my whole history.
As far as off-line storage, I do not trust commodity fixed disk more than 3 months on the shelf (if even that long). They are not designed to be stored that long and inactive after periodic activity. The vendors have come up with removable rigid disk (RRD) to address the longevity issues, but then RRD loses the performance or cost of fixed disk.
I suppose I could unmount the internal disk and use dd to copy for the same effect.
Why not a streaming backup format like afio or ustar? Far more recoverable and errors are localized than with a traditional filesystem.
Since the raid sync re-writes the whole disk, I don't accumulate errors on the removable drive like I might with one that is periodically connected for normal additions. In practice I've had more trouble with the on-line IDE side of the mirror and had to run full-day fscks or even sync the external copy back to a new drive after one failed.
Les Mikesell lesmikesell@gmail.com wrote:
Amanda has done that for at least 15 years now. It's clunky to restore, but they got the backup side right from the beginning. You can tell it how much bandwidth to use on your network(s) and it will stream that much into the holding disk simultaneously from many different hosts, writing to tape in sequence as they are completely received.
Buffering is better than what I normally see.
But ultimately, the disk-to-disk sync, multiple volume storage, and then "export to" (and "import from") tape functionality is what today's VTL offers.
Since it is almost full-auto - I still let amanda run tapes to be held offsite but I really don't ever want to use them except as a last resort,
But what about having Amanda not commit things to tape, and retrieve from the disk backup? Not quite, eh? ;->
hence the offsite backuppc disk.
But not that's an entirely different solution. Wouldn't it be nice if the solution was catered to disk-to-disk, but also let you export/import to/from tape for select backups? ;->
BTW, I've just had a lot of clients send their disks out for data recovery. So I can't condone off-line disk.
Now if you take the disk off-site and put it in another system, that'd different. As long as it is getting periodically exercised, that is good.
My backuppc archive probably has at least a million hardlinks and conventional copy mechanisms take longer than practical.
Hardlinks _can_ be stored in a stream archive format. ;->
Again, I think this is more about the lack of a good, unified open source system of disk-to-disk backup with tape export/import. Too many systems are either disk-only or tape-only (with only disk as buffer in the best case, not multi-volume/multi-backup management).
On Mon, 5 Dec 2005 at 2:04pm, Bryan J. Smith wrote
Again, I think this is more about the lack of a good, unified open source system of disk-to-disk backup with tape export/import. Too many systems are either disk-only or tape-only (with only disk as buffer in the best case, not multi-volume/multi-backup management).
Amanda can easily be run in a backup-to-disk-only mode. And it'd be trivial to manually tape some of those images for off-site storage. Doing that within amanda (i.e. backup-to-tape while also leaving images on disk) is a feature folks have talked about.
Joshua Baker-LePain jlb17@duke.edu wrote:
Amanda can easily be run in a backup-to-disk-only mode. And it'd be trivial to manually tape some of those images for off-site storage. Doing that within amanda (i.e. backup-to-tape while also leaving images on disk) is a feature folks have talked about.
Excellent!
The question is how seemless is it? Can they stream out their images into a tape archive? It should be possible.
What about importing back a stream from a tape archive into a disk image? That also should be possible.
The ability to make verification and restores -- the two most overlooked details of off-site backup -- are what make VTL systems/approaches most lauded.
On Mon, 2005-12-05 at 16:11, Bryan J. Smith wrote:
Joshua Baker-LePain jlb17@duke.edu wrote:
Amanda can easily be run in a backup-to-disk-only mode. And it'd be trivial to manually tape some of those images for off-site storage. Doing that within amanda (i.e. backup-to-tape while also leaving images on disk) is a feature folks have talked about.
Excellent!
The question is how seemless is it? Can they stream out their images into a tape archive? It should be possible.
The tricky part is that amanda mixes up the filesystems on tape in no particular order and keeps an index online to tell you which tape(s) to insert when restoring. When it flushes the disk copy out it adjusts the index for the new location so without a patch it won't use the disk copy even if you saved one. There is a tool for rebuilding the index from the tape if necessary and you can figure it out by hand as a last resort but I don't think there is one to look at the holding disk again.
What about importing back a stream from a tape archive into a disk image? That also should be possible.
The ability to make verification and restores -- the two most overlooked details of off-site backup -- are what make VTL systems/approaches most lauded.
Amanda uses dump or tar for the backups and adds one extra block as a header to each backup. You can strip off the header with dd and restore with dump or tar alone. It was a nice system 15 years ago but hasn't had a lot of development since. It has one big flaw in that it can't split a single filesytem backup across more than one tape even though it can do many hosts/filesystems in one run splitting different backups within the run over different tapes. With todays big disks that's probably fatal.
On Mon, 5 Dec 2005 at 5:49pm, Les Mikesell wrote
Amanda uses dump or tar for the backups and adds one extra block as a header to each backup. You can strip off the header with dd and restore with dump or tar alone. It was a nice system 15 years ago but hasn't had a lot of development since. It has one big flaw in that it can't split a single filesytem backup across more than one tape even though it can do many hosts/filesystems in one run splitting different backups within the run over different tapes. With todays big disks that's probably fatal.
I hardly consider it fatal. I backup 5.5TB of space (4 FSs) to an AIT3 changer without a problem. Also, tape spanning is included in the 2.5 branch, which is officially in beta now.
On Tue, 2005-12-06 at 10:44, Joshua Baker-LePain wrote:
Amanda uses dump or tar for the backups and adds one extra block as a header to each backup. You can strip off the header with dd and restore with dump or tar alone. It was a nice system 15 years ago but hasn't had a lot of development since. It has one big flaw in that it can't split a single filesytem backup across more than one tape even though it can do many hosts/filesystems in one run splitting different backups within the run over different tapes. With todays big disks that's probably fatal.
I hardly consider it fatal. I backup 5.5TB of space (4 FSs) to an AIT3 changer without a problem. Also, tape spanning is included in the 2.5 branch, which is officially in beta now.
The really brilliant part of amanda is the way it schedules the mix of full and incremental runs each night to fill a tape of a given size. This works nicely when you have a large number of small filesystems (relative to the tape size) but falls down badly when a single full nearly fills the tape because it will start streaming some small runs that are finished first, then the big one won't fit, and even during amflushes it doesn't know enough to put the big run on the tape first so you end up with the small ones that can be grouped in a later amflush. The tape spanning change may help a lot with this.
On Tue, Dec 06, 2005 at 12:07:42PM -0600, Les Mikesell enlightened us:
Amanda uses dump or tar for the backups and adds one extra block as a header to each backup. You can strip off the header with dd and restore with dump or tar alone. It was a nice system 15 years ago but hasn't had a lot of development since. It has one big flaw in that it can't split a single filesytem backup across more than one tape even though it can do many hosts/filesystems in one run splitting different backups within the run over different tapes. With todays big disks that's probably fatal.
I hardly consider it fatal. I backup 5.5TB of space (4 FSs) to an AIT3 changer without a problem. Also, tape spanning is included in the 2.5 branch, which is officially in beta now.
The really brilliant part of amanda is the way it schedules the mix of full and incremental runs each night to fill a tape of a given size. This works nicely when you have a large number of small filesystems (relative to the tape size) but falls down badly when a single full nearly fills the tape because it will start streaming some small runs that are finished first, then the big one won't fit, and even during amflushes it doesn't know enough to put the big run on the tape first so you end up with the small ones that can be grouped in a later amflush. The tape spanning change may help a lot with this.
You can also adjust the dumporder parameter to coerce amanda into doing all the biggest dumps first, or the longest dumps to complete, etc. etc.
On Tue, 2005-12-06 at 16:39, Matt Hyclak wrote:
The really brilliant part of amanda is the way it schedules the mix of full and incremental runs each night to fill a tape of a given size. This works nicely when you have a large number of small filesystems (relative to the tape size) but falls down badly when a single full nearly fills the tape because it will start streaming some small runs that are finished first, then the big one won't fit, and even during amflushes it doesn't know enough to put the big run on the tape first so you end up with the small ones that can be grouped in a later amflush. The tape spanning change may help a lot with this.
You can also adjust the dumporder parameter to coerce amanda into doing all the biggest dumps first, or the longest dumps to complete, etc. etc.
But if you dump to holding disk, dumporder controls starting the runs to disk. Hmmm... Maybe 'taperalgo biggestfit' is what I what but I don't see how that can work until the bigger dumps are finished and the small ones are going to finish first. At least it should help with flushes. I guess I should read the docs every decade or so. I don't think those options were there the last time I looked.
On Tue, Dec 06, 2005 at 05:01:28PM -0600, Les Mikesell enlightened us:
The really brilliant part of amanda is the way it schedules the mix of full and incremental runs each night to fill a tape of a given size. This works nicely when you have a large number of small filesystems (relative to the tape size) but falls down badly when a single full nearly fills the tape because it will start streaming some small runs that are finished first, then the big one won't fit, and even during amflushes it doesn't know enough to put the big run on the tape first so you end up with the small ones that can be grouped in a later amflush. The tape spanning change may help a lot with this.
You can also adjust the dumporder parameter to coerce amanda into doing all the biggest dumps first, or the longest dumps to complete, etc. etc.
But if you dump to holding disk, dumporder controls starting the runs to disk. Hmmm... Maybe 'taperalgo biggestfit' is what I what but I don't see how that can work until the bigger dumps are finished and the small ones are going to finish first. At least it should help with flushes. I guess I should read the docs every decade or so. I don't think those options were there the last time I looked.
Right, that's what I was talking about. If you set dumporder to some number of T's larger than "inparallel", then you will have lots of big dumps start and no little ones can start before them.
Matt
Hello Members,
While extracting a zip file using Win XP Pro, something happend to Windows and it just keeps rebooting. I can't get to safe mode because it reboots after gettiing through some of the sys drivers/files. This is a dual boot machine with CentOS 4. I may try to use the install disk to "repair" windows. If that does not work, how can I recover files from the Windows partition while booted in CentOS? Can I tell CentOS about another drive and simply mount it?
TIA, David Evennou
On 12/7/05, David Evennou de@data-masters.com wrote:
Hello Members,
While extracting a zip file using Win XP Pro, something happend to Windows and it just keeps rebooting. I can't get to safe mode because it reboots after gettiing through some of the sys drivers/files. This is a dual boot machine with CentOS 4. I may try to use the install disk to "repair" windows. If that does not work, how can I recover files from the Windows partition while booted in CentOS? Can I tell CentOS about another drive and simply mount it?
TIA, David Evennou
Depends on the format of the windows partition. Much as I like to preach about the benefits of centos, this is usually a task I keep a current knoppix disk handy for. knoppix will have ntfs support, where the centos install/rescue disk does not. There are several variations of knoppix running around designed specifically for file/data recovery. Basic recommendation is "don't use a screw driver as a hammer".
-- Jim Perrin System Architect - UIT Ft Gordon & US Army Signal Center
On 12/7/05, David Evennou de@data-masters.com wrote:
Hello Members,
While extracting a zip file using Win XP Pro, something happend to Windows and it just keeps rebooting. I can't get to safe mode because it reboots after gettiing through some of the sys drivers/files. This is a dual boot machine with CentOS 4. I may try to use the install disk to "repair" windows. If that does not work, how can I recover files from the Windows partition while booted in CentOS? Can I tell CentOS about another drive and simply mount it?
generally, yes. If that's an NTFS partition, you'll need to install the ntfs drivers you can find at linux-ntfs.sourceforge.net. Then you just mount:
mount -t ntfs /dev/hda2 /mnt/windows
(for example--your device is probably different, and the directory can be whatever you want).
Knoppix, as suggested in another post, is another excellent idea.
On Mon, 5 Dec 2005 at 2:11pm, Bryan J. Smith wrote
Joshua Baker-LePain jlb17@duke.edu wrote:
Amanda can easily be run in a backup-to-disk-only mode. And it'd be trivial to manually tape some of those images for off-site storage. Doing that within amanda (i.e. backup-to-tape while also leaving images on disk) is a feature folks have talked about.
Excellent!
The question is how seemless is it? Can they stream out their images into a tape archive? It should be possible.
What about importing back a stream from a tape archive into a disk image? That also should be possible.
Again, replicating the on-disk backups onto tape would, at this point, be a manual process. Code contributions are welcome. ;)
On Mon, 2005-12-05 at 16:04, Bryan J. Smith wrote:
Since it is almost full-auto - I still let amanda run tapes to be held offsite but I really don't ever want to use them except as a last resort,
But what about having Amanda not commit things to tape, and retrieve from the disk backup? Not quite, eh? ;->
Yes, amanda can recover from the holding disk copy. Just don't change the tape and it happens that way by itself until you run out of holding space.
hence the offsite backuppc disk.
But not that's an entirely different solution. Wouldn't it be nice if the solution was catered to disk-to-disk, but also let you export/import to/from tape for select backups? ;->
Backuppc does allow archiving the latest backup of a host to tape or a tar image compressed and split into specified size chunks like you might write to isos or dvds, but it is sort of an afterthought and not automatic. You could easily script the command line tool to do the same, though. But, the other thing amanda does is automatically mix the full/incremental runs among a bunch of machines to make them fit on your tape every night. I don't think anything else has that feature.
BTW, I've just had a lot of clients send their disks out for data recovery. So I can't condone off-line disk.
Now if you take the disk off-site and put it in another system, that'd different. As long as it is getting periodically exercised, that is good.
I'm not doing archival storage, just a rotating copy of the backuppc archive where the current version has a weeks worth of daily runs for most of the boxes. I rotate once a week through a 4 drive cycle so I could restore to any day in the last month with a bit of work to retrieve the right disk. In practice I've had only had to go back a few days for things someone erased accidentally and didn't notice right away - but one nightly snapshot that overwrites the previous wouldn't have been enough.
My backuppc archive probably has at least a million hardlinks and conventional copy mechanisms take longer than practical.
Hardlinks _can_ be stored in a stream archive format. ;->
But they can't be restored in a reasonable amount of time with any tool that I've found. I've let a tar | tar and a cp -a run for at least 3 days and they weren't done even with a disk to disk copy. I assume this is due to having to search a huge table for matching inodes on every file to restore the links. I wouldn't want to wait for this to happen before being able to start the real restores out of that archive.
Again, I think this is more about the lack of a good, unified open source system of disk-to-disk backup with tape export/import. Too many systems are either disk-only or tape-only (with only disk as buffer in the best case, not multi-volume/multi-backup management).
Bacula (http://www.bacula.org/) looks promising and is probably where I would start if I didn't already have something working, but I don't think anything else can match the amount of data you can cram on a disk with backuppc and especially not while keeping the ability to use rsync as the remote client for efficiency.
Robert wrote:
Bryan J. Smith wrote:
Robert kerplop@sbcglobal.net wrote:
for dir in bin boot etc home initrd lib lost+found misc opt root \ sbin selinux srv tftpboot \ usr var ; do find /$dir -depth -print0 | cpio --null -pmd $UD/$DT echo `date` Completed: /$dir >>$PF done
That's not a good benchmark. You're adding the overhead of inode/tree traversal and all sorts of other factors. You're easily cutting performance by 2-3x over.
Since USB 2.0 EHCI is capable of only 60MBps (480Mbps) theoretical, and Intel openly admits that only 30MBps is realistic, 8.8MBps is not unreasonable for this command.
Try a "raw" dd from /dev/zero, that is at least 2x your memory: dd if=/dev/zero of=(some file) bs=8M count=1000
Or consider a bonnie benchmark.
I wasn't complaining, Bryan, simply responding. Actually, having moved from a DAT-2 drive to the USB-connected disk, I'm happy as a pig in sh*t to be able to backup the whole thing unattended and have a reasonable expectation that the resulting wad of crud is good! (I'm still gonna burn /boot, /root, /home and /etc to DVD once a month, though.) Have a great day!
CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
19 GiB in 36 minutes, I wouldn't be complaining either :-). I'm using DDS-2 & 3, & they are MUCHO slower .... I do also backup across my network to a sorta-spare HDD on another box, but use the DAT tapes for remote storage. I might need to look into a firewire/USB disk for that at those speeds :-).
"William A. Mahaffey III" wam@HiWAAY.net wrote:
19 GiB in 36 minutes, I wouldn't be complaining either :-).
It's not bad at all.
I'm using DDS-2 & 3, & they are MUCHO slower ....
As I mentioned, DDS-2 is 15 years old and a measly 0.8MBps native. DDS-3 is about 10 years old and about 1.6MBps.
Tape is one of those commodities that really require a minimum of a $1K investment with something like VXA. And if you really can, it's best to spend $3-4K and go for the gold in something like LTO-3 with its 400GB native capacity and 80MBps transfer rate (double each with hardware compression).
Anything less really isn't worth it. Especially not at the cost of backup cartridges.
I do also backup across my network to a sorta-spare HDD on another box, but use the DAT tapes for remote storage.
Which is what many organization should do. They should guarantee they get some sort of daily backup, which is easiest to do with disk. Especailly when just doing sychronization of diffs, which drastically cuts down on network usage -- especially during the all important "backup window."
It's also easier to restore, easier to do just about everything when you have a full copy on random access disk. It is also easier to backup tape, directly, locally and 24x7 -- no more backup window constraints -- from that server. It's also easier to verify backups against original, when the backup server has a local copy -- again, at any time, 24x7, not bothering the network.
When I integrated any solution, I always told the client to put in 4x the disk they needed, then another 2.5x that size (for a total of 14x) for snapshots, disk backup, etc... Ideally this is a separate system, but in the worst case, it was just a separate array. If you're spending $4K on a server with such storage, then another $1K on basic tape backup is well worth it.
I might need to look into a firewire/USB disk for that at those speeds :-).
FireWire is pretty commodity these days on at least AMD platforms. I've had far less headaches with it, as long as I'm not running "on-line" data with it. I never do it with FireWire _or_ USB for that matter.
If I need something "on-line," I still use external SCSI LVD. SAS will become my preferred favorite soon enough.
Bryan J. Smith wrote:
"William A. Mahaffey III" wam@HiWAAY.net wrote:
19 GiB in 36 minutes, I wouldn't be complaining either :-).
It's not bad at all.
I'm using DDS-2 & 3, & they are MUCHO slower ....
As I mentioned, DDS-2 is 15 years old and a measly 0.8MBps native. DDS-3 is about 10 years old and about 1.6MBps.
Tape is one of those commodities that really require a minimum of a $1K investment with something like VXA. And if you really can, it's best to spend $3-4K and go for the gold in something like LTO-3 with its 400GB native capacity and 80MBps transfer rate (double each with hardware compression).
Anything less really isn't worth it. Especially not at the cost of backup cartridges.
I do also backup across my network to a sorta-spare HDD on another box, but use the DAT tapes for remote storage.
Which is what many organization should do. They should guarantee they get some sort of daily backup, which is easiest to do with disk. Especailly when just doing sychronization of diffs, which drastically cuts down on network usage -- especially during the all important "backup window."
It's also easier to restore, easier to do just about everything when you have a full copy on random access disk. It is also easier to backup tape, directly, locally and 24x7 -- no more backup window constraints -- from that server. It's also easier to verify backups against original, when the backup server has a local copy -- again, at any time, 24x7, not bothering the network.
When I integrated any solution, I always told the client to put in 4x the disk they needed, then another 2.5x that size (for a total of 14x) for snapshots, disk backup, etc... Ideally this is a separate system, but in the worst case, it was just a separate array. If you're spending $4K on a server with such storage, then another $1K on basic tape backup is well worth it.
I might need to look into a firewire/USB disk for that at those speeds :-).
FireWire is pretty commodity these days on at least AMD platforms. I've had far less headaches with it, as long as I'm not running "on-line" data with it. I never do it with FireWire _or_ USB for that matter.
If I need something "on-line," I still use external SCSI LVD. SAS will become my preferred favorite soon enough.
I'm thinking about adding a DDS4 DAT drive to the box that I am currently using for LAN daily backups, kinda like you suggest. Now I have a menagerie of boxen backing themselves up & in some cases LAN-backing and/or taping others. Quite the hodge-podge solution, but reliable enough ....
My backup window is ~12:00 A.M. for the major LAN backup, usually done in < 2 hours, then 4:45 A.M. to LAN-back my Win2K box onto an SGI Octane (w/ DDS3 drive), then ~6:00 A.M. to whenever-it's-done for various tapes, usually once a week, so usually no sweat there. This is a private-LAN only, not servers.
Peter Farrow wrote:
I tried a USB2 Maxtor One touch II external hard disk on a couple of my Centos 4.2 boxes and found it initiallised the SCSI subsystem ok and added device "sda". But the performance is miserable, yet the same hardware running XP the performance is satisfactory.
What EHCI controller (brand/model)?
I'd normally suggest you ensure you're connecting to an EHCI port with USBView (or Device Manager in Windows XP), but you've already stated that you had good performance in Windows XP.
The only other thing I can suggest, which really isn't an answer, is to use FireWire. I've never had any performance issues, period. And there's no worrying whether or not you are connected to an OHCI or EHCI port, how well the driver handles memory mapped I/O for the target side, etc...
FireWire was designed for block transfer devices, with intelligence allowing direct device-to-device transfers. USB was designed for character devices and programmed I/O, EHCI wasn't supposed to exist (but exists more out of Intel's refusal to license IEEE1394 -- long, long story -- which has affected adoption as well).
On Mon, 2005-12-05 at 10:01, Bryan J. Smith wrote:
I'd normally suggest you ensure you're connecting to an EHCI port with USBView (or Device Manager in Windows XP), but you've already stated that you had good performance in Windows XP.
The only other thing I can suggest, which really isn't an answer, is to use FireWire. I've never had any performance issues, period. And there's no worrying whether or not you are connected to an OHCI or EHCI port, how well the driver handles memory mapped I/O for the target side, etc...
FireWire was designed for block transfer devices, with intelligence allowing direct device-to-device transfers. USB was designed for character devices and programmed I/O, EHCI wasn't supposed to exist (but exists more out of Intel's refusal to license IEEE1394 -- long, long story -- which has affected adoption as well).
What Linux kernel versions have you used with firewire? The last 2 fedora FC4 updates broke disk access completely. FC3 sort-of works, but when I leave a RAID1 mirror running with an IDE partition and a firewire partition mirrored, within a few hours of activity either the machine will crash or the firewire partition will be kicked out of the RAID. I haven't tried Centos because you need the unsupported kernel and I didn't have much hope for that being better than any of the fedoras.
Les Mikesell lesmikesell@gmail.com wrote:
What Linux kernel versions have you used with firewire?
Late 2.4.2x, as well as 2.6.x -- basically RHL9/FC1/RHEL3 and FC3/RHEL4. I do have to rebuild for FireWire support in RHEL4, yes.
The last 2 fedora FC4 updates broke disk access completely.
I ain't touching FC4. ;->
FC3 sort-of works, but when I leave a RAID1 mirror running with an IDE partition and a firewire partition mirrored, within a few hours of activity either the machine will
crash
or the firewire partition will be kicked out of the RAID.
Repeat after me ... ;-> "USB and FireWire should _not_ be used as 24x7 on-line storage"
Despite Apple's prior claims, it has become more apparent than ever that FireWire is _not_ a 24x7 on-line storage solution. Do not use it as such, use it as a temporary, near-line storage solution that you plug-in and use just when you need it. I've learned that hard lesson even on Apple's own XServe platforms.
I haven't tried Centos because you need the unsupported kernel and I didn't have much hope for that being better than any of the fedoras.
I've had no problem with my disks, Digital8 and DV cams, etc... They all work great! But I don't leave the disks or camera connected for a day at a time, I plug-in, use and then I unplug when finished.
Regardless of OS -- Linux, MacOS X or Windows -- FireWire and USB are nothing but trouble when it comes to leaving them connected. They are a "temporary plug and unplug" solution AFAIAC.
If you want reliable, external storage, consider SCSI or ... better yet ... Serial Attached SCSI (SAS).
Ok here is the low down:
Machine 1: PII 450 Compaq, with an Intel USB controller, this is the same dog machine I just replaced in the previous thread Kernel: 2.6.9-22.0.1.EL 512 Megs RAM Don't know what the usb controller was but its Intel, and its USB 1 Speed: 1000K/sec consistent
Machine 2: Intel 440GX+ in an SC5000 chassis 2 x PIII 1GHz CPU 00:12.2 USB Controller: Intel Corporation 82371AB/EB/MB PIIX4 USB (rev 01) 1024 Megs RAM Kernel: 2.6.9-22.0.1.ELsmp Speed: 1000K/sec consistent
Machine 3: VIA C3 600MHz EPIA Kernel : 2.6.9-22.0.1.EL 00:10.0 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller (rev 80) 00:10.1 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller (rev 80) 00:10.2 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller (rev 80) 00:10.3 USB Controller: VIA Technologies, Inc. USB 2.0 (rev 82) 512Megs RAM Speed: 4.95M/sec consistent from a remote putty shell (via SSH) Erratic when hdparm run from the text console....
I know these machines aren't bleeding edge but the VIA C3 box is the intended target, its silent and fanless and smaller than the yellow pages book by a fair margin. It also has two nics onboard making it ideal for a home firewall (its current use).
USB ports are arranged in two pairs, and I tried one socket in each, from the LSPCI looks like only one of these is USB2 - cooments invited
/proc/CPUinfo shows the cpu at 400MHz not sure if this is accurate I will check the BIOS as its supposed to be 600MHz.
If you're telling me 4.95Megs/sec is about tops then I might be able to by that...
Ideas welcome....
Pete
If you reckon
Bryan J. Smith wrote:
Les Mikesell lesmikesell@gmail.com wrote:
What Linux kernel versions have you used with firewire?
Late 2.4.2x, as well as 2.6.x -- basically RHL9/FC1/RHEL3 and FC3/RHEL4. I do have to rebuild for FireWire support in RHEL4, yes.
The last 2 fedora FC4 updates broke disk access completely.
I ain't touching FC4. ;->
FC3 sort-of works, but when I leave a RAID1 mirror running with an IDE partition and a firewire partition mirrored, within a few hours of activity either the machine will
crash
or the firewire partition will be kicked out of the RAID.
Repeat after me ... ;-> "USB and FireWire should _not_ be used as 24x7 on-line storage"
Despite Apple's prior claims, it has become more apparent than ever that FireWire is _not_ a 24x7 on-line storage solution. Do not use it as such, use it as a temporary, near-line storage solution that you plug-in and use just when you need it. I've learned that hard lesson even on Apple's own XServe platforms.
I haven't tried Centos because you need the unsupported kernel and I didn't have much hope for that being better than any of the fedoras.
I've had no problem with my disks, Digital8 and DV cams, etc... They all work great! But I don't leave the disks or camera connected for a day at a time, I plug-in, use and then I unplug when finished.
Regardless of OS -- Linux, MacOS X or Windows -- FireWire and USB are nothing but trouble when it comes to leaving them connected. They are a "temporary plug and unplug" solution AFAIAC.
If you want reliable, external storage, consider SCSI or ... better yet ... Serial Attached SCSI (SAS).