Hi folks,
Could you please look at the text below and attachements and comment please?
Best regards,
Wojtek --------------------------------- CentOS Wiki tip proposal: Copy/verify files from CD/DVD quickly
If you want to copy a lot of files from CD or DVD, or verify md5 or sha1 of a large number of files, keep in mind that CD/DVD drive seeks are expensive and if are avoided as much as possible, the task could be completed faster (and quieter).
Method 1. Copy the image to hard disk and loopback mount it.
a) copying image assuming device is /dev/dvd and disk is either a single-session CD or CD-RW single session recorded in DAO mode, or DVD+RW recorded with growisofs $ dd if=/dev/dvd bs=2k of=hard_disk_directory/imagename.iso
It the media is a single-session CD or CD-RW recorded in TAO mode, there are unreadable sectors at end of session, you need to determine size:
$ isosize -x /dev/dvd
results are like this:
sector count: 346739, sector size: 2048
note sector count and read image using dd:
$ dd if=/dev/dvd bs=2k count=sector_count_value of=destination_image_file_path
Then, become root (e.g. using su), and make mount point if needed, mount the image: # mkdir -p /mnt/loop/myimage # mount -o ro,loop destination_image_file_path /mnt/loop/myimage
Now you can use files in /mnt/loop/myimage, CD/DVD media is not needed.
NOTES on loopback method: - needs root for lopback mount - will work only for single session media (or first session of multi-session media) - no seeks when reading DV/DVD media, but entire media is being read, justified only if you need to read most of the files, or a number of file multiple times
Method 2 Access files in physical media order
You could observe, that (most of the time) inode numbers of files in ISO9660 filesystem mounted on Linux are increasing as the start sector numer of file increase (within a single session). So to minimize seek operations access files in increasing inode value order.
Please use attached scripts flist_by_inum.pl - sort file list by file inode number flist_by_inum.pl - sort md5/sha1 list by file inode number
Mount the CD/DVD media, let us assume the mount point is /mnt/dvd
If you want to copy some or all regular files to destination_directory, run
cd /mnt/dvd find . -type f -print0 | flist_by_inum.pl -0 | cpio -p0md destination_directory
You can of course give find utility some files/directories instead of ., as well as any criteria. If you are sure no file name contain white-space, etc, you can remove -0 and 0:
find . -type f -print | flist_by_inum.pl | cpio -pmd destination_directory
If you want to verify SHA1 fo the files, do the following: cd /mnt/dvd cat sha1sum_file | mdlist_by_inum.pl | sha1sum -wc > ~/verify.rslt 2>~/verify.msg
NOTES: - no root privileges needed - should work for multi-session discs (although with some seeks) - only files needed are accessed - method also useful for accessing files on loopback-mounted images located on DVD
Wojciech Pilorz wrote:
Hi folks,
Could you please look at the text below and attachements and comment please?
Best regards,
Wojtek
CentOS Wiki tip proposal: Copy/verify files from CD/DVD quickly
If you want to copy a lot of files from CD or DVD, or verify md5 or sha1 of a large number of files, keep in mind that CD/DVD drive seeks are expensive and if are avoided as much as possible, the task could be completed faster (and quieter).
Method 1. Copy the image to hard disk and loopback mount it.
a) copying image assuming device is /dev/dvd and disk is either a single-session CD or CD-RW single session recorded in DAO mode, or DVD+RW recorded with growisofs $ dd if=/dev/dvd bs=2k of=hard_disk_directory/imagename.iso
It the media is a single-session CD or CD-RW recorded in TAO mode, there are unreadable sectors at end of session, you need to determine size:
$ isosize -x /dev/dvd
results are like this:
sector count: 346739, sector size: 2048
note sector count and read image using dd:
$ dd if=/dev/dvd bs=2k count=sector_count_value of=destination_image_file_path
Then, become root (e.g. using su), and make mount point if needed, mount the image: # mkdir -p /mnt/loop/myimage # mount -o ro,loop destination_image_file_path /mnt/loop/myimage
Now you can use files in /mnt/loop/myimage, CD/DVD media is not needed.
NOTES on loopback method:
- needs root for lopback mount
Can be overcome with appropriate mount options in /etc/fstab or /etc/auto.misc (for example) [summer@ns ~]$ grep loop /etc/auto.misc | head -2 S1 -fstype=iso9660,ro,nosuid,nodev,noexec,loop :/var/local/mirrors/linux/SUSE/10.0/i386/ISO/SUSE-10.0-CD-i386-GM-CD1.iso S2 -fstype=iso9660,ro,nosuid,nodev,noexec,loop :/var/local/mirrors/linux/SUSE/10.0/i386/ISO/SUSE-10.0-CD-i386-GM-CD2.iso
- will work only for single session media (or first session of
multi-session media)
- no seeks when reading DV/DVD media, but entire media is being read,
justified only if you need to read most of the files, or a number of file multiple times
Method 2 Access files in physical media order
You could observe, that (most of the time) inode numbers of files in ISO9660 filesystem mounted on Linux are increasing as the start sector numer of file increase (within a single session). So to minimize seek operations access files in increasing inode value order.
Please use attached scripts flist_by_inum.pl - sort file list by file inode number flist_by_inum.pl - sort md5/sha1 list by file inode number
Mount the CD/DVD media, let us assume the mount point is /mnt/dvd
If you want to copy some or all regular files to destination_directory, run
cd /mnt/dvd find . -type f -print0 | flist_by_inum.pl -0 | cpio -p0md destination_directory
You can of course give find utility some files/directories instead of ., as well as any criteria. If you are sure no file name contain white-space, etc, you can remove -0 and 0:
find . -type f -print | flist_by_inum.pl | cpio -pmd destination_directory
I've not tested this, but "it should work."
find . -type f \ | while read f ; do echo $(stat -c %i $f) " ' $f ; done \ | sort -n
If you want to verify SHA1 fo the files, do the following: cd /mnt/dvd cat sha1sum_file | mdlist_by_inum.pl | sha1sum -wc > ~/verify.rslt 2>~/verify.msg
NOTES:
- no root privileges needed
- should work for multi-session discs (although with some seeks)
- only files needed are accessed
- method also useful for accessing files on loopback-mounted images
located on DVD
Much of this is "black majick" that few users will even think of pursuing.
I think media verification should be build directly into the the burning tools, cdrecord (or whatever is is now) and growisofs, as it is in hdiutil in OS X. It's easy to find how to verify a burn if it's built into the burning tool and documented in the man page. Doing it in the GUI as at present doesn't work for those who don't use the gui.
I can attest that seeking is a major pain. I've been having problems with DVD coasters, and no indication from growisofs that there was a problem. I tried a file-by-file comparison by md5sum but it really is too tedious.
Since most of my DVDs are full of rpms , a suitable incantation of rpm isn't too bad.
growisofs could compute an md5sum (or sha1sum) as it writes, then read back and check. As it knows how much to read, it's far more reliable than most users {c,w}ould do.
John,
Thank you for you comments;
On 3/28/07, John Summerfield debian@herakles.homelinux.org wrote:
Wojciech Pilorz wrote:
Hi folks,
Could you please look at the text below and attachements and comment please?
Best regards,
Wojtek
CentOS Wiki tip proposal: Copy/verify files from CD/DVD quickly
If you want to copy a lot of files from CD or DVD, or verify md5 or sha1 of a large number of files, keep in mind that CD/DVD drive seeks are expensive and if are avoided as much as possible, the task could be completed faster (and quieter).
Method 1. Copy the image to hard disk and loopback mount it.
a) copying image assuming device is /dev/dvd and disk is either a single-session CD or CD-RW single session recorded in DAO mode, or DVD+RW recorded with growisofs $ dd if=/dev/dvd bs=2k of=hard_disk_directory/imagename.iso
It the media is a single-session CD or CD-RW recorded in TAO mode, there are unreadable sectors at end of session, you need to determine size:
$ isosize -x /dev/dvd
results are like this:
sector count: 346739, sector size: 2048
note sector count and read image using dd:
$ dd if=/dev/dvd bs=2k count=sector_count_value of=destination_image_file_path
Then, become root (e.g. using su), and make mount point if needed, mount the image: # mkdir -p /mnt/loop/myimage # mount -o ro,loop destination_image_file_path /mnt/loop/myimage
Now you can use files in /mnt/loop/myimage, CD/DVD media is not needed.
NOTES on loopback method:
- needs root for lopback mount
Can be overcome with appropriate mount options in /etc/fstab or /etc/auto.misc (for example) [summer@ns ~]$ grep loop /etc/auto.misc | head -2 S1 -fstype=iso9660,ro,nosuid,nodev,noexec,loop :/var/local/mirrors/linux/SUSE/10.0/i386/ISO/SUSE-10.0-CD-i386-GM-CD1.iso S2 -fstype=iso9660,ro,nosuid,nodev,noexec,loop :/var/local/mirrors/linux/SUSE/10.0/i386/ISO/SUSE-10.0-CD-i386-GM-CD2.iso
Thank you for pointing that. Still, this required admin privileges for plcing in fstab or a friendly admin.
- will work only for single session media (or first session of
multi-session media)
- no seeks when reading DV/DVD media, but entire media is being read,
justified only if you need to read most of the files, or a number of file multiple times
Method 2 Access files in physical media order
You could observe, that (most of the time) inode numbers of files in ISO9660 filesystem mounted on Linux are increasing as the start sector numer of file increase (within a single session). So to minimize seek operations access files in increasing inode value order.
Please use attached scripts flist_by_inum.pl - sort file list by file inode number flist_by_inum.pl - sort md5/sha1 list by file inode number
Mount the CD/DVD media, let us assume the mount point is /mnt/dvd
If you want to copy some or all regular files to destination_directory, run
cd /mnt/dvd find . -type f -print0 | flist_by_inum.pl -0 | cpio -p0md destination_directory
You can of course give find utility some files/directories instead of ., as well as any criteria. If you are sure no file name contain white-space, etc, you can remove -0 and 0:
find . -type f -print | flist_by_inum.pl | cpio -pmd destination_directory
I've not tested this, but "it should work."
find . -type f \ | while read f ; do echo $(stat -c %i $f) " ' $f ; done \ | sort -n
Thank you, nice trick; I would optimize it a bit (about 20x on my system) and remove numbers:
find . -type f -print0 | xargs -r0 stat -c '%i %n' | Csort -n | sed 's/^[0-9]+ //'
This is about two time slower than my perl script, quite good! And perl is not needed!
If you want to verify SHA1 fo the files, do the following: cd /mnt/dvd cat sha1sum_file | mdlist_by_inum.pl | sha1sum -wc > ~/verify.rslt 2>~/verify.msg
NOTES:
- no root privileges needed
- should work for multi-session discs (although with some seeks)
- only files needed are accessed
- method also useful for accessing files on loopback-mounted images
located on DVD
Much of this is "black majick" that few users will even think of pursuing.
I think media verification should be build directly into the the burning tools, cdrecord (or whatever is is now) and growisofs, as it is in hdiutil in OS X. It's easy to find how to verify a burn if it's built into the burning tool and documented in the man page. Doing it in the GUI as at present doesn't work for those who don't use the gui.
I can attest that seeking is a major pain. I've been having problems with DVD coasters, and no indication from growisofs that there was a problem. I tried a file-by-file comparison by md5sum but it really is too tedious.
Since most of my DVDs are full of rpms , a suitable incantation of rpm isn't too bad.
growisofs could compute an md5sum (or sha1sum) as it writes, then read back and check. As it knows how much to read, it's far more reliable than most users {c,w}ould do.
This would detect obviously bad recording.
But media to deteriorate with time, mishandling, etc.
Also, some media could be OK on the write It was burnt, but not on another drive.
When I record my files on CD/DVD, I almost always include files containing MD5 and SHA1 for all other files on the media. Verifying is then very easy, just good idea to sort by inode number if files read from CD/DVD media, and output needs to be filtered. This allows me to detect problems with image creating, e.g. Joilet name clashing or truncated.
--
Cheers John
I think about changing the proposed tip as follows:
- remove method 1, which is standard and rather obvious, to make tip shorter - add desription of sorting with stat from coreutils, as suggested be John
Thank you again,
Wojtek