I have a zip file. It is over 2Gb in size:
-rw-r--r-- 1 sweh sweh 2383956582 Mar 13 13:44 test.zip
The standard "unzip" program barfs: % unzip -l test.zip Archive: test.zip End-of-central-directory signature not found. Either this file is not a zipfile, or it constitutes one disk of a multi-part archive. In the latter case the central directory and zipfile comment will be found on the last disk(s) of this archive. unzip: cannot find zipfile directory in one of test.zip or test.zip.zip, and cannot find test.zip.ZIP, period.
This is because the info-zip utilities can't handle ZIP files over 2Gb in size.
Windows can access it just fine.
Anyone have any recommendations on a unix tool that'll let me access these large files?
On Sat, 2010-03-13 at 18:53 -0500, Stephen Harris wrote:
I have a zip file. It is over 2Gb in size:
-rw-r--r-- 1 sweh sweh 2383956582 Mar 13 13:44 test.zip
The standard "unzip" program barfs: % unzip -l test.zip Archive: test.zip End-of-central-directory signature not found. Either this file is not a zipfile, or it constitutes one disk of a multi-part archive. In the latter case the central directory and zipfile comment will be found on the last disk(s) of this archive. unzip: cannot find zipfile directory in one of test.zip or test.zip.zip, and cannot find test.zip.ZIP, period.
This is because the info-zip utilities can't handle ZIP files over 2Gb in size.
Windows can access it just fine.
Anyone have any recommendations on a unix tool that'll let me access these large files?
---- out of curiousity... is this a 64 bit version of Windows and a 32 bit version of Linux we are comparing?
Craig
On Sat, Mar 13, 2010 at 05:07:57PM -0700, Craig White wrote:
On Sat, 2010-03-13 at 18:53 -0500, Stephen Harris wrote:
I have a zip file. It is over 2Gb in size: The standard "unzip" program barfs: This is because the info-zip utilities can't handle ZIP files over 2Gb in size.
Windows can access it just fine.
out of curiousity... is this a 64 bit version of Windows and a 32 bit version of Linux we are comparing?
Nope; standard 32bit Windows XP vs 32bit CentOS 5
(bit-width shouldn't matter 32bit OS's can handle large files for over a decade)
On Sat, 2010-03-13 at 19:20 -0500, Stephen Harris wrote:
On Sat, Mar 13, 2010 at 05:07:57PM -0700, Craig White wrote:
On Sat, 2010-03-13 at 18:53 -0500, Stephen Harris wrote:
I have a zip file. It is over 2Gb in size: The standard "unzip" program barfs: This is because the info-zip utilities can't handle ZIP files over 2Gb in size.
Windows can access it just fine.
out of curiousity... is this a 64 bit version of Windows and a 32 bit version of Linux we are comparing?
Nope; standard 32bit Windows XP vs 32bit CentOS 5
(bit-width shouldn't matter 32bit OS's can handle large files for over a decade)
---- tell that to Outlook
try gunzip (but I have no idea that it will fare any better)
Craig
On Sat, Mar 13, 2010 at 05:31:35PM -0700, Craig White wrote:
On Sat, 2010-03-13 at 19:20 -0500, Stephen Harris wrote:
(bit-width shouldn't matter 32bit OS's can handle large files for over a decade)
tell that to Outlook
That's an application, not an OS (and Outlook 2007 handles it on 32bit XP)
try gunzip (but I have no idea that it will fare any better)
Different format files. gzip doesn't handle zip files, despite the similarity in names.
On Sat, 13 Mar 2010 19:20:43 -0500 Stephen Harris wrote:
Nope; standard 32bit Windows XP vs 32bit CentOS 5
(bit-width shouldn't matter 32bit OS's can handle large files for over a decade)
The problem with x86 (32-bit) is that there's two different memory limits. The total amoumt of accessible memory is 4GB (2^32 Bytes) but the total amount of memory per process is limited to 2GB. That means that even on 64- bit system and more than 4GB of total memory a 32-bit process cannot access more than that, which is why a ZIP file larger than that can cause trouble on any system. I highly doubt that Windows will be able to decompress that file. Depending on the tool you use (built-in unzip tool? Winzip? Winrar? 7-zip?) you might be able to access the file and view its contents, but you will probably fail unzipping it.
Martin
On Sun, Mar 14, 2010 at 12:39:28AM +0000, Martin Jungowski wrote:
The total amoumt of accessible memory is 4GB (2^32 Bytes) but the total amount of memory per process is limited to 2GB. That means that even on 64- bit system and more than 4GB of total memory a 32-bit process cannot access more than that, which is why a ZIP file larger than that can cause trouble on any system. I highly doubt that Windows will be able to decompress that file. Depending on the tool you use (built-in unzip tool?
Why? It's not storing the whole file in memory, it's writing it out chunk-wise to a disk. Unzipping a file requires very little memory regardless of the size of it.
Even info-zip can handle files of over 2Gb as long as the whole archive itself isn't 2Gb.
This is _not_ a memory issue; it's a "32bit pointer" issue (historical limitation on unix before largefiles concept).
As it happens, yes, the built-in XP zip program happily extracts all the files.
The problem with x86 (32-bit) is that there's two different memory limits. The total amoumt of accessible memory is 4GB (2^32 Bytes) but the total amount of memory per process is limited to 2GB. That means that even on 64- bit system and more than 4GB of total memory a 32-bit process cannot access more than that, which is why a ZIP file larger than that can cause trouble on any system. I highly doubt that Windows will be able to decompress that file. Depending on the tool you use (built-in unzip tool? Winzip? Winrar? 7-zip?) you might be able to access the file and view its contents, but you will probably fail unzipping it.
Someone must not have told Windows (XP 32bit) that then. :-)
I'm staring at a 6.26GB backup of my samba share that WinRAR has no problems reading & extracting files from. I also routinely burn iso images of DVD's which are around 4GB.
At Sat, 13 Mar 2010 18:53:49 -0500 CentOS mailing list centos@centos.org wrote:
I have a zip file. It is over 2Gb in size:
-rw-r--r-- 1 sweh sweh 2383956582 Mar 13 13:44 test.zip
The standard "unzip" program barfs: % unzip -l test.zip Archive: test.zip End-of-central-directory signature not found. Either this file is not a zipfile, or it constitutes one disk of a multi-part archive. In the latter case the central directory and zipfile comment will be found on the last disk(s) of this archive. unzip: cannot find zipfile directory in one of test.zip or test.zip.zip, and cannot find test.zip.ZIP, period.
This is because the info-zip utilities can't handle ZIP files over 2Gb in size.
Windows can access it just fine.
Anyone have any recommendations on a unix tool that'll let me access these large files?
Random thought (total guess): What happens if you use split on the zip file and try to get info zip to think it is a multi-part archive?
On Sat, Mar 13, 2010 at 07:28:11PM -0500, Robert Heller wrote:
Random thought (total guess): What happens if you use split on the zip file and try to get info zip to think it is a multi-part archive?
The manpage says multi-part archives aren't supported, and in tests it doesn't look like it even attempts to open a 2nd part when I split the file into fragments.
Robert Heller wrote:
At Sat, 13 Mar 2010 18:53:49 -0500 CentOS mailing list centos@centos.org wrote:
Random thought (total guess): What happens if you use split on the zip file and try to get info zip to think it is a multi-part archive?
A multi-part archive is not the same as a single archive split into pieces.
Mike
On Sat, Mar 13, 2010 at 4:53 PM, Stephen Harris lists@spuddy.org wrote:
I have a zip file. It is over 2Gb in size:
-rw-r--r-- 1 sweh sweh 2383956582 Mar 13 13:44 test.zip
The standard "unzip" program barfs: % unzip -l test.zip Archive: test.zip End-of-central-directory signature not found. Either this file is not a zipfile, or it constitutes one disk of a multi-part archive. In the latter case the central directory and zipfile comment will be found on the last disk(s) of this archive. unzip: cannot find zipfile directory in one of test.zip or test.zip.zip, and cannot find test.zip.ZIP, period.
This is because the info-zip utilities can't handle ZIP files over 2Gb in size.
Windows can access it just fine.
Anyone have any recommendations on a unix tool that'll let me access these large files?
My guess is that the code has not been updated to deal with large fseeks. Looking at the FAQ at http://www.info-zip.org they know that a file longer than 2 GB will have problems because of this.
http://www.info-zip.org/FAQ.html#limits
=> While the only theoretical limit on the size of an archive is given by (65,536 files x 4 GB each), realistically UnZip's random-access operation and (partial) dependence on the stored compressed-size values limits the total size to something in the neighborhood of 2 to 4 GB. This restriction may be relaxed in a future release. (On 64-bit IRIX with the native compiler, the options "-mips4 -64" or "-mips4 -64 -ipa" [for both compiling and linking] may help. "-ipa" is reported to generate incorrect code sometimes, however.) <=
My guess is that would affect things the most. My guess is that the windows tools have code to deal with this and the unzip tool needs a rewrite to match it.
On Sat, Mar 13, 2010 at 05:49:06PM -0700, Stephen John Smoogen wrote:
On Sat, Mar 13, 2010 at 4:53 PM, Stephen Harris lists@spuddy.org wrote:
I have a zip file. ?It is over 2Gb in size:
This is because the info-zip utilities can't handle ZIP files over 2Gb in size.
Anyone have any recommendations on a unix tool that'll let me access these large files?
My guess is that the code has not been updated to deal with large fseeks. Looking at the FAQ at http://www.info-zip.org they know that a file longer than 2 GB will have problems because of this.
Correct. I stated that in my post.
My guess is that would affect things the most. My guess is that the windows tools have code to deal with this and the unzip tool needs a rewrite to match it.
And thus my question; does anyone _know_ of a Unix tool that can access this?
On Mar 13, 2010, at 8:15 PM, Stephen Harris wrote:
And thus my question; does anyone _know_ of a Unix tool that can access this?
------- http://www.info-zip.org/UnZip.html
Latest Release New features in UnZip 6.0, released 20 April 2009:
• Support PKWARE ZIP64 extensions, allowing Zip archives and Zip archive entries larger than 4 GiBytes and more than 65536 entries within a single Zip archive. This support is currently only available for Unix, OpenVMS and Win32/Win64. • Support for bzip2 compression method. [...] ------
Also you can make this work with Unzip 5.x if you recompile with a particular flag: http://dbaspot.com/forums/linux-misc/173810-help-unzipping-large-2g-plus-zip...
A separate question but: to create these I think you will want Zip 3.0: ---- Latest Release New features in Zip 3.0, released 7 July 2008:
• large-file support (i.e., > 2GB) • support for more than 65536 files per archive • multi-part archive support • bzip2 compression support ----- I don't see either of these available for CentOS via "yum" or on Dag's repos. It may or may not be trivial to compile these from source on CentOS, some libraries on CentOS I've noticed are rather dated. In terms of time spent, you may wish to set up a live *BSD just unpack these, if that's all you need to do. You may still end up building from source depending on what e.g. FreeBSD ships with, but the BSDs are set up with building your own items from source in mind.
Hope that helps!
Brian
On Sat, Mar 13, 2010 at 09:39:57PM -0500, Brian wrote:
And thus my question; does anyone _know_ of a Unix tool that can access
http://www.info-zip.org/UnZip.html
Latest Release New features in UnZip 6.0, released 20 April 2009:
Hmm, interesting.
However, I think I found a simpler solution. It struck me that Java "JAR" files are just zip files with manifests; I've been able to unzip jar files before, so why not the other way?
And, indeed, "jar xf largefile.zip" worked :-)