www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - std.zip size limit of 2 GB?

reply Andre Pany <andre s-e-a-p.de> writes:
Hi,

I just noticed that std.zip will throw an exception if the source 
files exceeds 2 GB.
I am not sure whether this is a limitation of zip version 20 or a 
bug. On wikipedia a
size limit of 4 GB is mentioned. Should I open an issue?

Windows 10 with x86_64 architecture.

core.exception.RangeError std\zip.d(808): Range violation
----------------
0x00007FF7C9B1705C in d_arrayboundsp
0x00007FF7C9B301FF in  safe void 
std.zip.ZipArchive.putUshort(int, ushort)
0x00007FF7C9B2E634 in void[] std.zip.ZipArchive.build()

     void zipFolder(string archiveFilePath, string folderPath)
     {
         import std.zip, std.file;

         ZipArchive zip = new ZipArchive();
         string folderName = folderPath.baseName;

         foreach(entry; dirEntries(folderPath, SpanMode.depth))
         {
             if (!entry.isFile)
                 continue;

             ArchiveMember am = new ArchiveMember();
             am.name = entry.name[folderPath.length + 1..$];
             am.expandedData(cast(ubyte[]) read(entry.name));
             zip.addMember(am);
         }

         void[] compressed_data = zip.build(); // zip.build() will 
throw
         write(archiveFilePath, compressed_data);
     }

Kind regards
André
Feb 15 2018
next sibling parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
On 2/15/18 6:56 AM, Andre Pany wrote:
 Hi,
 
 I just noticed that std.zip will throw an exception if the source files 
 exceeds 2 GB.
 I am not sure whether this is a limitation of zip version 20 or a bug. 
 On wikipedia a
 size limit of 4 GB is mentioned. Should I open an issue?
 
 Windows 10 with x86_64 architecture.
 
 core.exception.RangeError std\zip.d(808): Range violation
 ----------------
 0x00007FF7C9B1705C in d_arrayboundsp
 0x00007FF7C9B301FF in  safe void std.zip.ZipArchive.putUshort(int, ushort)
 0x00007FF7C9B2E634 in void[] std.zip.ZipArchive.build()
 
      void zipFolder(string archiveFilePath, string folderPath)
      {
          import std.zip, std.file;
 
          ZipArchive zip = new ZipArchive();
          string folderName = folderPath.baseName;
 
          foreach(entry; dirEntries(folderPath, SpanMode.depth))
          {
              if (!entry.isFile)
                  continue;
 
              ArchiveMember am = new ArchiveMember();
              am.name = entry.name[folderPath.length + 1..$];
              am.expandedData(cast(ubyte[]) read(entry.name));
              zip.addMember(am);
          }
 
          void[] compressed_data = zip.build(); // zip.build() will throw
          write(archiveFilePath, compressed_data);
      }
 
 Kind regards
 André
I think it's inherent in the zlib API. I haven't used all of the library, but the portion I did use (using zstream) uses uint for buffer sizes. -Steve
Feb 15 2018
parent reply Tony <tonytdominguez aol.com> writes:
On Thursday, 15 February 2018 at 18:49:55 UTC, Steven 
Schveighoffer wrote:

 I think it's inherent in the zlib API. I haven't used all of 
 the library, but the portion I did use (using zstream) uses 
 uint for buffer sizes.
Wouldn't using a uint for buffer size give a size limit of greater than 4GB? Seems like an int is in the mix somewhere.
Feb 15 2018
next sibling parent ag0aep6g <anonymous example.com> writes:
On 02/15/2018 10:20 PM, Tony wrote:
 Wouldn't using a uint for buffer size give a size limit of greater than 
 4GB? Seems like an int is in the mix somewhere.
uint gives 4, int gives 2.
Feb 15 2018
prev sibling parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
On 2/15/18 4:20 PM, Tony wrote:
 On Thursday, 15 February 2018 at 18:49:55 UTC, Steven Schveighoffer wrote:
 
 I think it's inherent in the zlib API. I haven't used all of the 
 library, but the portion I did use (using zstream) uses uint for 
 buffer sizes.
Wouldn't using a uint for buffer size give a size limit of greater than 4GB? Seems like an int is in the mix somewhere.
You meant 2GB, I think. And you are right. I looked into it a bit, this has nothing to do (superficially) with zlib, it has to do with std.zip: https://github.com/dlang/phobos/blob/0107a6ee09072bda9e486a12caa148dc7af7bb08/std/zip.d#L806 Really, i should be size_t in all places, I can't see why it should ever be int. Please file an issue. -Steve
Feb 15 2018
next sibling parent Andre Pany <andre s-e-a-p.de> writes:
On Thursday, 15 February 2018 at 21:57:23 UTC, Steven 
Schveighoffer wrote:
 Really, i should be size_t in all places, I can't see why it 
 should ever be int.

 Please file an issue.

 -Steve
Issue created: https://issues.dlang.org/show_bug.cgi?id=18452 Thanks for the analysis. Kind regards André
Feb 16 2018
prev sibling parent reply =?UTF-8?Q?Ali_=c3=87ehreli?= <acehreli yahoo.com> writes:
On 02/15/2018 01:57 PM, Steven Schveighoffer wrote:

 Really, i should be size_t in all places
size_t or ulong? size_t would constrain 32-bit systems unless they can't handle files over 2G. Ali
Feb 16 2018
parent Steven Schveighoffer <schveiguy yahoo.com> writes:
On 2/16/18 5:39 PM, Ali Çehreli wrote:
 On 02/15/2018 01:57 PM, Steven Schveighoffer wrote:
 
  > Really, i should be size_t in all places
 
 size_t or ulong? size_t would constrain 32-bit systems unless they can't 
 handle files over 2G.
The code I linked to writes to an array. So it's constrained to size_t. I think the zlib library itself doesn't support very well anything more than 4GB files. -Steve
Feb 16 2018
prev sibling parent Stefan Koch <uplink.coder googlemail.com> writes:
On Thursday, 15 February 2018 at 11:56:04 UTC, Andre Pany wrote:
 Hi,

 I just noticed that std.zip will throw an exception if the 
 source files exceeds 2 GB.
 I am not sure whether this is a limitation of zip version 20 or 
 a bug. On wikipedia a
 size limit of 4 GB is mentioned. Should I open an issue?

 [...]
It was partially changed in this PR: https://github.com/dlang/phobos/pull/2914/files The the put methods where left at int must have been an oversight.
Feb 15 2018