digitalmars.D - Having problems with uncompress of zip file created by std.zlib
- Lynn Allan (44/44) Aug 27 2004 I'm puzzled why the code below doesn't work. I'm attempting to use the
- Walter (68/68) Aug 28 2004 The following test program does a simple read/write of a zip file. It mi...
-
Lynn Allan
(109/177)
Aug 28 2004
- Ben Hinkle (9/226) Aug 28 2004 It looks like a bug in std.zlib.uncompress. The code
- Lynn Allan (8/18) Aug 28 2004 Interesting ... I found the snippet you noted below in the phobos zlib c...
- Jarrett Billingsley (7/9) Aug 28 2004 interestingly, those lines of code do not appear in the original zlib
- Sean Kelly (22/29) Aug 29 2004 Typical usage of zlib is to loop on inflate until all the data has been
- Lynn Allan (8/20) Aug 30 2004 to
I'm puzzled why the code below doesn't work. I'm attempting to use the std.zlib.uncompress on a file that was created with std.zlib.compress. In the code, there is a "version" that: * reads Test.vpl into step01 (plain text with <crlf>'s) * compresses into step02 and writes Test.zip * reads Test.zip into buffer step03 * checks length of step02 and step03 equal to confirm read+write+read * attempts to uncompress the buffer from Test.zip into char[] buffer step04 * gets exception with message, "Error: buf error" The same file without -version=WriteFileBeforeReading skips the 1st, 2nd, and 4th steps above to see if it works better to let the Test.zip file close. I've tried a variety of combinations with reusing buffers, small files, big files. Am I doing something wrong? Leaving out a step or three? Do I need to incorporate std.zip.ZipArchive and ArchiveMember for a Test.zip that only includes one file? Lynn A. //******************************************************** //******************************************************** import std.file; import std.zlib; int main (char[][] args) { version(WriteFileBeforeReading) // dmd -version=WriteFileBeforeReading test.d { printf("Reached version: WriteFileBeforeReading\n"); char[] inputStep01 = cast(char[])std.file.read("Test.vpl"); ubyte[] compressedStep02 = cast(ubyte[])compress(inputStep01); printf("inputStep01 size: %d\n", inputStep01.length); printf("compressedStep02 size: %d\n", compressedStep02.length); std.file.write("Test.zip", compressedStep02); } printf("Reached past: WriteFileBeforeReading\n"); ubyte [] compressedStep03 = cast(ubyte[])std.file.read("Test.zip"); printf("Test.zip size: %d\n", compressedStep03.length); version(WriteFileBeforeReading) { assert(compressedStep02.length == compressedStep03.length); } char[] textUncompressedStep04 = cast(char[])uncompress(compressedStep03); printf("textUncompressedStep04 size: %d\n", textUncompressedStep04.length); return 0; }
Aug 27 2004
The following test program does a simple read/write of a zip file. It might be helpful. ---------------------------- import std.file; import std.date; import std.zip; import std.zlib; int main(char[][] args) { byte[] buffer; std.zip.ZipArchive zr; char[] zipname; ubyte[] data; testzlib(); if (args.length > 1) zipname = args[1]; else zipname = "test.zip"; buffer = cast(byte[])std.file.read(zipname); zr = new std.zip.ZipArchive(cast(void[])buffer); printf("comment = '%.*s'\n", zr.comment); zr.print(); foreach (ArchiveMember de; zr.directory) { de.print(); printf("date = '%.*s'\n", std.date.toString(std.date.toDtime(de.time))); arrayPrint(de.compressedData); data = zr.expand(de); printf("data = '%.*s'\n", data); } printf("**Success**\n"); zr = new std.zip.ZipArchive(); ArchiveMember am = new ArchiveMember(); am.compressionMethod = 8; am.name = "foo.bar"; //am.extra = cast(ubyte[])"ExTrA"; am.expandedData = cast(ubyte[])"We all live in a yellow submarine, a yellow submarine"; am.expandedSize = am.expandedData.length; zr.addMember(am); void[] data2 = zr.build(); std.file.write("foo.zip", cast(byte[])data2); return 0; } void arrayPrint(ubyte[] array) { //printf("array %p,%d\n", (void*)array, array.length); for (int i = 0; i < array.length; i++) { printf("%02x ", array[i]); if (((i + 1) & 15) == 0) printf("\n"); } printf("\n\n"); } void testzlib() { ubyte[] src = cast(ubyte[]) "the quick brown fox jumps over the lazy dog\r the quick brown fox jumps over the lazy dog\r "; ubyte[] dst; arrayPrint(src); dst = cast(ubyte[])std.zlib.compress(cast(void[])src); arrayPrint(dst); src = cast(ubyte[])std.zlib.uncompress(cast(void[])dst); arrayPrint(src); }
Aug 28 2004
<alert comment="newbie"> I'm still having problems so I removed the std.file logic to better illustrate the misbehavior I'm seeing. The exceptions thrown seem related to the size of the buffer handled by std.zlib.uncompress. Or that I never really woke up this morning??? The code is similar to Walter B.'s sample code for using zip, except using larger buffers and doesn't "reuse" the original src buffer as the destination of uncompress. Eventually, I want to read in a 1.1 meg plain text file that has been compressed with std.zlib.compress from about 4.1 meg. The application will use std.zlib.uncompress and proceed. The original uncompressed buffer will be read in from a file, but this simplified sample code just uses arrays to check what happens when a plain text buffer is compressed, and then uncompressed. To summarize, main declares different text buffers of varying sizes and then calls CompressThenUncompress. Oddly, the same CompressThenUncompress code (below) that works for a buffer of 30 ubytes may fail inconsistently with 60 ubytes. The way the buffer is declared also seems to make a difference. There seems to be a 'threshold' of about 50 bytes, but that isn't consistent either. I suspect that I'm confused about declaring arrays of ubytes?? I've included the code below, which may be hard to view depending on word wrap. It may be more viewable at: http://dsource.org/forums/viewtopic.php?t=321 Am I doing something wrong or leaving out a step or three? The output from running the program is shown at the bottom.. </alert> // ******************************* // ******************************* import std.zlib; import std.stdio; void CompressThenUncompress (ubyte[] src) { try { ubyte[] dst = cast(ubyte[])std.zlib.compress(cast(void[])src); writef("src.length: ", src.length, " dst: ", dst.length); ubyte[] uncompressedBuf; uncompressedBuf = cast(ubyte[])std.zlib.uncompress(cast(void[])dst); writefln(" ... Got past std.zlib.uncompress. dst.length: ", dst.length); assert(src.length == uncompressedBuf.length); assert(src == uncompressedBuf); } catch { writefln(" ... Exception thrown when src.length = ", src.length, ". Keep going"); } } char[] outerBuf30 = "000000000011111111112222222222"; char[] outerBuf40 = "0000000000111111111122222222223333333333"; char[] outerBuf50 = "00000000001111111111222222222233333333334444444444"; char[] outerBuf100 = "00000000001111111111222222222233333333334444444444" "01234567890123456789012345678901234567890123456789"; void main (char[][] args) { char[] buf32 = "0123456789 0123456789 0123456789"; CompressThenUncompress(cast(ubyte[])buf32); // Works ok char[] buf40 = "0123456789 0123456789 0123456789 0123456"; CompressThenUncompress(cast(ubyte[])buf40); // Works ok char[] buf60 = "0123456789 0123456789 0123456789 0123456790 123456789 123456"; CompressThenUncompress(cast(ubyte[])buf60); // Throws exception ubyte[] ubuf60 = cast(ubyte[])"0123456789 0123456789 0123456789 " "0123456790 123456789 123456"; CompressThenUncompress(ubuf60); // Throws exception char[] buf80 = "0123456789012345678901234567890123456789" "0123456789012345678901234567890123456789"; CompressThenUncompress(cast(ubyte[])buf80); // Throws exception CompressThenUncompress(cast(ubyte[])"This string is 28 chars long"); //ok CompressThenUncompress(cast(ubyte[])"This string is 42 chars long " "0123456789012"); //ok CompressThenUncompress(cast(ubyte[])"This string is 46 chars long " "01234567890123456"); //ok CompressThenUncompress(cast(ubyte[])"This string is 60 chars long " "0123456789012345678901234567890"); //ok CompressThenUncompress(cast(ubyte[])"This string is 80 chars long " "0123456789012345678901234567890" "12345678901234567890"); //ok CompressThenUncompress(cast(ubyte[])outerBuf30); // ok CompressThenUncompress(cast(ubyte[])outerBuf40); // Throws exception CompressThenUncompress(cast(ubyte[])outerBuf50); // Throws exception CompressThenUncompress(cast(ubyte[])outerBuf100); // Throws exception } // Results from running above code for different array declarations src.length: 32 dst: 22 ... Got past std.zlib.uncompress. dst.length: 22 src.length: 40 dst: 22 ... Got past std.zlib.uncompress. dst.length: 22 src.length: 60 dst: 28 ... Exception thrown when src.length = 60. Keep going src.length: 60 dst: 28 ... Exception thrown when src.length = 60. Keep going src.length: 80 dst: 21 ... Exception thrown when src.length = 80. Keep going src.length: 28 dst: 34 ... Got past std.zlib.uncompress. dst.length: 34 src.length: 42 dst: 46 ... Got past std.zlib.uncompress. dst.length: 46 src.length: 46 dst: 46 ... Got past std.zlib.uncompress. dst.length: 46 src.length: 60 dst: 46 ... Got past std.zlib.uncompress. dst.length: 46 src.length: 80 dst: 46 ... Got past std.zlib.uncompress. dst.length: 46 src.length: 30 dst: 16 ... Got past std.zlib.uncompress. dst.length: 16 src.length: 40 dst: 19 ... Exception thrown when src.length = 40. Keep going src.length: 50 dst: 21 ... Exception thrown when src.length = 50. Keep going src.length: 100 dst: 33 ... Exception thrown when src.length = 100. Keep going "Walter" <newshound digitalmars.com> wrote in message news:cgper0$2hoh$1 digitaldaemon.com...The following test program does a simple read/write of a zip file. Itmightbe helpful. ---------------------------- import std.file; import std.date; import std.zip; import std.zlib; int main(char[][] args) { byte[] buffer; std.zip.ZipArchive zr; char[] zipname; ubyte[] data; testzlib(); if (args.length > 1) zipname = args[1]; else zipname = "test.zip"; buffer = cast(byte[])std.file.read(zipname); zr = new std.zip.ZipArchive(cast(void[])buffer); printf("comment = '%.*s'\n", zr.comment); zr.print(); foreach (ArchiveMember de; zr.directory) { de.print(); printf("date = '%.*s'\n", std.date.toString(std.date.toDtime(de.time))); arrayPrint(de.compressedData); data = zr.expand(de); printf("data = '%.*s'\n", data); } printf("**Success**\n"); zr = new std.zip.ZipArchive(); ArchiveMember am = new ArchiveMember(); am.compressionMethod = 8; am.name = "foo.bar"; //am.extra = cast(ubyte[])"ExTrA"; am.expandedData = cast(ubyte[])"We all live in a yellow submarine, a yellow submarine"; am.expandedSize = am.expandedData.length; zr.addMember(am); void[] data2 = zr.build(); std.file.write("foo.zip", cast(byte[])data2); return 0; } void arrayPrint(ubyte[] array) { //printf("array %p,%d\n", (void*)array, array.length); for (int i = 0; i < array.length; i++) { printf("%02x ", array[i]); if (((i + 1) & 15) == 0) printf("\n"); } printf("\n\n"); } void testzlib() { ubyte[] src = cast(ubyte[]) "the quick brown fox jumps over the lazy dog\r the quick brown fox jumps over the lazy dog\r "; ubyte[] dst; arrayPrint(src); dst = cast(ubyte[])std.zlib.compress(cast(void[])src); arrayPrint(dst); src = cast(ubyte[])std.zlib.uncompress(cast(void[])dst); arrayPrint(src); }
Aug 28 2004
It looks like a bug in std.zlib.uncompress. The code if (!destlen) destlen = srcbuf.length * 2 + 1; doesn't always allocate enough space for the result. When I change the 1 to 100 (or something big like that) all the examples in your test work. I have no idea what the "right" value should be. I was just playing around with different values. -Ben Lynn Allan wrote:<alert comment="newbie"> I'm still having problems so I removed the std.file logic to better illustrate the misbehavior I'm seeing. The exceptions thrown seem related to the size of the buffer handled by std.zlib.uncompress. Or that I never really woke up this morning??? The code is similar to Walter B.'s sample code for using zip, except using larger buffers and doesn't "reuse" the original src buffer as the destination of uncompress. Eventually, I want to read in a 1.1 meg plain text file that has been compressed with std.zlib.compress from about 4.1 meg. The application will use std.zlib.uncompress and proceed. The original uncompressed buffer will be read in from a file, but this simplified sample code just uses arrays to check what happens when a plain text buffer is compressed, and then uncompressed. To summarize, main declares different text buffers of varying sizes and then calls CompressThenUncompress. Oddly, the same CompressThenUncompress code (below) that works for a buffer of 30 ubytes may fail inconsistently with 60 ubytes. The way the buffer is declared also seems to make a difference. There seems to be a 'threshold' of about 50 bytes, but that isn't consistent either. I suspect that I'm confused about declaring arrays of ubytes?? I've included the code below, which may be hard to view depending on word wrap. It may be more viewable at: http://dsource.org/forums/viewtopic.php?t=321 Am I doing something wrong or leaving out a step or three? The output from running the program is shown at the bottom.. </alert> // ******************************* // ******************************* import std.zlib; import std.stdio; void CompressThenUncompress (ubyte[] src) { try { ubyte[] dst = cast(ubyte[])std.zlib.compress(cast(void[])src); writef("src.length: ", src.length, " dst: ", dst.length); ubyte[] uncompressedBuf; uncompressedBuf = cast(ubyte[])std.zlib.uncompress(cast(void[])dst); writefln(" ... Got past std.zlib.uncompress. dst.length: ", dst.length); assert(src.length == uncompressedBuf.length); assert(src == uncompressedBuf); } catch { writefln(" ... Exception thrown when src.length = ", src.length, ". Keep going"); } } char[] outerBuf30 = "000000000011111111112222222222"; char[] outerBuf40 = "0000000000111111111122222222223333333333"; char[] outerBuf50 = "00000000001111111111222222222233333333334444444444"; char[] outerBuf100 = "00000000001111111111222222222233333333334444444444" "01234567890123456789012345678901234567890123456789"; void main (char[][] args) { char[] buf32 = "0123456789 0123456789 0123456789"; CompressThenUncompress(cast(ubyte[])buf32); // Works ok char[] buf40 = "0123456789 0123456789 0123456789 0123456"; CompressThenUncompress(cast(ubyte[])buf40); // Works ok char[] buf60 = "0123456789 0123456789 0123456789 0123456790 123456789 123456"; CompressThenUncompress(cast(ubyte[])buf60); // Throws exception ubyte[] ubuf60 = cast(ubyte[])"0123456789 0123456789 0123456789 " "0123456790 123456789 123456"; CompressThenUncompress(ubuf60); // Throws exception char[] buf80 = "0123456789012345678901234567890123456789" "0123456789012345678901234567890123456789"; CompressThenUncompress(cast(ubyte[])buf80); // Throws exception CompressThenUncompress(cast(ubyte[])"This string is 28 chars long"); //ok CompressThenUncompress(cast(ubyte[])"This string is 42 chars long " "0123456789012"); //ok CompressThenUncompress(cast(ubyte[])"This string is 46 chars long " "01234567890123456"); //ok CompressThenUncompress(cast(ubyte[])"This string is 60 chars long " "0123456789012345678901234567890"); //ok CompressThenUncompress(cast(ubyte[])"This string is 80 chars long " "0123456789012345678901234567890" "12345678901234567890"); //ok CompressThenUncompress(cast(ubyte[])outerBuf30); // ok CompressThenUncompress(cast(ubyte[])outerBuf40); // Throws exception CompressThenUncompress(cast(ubyte[])outerBuf50); // Throws exception CompressThenUncompress(cast(ubyte[])outerBuf100); // Throws exception } // Results from running above code for different array declarations src.length: 32 dst: 22 ... Got past std.zlib.uncompress. dst.length: 22 src.length: 40 dst: 22 ... Got past std.zlib.uncompress. dst.length: 22 src.length: 60 dst: 28 ... Exception thrown when src.length = 60. Keep going src.length: 60 dst: 28 ... Exception thrown when src.length = 60. Keep going src.length: 80 dst: 21 ... Exception thrown when src.length = 80. Keep going src.length: 28 dst: 34 ... Got past std.zlib.uncompress. dst.length: 34 src.length: 42 dst: 46 ... Got past std.zlib.uncompress. dst.length: 46 src.length: 46 dst: 46 ... Got past std.zlib.uncompress. dst.length: 46 src.length: 60 dst: 46 ... Got past std.zlib.uncompress. dst.length: 46 src.length: 80 dst: 46 ... Got past std.zlib.uncompress. dst.length: 46 src.length: 30 dst: 16 ... Got past std.zlib.uncompress. dst.length: 16 src.length: 40 dst: 19 ... Exception thrown when src.length = 40. Keep going src.length: 50 dst: 21 ... Exception thrown when src.length = 50. Keep going src.length: 100 dst: 33 ... Exception thrown when src.length = 100. Keep going "Walter" <newshound digitalmars.com> wrote in message news:cgper0$2hoh$1 digitaldaemon.com...The following test program does a simple read/write of a zip file. Itmightbe helpful. ---------------------------- import std.file; import std.date; import std.zip; import std.zlib; int main(char[][] args) { byte[] buffer; std.zip.ZipArchive zr; char[] zipname; ubyte[] data; testzlib(); if (args.length > 1) zipname = args[1]; else zipname = "test.zip"; buffer = cast(byte[])std.file.read(zipname); zr = new std.zip.ZipArchive(cast(void[])buffer); printf("comment = '%.*s'\n", zr.comment); zr.print(); foreach (ArchiveMember de; zr.directory) { de.print(); printf("date = '%.*s'\n", std.date.toString(std.date.toDtime(de.time))); arrayPrint(de.compressedData); data = zr.expand(de); printf("data = '%.*s'\n", data); } printf("**Success**\n"); zr = new std.zip.ZipArchive(); ArchiveMember am = new ArchiveMember(); am.compressionMethod = 8; am.name = "foo.bar"; //am.extra = cast(ubyte[])"ExTrA"; am.expandedData = cast(ubyte[])"We all live in a yellow submarine, a yellow submarine"; am.expandedSize = am.expandedData.length; zr.addMember(am); void[] data2 = zr.build(); std.file.write("foo.zip", cast(byte[])data2); return 0; } void arrayPrint(ubyte[] array) { //printf("array %p,%d\n", (void*)array, array.length); for (int i = 0; i < array.length; i++) { printf("%02x ", array[i]); if (((i + 1) & 15) == 0) printf("\n"); } printf("\n\n"); } void testzlib() { ubyte[] src = cast(ubyte[]) "the quick brown fox jumps over the lazy dog\r the quick brown fox jumps over the lazy dog\r "; ubyte[] dst; arrayPrint(src); dst = cast(ubyte[])std.zlib.compress(cast(void[])src); arrayPrint(dst); src = cast(ubyte[])std.zlib.uncompress(cast(void[])dst); arrayPrint(src); }
Aug 28 2004
Interesting ... I found the snippet you noted below in the phobos zlib code. Does that mean that there isn't really a workaround for someone using std.zlib? My impression is that std.zlib was ported from the original C code.if (!destlen) destlen = srcbuf.length * 2 + 1;"Ben Hinkle" <bhinkle4 juno.com> wrote in message news:<cgr95t$u3q$1 digitaldaemon.com>...It looks like a bug in std.zlib.uncompress. The code if (!destlen) destlen = srcbuf.length * 2 + 1; doesn't always allocate enough space for the result. When I change the 1to100 (or something big like that) all the examples in your test work. Ihaveno idea what the "right" value should be. I was just playing around with different values. -Ben
Aug 28 2004
My impression is that std.zlib was ported from the original C code.interestingly, those lines of code do not appear in the original zlib source. i think what walter tried to do was "approximate" a buffer size, which is not really the best way to go about it, as the size of the uncompressed data is not necessarily (2*compressed)+1. it would be better just to fail than try to carry on half-assedly in this case. or, rather than returning a void[], it could accept an out void[] for the dest buffer. though it wouldn't be as elegant :P
Aug 28 2004
Ben Hinkle wrote:It looks like a bug in std.zlib.uncompress. The code if (!destlen) destlen = srcbuf.length * 2 + 1; doesn't always allocate enough space for the result. When I change the 1 to 100 (or something big like that) all the examples in your test work. I have no idea what the "right" value should be. I was just playing around with different values.Typical usage of zlib is to loop on inflate until all the data has been extracted--it looks like the current implementation is trying to do everything in one pass. I'd be happy to fix this, though I won't have time until tomorrow. Also, the core zlib inflate/deflate functions do not generate or parse a zip header. This process is only taken care of by the printf-type functions in the library (which don't operate on memory buffers). While not having a header is fine (and probably preferable) for application-specific data, it means that std.zlib will not be able to read or write zip files usable by other programs. I've written in-memory wrappers for zlib before that take care of this issue and would be happy to do something about it if folks are interested. For the free functions the best way to do this would be to add a bit parameter at the end to specify whether the header should be processed/generated. For the classes this could be a value passed on construction. Default would be to off. Frankly, it would be nice if the zlib routines didn't allocate a new buffer for every function call. Maybe a new set of functions that take both the input and output buffers as parameters? The output buffer might still have to grow if it's not big enough. Sean
Aug 29 2004
"Sean Kelly" <sean f4.ca> wrote in message news:cgtc4a$1ojk$1 digitaldaemon.com...Ben Hinkle wrote:toIt looks like a bug in std.zlib.uncompress. The code if (!destlen) destlen = srcbuf.length * 2 + 1; doesn't always allocate enough space for the result. When I change the 1have100 (or something big like that) all the examples in your test work. II've posted as a std.zlib.decompress bug, and appreciate Sean K's offer to fix. http://www.digitalmars.com/drn-bin/wwwnews?digitalmars.D.bugs/1677 Lynn A.no idea what the "right" value should be. I was just playing around with different values.Typical usage of zlib is to loop on inflate until all the data has been extracted--it looks like the current implementation is trying to do everything in one pass. I'd be happy to fix this, though I won't have time until tomorrow.
Aug 30 2004