digitalmars.D.bugs - std.zlib.decompress may have flawed logic for allocating buffer
- Lynn A. (11/11) Aug 30 2004 It appears there may be a bug in std.zlib.uncompress. The size of the ou...
- Sean Kelly (4/13) Aug 31 2004 Working on it. I'm just about done updating etc.c.zlib from v1.1.4 to v...
- Dave (63/79) Aug 31 2004 Before you posted this, I checked into it and believe I found a fix.
-
Lynn
(24/26)
Sep 01 2004
Dave (17/19) Sep 01 2004 1) On Windows, make sure the dm\bin and dmd\bin folders are in your PATH... -
Lynn Allan
(23/34)
Sep 01 2004
- Sean Kelly (5/11) Sep 01 2004 Yup, that should do it. I actually missed the while(1) bit and thought ...
- Dave (42/57) Sep 01 2004 Thanks!
- Dave (6/46) Sep 02 2004 By "What is the best way to submit this so it is included in the next bu...
- Sean Kelly (3/6) Sep 02 2004 Try emailing Walter with the fix.
- Lynn Allan (19/21) Sep 02 2004 My 2¢ ....
-
Lynn Allan
(15/17)
Sep 04 2004
- Dave (3/20) Sep 04 2004 I'll take care of it..
-
Lynn Allan
(26/26)
Sep 01 2004
It appears there may be a bug in std.zlib.uncompress. The size of the output buffer is apparently estimated from the size of the input, and can be inadequate when the decompression ratio is over 2x. Rather than repeat other info, there is discussion in the following thread from the digitalmars.d newsgroup: "Having problems with uncompress of zip file created by std.zlib" There is a corresponding thread that is more specific about the uncompress behavior, and has more info (including code) at: http://dsource.org/forums/viewtopic.php?t=321 HTH, Lynn A.
Aug 30 2004
In article <cgv2k4$2dau$1 digitaldaemon.com>, Lynn A. says...It appears there may be a bug in std.zlib.uncompress. The size of the output buffer is apparently estimated from the size of the input, and can be inadequate when the decompression ratio is over 2x. Rather than repeat other info, there is discussion in the following thread from the digitalmars.d newsgroup: "Having problems with uncompress of zip file created by std.zlib" There is a corresponding thread that is more specific about the uncompress behavior, and has more info (including code) at: http://dsource.org/forums/viewtopic.php?t=321Working on it. I'm just about done updating etc.c.zlib from v1.1.4 to v1.2.1. After that I'm going to look at this bug. Sean
Aug 31 2004
In article <ch29mp$ugn$1 digitaldaemon.com>, Sean Kelly says...In article <cgv2k4$2dau$1 digitaldaemon.com>, Lynn A. says...Before you posted this, I checked into it and believe I found a fix. I'm not a zlib expert, but this seems to take care of the problem with 1.1.4. Change line 159 from: err = etc.c.zlib.inflate(&zs, Z_FINISH); to: err = etc.c.zlib.inflate(&zs, Z_NO_FLUSH); Here's how I tested it: #import std.zlib; #import std.stdio; #import std.random; #bool CompressThenUncompress (ubyte[] src) ratio); #void main (char[][] args)It appears there may be a bug in std.zlib.uncompress. The size of the output buffer is apparently estimated from the size of the input, and can be inadequate when the decompression ratio is over 2x. Rather than repeat other info, there is discussion in the following thread from the digitalmars.d newsgroup: "Having problems with uncompress of zip file created by std.zlib" There is a corresponding thread that is more specific about the uncompress behavior, and has more info (including code) at: http://dsource.org/forums/viewtopic.php?t=321Working on it. I'm just about done updating etc.c.zlib from v1.1.4 to v1.2.1. After that I'm going to look at this bug. Sean
Aug 31 2004
<alert comment="newbie> Looks like great progress is taking place. THANKS from a prospective user of std.zlib. One suggestion: The above test is ingenious (and over my head), but with large buffers the resulting compression ratios are invariably about 46.9% for the more compressible buffer, and about 82.4% for the less compressible buffer. I would advocate logic that generated test buffers with much greater variability of the compression ratio ... perhaps from 1:10 to 2:1. And also a contrived buffer that was all one letter (e.g. '1'), all two letters repeated many times (e.g. "1212121212 ... 1212"), all three letters (e.g. "123"), etc. Specifically, I anticipate using zlib for quite regular specialized text that compresses the original from about 4.1 mb to 1.1 mb. = 73%. It would be an ironic implementation of "Murphy's Law" if the fix worked well with a limited range of compression ratios, but couldn't handle a buffer of 5000 "Test Test Test Test ... repeat 4995 more times ... Test". Another suggestion: perhaps the tests that are being put together could eventually be incorporated as zlib unittests? Also, could the people doing the fixing post the modified zlib object code somewhere for other people to try out? My impression is that recompiling Phobos is not for faint-hearted newbies. (Another suggestion: perhaps I should do more than advocate the suggestions above, but "roll up my sleeves" and contribute time and effort. "People tend to be as useless as they can get away with, and as resourceful as the circumstances demand." L.Allan :-) </alert>
Sep 01 2004
In article <ch4f53$26q7$1 digitaldaemon.com>, Lynn says...My impression is that recompiling Phobos is not for faint-hearted newbies.1) On Windows, make sure the dm\bin and dmd\bin folders are in your PATH (you may have to move them towards the 'front' of the PATH if other make tools are already present in the PATH). 2) Next, cd to each: dmd/src/phobos/etc/c/zlib/ dmd/src/phobos/etc/c/recls/ dmd/src/phobos/internal/gc/ dmd/src/phobos/ 3) and edit the win32.mak files to make sure that the CC and DMD vars. are set properly (if your path is setup Ok, then you can just change them to dmc and dmd respectively). 4) Then run "make -f win32.mak" in each. 5) copy dmd/src/phobos/phobos.lib to dmd/lib (after making a backup ;) The steps are similar for Linux, except the make files are named linux.mak, the C compiler won't be DMC and you need to copy dmd/src/phobos/libphobos.a to /usr/lib after the build.</alert>
Sep 01 2004
2) Next, cd to each: dmd/src/phobos/etc/c/zlib/ dmd/src/phobos/etc/c/recls/ dmd/src/phobos/internal/gc/ dmd/src/phobos/ 3) and edit the win32.mak files to make sure that the CC and DMD vars. are set properly (if your path is setup Ok, then you can just change them to dmc and dmd respectively). 4) Then run "make -f win32.mak" in each.<alert comment="newbie"> I wasn't able to get dmd/src/phobos/c/recls to recompile. I was able to rebuild zlib.lib, but couldn't figure out how to get D code to reference my changes rather than phobos.lib. I suppose I've used IDE's too much. After some head scratching, I wonder if you really meant that I should attempt to rebuild: dmd/src/photos/std/zlib.d As a quick hack, I put test.d and zlib.d in the same directory, and put in several : writefln("Hello Lynn"); "fingerprints" within std.zlib.compress and std.zlib.uncompress. I compiled with: dmd test.d zlib.d The "Hello Lynn" outputs shows up when running test.exe, so it seems like I have an ad hoc environment from which I can make further tests. Quick sanity check ... am I on the right track with this? My intent is to incorporate the "fix candidates" identified in earlier posts, and then exercise the "CompressThenUncompress" test scaffolding with test cases that have compression ranges from 100:1 to 1:2. (such as "1111111 .... 1111" "12121212 ... 1212" "123123123 ... 123123" </alert>
Sep 01 2004
In article <ch3ajm$1gks$1 digitaldaemon.com>, Dave says...Before you posted this, I checked into it and believe I found a fix. I'm not a zlib expert, but this seems to take care of the problem with 1.1.4. Change line 159 from: err = etc.c.zlib.inflate(&zs, Z_FINISH); to: err = etc.c.zlib.inflate(&zs, Z_NO_FLUSH);Yup, that should do it. I actually missed the while(1) bit and thought inflate was only being called once. It should also work if you set the value to Z_SYNC_FLUSH. Sean
Sep 01 2004
In article <ch4uav$2dfk$1 digitaldaemon.com>, Sean Kelly says...In article <ch3ajm$1gks$1 digitaldaemon.com>, Dave says...Thanks! Timed both with the following on Linux and there was virtually no difference between the two averaged over 3 runs each. Also timed the previous code post - again no difference. The Z_NO_FLUSH was tested on Windows as well and "passed". What is the best way to submit this so it is included in the next build? import std.zlib; import std.stdio; import std.random; bool CompressThenUncompress (char[] src) { try { char[] dst = cast(char[])std.zlib.compress(cast(void[])src); double ratio = (src.length / cast(double)dst.length); if(src != cast(char[])std.zlib.uncompress(cast(void[])dst)) { writefln(" ... NO MATCH when src.length = ", src.length, "."); return false; } else { writef("src.length: ", src.length, ", dst: ", dst.length, ", Ratio = ", ratio); } } catch { writefln(" ... Exception thrown when src.length = ", src.length, "."); return false; } return true; } void main (char[][] args) { for(int idx = 0; idx < 50; idx++) { char[] buf = new char[(idx + 1) * 20000]; // Alternate between more & less compressible foreach(inout char c; buf) c = ' ' + (rand() % (90 / ((idx % 5) + 1))); if(CompressThenUncompress(buf)) { printf("; Success.\n"); } else { return; } } printf("PASSED all tests.\n"); }Before you posted this, I checked into it and believe I found a fix. I'm not a zlib expert, but this seems to take care of the problem with 1.1.4. Change line 159 from: err = etc.c.zlib.inflate(&zs, Z_FINISH); to: err = etc.c.zlib.inflate(&zs, Z_NO_FLUSH);Yup, that should do it. I actually missed the while(1) bit and thought inflate was only being called once. It should also work if you set the value to Z_SYNC_FLUSH. Sean
Sep 01 2004
By "What is the best way to submit this so it is included in the next build?" I meant the Z_NO_FLUSH change, not the included code as one might expect given the context ;) Geesh, I guess I should review my own posts more often /before/ I send them <g> - Dave In article <ch52eq$2fbi$1 digitaldaemon.com>, Dave says...Timed both with the following on Linux and there was virtually no difference between the two averaged over 3 runs each. Also timed the previous code post - again no difference. The Z_NO_FLUSH was tested on Windows as well and "passed". What is the best way to submit this so it is included in the next build? import std.zlib; import std.stdio; import std.random; bool CompressThenUncompress (char[] src) { try { char[] dst = cast(char[])std.zlib.compress(cast(void[])src); double ratio = (src.length / cast(double)dst.length); if(src != cast(char[])std.zlib.uncompress(cast(void[])dst)) { writefln(" ... NO MATCH when src.length = ", src.length, "."); return false; } else { writef("src.length: ", src.length, ", dst: ", dst.length, ", Ratio = ", ratio); } } catch { writefln(" ... Exception thrown when src.length = ", src.length, "."); return false; } return true; } void main (char[][] args) { for(int idx = 0; idx < 50; idx++) { char[] buf = new char[(idx + 1) * 20000]; // Alternate between more & less compressible foreach(inout char c; buf) c = ' ' + (rand() % (90 / ((idx % 5) + 1))); if(CompressThenUncompress(buf)) { printf("; Success.\n"); } else { return; } } printf("PASSED all tests.\n"); }
Sep 02 2004
In article <ch7ceo$fn2$1 digitaldaemon.com>, Dave says...By "What is the best way to submit this so it is included in the next build?" I meant the Z_NO_FLUSH change, not the included code as one might expect given the context ;)Try emailing Walter with the fix. Sean
Sep 02 2004
My 2¢ .... I would advocate at least some of "the included code" remain for unittest. We have a decent **start** of a rigorous set of test scaffolding for the zlib module. Without implying the slightest criticism of the truly amazing WB, this problem would have been caught during week one (if not day one) of porting zlib from C to D if DBC and unittest was incorporated (unittest may not have been implemented when the port was done?). From The WB circa 2002-April: http://www.digitalmars.com/d/archives/4607.html "Unit testing [along with DBC] has in my mind risen to become a main feature of D." Amen. Amen! AMEN! I would only humbly add that unittest has the potential to significantly subdue that truism which is typically attributed to D. Knuth, "premature optimization is the root of [most | all] software evil." (may actually be T. Hoare's: "We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil." ) Lynn A.meant the Z_NO_FLUSH change, not the included code as one might expect given the
Sep 02 2004
"Sean Kelly" <sean f4.ca> wrote in message news:ch7ikr$ipi$1 digitaldaemon.com...In article <ch7ceo$fn2$1 digitaldaemon.com>, Dave says... Try emailing Walter with the fix [for std.zlib.uncompress].<alert comment="newbie"> Walter, Did this notification happen? I'm a D newbie and not familiar with the procedure to submit 'candidates' for code fixes to phobos. There was a moderately extensive series of posts related to a possible flaw in std.zlib.uncompress. A probable fix was identified, as well as a number of routines to possibly incorporate as zlib unittests. "std.zlib.decompress may have flawed logic for allocating buffer" news:<cgp8gc$2edj$1 digitaldaemon.com news:cgv2k4$2dau$1 digitaldaemon.com HTH, Lynn A. </alert>
Sep 04 2004
I'll take care of it.. - Dave In article <chcq3n$2u7m$1 digitaldaemon.com>, Lynn Allan says..."Sean Kelly" <sean f4.ca> wrote in message news:ch7ikr$ipi$1 digitaldaemon.com...In article <ch7ceo$fn2$1 digitaldaemon.com>, Dave says... Try emailing Walter with the fix [for std.zlib.uncompress].<alert comment="newbie"> Walter, Did this notification happen? I'm a D newbie and not familiar with the procedure to submit 'candidates' for code fixes to phobos. There was a moderately extensive series of posts related to a possible flaw in std.zlib.uncompress. A probable fix was identified, as well as a number of routines to possibly incorporate as zlib unittests. "std.zlib.decompress may have flawed logic for allocating buffer" news:<cgp8gc$2edj$1 digitaldaemon.com news:cgv2k4$2dau$1 digitaldaemon.com HTH, Lynn A. </alert>
Sep 04 2004
<alert comment="newbie"> With the Z_NO_FLUSH change noted in earlier posts, I was able to recompile all previous code using a rebuilt zlib.obj and ... (drum roll, please) ... IT WORKED. THANKS! I also appended the following code to the code submitted by Dave. It has a test buffer than has extremely regular text in a buffer than grows from len=6 to about len=6,291,456. (6 * 2 ** 20). This caused the "compression ratio" to go from 1:2.33 for 6 bytes of "123123" to about 1000:1 for 6291456 bytes of "123123 ... 123123".
Sep 01 2004