digitalmars.D.announce - LZ4 decompression at CTFE

Stefan Koch (7/7) Apr 26 2016 Hello,

MrSmith (3/11) Apr 26 2016 I would like to use this instead of c++ static lib. Thanks! (I

Stefan Koch (11/13) Apr 26 2016 Sure it does, but keep in mind the c++ version is heavily
Stefan Koch (3/5) Apr 26 2016 Oh and If you could please send me a sample of a file you are

Walter Bright (4/11) Apr 26 2016 Sounds nice. I'm curious how it would compare to:

Marco Leise (14/32) Apr 27 2016 There exist some comparisons for the C++ implementations

Stefan Koch (3/13) Apr 28 2016 Thanks for the 2. link you posted.

Stefan Koch (6/9) Apr 28 2016 lz77 took 176 hnecs uncompressing

Stefan Koch (3/16) Apr 28 2016 Though the compression ratio is worse.

Stefan Koch (5/7) Apr 30 2016 I have to go back on that, due to restrictions in the lz4 spec

Dejan Lekic (3/11) Apr 27 2016 That is brilliant! I need LZ4 compression for a small project I

Stefan Koch (11/13) Apr 27 2016 The decompressor is ready to be released.

Dmitry Olshansky (7/14) Apr 28 2016 What's the benefit? I mean after CTFE-decompression they are going to

Stefan Koch (8/12) Apr 28 2016 The compiler can load files faster, that are being used by ctfe

deadalnix (4/19) Apr 28 2016 Considering the speed and memory consumption of CTFE, I'd bet on

Dmitry Olshansky (5/23) Apr 28 2016 Yeah, the whole CTFE to save compile-time memory sounds like a bad joke
Stefan Koch (11/12) Apr 28 2016 I would like a have an allocation primitive for ctfe use.

Stefan Koch (6/8) Apr 28 2016 I fear that is going to be pretty slow and will eat at least 1.5

Stefan Koch <uplink.coder googlemail.com> writes:

Hello,

originally I want to wait with this announcement until DConf.
But since I working on another toy. I can release this info early.

So as per title. you can decompress .lz4 flies created by the 
standard lz4hc commnadline tool at compile time.

No github link yet as there is a little bit of cleanup todo :)

Please comment.

Apr 26 2016

MrSmith <mrsmith33 yandex.ru> writes:

On Tuesday, 26 April 2016 at 22:05:39 UTC, Stefan Koch wrote:
 Hello,

 originally I want to wait with this announcement until DConf.
 But since I working on another toy. I can release this info 
 early.

 So as per title. you can decompress .lz4 flies created by the 
 standard lz4hc commnadline tool at compile time.

 No github link yet as there is a little bit of cleanup todo :)

 Please comment.

I would like to use this instead of c++ static lib. Thanks! (I 
hope it works at runtime too).

Apr 26 2016

Stefan Koch <uplink.coder googlemail.com> writes:

On Tuesday, 26 April 2016 at 22:07:47 UTC, MrSmith wrote:
 I would like to use this instead of c++ static lib. Thanks! (I 
 hope it works at runtime too).

Sure it does, but keep in mind the c++ version is heavily 
optimized.
I would have to make a special runtime version to archive 
comparable performance I think.

That said,
I already plan to write another optimized version.

Concerning compression.
I am fairly certain I can beat the compression ratio of lz4hc in 
a few cases.
But it is going to be slower.

Apr 26 2016

Stefan Koch <uplink.coder googlemail.com> writes:

On Tuesday, 26 April 2016 at 22:07:47 UTC, MrSmith wrote:

 I would like to use this instead of c++ static lib. Thanks! (I 
 hope it works at runtime too).

Oh and If you could please send me a sample of a file you are 
trying to uncompress. That would be most helpful.

Apr 26 2016

Walter Bright <newshound2 digitalmars.com> writes:

On 4/26/2016 3:05 PM, Stefan Koch wrote:
 Hello,

 originally I want to wait with this announcement until DConf.
 But since I working on another toy. I can release this info early.

 So as per title. you can decompress .lz4 flies created by the standard lz4hc
 commnadline tool at compile time.

 No github link yet as there is a little bit of cleanup todo :)

 Please comment.

Sounds nice. I'm curious how it would compare to:

https://www.digitalmars.com/sargon/lz77.html

https://github.com/DigitalMars/sargon/blob/master/src/sargon/lz77.d

Apr 26 2016

Marco Leise <Marco.Leise gmx.de> writes:

Am Tue, 26 Apr 2016 23:55:46 -0700
schrieb Walter Bright <newshound2 digitalmars.com>:

 On 4/26/2016 3:05 PM, Stefan Koch wrote:
 Hello,

 originally I want to wait with this announcement until DConf.
 But since I working on another toy. I can release this info early.

 So as per title. you can decompress .lz4 flies created by the standard lz4hc
 commnadline tool at compile time.

 No github link yet as there is a little bit of cleanup todo :)

 Please comment.  

 
 Sounds nice. I'm curious how it would compare to:
 
 https://www.digitalmars.com/sargon/lz77.html
 
 https://github.com/DigitalMars/sargon/blob/master/src/sargon/lz77.d

There exist some comparisons for the C++ implementations
(zlib's DEFLATE being a variation of lz77):
http://catchchallenger.first-world.info//wiki/Quick_Benchmark:_Gzip_vs_Bzip2_vs_LZMA_vs_XZ_vs_LZ4_vs_LZO
https://pdfs.semanticscholar.org/9b69/86f2fff8db7e080ef8b02aa19f3941a61a91.pdf
(pg.9)

The high compression variant of lz4 basically like gzip with
9x faster decompression. That makes it well suited for use
cases where you compress once, decompress often and I/O
sequential reads are fast e.g. 200 MB/s or the program does
other computations meanwhile and one doesn't want
decompression to use a lot of CPU time.

-- 
Marco

Apr 27 2016

Stefan Koch <uplink.coder googlemail.com> writes:

On Thursday, 28 April 2016 at 06:03:46 UTC, Marco Leise wrote:
 There exist some comparisons for the C++ implementations
 (zlib's DEFLATE being a variation of lz77):
 http://catchchallenger.first-world.info//wiki/Quick_Benchmark:_Gzip_vs_Bzip2_vs_LZMA_vs_XZ_vs_LZ4_vs_LZO
 https://pdfs.semanticscholar.org/9b69/86f2fff8db7e080ef8b02aa19f3941a61a91.pdf
(pg.9)

 The high compression variant of lz4 basically like gzip with 9x 
 faster decompression. That makes it well suited for use cases 
 where you compress once, decompress often and I/O sequential 
 reads are fast e.g. 200 MB/s or the program does other 
 computations meanwhile and one doesn't want decompression to 
 use a lot of CPU time.

Thanks for the 2. link you posted.
This made me aware of a few things I were not aware of before.

Apr 28 2016

Stefan Koch <uplink.coder googlemail.com> writes:

On Wednesday, 27 April 2016 at 06:55:46 UTC, Walter Bright wrote:
 Sounds nice. I'm curious how it would compare to:

 https://www.digitalmars.com/sargon/lz77.html

 https://github.com/DigitalMars/sargon/blob/master/src/sargon/lz77.d

lz77 took 176 hnecs uncompressing
lz4 took 92 hnecs uncompressing

And another test in reversed order using the same data.

lz4 took 162 hnecs uncompressing
lz77 took 245 hnecs uncompressing

Apr 28 2016

Stefan Koch <uplink.coder googlemail.com> writes:

On Thursday, 28 April 2016 at 20:12:58 UTC, Stefan Koch wrote:
 On Wednesday, 27 April 2016 at 06:55:46 UTC, Walter Bright 
 wrote:
 Sounds nice. I'm curious how it would compare to:

 https://www.digitalmars.com/sargon/lz77.html

 https://github.com/DigitalMars/sargon/blob/master/src/sargon/lz77.d

 lz77 took 176 hnecs uncompressing
 lz4 took 92 hnecs uncompressing

 And another test in reversed order using the same data.

 lz4 took 162 hnecs uncompressing
 lz77 took 245 hnecs uncompressing

Though the compression ratio is worse.
But that is partially fixable.

Apr 28 2016

Stefan Koch <uplink.coder googlemail.com> writes:

On Thursday, 28 April 2016 at 20:58:25 UTC, Stefan Koch wrote:
 Though the compression ratio is worse.
 But that is partially fixable.

I have to go back on that, due to restrictions in the lz4 spec 
many _very_ small files will have significant overhead.

Work on improving the compression ratio is ongoing but there is 
not more then 0.5-1.5% improvement to expect.

Apr 30 2016

Dejan Lekic <dejan.lekic gmail.com> writes:

On Tuesday, 26 April 2016 at 22:05:39 UTC, Stefan Koch wrote:
 Hello,

 originally I want to wait with this announcement until DConf.
 But since I working on another toy. I can release this info 
 early.

 So as per title. you can decompress .lz4 flies created by the 
 standard lz4hc commnadline tool at compile time.

 No github link yet as there is a little bit of cleanup todo :)

 Please comment.

That is brilliant! I need LZ4 compression for a small project I 
work on...

Apr 27 2016

Stefan Koch <uplink.coder googlemail.com> writes:

On Wednesday, 27 April 2016 at 07:51:30 UTC, Dejan Lekic wrote:
 That is brilliant! I need LZ4 compression for a small project I 
 work on...

The decompressor is ready to be released.
It should work for all files compressed with the vanilla
lz4c -9

please regard this release as alpha quality.

https://github.com/UplinkCoder/lz4-ctfe

P.S and I did not tweak the source. The compressed file size just 
happens to be 1911.
I take this as a sign of correctness.

P.P.S Actually LZ4 is a much more interesting topic then SQLite.
If you don't mind I am going to talk about that :)

Apr 27 2016

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

On 27-Apr-2016 01:05, Stefan Koch wrote:
 Hello,

 originally I want to wait with this announcement until DConf.
 But since I working on another toy. I can release this info early.

 So as per title. you can decompress .lz4 flies created by the standard
 lz4hc commnadline tool at compile time.

What's the benefit? I mean after CTFE-decompression they are going to 
add weight to the binary as much as decompressed files.

Compression on the other hand might be helpful to avoid precompressing 
everything beforehand.

 No github link yet as there is a little bit of cleanup todo :)

 Please comment.


-- 
Dmitry Olshansky

Apr 28 2016

Stefan Koch <uplink.coder googlemail.com> writes:

On Thursday, 28 April 2016 at 17:29:05 UTC, Dmitry Olshansky 
wrote:
 What's the benefit? I mean after CTFE-decompression they are 
 going to add weight to the binary as much as decompressed files.

 Compression on the other hand might be helpful to avoid 
 precompressing everything beforehand.

The compiler can load files faster, that are being used by ctfe 
only.
Which would be stripped out by the linker later.
And keep in mind that it also works at runtime.

Memory is scarce at compiletime and this can help reducing the 
memory requirements. When a bit of structure is added on top.

Apr 28 2016

deadalnix <deadalnix gmail.com> writes:

On Thursday, 28 April 2016 at 17:58:50 UTC, Stefan Koch wrote:
 On Thursday, 28 April 2016 at 17:29:05 UTC, Dmitry Olshansky 
 wrote:
 What's the benefit? I mean after CTFE-decompression they are 
 going to add weight to the binary as much as decompressed 
 files.

 Compression on the other hand might be helpful to avoid 
 precompressing everything beforehand.

 The compiler can load files faster, that are being used by ctfe 
 only.
 Which would be stripped out by the linker later.
 And keep in mind that it also works at runtime.

 Memory is scarce at compiletime and this can help reducing the 
 memory requirements. When a bit of structure is added on top.

Considering the speed and memory consumption of CTFE, I'd bet on 
the exact reverse.

Also, the damn thing is allocation in a loop.

Apr 28 2016

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

On 28-Apr-2016 21:31, deadalnix wrote:
 On Thursday, 28 April 2016 at 17:58:50 UTC, Stefan Koch wrote:
 On Thursday, 28 April 2016 at 17:29:05 UTC, Dmitry Olshansky wrote:
 What's the benefit? I mean after CTFE-decompression they are going to
 add weight to the binary as much as decompressed files.

 Compression on the other hand might be helpful to avoid
 precompressing everything beforehand.

 The compiler can load files faster, that are being used by ctfe only.
 Which would be stripped out by the linker later.
 And keep in mind that it also works at runtime.

 Memory is scarce at compiletime and this can help reducing the memory
 requirements. When a bit of structure is added on top.

 Considering the speed and memory consumption of CTFE, I'd bet on the
 exact reverse.

Yeah, the whole CTFE to save compile-time memory sounds like a bad joke 
to me;)
 Also, the damn thing is allocation in a loop.


-- 
Dmitry Olshansky

Apr 28 2016

Stefan Koch <uplink.coder googlemail.com> writes:

On Thursday, 28 April 2016 at 18:31:25 UTC, deadalnix wrote:
 Also, the damn thing is allocation in a loop.

I would like a have an allocation primitive for ctfe use.
But that would not help too much as I don't know the size I need 
in advance.
storing that in the header is optional, and unfortunately lz4c 
does not store it by default.

decompressing the lz family takes never more space then 
uncompressed size of the data.
The working set is often bounded. In the case of lz4 it's 
restricted to 4k in the frame format.
and to 64k by design.

Apr 28 2016

Stefan Koch <uplink.coder googlemail.com> writes:

On Thursday, 28 April 2016 at 17:29:05 UTC, Dmitry Olshansky 
wrote:
 Compression on the other hand might be helpful to avoid 
 precompressing everything beforehand.

I fear that is going to be pretty slow and will eat at least 1.5 
the memory of the file you are trying to store.
If you want a good compression ratio.

then again... it might be fast enough to still be useful.

Apr 28 2016

D Programming

C/C++ Programming

Other

digitalmars.D.announce - LZ4 decompression at CTFE