digitalmars.D.learn - zlib performance
- yawniek (23/23) Aug 07 2015 hi,
- Daniel =?UTF-8?B?S296w6Fr?= via Digitalmars-d-learn (5/36) Aug 07 2015 On Fri, 07 Aug 2015 07:19:43 +0000
- yawniek (5/7) Aug 07 2015 i'm on master. v2.068-devel-8f81ffc
- Daniel =?UTF-8?B?S296w6Fr?= (3/12) Aug 07 2015 This is weird. I would say it should not crash
- yawniek (21/24) Aug 07 2015 exactely. but try it yourself.
- yawniek (3/8) Aug 07 2015 ok maybe not, there is another problem, not everything seems to
- Daniel =?UTF-8?B?S296w6Fr?= (19/31) Aug 07 2015 import=20
- yawniek (4/21) Aug 07 2015 not here on os x:
- Daniel =?UTF-8?B?S296w6Fr?= (6/31) Aug 07 2015 Maybe stil some IO issues. On Linux it is OK. I remmeber a few days ago
- Daniel =?UTF-8?B?S296w6Fr?= (4/29) Aug 07 2015 can you try it with ldc?
- yawniek (19/21) Aug 07 2015 ldc 0.15.2 beta2
- Daniel =?UTF-8?B?S296w6Fr?= (5/37) Aug 07 2015 I am not sure but I do not tkink so, it is currently possible.
- yawniek (10/22) Aug 07 2015 i can now reproduce the results and indeed, its faster than zcat:
- Daniel Kozak (24/47) Aug 07 2015 Can you try it without write operation (comment out all write)?
- yawniek (3/28) Aug 07 2015 2.82s user 0.05s system 99% cpu 2.873 total
- Daniel =?UTF-8?B?S296w6Fr?= (5/43) Aug 07 2015 So I/O seems to be OK
- Daniel =?UTF-8?B?S296w6Fr?= (4/19) Aug 07 2015 Ok I see it is not weird because Uncompressor class probably has slice
- Daniel =?UTF-8?B?S296w6Fr?= (21/31) Aug 07 2015 It depends. In your case you don't need to.
hi, unpacking files is kinda slow, probably i'm doing something wrong. below code is about half the speed of gnu zcat on my os x machine. why? why do i need to .dup the buffer? can i get rid of the casts? the chunk size has only a marginal influence. https://github.com/yannick/zcatd import std.zlib, std.file, std.stdio; void main(string[] args) { auto f = File(args[1], "r"); auto uncompressor = new UnCompress(HeaderFormat.gzip); foreach (ubyte[] buffer; f.byChunk(4096)) { auto uncompressed = cast(immutable(string)) uncompressor.uncompress(buffer.dup); write(uncompressed); } }
Aug 07 2015
On Fri, 07 Aug 2015 07:19:43 +0000 yawniek via Digitalmars-d-learn <digitalmars-d-learn puremagic.com> wrote:hi, unpacking files is kinda slow, probably i'm doing something wrong. below code is about half the speed of gnu zcat on my os x machine. why? why do i need to .dup the buffer? can i get rid of the casts? the chunk size has only a marginal influence. https://github.com/yannick/zcatd import std.zlib, std.file, std.stdio; void main(string[] args) { auto f = File(args[1], "r"); auto uncompressor = new UnCompress(HeaderFormat.gzip); foreach (ubyte[] buffer; f.byChunk(4096)) { auto uncompressed = cast(immutable(string)) uncompressor.uncompress(buffer.dup); write(uncompressed); } }Which compiler and version. There has been some performance problem with IO on OSX, it should be fixed in 2.068 release
Aug 07 2015
On Friday, 7 August 2015 at 07:29:15 UTC, Daniel Kozák wrote:Which compiler and version. There has been some performance problem with IO on OSX, it should be fixed in 2.068 releasei'm on master. v2.068-devel-8f81ffc also changed file read mode to "rb". i don't understand why the program crashes when i do not do the .dup
Aug 07 2015
On Fri, 07 Aug 2015 07:36:39 +0000 "yawniek" <dlang srtnwz.com> wrote:On Friday, 7 August 2015 at 07:29:15 UTC, Daniel Koz=C3=A1k wrote:This is weird. I would say it should not crashWhich compiler and version. There has been some performance=20 problem with IO on OSX, it should be fixed in 2.068 release=20 i'm on master. v2.068-devel-8f81ffc also changed file read mode to "rb". =20 i don't understand why the program crashes when i do not do the=20 .dup
Aug 07 2015
On Friday, 7 August 2015 at 07:43:25 UTC, Daniel Kozák wrote:exactely. but try it yourself. the fastest version i could come up so far is below. std.conv slows it down. going from a 4kb to a 4mb buffer helped. now i'm within 30% of gzcat's performance. import std.zlib, std.file, std.stdio; void main(string[] args) { auto f = File(args[1], "rb"); auto uncompressor = new UnCompress(HeaderFormat.gzip); foreach (ubyte[] buffer; f.byChunk(1024*1024*4)) { auto uncompressed = cast(immutable(string)) uncompressor.uncompress(buffer.dup); write(uncompressed); } }i don't understand why the program crashes when i do not do the .dupThis is weird. I would say it should not crash
Aug 07 2015
On Friday, 7 August 2015 at 07:48:25 UTC, yawniek wrote:On Friday, 7 August 2015 at 07:43:25 UTC, Daniel Kozák wrote: the fastest version i could come up so far is below. std.conv slows it down. going from a 4kb to a 4mb buffer helped. now i'm within 30% of gzcat's performance.ok maybe not, there is another problem, not everything seems to get flushed, i'm missing output
Aug 07 2015
On Fri, 07 Aug 2015 08:01:27 +0000 "yawniek" <dlang srtnwz.com> wrote:On Friday, 7 August 2015 at 07:48:25 UTC, yawniek wrote:import=20 std.zlib,=20 std.file, std.stdio, std.conv; void main(string[] args) { auto f =3D File(args[1], "rb"); auto uncompressor =3D new UnCompress(HeaderFormat.gzip); foreach (buffer; f.byChunk(4096)) { auto uncompressed =3D cast(char[])(uncompressor.uncompress(buffer.idup)); write(uncompressed); } write(cast(char[])uncompressor.flush); } this is faster for me than zcatOn Friday, 7 August 2015 at 07:43:25 UTC, Daniel Koz=C3=A1k wrote: the fastest version i could come up so far is below. std.conv slows it down. going from a 4kb to a 4mb buffer helped. now i'm within 30% of=20 gzcat's performance.=20 ok maybe not, there is another problem, not everything seems to=20 get flushed, i'm missing output =20 =20 =20
Aug 07 2015
On Friday, 7 August 2015 at 08:05:01 UTC, Daniel Kozák wrote:import std.zlib, std.file, std.stdio, std.conv; void main(string[] args) { auto f = File(args[1], "rb"); auto uncompressor = new UnCompress(HeaderFormat.gzip); foreach (buffer; f.byChunk(4096)) { auto uncompressed = cast(char[])(uncompressor.uncompress(buffer.idup)); write(uncompressed); } write(cast(char[])uncompressor.flush); } this is faster for me than zcatnot here on os x: d version: 3.06s user 1.17s system 82% cpu 5.156 total gzcat 1.79s user 0.11s system 99% cpu 1.899 total
Aug 07 2015
On Fri, 07 Aug 2015 08:13:01 +0000 "yawniek" <dlang srtnwz.com> wrote:On Friday, 7 August 2015 at 08:05:01 UTC, Daniel Koz=C3=A1k wrote:Maybe stil some IO issues. On Linux it is OK. I remmeber a few days ago there has been some discussion about IO improvments fo osx. http://forum.dlang.org/post/mailman.184.1437841312.16005.digitalmars-d pure= magic.comimport std.zlib, std.file, std.stdio, std.conv; void main(string[] args) { auto f =3D File(args[1], "rb"); auto uncompressor =3D new UnCompress(HeaderFormat.gzip); foreach (buffer; f.byChunk(4096)) { auto uncompressed =3D cast(char[])(uncompressor.uncompress(buffer.idup)); write(uncompressed); } write(cast(char[])uncompressor.flush); } this is faster for me than zcat=20 not here on os x: d version: 3.06s user 1.17s system 82% cpu 5.156 total gzcat 1.79s user 0.11s system 99% cpu 1.899 total
Aug 07 2015
On Fri, 07 Aug 2015 08:13:01 +0000 "yawniek" <dlang srtnwz.com> wrote:On Friday, 7 August 2015 at 08:05:01 UTC, Daniel Koz=C3=A1k wrote:can you try it with ldc? ldc[2] -O -release -boundscheck=3Doff -singleobj app.dimport std.zlib, std.file, std.stdio, std.conv; void main(string[] args) { auto f =3D File(args[1], "rb"); auto uncompressor =3D new UnCompress(HeaderFormat.gzip); foreach (buffer; f.byChunk(4096)) { auto uncompressed =3D cast(char[])(uncompressor.uncompress(buffer.idup)); write(uncompressed); } write(cast(char[])uncompressor.flush); } this is faster for me than zcat=20 not here on os x: d version: 3.06s user 1.17s system 82% cpu 5.156 total gzcat 1.79s user 0.11s system 99% cpu 1.899 total
Aug 07 2015
On Friday, 7 August 2015 at 08:24:11 UTC, Daniel Kozák wrote:can you try it with ldc? ldc[2] -O -release -boundscheck=off -singleobj app.dldc 0.15.2 beta2 2.86s user 0.55s system 77% cpu 4.392 total v2.068-devel-8f81ffc 2.86s user 0.67s system 78% cpu 4.476 total v2.067 2.88s user 0.67s system 78% cpu 4.529 total (different file, half the size of the one above:) archlinux, virtualbox vm, DMD64 D Compiler v2.067 real 0m2.079s user 0m1.193s sys 0m0.637s zcat: real 0m3.023s user 0m0.320s sys 0m2.440s is there a way to get rid of the flush in the end so everything happens in one loop? its a bit inconvenient when i have another subloop that does work
Aug 07 2015
On Fri, 07 Aug 2015 08:42:45 +0000 "yawniek" <dlang srtnwz.com> wrote:On Friday, 7 August 2015 at 08:24:11 UTC, Daniel Koz=C3=A1k wrote: =20I am not sure but I do not tkink so, it is currently possible. Btw. If you want to remove [i]dup. you can use this code: http://dpaste.dzfl.pl/f52c82935bb5can you try it with ldc? ldc[2] -O -release -boundscheck=3Doff -singleobj app.d=20 =20 ldc 0.15.2 beta2 2.86s user 0.55s system 77% cpu 4.392 total =20 v2.068-devel-8f81ffc 2.86s user 0.67s system 78% cpu 4.476 total =20 v2.067 2.88s user 0.67s system 78% cpu 4.529 total =20 (different file, half the size of the one above:) archlinux, virtualbox vm, DMD64 D Compiler v2.067 real 0m2.079s user 0m1.193s sys 0m0.637s =20 zcat: real 0m3.023s user 0m0.320s sys 0m2.440s =20 =20 is there a way to get rid of the flush in the end so everything=20 happens in one loop? its a bit inconvenient when i have another=20 subloop that does work =20
Aug 07 2015
On Friday, 7 August 2015 at 08:50:11 UTC, Daniel Kozák wrote:i can now reproduce the results and indeed, its faster than zcat: on a c4.xlarge aws instance running archlinux and dmd v2.067 same file as above on my macbook. best run: 2.72s user 0.39s system 99% cpu 3.134 total worst run: 3.47s user 0.46s system 99% cpu 3.970 total zcat: best: 4.45s user 0.28s system 99% cpu 4.764 total worst: 4.99s user 0.57s system 99% cpu 5.568 total so i guess on os x there is still something to be optimizedldc[2] -O -release -boundscheck=off -singleobj app.dldc 0.15.2 beta2 2.86s user 0.55s system 77% cpu 4.392 total v2.068-devel-8f81ffc 2.86s user 0.67s system 78% cpu 4.476 total v2.067 2.88s user 0.67s system 78% cpu 4.529 total
Aug 07 2015
On Friday, 7 August 2015 at 09:12:32 UTC, yawniek wrote:On Friday, 7 August 2015 at 08:50:11 UTC, Daniel Kozák wrote:Can you try it without write operation (comment out all write)? And than try it without uncompression? // without compression: void main(string[] args) { auto f = File(args[1], "r"); foreach (buffer; f.byChunk(4096)) { write(cast(char[])buffer); } } // without write: void main(string[] args) { auto f = File(args[1], "r"); auto uncompressor = new UnCompress(HeaderFormat.gzip); foreach (buffer; f.byChunk(4096)) { auto uncompressed = cast(char[])(uncompressor.uncompress(buffer)); } uncompressor.flush; }i can now reproduce the results and indeed, its faster than zcat: on a c4.xlarge aws instance running archlinux and dmd v2.067 same file as above on my macbook. best run: 2.72s user 0.39s system 99% cpu 3.134 total worst run: 3.47s user 0.46s system 99% cpu 3.970 total zcat: best: 4.45s user 0.28s system 99% cpu 4.764 total worst: 4.99s user 0.57s system 99% cpu 5.568 total so i guess on os x there is still something to be optimizedldc[2] -O -release -boundscheck=off -singleobj app.dldc 0.15.2 beta2 2.86s user 0.55s system 77% cpu 4.392 total v2.068-devel-8f81ffc 2.86s user 0.67s system 78% cpu 4.476 total v2.067 2.88s user 0.67s system 78% cpu 4.529 total
Aug 07 2015
On Friday, 7 August 2015 at 11:45:00 UTC, Daniel Kozak wrote:On Friday, 7 August 2015 at 09:12:32 UTC, yawniek wrote:0.03s user 0.09s system 11% cpu 1.046 total[...]Can you try it without write operation (comment out all write)? And than try it without uncompression? // without compression: void main(string[] args) { auto f = File(args[1], "r"); foreach (buffer; f.byChunk(4096)) { write(cast(char[])buffer); } }// without write: void main(string[] args) { auto f = File(args[1], "r"); auto uncompressor = new UnCompress(HeaderFormat.gzip); foreach (buffer; f.byChunk(4096)) { auto uncompressed = cast(char[])(uncompressor.uncompress(buffer)); } uncompressor.flush; }2.82s user 0.05s system 99% cpu 2.873 total
Aug 07 2015
On Fri, 07 Aug 2015 12:29:26 +0000 "yawniek" <dlang srtnwz.com> wrote:On Friday, 7 August 2015 at 11:45:00 UTC, Daniel Kozak wrote:So I/O seems to be OKOn Friday, 7 August 2015 at 09:12:32 UTC, yawniek wrote:0.03s user 0.09s system 11% cpu 1.046 total[...]Can you try it without write operation (comment out all write)? And than try it without uncompression? // without compression: void main(string[] args) { auto f = File(args[1], "r"); foreach (buffer; f.byChunk(4096)) { write(cast(char[])buffer); } }So maybe it is a zlib problem on osx?// without write: void main(string[] args) { auto f = File(args[1], "r"); auto uncompressor = new UnCompress(HeaderFormat.gzip); foreach (buffer; f.byChunk(4096)) { auto uncompressed = cast(char[])(uncompressor.uncompress(buffer)); } uncompressor.flush; }2.82s user 0.05s system 99% cpu 2.873 total
Aug 07 2015
On Fri, 7 Aug 2015 09:43:25 +0200 Daniel Koz=C3=A1k <kozzi dlang.cz> wrote:=20 On Fri, 07 Aug 2015 07:36:39 +0000 "yawniek" <dlang srtnwz.com> wrote: =20Ok I see it is not weird because Uncompressor class probably has slice to bufferOn Friday, 7 August 2015 at 07:29:15 UTC, Daniel Koz=C3=A1k wrote:This is weird. I would say it should not crash =20Which compiler and version. There has been some performance=20 problem with IO on OSX, it should be fixed in 2.068 release=20 i'm on master. v2.068-devel-8f81ffc also changed file read mode to "rb". =20 i don't understand why the program crashes when i do not do the=20 .dup
Aug 07 2015
On Fri, 07 Aug 2015 07:19:43 +0000 "yawniek" <dlang srtnwz.com> wrote:hi, unpacking files is kinda slow, probably i'm doing something wrong. below code is about half the speed of gnu zcat on my os x machine. why? why do i need to .dup the buffer?It depends. In your case you don't need to. byChunk() reuse buffer which means, after each call same buffer is use, so all previous data are gone.can i get rid of the casts?Yes, you can use std.conv.to import std.zlib, std.file, std.stdio, std.conv; void main(string[] args) { auto f = File(args[1], "rb"); auto uncompressor = new UnCompress(HeaderFormat.gzip); foreach (buffer; f.byChunk(4096)) { auto uncompressed = to!(char[])(uncompressor.uncompress(buffer)); write(uncompressed); } }
Aug 07 2015