digitalmars.D.bugs - [Issue 14368] New: stdio.rawRead underperforms stdio
- via Digitalmars-d-bugs (100/116) Mar 28 2015 https://issues.dlang.org/show_bug.cgi?id=14368
https://issues.dlang.org/show_bug.cgi?id=14368 Issue ID: 14368 Summary: stdio.rawRead underperforms stdio Product: D Version: D2 Hardware: x86_64 OS: Linux Status: NEW Severity: enhancement Priority: P1 Component: Phobos Assignee: nobody puremagic.com Reporter: cooper.charles.m gmail.com Performance of std.stdio.rawRead is 50-75% slower than core.std.stdio.fread in tight loop. The performance of a thin wrapper should match C stdio performance or users will be unhappy. // stdioperf.d struct mystruct { long data[4]; } void main() { enum bool CSTDIO = false; mystruct foo; static if (CSTDIO) { import core.stdc.stdio : stdin,fread; while (0 != fread(&foo, foo.sizeof, 1, stdin)) {} } else { static import std.stdio; while (0 != std.stdio.stdin.rawRead((&foo)[0..1]).length) {} } } //EOF $ dmd --version DMD64 D Compiler v2.067.0 Copyright (c) 1999-2014 by Digital Mars written by Walter Bright $ dmd -O -inline -release -noboundscheck stdioperf.d $ time dd if=/dev/zero bs=1M count=8192 | ./stdioperf 8192+0 records in 8192+0 records out 8589934592 bytes (8.6 GB) copied, 7.0038 s, 1.2 GB/s real 0m7.005s user 0m5.792s sys 0m6.924s $ gdc --version gdc (Debian 4.9.2-10) 4.9.2 Copyright (C) 2014 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. $ gdc -O3 -fno-bounds-check -fno-assert -fno-invariants -fno-in -fno-out stdioperf.d $ time dd if=/dev/zero bs=1M count=8192 | ./a.out 8192+0 records in 8192+0 records out 8589934592 bytes (8.6 GB) copied, 6.07485 s, 1.4 GB/s real 0m6.076s user 0m4.908s sys 0m6.684s With CSTDIO = true (performance is same no matter the compiler): $ gdc -O3 stdioperf.d $ time dd if=/dev/zero bs=1M count=8192 | ./a.out 8192+0 records in 8192+0 records out 8589934592 bytes (8.6 GB) copied, 4.18047 s, 2.1 GB/s real 0m4.182s user 0m2.888s sys 0m3.888s Profiling suggests the overhead comes from the compiler failing to inline calls to std.exception.enforce, calling errnoEnforce even when fread's return indicates success, and from buffer slicing overhead. The following patch to d/4.9/std/stdio.d (front end D 2.065) confirms this, reducing the performance gap to ~2% (gdc -O2). It also gets rid of the undocumented null return value: 609c609,611 < enforce(buffer.length, "rawRead must take a non-empty buffer"); ---if (!buffer.length) { enforce(buffer.length, "rawRead must take a non-empty buffer"); }625,626c627,631 < errnoEnforce(!error); < return result ? buffer[0 .. result] : null; ---if (result < buffer.length) { errnoEnforce(!error); return buffer[0..result]; } return buffer;$ gdc -O3 stdioperf.d mystdio.d $ time dd if=/dev/zero bs=1M count=8192 | ./a.out 8192+0 records in 8192+0 records out 8589934592 bytes (8.6 GB) copied, 4.26723 s, 2.0 GB/s real 0m4.269s user 0m2.960s sys 0m3.788s The patch to dmd 2.067 phobos is similar except the line numbers are different: 715c715,717 < enforce(buffer.length, "rawRead must take a non-empty buffer"); ---if (!buffer.length) { enforce(false, "rawRead must take a non-empty buffer"); }733,734c735,739 < errnoEnforce(!error); < return result ? buffer[0 .. result] : null; ---if (result < buffer.length) { errnoEnforce(!error); return buffer[0..result]; } return buffer;I also suggest that stdio.File.rawRead also update the documentation of rawRead so that it includes an example of idiomatic usage: while (1) if (0 == rawRead(...).length) break; Charles --
Mar 28 2015