digitalmars.D.learn - read till EOF from stdin
- kdevel (31/31) Dec 10 2020 Currently as a workaround I read all the chars from stdin with
- frame (7/17) Dec 11 2020 I see expected behaviour here if you use a buffer of length 4. I
- kdevel (14/35) Dec 11 2020 My code cannot do that because the function byChunk has control
- frame (15/23) Dec 11 2020 What do you mean by control? It just has the file handle, why do
- kdevel (29/34) Dec 11 2020 The error happens while the cpu executes code of the D runtime
- Adam D. Ruppe (6/8) Dec 11 2020 works for me.... looks like i have
Currently as a workaround I read all the chars from stdin with import std.file; auto s = cast (string) read("/dev/fd/0"); after I found that you can't read from stdin. This is of course non-portable Linux only code. In perl I frequently use the idiom $s = join ('', <>); that corresponds to D's import std.stdio; import std.array; import std.typecons; auto s = stdin.byLineCopy(Yes.keepTerminator).join; which alas needs an amazing amount of import boilerplate. BTW why does byLine not suffice in this case? Then there is a third way of reading all the characters from stdin: import std.stdio; import std.array; auto s = cast (string) stdin.byChunk(1).join; This version behaves correctly if Ctrl+D is pressed anywhere after the program is started. This is no longer the case a if larger chunk is read, e.g.: auto s = cast (string) stdin.byChunk(4).join; As strace reveals the resulting program sometimes reads twice zero characters before it terminates: read(0, a <-- A, return "a\n", 1024) = 2 read(0, "", 1024) = 0 <-- ctrl+d read(0, "", 1024) = 0 <-- ctrl+d Any comments or ideas?
Dec 10 2020
On Friday, 11 December 2020 at 02:31:24 UTC, kdevel wrote:auto s = cast (string) stdin.byChunk(4).join; As strace reveals the resulting program sometimes reads twice zero characters before it terminates: read(0, a <-- A, return "a\n", 1024) = 2 read(0, "", 1024) = 0 <-- ctrl+d read(0, "", 1024) = 0 <-- ctrl+d Any comments or ideas?I see expected behaviour here if you use a buffer of length 4. I don't know what you want to achieve here. If you want to stop reading from stdin, you should check for eof() instead. You should not check yourself for the character. eof() can be lock in by multiple ways and it is the only correct way to handle all of them.
Dec 11 2020
On Friday, 11 December 2020 at 11:05:59 UTC, frame wrote:On Friday, 11 December 2020 at 02:31:24 UTC, kdevel wrote:Read till EOF.auto s = cast (string) stdin.byChunk(4).join; As strace reveals the resulting program sometimes reads twice zero characters before it terminates: read(0, a <-- A, return "a\n", 1024) = 2 read(0, "", 1024) = 0 <-- ctrl+d read(0, "", 1024) = 0 <-- ctrl+d Any comments or ideas?I see expected behaviour here if you use a buffer of length 4. I don't know what you want to achieve here.If you want to stop reading from stdin, you should check for eof() instead.My code cannot do that because the function byChunk has control over the file descriptor. The OS reports EOF by returning zero from read(2). The D documentation of byChunk [1] does not mention such a check for eof either.You should not check yourself for the character.Where did I do that here? auto s = cast (string) stdin.byChunk(4).join;eof() can be lock in by multiple ways and it is the only correct way to handle all of them.?? [1] https://linux.die.net/man/2/read [2] https://dlang.org/phobos/std_stdio.html#byChunk
Dec 11 2020
On Friday, 11 December 2020 at 12:34:19 UTC, kdevel wrote:My code cannot do that because the function byChunk has control over the file descriptor.What do you mean by control? It just has the file handle, why do you cannot call eof() on the file handle struct?I was just assuming that...You should not check yourself for the character.Where did I do that here?I mean that it's safer to rely on eof() which should return true if the stream comes inaccessible, caused by read(2) or whatever other OS depended reasons. ...but I was looking in the source and... yes, byChunk() seems not to care about eof() - but it will just truncate the buffer on read failure which should work for your case. It basically just calls C's fread(). Are you sure that read(0, "", 1024) trace cones from your ctrl+d? It could be also from the runtime checking if the handle can be closed or something. Please note that your terminal could be also the issue.eof() can be lock in by multiple ways and it is the only correct way to handle all of them.??
Dec 11 2020
On Friday, 11 December 2020 at 15:57:37 UTC, frame wrote:On Friday, 11 December 2020 at 12:34:19 UTC, kdevel wrote:The error happens while the cpu executes code of the D runtime (or the C library). After looking into std/stdio.d I found that byChunk uses fread (not read). Thus I think I ran into [1] which seems to affect quite a lot of programs [2] [3]. ~~~bychunk.d void main () { import std.stdio; foreach (buf; stdin.byChunk (4096)) { auto s = cast (string) buf; writeln ("buf = <", s, ">"); } } ~~~ STR: 1. ./bychunk 2. A, [RETURN] 3. CTRL+D expected: program ends found: program still reading [1] https://sourceware.org/bugzilla/show_bug.cgi?id=1190 Bug 1190 Summary: fgetc()/fread() behaviour is not POSIX compliant [2] https://unix.stackexchange.com/questions/517064/why-does-hexdump-try-to-read-through-eof [3] https://stackoverflow.com/questions/52674057/why-does-an-fread-loop-require-an-extra-ctrld-to-signal-eof-with-glibcMy code cannot do that because the function byChunk has control over the file descriptor.What do you mean by control?
Dec 11 2020
On Friday, 11 December 2020 at 16:37:42 UTC, kdevel wrote:expected: program ends found: program still readingworks for me.... looks like i have libc-2.30.so so i guess i have the fixed libc. Can you confirm what version you have? I did `ls /lib/libc*` to pick that out but it might be different on your system.
Dec 11 2020
On Friday, 11 December 2020 at 16:49:18 UTC, Adam D. Ruppe wrote:libc-2.30.soThe bug was fixed in 2.28 IIRC.so i guess i have the fixed libc. Can you confirm what version you have?Various. I tested the code on a machine running the yet EOL CENTOS-6 having glibc 2.12.
Dec 11 2020
On Friday, 11 December 2020 at 18:18:35 UTC, kdevel wrote:On Friday, 11 December 2020 at 16:49:18 UTC, Adam D. Ruppe wrote:Of course that could be "your" bug. But you should test your program with another stream than stdin to ensure the terminal is not the problem because read(2) is lowlevel and you may not see where it really comes from. Maybe the terminal checks again or there are some buffers between terminal and your program.libc-2.30.soThe bug was fixed in 2.28 IIRC.so i guess i have the fixed libc. Can you confirm what version you have?Various. I tested the code on a machine running the yet EOL CENTOS-6 having glibc 2.12.
Dec 11 2020