www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - File() vs. std.cstream.din

reply Gerome Fournier <Gerome_member pathlink.com> writes:
The following code doesn't behave the same if I output a file by 
opening it, or through std.cstream.din. I get an extra line when using
std.cstream.din, and I'd like to understand why.

Here goes the code first to show the problem:

$ cat file_vs_stdin.d
import std.cstream;

void dump_stream(Stream s)
{
while (!s.eof()) {
char[] line = s.readLine();
printf("Line: %.*s\n", line);
}
}

int main (char[][] args)
{
if (args.length == 2) {
File f = new File(args[1]);
dump_stream(f);
f.close();
} else {
dump_stream(din);
}

return 0;
}

$ dmd file_vs_stdin.d
gcc file_vs_stdin.o -o file_vs_stdin -lphobos -lpthread -lm

Here goes a sample file to perform tests:

$ cat test.txt
one
two
three

$ ./file_vs_stdin test.txt
Line: one
Line: two
Line: three

$ ./file_vs_stdin < test.txt
Line: one
Line: two
Line: three
Line:

Why do I get an extra line here?
Apr 20 2006
parent reply "Regan Heath" <regan netwin.co.nz> writes:
On Thu, 20 Apr 2006 20:36:22 +0000 (UTC), Gerome Fournier  
<Gerome_member pathlink.com> wrote:
 The following code doesn't behave the same if I output a file by
 opening it, or through std.cstream.din. I get an extra line when using
 std.cstream.din, and I'd like to understand why.

 Here goes the code first to show the problem:
<snip>
 Why do I get an extra line here?
My guess is that it is related to when eof is flagged. In the File case it reads the last line and flags eof, in the case of din it has to read past the end to flag eof. If you modify the code to be: void dump_stream(Stream s) { while (!s.eof()) { char[] line = s.readLine(); if (line !is null) printf("Line: %.*s\n", line); } } you will get the same results for both, this handles the attempt to read past the end (which results in a null char[] array AKA a non-existant line - as opposed to an empty line - see p.s. below). Regan (to flog again that horse I have previously flogged..) p.s. You might also notice that without the current distinction between a null array and an empty array you would not be able to distinguish this case. You would have to change readLine to return a bool or similar, not that that is such a bad idea, a readLine function that re-used a buffer might be more efficient, eg. bool readLine(inout char[] line);
Apr 20 2006
parent reply Gerome Fournier <Gerome_member pathlink.com> writes:
In article <ops8bt9xwe23k2f5 nrage.netwin.co.nz>, Regan Heath says...
My guess is that it is related to when eof is flagged. In the File case it  
reads the last line and flags eof, in the case of din it has to read past  
the end to flag eof.
If you modify the code to be:

void dump_stream(Stream s)
{
	while (!s.eof()) {
		char[] line = s.readLine();
		if (line !is null) printf("Line: %.*s\n", line);
	}
}
Thanks for your comment. You're right telling that concerning din, looks like it's looking past the end of file to flag eof. Sounds like a kind of bug to me, as I would expect the same behaviour between a file based stream (a seekable stream), and a din based stream (a non seekable stream).
If you modify the code to be:

void dump_stream(Stream s)
{
	while (!s.eof()) {
		char[] line = s.readLine();
		if (line !is null) printf("Line: %.*s\n", line);
	}
}
Unfortunatly this change doesn't work when you have empty lines in your file, as they're skipped. Reading a line containing a single '\n' will give a null value to variable line through the call to readLine. I've even noticed that calling readLine on a non seekable Stream (like std.cstream.din) on a file using the dos '\r\n' end of line sequence, will stop when '\r' is seen. And a second call to readLine will stop on the following '\n'. The sequence '\r\n' is not eaten as a whole end of line sequence, like it's done for a seekable stream.
Apr 22 2006
parent reply "Regan Heath" <regan netwin.co.nz> writes:
On Sat, 22 Apr 2006 15:01:40 +0000 (UTC), Gerome Fournier  
<Gerome_member pathlink.com> wrote:
 In article <ops8bt9xwe23k2f5 nrage.netwin.co.nz>, Regan Heath says...
 My guess is that it is related to when eof is flagged. In the File case  
 it
 reads the last line and flags eof, in the case of din it has to read  
 past
 the end to flag eof.
 If you modify the code to be:

 void dump_stream(Stream s)
 {
 	while (!s.eof()) {
 		char[] line = s.readLine();
 		if (line !is null) printf("Line: %.*s\n", line);
 	}
 }
Thanks for your comment. You're right telling that concerning din, looks like it's looking past the end of file to flag eof. Sounds like a kind of bug to me, as I would expect the same behaviour between a file based stream (a seekable stream), and a din based stream (a non seekable stream).
 If you modify the code to be:

 void dump_stream(Stream s)
 {
 	while (!s.eof()) {
 		char[] line = s.readLine();
 		if (line !is null) printf("Line: %.*s\n", line);
 	}
 }
Unfortunatly this change doesn't work when you have empty lines in your file, as they're skipped. Reading a line containing a single '\n' will give a null value to variable line through the call to readLine.
Ideally it should return a line which is not null but has a length of 0 i.e. an empty line. Until it does, it looks like you're stuck with: while(true) { char[] line = s.readLine(); if (line is null) { if (s.eof) break; line = ""; } writefln("Line: %s",line); } or similar.
 I've even noticed that calling readLine on a non seekable Stream (like
 std.cstream.din) on a file using the dos '\r\n' end of line sequence,
 will stop when '\r' is seen. And a second call to readLine will stop on
 the following '\n'. The sequence '\r\n' is not eaten as a whole end of  
 line sequence, like it's done for a seekable stream.
That's a bug that needs to be fixed. Regan
Apr 23 2006
parent Gerome Fournier <Gerome_member pathlink.com> writes:
In article <ops8hex0ap23k2f5 nrage.netwin.co.nz>, Regan Heath says...
Ideally it should return a line which is not null but has a length of 0  
i.e. an empty line. Until it does, it looks like you're stuck with:

while(true) {
   char[] line = s.readLine();
   if (line is null) {
     if (s.eof) break;
     line = "";
   }
   writefln("Line: %s",line);
}

or similar.
Yep, this workaround sounds good.
 I've even noticed that calling readLine on a non seekable Stream (like
 std.cstream.din) on a file using the dos '\r\n' end of line sequence,
 will stop when '\r' is seen. And a second call to readLine will stop on
 the following '\n'. The sequence '\r\n' is not eaten as a whole end of  
 line sequence, like it's done for a seekable stream.
That's a bug that needs to be fixed.
I'll post something about it under digitalmars.D.bugs.
Apr 24 2006