digitalmars.D.learn - stream.getc() doesn't recognize eof

Brian White (24/24) Mar 12 2008 I was looking through the std.stream code of Phobos and found this funct...

Regan Heath (9/36) Mar 13 2008 At EOF readBlock returns 0, but more importantly it does not modify the

Brian White (6/8) Mar 13 2008 Ah, thanks!

Regan Heath (27/36) Mar 13 2008 Good point, might be safer to check for the 0 return and set c to

Brian White (15/18) Mar 13 2008 I think it makes a better design. This way feels like relying on

Regan Heath (7/14) Mar 14 2008 You can use -release to turn off contracts and asserts, so only

Brian White (4/18) Mar 16 2008 I was just thinking the exact same thing.

Brian White <bcwhite pobox.com> writes:

I was looking through the std.stream code of Phobos and found this function:

   // reads and returns next character from the stream,
   // handles characters pushed back by ungetc()
   // returns char.init on eof.
   char getc() {
     char c;
     if (prevCr) {
       prevCr = false;
       c = getc();
       if (c != '\n')
         return c;
     }
     if (unget.length > 1) {
       c = cast(char)unget[unget.length - 1];
       unget.length = unget.length - 1;
     } else {
       readBlock(&c,1);
     }
     return c;
   }


Is there something I don't understand?  How does it recognize EOF?  The 
"readBlock" function is defined as returning 0 (zero) if there is no 
more data but its return value in not checked.

-- Brian

Mar 12 2008

Regan Heath <regan netmail.co.nz> writes:

Brian White wrote:
 I was looking through the std.stream code of Phobos and found this 
 function:
 
   // reads and returns next character from the stream,
   // handles characters pushed back by ungetc()
   // returns char.init on eof.
   char getc() {
     char c;
     if (prevCr) {
       prevCr = false;
       c = getc();
       if (c != '\n')
         return c;
     }
     if (unget.length > 1) {
       c = cast(char)unget[unget.length - 1];
       unget.length = unget.length - 1;
     } else {
       readBlock(&c,1);
     }
     return c;
   }
 
 
 Is there something I don't understand?  How does it recognize EOF?  The 
 "readBlock" function is defined as returning 0 (zero) if there is no 
 more data but its return value in not checked.

At EOF readBlock returns 0, but more importantly it does not modify the 
value of 'c' which it is passed.

The value of 'c' is char.init (due to D's automatic initialisation of 
variables to their init value)

So, because c == char.init and nothing has modified it, the path which 
calls readBlock will return char.init when EOF is reached.

:)

Regan

Mar 13 2008

Brian White <bcwhite pobox.com> writes:

 So, because c == char.init and nothing has modified it, the path which 
 calls readBlock will return char.init when EOF is reached.

Ah, thanks!

I must say that this technique worries me somewhat.  "readBlock" is an 
abstract function definable by any derived class and I don't believe 
that "c must remain unchanged where data is not stored" is a defined 
output requirement of that method.

-- Brian

Mar 13 2008

Regan Heath <regan netmail.co.nz> writes:

Brian White wrote:
 So, because c == char.init and nothing has modified it, the path which 
 calls readBlock will return char.init when EOF is reached.

 
 Ah, thanks!
 
 I must say that this technique worries me somewhat.  "readBlock" is an 
 abstract function definable by any derived class and I don't believe 
 that "c must remain unchanged where data is not stored" is a defined 
 output requirement of that method.

Good point, might be safer to check for the 0 return and set c to 
char.init explicitly.

You comment did get me thinking... Is there some way of expressing the 
requirement using design by contract?  I think the answer is, not 
easily, you'd have to do something like:

// the problem being that we need a global to copy the input buffer into
// and it could be potentially huge.
// when really all we want is some way to detect whether
// data was written to that address _at all_
byte* buffer_in;

abstract size_t readBlock(void* buffer, size_t size)
in {
   buffer_in = malloc(size);
   memcpy(buffer_in, buffer, size);
}
out (result) {
   assert(result > 0 ||
         (result == 0 && memcmp(buffer_in, buffer, size) == 0));
}
/* note, no body, therefore function is still 'abstract' */

All that assuming it is legal to specify in/out contracts on an abstract 
method without a body.

It should be possible, it would simply follow the same rules given for 
inheritance here under "In, Out and Inheritance":
http://www.digitalmars.com/d/1.0/dbc.html

Regan

Mar 13 2008

Brian White <bcwhite pobox.com> writes:

 Good point, might be safer to check for the 0 return and set c to 
 char.init explicitly.

I think it makes a better design.  This way feels like relying on 
side-effects and I've spent enough time coding perl to know that making 
use of side-effects is a great start towards unreadable and 
unmaintainable code.

The more obvious you make code, the less likely there will be bugs and 
the easier it will be for someone else to maintain it.  A comment like 
"c still has .init value if readBlock failed" would also be sufficient.

If I were maintaining this code, I would have (wrongly) assumed a bug 
and "corrected" it, possibly introducing a new bug.


         (result == 0 && memcmp(buffer_in, buffer, size) == 0));

Eee-Gad, but that's painful!  Performance could easily be so bad that 
I'd turn off the checks and then they're no use at all.

I've never known a "read" function to modify bytes beyond the "count" 
amount returned, but I don't know if it's ever explicitly stated not to 
do so.

-- Brian

Mar 13 2008

Regan Heath <regan netmail.co.nz> writes:

Brian White wrote:
         (result == 0 && memcmp(buffer_in, buffer, size) == 0));

 
 Eee-Gad, but that's painful!  Performance could easily be so bad that 
 I'd turn off the checks and then they're no use at all.

You can use -release to turn off contracts and asserts, so only 
non-release builds would suffer the penalty.

 I've never known a "read" function to modify bytes beyond the "count" 
 amount returned, but I don't know if it's ever explicitly stated not to 
 do so.

True.  You could perhaps cheat a little and remember just the first byte 
of the output buffer, chances are if the first byte hasn't changed, 
nothing was written to the buffer.

Regan

Mar 14 2008

Brian White <bcwhite pobox.com> writes:

         (result == 0 && memcmp(buffer_in, buffer, size) == 0));

 Eee-Gad, but that's painful!  Performance could easily be so bad that 
 I'd turn off the checks and then they're no use at all.

 
 You can use -release to turn off contracts and asserts, so only 
 non-release builds would suffer the penalty.

My worry is that the test code would be such a performance hit that it 
would be impossible to use without -release.


 I've never known a "read" function to modify bytes beyond the "count" 
 amount returned, but I don't know if it's ever explicitly stated not 
 to do so.

 
 True.  You could perhaps cheat a little and remember just the first byte 
 of the output buffer, chances are if the first byte hasn't changed, 
 nothing was written to the buffer.

I was just thinking the exact same thing.

-- Brian

Mar 16 2008

D Programming

C/C++ Programming

Other

digitalmars.D.learn - stream.getc() doesn't recognize eof