digitalmars.D.learn - Reading a string in binary mode

Christof Schardt (19/19) Mar 03 2014 I'm evaluating D and try to write a binary io class.

=?UTF-8?B?QWxpIMOHZWhyZWxp?= (11/29) Mar 03 2014 You need to have a buffer of 'size'. Not tested:

Christof Schardt (6/38) Mar 03 2014 Thanks, Ali, this works.

=?UTF-8?B?QWxpIMOHZWhyZWxp?= (5/7) Mar 03 2014 Yes, all of that will happen after I get back to working on the book and...

John Colvin (4/41) Mar 03 2014 Assuming you're not expecting pre-allocation (which I infer from

Christof Schardt (5/48) Mar 03 2014 Thanks, John, this works.

John Colvin (8/63) Mar 03 2014 Doesn't seem like trickery to me; you just make a new array of

Christof Schardt (6/67) Mar 03 2014 By "trickery" I meant having to know about things like

bearophile (61/73) Mar 03 2014 Your function has signature (you use "ref" instead of "in" or

Christof Schardt (4/73) Mar 03 2014 Great, thanks for this insight.

"Christof Schardt" <csnews schardt.info> writes:

I'm evaluating D and try to write a binary io class.
I got stuck with strings:

    void rw(ref string x)
    {
        if(_isWriting)
        {
            int size = x.length;
            _f.rawWrite((&size)[0..1]);
            _f.rawWrite(x);
        }
        else
        {
            int size;
            _f.rawRead((&size)[0..1]);

            ... what now?
        }
    }

Writing is ok, but how do I read the bytes to the
string x after having its size?

Mar 03 2014

=?UTF-8?B?QWxpIMOHZWhyZWxp?= <acehreli yahoo.com> writes:

On 03/03/2014 01:44 PM, Christof Schardt wrote:> I'm evaluating D and 
try to write a binary io class.
 I got stuck with strings:

      void rw(ref string x)
      {
          if(_isWriting)
          {
              int size = x.length;
              _f.rawWrite((&size)[0..1]);
              _f.rawWrite(x);
          }
          else
          {
              int size;
              _f.rawRead((&size)[0..1]);

              ... what now?

You need to have a buffer of 'size'. Not tested:

     auto s = new char[size];
     s = _f.rawRead(s);
     x = s;

However, the last line will not compile due to difference in mutability. 
So will need to do something like this:

     import std.exception : assumeUnique;

     x = assumeUnique(s);

          }
      }

 Writing is ok, but how do I read the bytes to the
 string x after having its size?

Ali

Mar 03 2014

"Christof Schardt" <csnews schardt.info> writes:

"Ali �ehreli" <acehreli yahoo.com> schrieb im Newsbeitrag 
news:lf2ude$1njf$1 digitalmars.com...
 On 03/03/2014 01:44 PM, Christof Schardt wrote:> I'm evaluating D and try 
 to write a binary io class.
 I got stuck with strings:

      void rw(ref string x)
      {
          if(_isWriting)
          {
              int size = x.length;
              _f.rawWrite((&size)[0..1]);
              _f.rawWrite(x);
          }
          else
          {
              int size;
              _f.rawRead((&size)[0..1]);

              ... what now?

 You need to have a buffer of 'size'. Not tested:

     auto s = new char[size];
     s = _f.rawRead(s);
     x = s;

 However, the last line will not compile due to difference in mutability. 
 So will need to do something like this:

     import std.exception : assumeUnique;

     x = assumeUnique(s);

          }
      }

 Writing is ok, but how do I read the bytes to the
 string x after having its size?

 Ali

Thanks, Ali, this works.

BTW: will your excellent book be equipped with a TOC and an index?

I find it hard to look for answers to questions like above in all
the D docs I have (dpl.org, TDPL, your book)

Mar 03 2014

=?UTF-8?B?QWxpIMOHZWhyZWxp?= <acehreli yahoo.com> writes:

On 03/03/2014 02:25 PM, Christof Schardt wrote:

 Thanks, Ali, this works.

Yay! :)

 book be equipped with a TOC and an index?

Yes, all of that will happen after I get back to working on the book and 
its ever increasing list of to-dos. :)

Ali

Mar 03 2014

"John Colvin" <john.loughran.colvin gmail.com> writes:

On Monday, 3 March 2014 at 21:44:16 UTC, Christof Schardt wrote:
 I'm evaluating D and try to write a binary io class.
 I got stuck with strings:

     void rw(ref string x)
     {
         if(_isWriting)
         {
             int size = x.length;
             _f.rawWrite((&size)[0..1]);
             _f.rawWrite(x);
         }
         else
         {
             int size;
             _f.rawRead((&size)[0..1]);

             ... what now?
         }
     }

 Writing is ok, but how do I read the bytes to the
 string x after having its size?


Assuming you're not expecting pre-allocation (which I infer from 
your choice of "ref string" instead of "char[]"), you could do 
this:

     void rw(ref string x)
     {
         if(_isWriting)
         {
             size_t size = x.length;
             _f.rawWrite((&size)[0..1]);
             _f.rawWrite(x);
         }
         else
         {
             size_t size;
             _f.rawRead((&size)[0..1]);
             auto tmp = new char[size];
             _f.rawRead(tmp);
             import std.exception : assumeUnique;
             x = tmp.assumeUnique;
         }
     }

Mar 03 2014

"Christof Schardt" <csnews schardt.info> writes:

"John Colvin" <john.loughran.colvin gmail.com> schrieb im Newsbeitrag 
news:dyfkblqonigrtmkwtfjs forum.dlang.org...
 On Monday, 3 March 2014 at 21:44:16 UTC, Christof Schardt wrote:
 I'm evaluating D and try to write a binary io class.
 I got stuck with strings:

     void rw(ref string x)
     {
         if(_isWriting)
         {
             int size = x.length;
             _f.rawWrite((&size)[0..1]);
             _f.rawWrite(x);
         }
         else
         {
             int size;
             _f.rawRead((&size)[0..1]);

             ... what now?
         }
     }

 Writing is ok, but how do I read the bytes to the
 string x after having its size?


 Assuming you're not expecting pre-allocation (which I infer from your 
 choice of "ref string" instead of "char[]"), you could do this:

     void rw(ref string x)
     {
         if(_isWriting)
         {
             size_t size = x.length;
             _f.rawWrite((&size)[0..1]);
             _f.rawWrite(x);
         }
         else
         {
             size_t size;
             _f.rawRead((&size)[0..1]);
             auto tmp = new char[size];
             _f.rawRead(tmp);
             import std.exception : assumeUnique;
             x = tmp.assumeUnique;
         }
     }


Thanks, John, this works.

Though it feels a bit strange, that one has to do such trickery in order to
perform basic things like binary io of strings.

Mar 03 2014

"John Colvin" <john.loughran.colvin gmail.com> writes:

On Monday, 3 March 2014 at 22:22:06 UTC, Christof Schardt wrote:
 "John Colvin" <john.loughran.colvin gmail.com> schrieb im 
 Newsbeitrag
 news:dyfkblqonigrtmkwtfjs forum.dlang.org...
 On Monday, 3 March 2014 at 21:44:16 UTC, Christof Schardt 
 wrote:
 I'm evaluating D and try to write a binary io class.
 I got stuck with strings:

     void rw(ref string x)
     {
         if(_isWriting)
         {
             int size = x.length;
             _f.rawWrite((&size)[0..1]);
             _f.rawWrite(x);
         }
         else
         {
             int size;
             _f.rawRead((&size)[0..1]);

             ... what now?
         }
     }

 Writing is ok, but how do I read the bytes to the
 string x after having its size?


 Assuming you're not expecting pre-allocation (which I infer 
 from your choice of "ref string" instead of "char[]"), you 
 could do this:

     void rw(ref string x)
     {
         if(_isWriting)
         {
             size_t size = x.length;
             _f.rawWrite((&size)[0..1]);
             _f.rawWrite(x);
         }
         else
         {
             size_t size;
             _f.rawRead((&size)[0..1]);
             auto tmp = new char[size];
             _f.rawRead(tmp);
             import std.exception : assumeUnique;
             x = tmp.assumeUnique;
         }
     }


 Thanks, John, this works.

 Though it feels a bit strange, that one has to do such trickery 
 in order to
 perform basic things like binary io of strings.

Doesn't seem like trickery to me; you just make a new array of 
the correct size and then fill it from the file. Is that not what 
you expected to do?

The only thing that is unusual is assumeUnique, but if you 
understand that string is an alias to immutable(char)[] then it 
should be apparent why it's there. You could just write "x = 
cast(string)tmp;" instead, it's the same.

Mar 03 2014

"Christof Schardt" <csnews schardt.info> writes:

"John Colvin" <john.loughran.colvin gmail.com> schrieb im Newsbeitrag 
news:zjsykclxreagfhqsqpau forum.dlang.org...
 On Monday, 3 March 2014 at 22:22:06 UTC, Christof Schardt wrote:
 "John Colvin" <john.loughran.colvin gmail.com> schrieb im Newsbeitrag
 news:dyfkblqonigrtmkwtfjs forum.dlang.org...
 On Monday, 3 March 2014 at 21:44:16 UTC, Christof Schardt wrote:
 I'm evaluating D and try to write a binary io class.
 I got stuck with strings:

     void rw(ref string x)
     {
         if(_isWriting)
         {
             int size = x.length;
             _f.rawWrite((&size)[0..1]);
             _f.rawWrite(x);
         }
         else
         {
             int size;
             _f.rawRead((&size)[0..1]);

             ... what now?
         }
     }

 Writing is ok, but how do I read the bytes to the
 string x after having its size?


 Assuming you're not expecting pre-allocation (which I infer from your 
 choice of "ref string" instead of "char[]"), you could do this:

     void rw(ref string x)
     {
         if(_isWriting)
         {
             size_t size = x.length;
             _f.rawWrite((&size)[0..1]);
             _f.rawWrite(x);
         }
         else
         {
             size_t size;
             _f.rawRead((&size)[0..1]);
             auto tmp = new char[size];
             _f.rawRead(tmp);
             import std.exception : assumeUnique;
             x = tmp.assumeUnique;
         }
     }


 Thanks, John, this works.

 Though it feels a bit strange, that one has to do such trickery in order 
 to
 perform basic things like binary io of strings.

 Doesn't seem like trickery to me; you just make a new array of the correct 
 size and then fill it from the file. Is that not what you expected to do?

 The only thing that is unusual is assumeUnique, but if you understand that 
 string is an alias to immutable(char)[] then it should be apparent why 
 it's there. You could just write "x = cast(string)tmp;" instead, it's the 
 same.

By "trickery" I meant having to know about things like
"import std.exception : assumeUnique" for this basic kind of task.

Anyway, since D has an incredible community, which answers questions
like mine within minutes, this is not really an obstacle.

Mar 03 2014

"bearophile" <bearophileHUGS lycos.com> writes:

Christof Schardt:

 By "trickery" I meant having to know about things like
 "import std.exception : assumeUnique" for this basic kind of 
 task.

Your function has signature (you use "ref" instead of "in" or 
"out" because it performs read/write):

void rw(ref string x)

A string is a immutable(char)[], that is a dynamic array of 
immutable (UTF-8) chars. In D a dynamic array is a struct (so 
it's a value) that contains a length of the string (here in 
multiple of char.sizeof, that are bytes) and a pointer to the 
actual string data. Your function gets a string by reference, so 
it's a pointer to a mutable struct that points to immutable chars.

The else branch suggested by John Colvin was:

         else
         {
             size_t size;
             _f.rawRead((&size)[0..1]);
             auto tmp = new char[size];
             _f.rawRead(tmp);
             import std.exception : assumeUnique;
             x = tmp.assumeUnique;
         }

This allocated a GC-managed dymamic array of chars (the buffer 
tmp), and loads the data into them:

auto tmp = new char[size];
_f.rawRead(tmp);

Now you can't just perform:

x = tmp;

D manages the pointer to the dynamic array x automatically, so x 
can be seen as a dynamic array array. But their type is 
different, x refers to immutable(char)[] while tmp is a char[]. 
In general you can't implicitly convert immutable data with 
indirections to mutable data with indirections, because this 
breaks the assumptions immutability is based on (while in D you 
can assign a char[] to a const(char)[] variable. It's the 
difference between const an immutable). So the "trickery" comes 
from satisfying the strong typing of D. It's the price you have 
to pay for safety and (in theory) a bit of improvements in 
concurrent code.

assumeUnique is essentially a better documented cast, that 
converts mutable to immutable. It's similar to cast(immutable). D 
doesn't have uniqueness typing so in many cases the D compiler is 
not able to infer the uniqueness of data for you (and unique data 
can be implicitly converted to immutable). But the situation on 
this is improving (this is already partially implemented and 
merged, and will be present in D 2.066: 
http://wiki.dlang.org/DIP29 ).

when the function you are calling is pure (unlike rawRead) you 
don't need assumeUnique:

import std.exception: assumeUnique;

void foo(out char[] s) pure {
     foreach (immutable i, ref c; s)
         c = cast(char)i;
}

// Using assumeUnique:
void bar1(ref string s) {
     auto tmp = new char[10];
     foo(tmp);
     s = tmp.assumeUnique;
}

// Using the D type system:
void bar2(ref string s) {
     static string local() pure {
         auto tmp = new char[10];
         foo(tmp);
         return tmp;
     }
     s = local;
}

void main() {}

Bye,
bearophile

Mar 03 2014

"Christof Schardt" <csnews schardt.info> writes:

"bearophile" <bearophileHUGS lycos.com> schrieb im Newsbeitrag 
news:qqcdemwimcylaizjyhfg forum.dlang.org...
 Christof Schardt:

 By "trickery" I meant having to know about things like
 "import std.exception : assumeUnique" for this basic kind of task.

 Your function has signature (you use "ref" instead of "in" or "out" 
 because it performs read/write):

 void rw(ref string x)

 A string is a immutable(char)[], that is a dynamic array of immutable 
 (UTF-8) chars. In D a dynamic array is a struct (so it's a value) that 
 contains a length of the string (here in multiple of char.sizeof, that are 
 bytes) and a pointer to the actual string data. Your function gets a 
 string by reference, so it's a pointer to a mutable struct that points to 
 immutable chars.

 The else branch suggested by John Colvin was:

         else
         {
             size_t size;
             _f.rawRead((&size)[0..1]);
             auto tmp = new char[size];
             _f.rawRead(tmp);
             import std.exception : assumeUnique;
             x = tmp.assumeUnique;
         }

 This allocated a GC-managed dymamic array of chars (the buffer tmp), and 
 loads the data into them:

 auto tmp = new char[size];
 _f.rawRead(tmp);

 Now you can't just perform:

 x = tmp;

 D manages the pointer to the dynamic array x automatically, so x can be 
 seen as a dynamic array array. But their type is different, x refers to 
 immutable(char)[] while tmp is a char[]. In general you can't implicitly 
 convert immutable data with indirections to mutable data with 
 indirections, because this breaks the assumptions immutability is based on 
 (while in D you can assign a char[] to a const(char)[] variable. It's the 
 difference between const an immutable). So the "trickery" comes from 
 satisfying the strong typing of D. It's the price you have to pay for 
 safety and (in theory) a bit of improvements in concurrent code.

 assumeUnique is essentially a better documented cast, that converts 
 mutable to immutable. It's similar to cast(immutable). D doesn't have 
 uniqueness typing so in many cases the D compiler is not able to infer the 
 uniqueness of data for you (and unique data can be implicitly converted to 
 immutable). But the situation on this is improving (this is already 
 partially implemented and merged, and will be present in D 2.066: 
 http://wiki.dlang.org/DIP29 ).

 when the function you are calling is pure (unlike rawRead) you don't need 
 assumeUnique:

 import std.exception: assumeUnique;

 void foo(out char[] s) pure {
     foreach (immutable i, ref c; s)
         c = cast(char)i;
 }

 // Using assumeUnique:
 void bar1(ref string s) {
     auto tmp = new char[10];
     foo(tmp);
     s = tmp.assumeUnique;
 }

 // Using the D type system:
 void bar2(ref string s) {
     static string local() pure {
         auto tmp = new char[10];
         foo(tmp);
         return tmp;
     }
     s = local;
 }

 void main() {}

 Bye,
 bearophile

Great, thanks for this insight.
Christof

Mar 03 2014

D Programming

C/C++ Programming

Other

digitalmars.D.learn - Reading a string in binary mode