www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - Reading a string in binary mode

reply "Christof Schardt" <csnews schardt.info> writes:
I'm evaluating D and try to write a binary io class.
I got stuck with strings:

    void rw(ref string x)
    {
        if(_isWriting)
        {
            int size = x.length;
            _f.rawWrite((&size)[0..1]);
            _f.rawWrite(x);
        }
        else
        {
            int size;
            _f.rawRead((&size)[0..1]);

            ... what now?
        }
    }

Writing is ok, but how do I read the bytes to the
string x after having its size?
Mar 03 2014
next sibling parent reply =?UTF-8?B?QWxpIMOHZWhyZWxp?= <acehreli yahoo.com> writes:
On 03/03/2014 01:44 PM, Christof Schardt wrote:> I'm evaluating D and 
try to write a binary io class.
 I got stuck with strings:

      void rw(ref string x)
      {
          if(_isWriting)
          {
              int size = x.length;
              _f.rawWrite((&size)[0..1]);
              _f.rawWrite(x);
          }
          else
          {
              int size;
              _f.rawRead((&size)[0..1]);

              ... what now?
You need to have a buffer of 'size'. Not tested: auto s = new char[size]; s = _f.rawRead(s); x = s; However, the last line will not compile due to difference in mutability. So will need to do something like this: import std.exception : assumeUnique; x = assumeUnique(s);
          }
      }

 Writing is ok, but how do I read the bytes to the
 string x after having its size?
Ali
Mar 03 2014
parent reply "Christof Schardt" <csnews schardt.info> writes:
"Ali Çehreli" <acehreli yahoo.com> schrieb im Newsbeitrag 
news:lf2ude$1njf$1 digitalmars.com...
 On 03/03/2014 01:44 PM, Christof Schardt wrote:> I'm evaluating D and try 
 to write a binary io class.
 I got stuck with strings:

      void rw(ref string x)
      {
          if(_isWriting)
          {
              int size = x.length;
              _f.rawWrite((&size)[0..1]);
              _f.rawWrite(x);
          }
          else
          {
              int size;
              _f.rawRead((&size)[0..1]);

              ... what now?
You need to have a buffer of 'size'. Not tested: auto s = new char[size]; s = _f.rawRead(s); x = s; However, the last line will not compile due to difference in mutability. So will need to do something like this: import std.exception : assumeUnique; x = assumeUnique(s);
          }
      }

 Writing is ok, but how do I read the bytes to the
 string x after having its size?
Ali
Thanks, Ali, this works. BTW: will your excellent book be equipped with a TOC and an index? I find it hard to look for answers to questions like above in all the D docs I have (dpl.org, TDPL, your book)
Mar 03 2014
parent =?UTF-8?B?QWxpIMOHZWhyZWxp?= <acehreli yahoo.com> writes:
On 03/03/2014 02:25 PM, Christof Schardt wrote:

 Thanks, Ali, this works.
Yay! :)
 book be equipped with a TOC and an index?
Yes, all of that will happen after I get back to working on the book and its ever increasing list of to-dos. :) Ali
Mar 03 2014
prev sibling parent reply "John Colvin" <john.loughran.colvin gmail.com> writes:
On Monday, 3 March 2014 at 21:44:16 UTC, Christof Schardt wrote:
 I'm evaluating D and try to write a binary io class.
 I got stuck with strings:

     void rw(ref string x)
     {
         if(_isWriting)
         {
             int size = x.length;
             _f.rawWrite((&size)[0..1]);
             _f.rawWrite(x);
         }
         else
         {
             int size;
             _f.rawRead((&size)[0..1]);

             ... what now?
         }
     }

 Writing is ok, but how do I read the bytes to the
 string x after having its size?
Assuming you're not expecting pre-allocation (which I infer from your choice of "ref string" instead of "char[]"), you could do this:
     void rw(ref string x)
     {
         if(_isWriting)
         {
             size_t size = x.length;
             _f.rawWrite((&size)[0..1]);
             _f.rawWrite(x);
         }
         else
         {
             size_t size;
             _f.rawRead((&size)[0..1]);
             auto tmp = new char[size];
             _f.rawRead(tmp);
             import std.exception : assumeUnique;
             x = tmp.assumeUnique;
         }
     }
Mar 03 2014
parent reply "Christof Schardt" <csnews schardt.info> writes:
"John Colvin" <john.loughran.colvin gmail.com> schrieb im Newsbeitrag 
news:dyfkblqonigrtmkwtfjs forum.dlang.org...
 On Monday, 3 March 2014 at 21:44:16 UTC, Christof Schardt wrote:
 I'm evaluating D and try to write a binary io class.
 I got stuck with strings:

     void rw(ref string x)
     {
         if(_isWriting)
         {
             int size = x.length;
             _f.rawWrite((&size)[0..1]);
             _f.rawWrite(x);
         }
         else
         {
             int size;
             _f.rawRead((&size)[0..1]);

             ... what now?
         }
     }

 Writing is ok, but how do I read the bytes to the
 string x after having its size?
Assuming you're not expecting pre-allocation (which I infer from your choice of "ref string" instead of "char[]"), you could do this:
     void rw(ref string x)
     {
         if(_isWriting)
         {
             size_t size = x.length;
             _f.rawWrite((&size)[0..1]);
             _f.rawWrite(x);
         }
         else
         {
             size_t size;
             _f.rawRead((&size)[0..1]);
             auto tmp = new char[size];
             _f.rawRead(tmp);
             import std.exception : assumeUnique;
             x = tmp.assumeUnique;
         }
     }
Thanks, John, this works. Though it feels a bit strange, that one has to do such trickery in order to perform basic things like binary io of strings.
Mar 03 2014
parent reply "John Colvin" <john.loughran.colvin gmail.com> writes:
On Monday, 3 March 2014 at 22:22:06 UTC, Christof Schardt wrote:
 "John Colvin" <john.loughran.colvin gmail.com> schrieb im 
 Newsbeitrag
 news:dyfkblqonigrtmkwtfjs forum.dlang.org...
 On Monday, 3 March 2014 at 21:44:16 UTC, Christof Schardt 
 wrote:
 I'm evaluating D and try to write a binary io class.
 I got stuck with strings:

     void rw(ref string x)
     {
         if(_isWriting)
         {
             int size = x.length;
             _f.rawWrite((&size)[0..1]);
             _f.rawWrite(x);
         }
         else
         {
             int size;
             _f.rawRead((&size)[0..1]);

             ... what now?
         }
     }

 Writing is ok, but how do I read the bytes to the
 string x after having its size?
Assuming you're not expecting pre-allocation (which I infer from your choice of "ref string" instead of "char[]"), you could do this:
     void rw(ref string x)
     {
         if(_isWriting)
         {
             size_t size = x.length;
             _f.rawWrite((&size)[0..1]);
             _f.rawWrite(x);
         }
         else
         {
             size_t size;
             _f.rawRead((&size)[0..1]);
             auto tmp = new char[size];
             _f.rawRead(tmp);
             import std.exception : assumeUnique;
             x = tmp.assumeUnique;
         }
     }
Thanks, John, this works. Though it feels a bit strange, that one has to do such trickery in order to perform basic things like binary io of strings.
Doesn't seem like trickery to me; you just make a new array of the correct size and then fill it from the file. Is that not what you expected to do? The only thing that is unusual is assumeUnique, but if you understand that string is an alias to immutable(char)[] then it should be apparent why it's there. You could just write "x = cast(string)tmp;" instead, it's the same.
Mar 03 2014
parent reply "Christof Schardt" <csnews schardt.info> writes:
"John Colvin" <john.loughran.colvin gmail.com> schrieb im Newsbeitrag 
news:zjsykclxreagfhqsqpau forum.dlang.org...
 On Monday, 3 March 2014 at 22:22:06 UTC, Christof Schardt wrote:
 "John Colvin" <john.loughran.colvin gmail.com> schrieb im Newsbeitrag
 news:dyfkblqonigrtmkwtfjs forum.dlang.org...
 On Monday, 3 March 2014 at 21:44:16 UTC, Christof Schardt wrote:
 I'm evaluating D and try to write a binary io class.
 I got stuck with strings:

     void rw(ref string x)
     {
         if(_isWriting)
         {
             int size = x.length;
             _f.rawWrite((&size)[0..1]);
             _f.rawWrite(x);
         }
         else
         {
             int size;
             _f.rawRead((&size)[0..1]);

             ... what now?
         }
     }

 Writing is ok, but how do I read the bytes to the
 string x after having its size?
Assuming you're not expecting pre-allocation (which I infer from your choice of "ref string" instead of "char[]"), you could do this:
     void rw(ref string x)
     {
         if(_isWriting)
         {
             size_t size = x.length;
             _f.rawWrite((&size)[0..1]);
             _f.rawWrite(x);
         }
         else
         {
             size_t size;
             _f.rawRead((&size)[0..1]);
             auto tmp = new char[size];
             _f.rawRead(tmp);
             import std.exception : assumeUnique;
             x = tmp.assumeUnique;
         }
     }
Thanks, John, this works. Though it feels a bit strange, that one has to do such trickery in order to perform basic things like binary io of strings.
Doesn't seem like trickery to me; you just make a new array of the correct size and then fill it from the file. Is that not what you expected to do? The only thing that is unusual is assumeUnique, but if you understand that string is an alias to immutable(char)[] then it should be apparent why it's there. You could just write "x = cast(string)tmp;" instead, it's the same.
By "trickery" I meant having to know about things like "import std.exception : assumeUnique" for this basic kind of task. Anyway, since D has an incredible community, which answers questions like mine within minutes, this is not really an obstacle.
Mar 03 2014
parent reply "bearophile" <bearophileHUGS lycos.com> writes:
Christof Schardt:

 By "trickery" I meant having to know about things like
 "import std.exception : assumeUnique" for this basic kind of 
 task.
Your function has signature (you use "ref" instead of "in" or "out" because it performs read/write): void rw(ref string x) A string is a immutable(char)[], that is a dynamic array of immutable (UTF-8) chars. In D a dynamic array is a struct (so it's a value) that contains a length of the string (here in multiple of char.sizeof, that are bytes) and a pointer to the actual string data. Your function gets a string by reference, so it's a pointer to a mutable struct that points to immutable chars. The else branch suggested by John Colvin was:
         else
         {
             size_t size;
             _f.rawRead((&size)[0..1]);
             auto tmp = new char[size];
             _f.rawRead(tmp);
             import std.exception : assumeUnique;
             x = tmp.assumeUnique;
         }
This allocated a GC-managed dymamic array of chars (the buffer tmp), and loads the data into them: auto tmp = new char[size]; _f.rawRead(tmp); Now you can't just perform: x = tmp; D manages the pointer to the dynamic array x automatically, so x can be seen as a dynamic array array. But their type is different, x refers to immutable(char)[] while tmp is a char[]. In general you can't implicitly convert immutable data with indirections to mutable data with indirections, because this breaks the assumptions immutability is based on (while in D you can assign a char[] to a const(char)[] variable. It's the difference between const an immutable). So the "trickery" comes from satisfying the strong typing of D. It's the price you have to pay for safety and (in theory) a bit of improvements in concurrent code. assumeUnique is essentially a better documented cast, that converts mutable to immutable. It's similar to cast(immutable). D doesn't have uniqueness typing so in many cases the D compiler is not able to infer the uniqueness of data for you (and unique data can be implicitly converted to immutable). But the situation on this is improving (this is already partially implemented and merged, and will be present in D 2.066: http://wiki.dlang.org/DIP29 ). when the function you are calling is pure (unlike rawRead) you don't need assumeUnique: import std.exception: assumeUnique; void foo(out char[] s) pure { foreach (immutable i, ref c; s) c = cast(char)i; } // Using assumeUnique: void bar1(ref string s) { auto tmp = new char[10]; foo(tmp); s = tmp.assumeUnique; } // Using the D type system: void bar2(ref string s) { static string local() pure { auto tmp = new char[10]; foo(tmp); return tmp; } s = local; } void main() {} Bye, bearophile
Mar 03 2014
parent "Christof Schardt" <csnews schardt.info> writes:
"bearophile" <bearophileHUGS lycos.com> schrieb im Newsbeitrag 
news:qqcdemwimcylaizjyhfg forum.dlang.org...
 Christof Schardt:

 By "trickery" I meant having to know about things like
 "import std.exception : assumeUnique" for this basic kind of task.
Your function has signature (you use "ref" instead of "in" or "out" because it performs read/write): void rw(ref string x) A string is a immutable(char)[], that is a dynamic array of immutable (UTF-8) chars. In D a dynamic array is a struct (so it's a value) that contains a length of the string (here in multiple of char.sizeof, that are bytes) and a pointer to the actual string data. Your function gets a string by reference, so it's a pointer to a mutable struct that points to immutable chars. The else branch suggested by John Colvin was:
         else
         {
             size_t size;
             _f.rawRead((&size)[0..1]);
             auto tmp = new char[size];
             _f.rawRead(tmp);
             import std.exception : assumeUnique;
             x = tmp.assumeUnique;
         }
This allocated a GC-managed dymamic array of chars (the buffer tmp), and loads the data into them: auto tmp = new char[size]; _f.rawRead(tmp); Now you can't just perform: x = tmp; D manages the pointer to the dynamic array x automatically, so x can be seen as a dynamic array array. But their type is different, x refers to immutable(char)[] while tmp is a char[]. In general you can't implicitly convert immutable data with indirections to mutable data with indirections, because this breaks the assumptions immutability is based on (while in D you can assign a char[] to a const(char)[] variable. It's the difference between const an immutable). So the "trickery" comes from satisfying the strong typing of D. It's the price you have to pay for safety and (in theory) a bit of improvements in concurrent code. assumeUnique is essentially a better documented cast, that converts mutable to immutable. It's similar to cast(immutable). D doesn't have uniqueness typing so in many cases the D compiler is not able to infer the uniqueness of data for you (and unique data can be implicitly converted to immutable). But the situation on this is improving (this is already partially implemented and merged, and will be present in D 2.066: http://wiki.dlang.org/DIP29 ). when the function you are calling is pure (unlike rawRead) you don't need assumeUnique: import std.exception: assumeUnique; void foo(out char[] s) pure { foreach (immutable i, ref c; s) c = cast(char)i; } // Using assumeUnique: void bar1(ref string s) { auto tmp = new char[10]; foo(tmp); s = tmp.assumeUnique; } // Using the D type system: void bar2(ref string s) { static string local() pure { auto tmp = new char[10]; foo(tmp); return tmp; } s = local; } void main() {} Bye, bearophile
Great, thanks for this insight. Christof
Mar 03 2014