digitalmars.D.learn - more OO way to do hex string to bytes conversion

Ralph Doncaster (14/14) Feb 06 2018 I've been reading std.conv and std.range, trying to figure out a

ag0aep6g (19/37) Feb 06 2018 I don't think that works as you intend. Your code is taking the numeric
H. S. Teoh (63/81) Feb 06 2018 OO is outdated. D uses the range-based idiom with UFCS for chaining

Steven Schveighoffer (4/40) Feb 06 2018 Hm... format in a loop? That returns strings, and allocates. Yuck! ;)
Craig Dillabaugh (13/31) Feb 06 2018 clip

rikki cattermole (2/33) Feb 06 2018 But you could with signatures and structs instead ;)

Craig Dillabaugh (5/17) Feb 06 2018 I am not sure how this would work ... would this actually be a

rikki cattermole (4/21) Feb 06 2018 A very good idea :)

Machin (14/28) Feb 06 2018 converting data has nothing to do with OOP.
Ralph Doncaster (11/23) Feb 06 2018 Thanks for all the feedback. I'll have to do some more reading

=?UTF-8?Q?Ali_=c3=87ehreli?= (44/52) Feb 06 2018 That is great but it has two issues that may be important in some progra...

Ralph Doncaster (16/30) Feb 07 2018 After a bunch of searching, I came across hex string literals.

Adam D. Ruppe (4/7) Feb 07 2018 No copy - you just get undefined behavior if you actually try to

Ralph Doncaster <nerdralph github.com> writes:

I've been reading std.conv and std.range, trying to figure out a 
high-level way of converting a hex string to bytes.  The only way 
I've been able to do it is through pointer access:

import std.stdio;
import std.string;
import std.conv;

void main()
{
     immutable char* hex = "deadbeef".toStringz;
     for (auto i=0; hex[i]; i += 2)
         writeln(to!byte(hex[i]));
}


While it works, I'm wondering if there is a more object-oriented 
way of doing it in D.

Feb 06 2018

ag0aep6g <anonymous example.com> writes:

On 02/06/2018 07:33 PM, Ralph Doncaster wrote:
 I've been reading std.conv and std.range, trying to figure out a 
 high-level way of converting a hex string to bytes.  The only way I've 
 been able to do it is through pointer access:
 
 import std.stdio;
 import std.string;
 import std.conv;
 
 void main()
 {
      immutable char* hex = "deadbeef".toStringz;
      for (auto i=0; hex[i]; i += 2)
          writeln(to!byte(hex[i]));
 }
 
 
 While it works, I'm wondering if there is a more object-oriented way of 
 doing it in D.

I don't think that works as you intend. Your code is taking the numeric 
value of every other character.

But you want 0xDE, 0xAD, 0xBE, 0xEF, no? Here's one way to do that, but 
I don't think it qualifies as object oriented (but I'm also not sure how 
an object oriented solution is supposed to look):

----
void main()
{
     import std.algorithm: map;
     import std.conv: to;
     import std.range: chunks;
     import std.stdio;
     foreach (b; "deadbeef".chunks(2).map!(chars => chars.to!ubyte(16)))
     {
         writefln("%2X", b);
     }
}
----

Feb 06 2018

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Tue, Feb 06, 2018 at 06:33:02PM +0000, Ralph Doncaster via
Digitalmars-d-learn wrote:
 I've been reading std.conv and std.range, trying to figure out a
 high-level way of converting a hex string to bytes.  The only way I've
 been able to do it is through pointer access:
 
 import std.stdio;
 import std.string;
 import std.conv;
 
 void main()
 {
     immutable char* hex = "deadbeef".toStringz;
     for (auto i=0; hex[i]; i += 2)
         writeln(to!byte(hex[i]));
 }
 
 
 While it works, I'm wondering if there is a more object-oriented way
 of doing it in D.

OO is outdated.  D uses the range-based idiom with UFCS for chaining
operations in a way that doesn't require you to write loops yourself.
For example:

	import std.array;
	import std.algorithm;
	import std.conv;
	import std.range;

	// No need to use .toStringz unless you're interfacing with C
	auto hex = "deadbeef";	// let compiler infer the type for you

	auto bytes = hex.chunks(2)	// lazily iterate over `hex` by digit pairs
	   .map!(s => s.to!ubyte(16))	// convert each pair to a ubyte
	   .array;			// make an array out of it

	// Do whatever you wish with the ubyte[] array.
	writefln("%(%02X %)", bytes);

If you want a reusable way to convert a hex string to bytes, you could
do something like this:

	import std.array;
	import std.algorithm;
	import std.conv;
	import std.range;

	ubyte[] hexToBytes(string hex)
	{
		return hex.chunks(2)
		          .map!(s => s.to!ubyte(16))
			  .array;
	}

Of course, this eagerly constructs an array to store the result, which
allocates, and also requires the hex string to be fully constructed
first.  You can make this code lazy by turning it into a range
algorithm, then you can actually generate the hex digits lazily from
somewhere else, and process the output bytes as they are generated, no
allocation necessary:

	/* Run this example by putting this in a file called 'test.d'
	 * and invoking `dmd -unittest -main -run test.d`
	 */
	import std.array;
	import std.algorithm;
	import std.conv;
	import std.format;
	import std.range;
	import std.stdio;

	auto hexToBytes(R)(R hex)
		if (isInputRange!R && is(ElementType!R : dchar))
	{
		return hex.chunks(2)
		          .map!(s => s.to!ubyte(16));
	}

	unittest
	{
		// Infinite stream of hex digits
		auto digits = "0123456789abcdef".cycle;

		digits.take(100)	// take the first 100 digits
		      .hexToBytes	// turn them into bytes
		      .map!(b => format("%02X", b)) // print in uppercase
		      .joiner(" ")	// nicely delimit bytes with spaces 
		      .chain("\n")	// end with a nice newline
		      .copy(stdout.lockingTextWriter);
		      			// write output directly to stdout
	}


T

-- 
Designer clothes: how to cover less by paying more.

Feb 06 2018

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 2/6/18 1:46 PM, H. S. Teoh wrote:
 Of course, this eagerly constructs an array to store the result, which
 allocates, and also requires the hex string to be fully constructed
 first.  You can make this code lazy by turning it into a range
 algorithm, then you can actually generate the hex digits lazily from
 somewhere else, and process the output bytes as they are generated, no
 allocation necessary:
 
 	/* Run this example by putting this in a file called 'test.d'
 	 * and invoking `dmd -unittest -main -run test.d`
 	 */
 	import std.array;
 	import std.algorithm;
 	import std.conv;
 	import std.format;
 	import std.range;
 	import std.stdio;
 
 	auto hexToBytes(R)(R hex)
 		if (isInputRange!R && is(ElementType!R : dchar))
 	{
 		return hex.chunks(2)
 		          .map!(s => s.to!ubyte(16));
 	}
 
 	unittest
 	{
 		// Infinite stream of hex digits
 		auto digits = "0123456789abcdef".cycle;
 
 		digits.take(100)	// take the first 100 digits
 		      .hexToBytes	// turn them into bytes
 		      .map!(b => format("%02X", b)) // print in uppercase
 		      .joiner(" ")	// nicely delimit bytes with spaces
 		      .chain("\n")	// end with a nice newline
 		      .copy(stdout.lockingTextWriter);
 		      			// write output directly to stdout

Hm... format in a loop? That returns strings, and allocates. Yuck! ;)

writefln("%(%02X %)", digits.take(100).hexToBytes);

-Steve

Feb 06 2018

Craig Dillabaugh <craig.dillabaugh gmail.com> writes:

On Tuesday, 6 February 2018 at 18:46:54 UTC, H. S. Teoh wrote:
 On Tue, Feb 06, 2018 at 06:33:02PM +0000, Ralph Doncaster via 
 Digitalmars-d-learn wrote:

clip
 OO is outdated.  D uses the range-based idiom with UFCS for 
 chaining operations in a way that doesn't require you to write 
 loops yourself. For example:

 	import std.array;
 	import std.algorithm;
 	import std.conv;
 	import std.range;

 	// No need to use .toStringz unless you're interfacing with C
 	auto hex = "deadbeef";	// let compiler infer the type for you

 	auto bytes = hex.chunks(2)	// lazily iterate over `hex` by 
 digit pairs
 	   .map!(s => s.to!ubyte(16))	// convert each pair to a ubyte
 	   .array;			// make an array out of it

 	// Do whatever you wish with the ubyte[] array.
 	writefln("%(%02X %)", bytes);

clip
 T

Wouldn't it be more accurate to say OO is not the correct tool 
for every job rather than it is "outdated".  How would one write 
a GUI library with chains and CTFE?

Second, while 'auto' is nice, for learning examples I think 
putting the type there is actually more helpful to someone trying 
to understand what is happening. If you know the type why not 
just write it ... its not like using auto saves you any work in 
most cases. I understand that its nice in templates and for 
ranges and the like, but for basic types I don't see any 
advantage to using it.

Feb 06 2018

rikki cattermole <rikki cattermole.co.nz> writes:

On 06/02/2018 8:46 PM, Craig Dillabaugh wrote:
 On Tuesday, 6 February 2018 at 18:46:54 UTC, H. S. Teoh wrote:
 On Tue, Feb 06, 2018 at 06:33:02PM +0000, Ralph Doncaster via 
 Digitalmars-d-learn wrote:

 clip
 OO is outdated.  D uses the range-based idiom with UFCS for chaining 
 operations in a way that doesn't require you to write loops yourself. 
 For example:

     import std.array;
     import std.algorithm;
     import std.conv;
     import std.range;

     // No need to use .toStringz unless you're interfacing with C
     auto hex = "deadbeef";    // let compiler infer the type for you

     auto bytes = hex.chunks(2)    // lazily iterate over `hex` by 
 digit pairs
        .map!(s => s.to!ubyte(16))    // convert each pair to a ubyte
        .array;            // make an array out of it

     // Do whatever you wish with the ubyte[] array.
     writefln("%(%02X %)", bytes);

 clip
 T

 
 Wouldn't it be more accurate to say OO is not the correct tool for every 
 job rather than it is "outdated".  How would one write a GUI library 
 with chains and CTFE?

But you could with signatures and structs instead ;)

Feb 06 2018

Craig Dillabaugh <craig.dillabaugh gmail.com> writes:

On Wednesday, 7 February 2018 at 03:25:05 UTC, rikki cattermole 
wrote:
 On 06/02/2018 8:46 PM, Craig Dillabaugh wrote:
 On Tuesday, 6 February 2018 at 18:46:54 UTC, H. S. Teoh wrote:
 [...]

 clip
[...]

 clip
 [...]

 
 Wouldn't it be more accurate to say OO is not the correct tool 
 for every job rather than it is "outdated".  How would one 
 write a GUI library with chains and CTFE?

 But you could with signatures and structs instead ;)

I am not sure how this would work ... would this actually be a 
good idea, or are you just saying that technically it would be 
possible?

Feb 06 2018

rikki cattermole <rikki cattermole.co.nz> writes:

On 07/02/2018 4:06 AM, Craig Dillabaugh wrote:
 On Wednesday, 7 February 2018 at 03:25:05 UTC, rikki cattermole wrote:
 On 06/02/2018 8:46 PM, Craig Dillabaugh wrote:
 On Tuesday, 6 February 2018 at 18:46:54 UTC, H. S. Teoh wrote:
 [...]

 clip
 [...]

 clip
 [...]

 Wouldn't it be more accurate to say OO is not the correct tool for 
 every job rather than it is "outdated".  How would one write a GUI 
 library with chains and CTFE?

 But you could with signatures and structs instead ;)

 
 I am not sure how this would work ... would this actually be a good 
 idea, or are you just saying that technically it would be possible?

A very good idea :)

WIP: https://github.com/rikkimax/DIPs/blob/master/DIPs/DIP1xxx-RC.md
https://github.com/rikkimax/stdc-signatures/tree/master/stdc

Feb 06 2018

Machin <machgyl zbor.ue> writes:

On Tuesday, 6 February 2018 at 18:33:02 UTC, Ralph Doncaster 
wrote:
 I've been reading std.conv and std.range, trying to figure out 
 a high-level way of converting a hex string to bytes.  The only 
 way I've been able to do it is through pointer access:

 import std.stdio;
 import std.string;
 import std.conv;

 void main()
 {
     immutable char* hex = "deadbeef".toStringz;
     for (auto i=0; hex[i]; i += 2)
         writeln(to!byte(hex[i]));
 }


 While it works, I'm wondering if there is a more 
 object-oriented way of doing it in D.

converting data has nothing to do with OOP.

In D we write like that:

```
import std.range : chunks;             // consumes lazily two by 
two
import std.algorithm.iteration : map;  // apply a func to the 
chuncks
import std.conv : to;                  // the func: convert with 
a custom base
import std.array : array;              // render the whole stuff
ubyte[] a = "deadbeef".chunks(2).map!(a => a.to!ubyte(16)).array;
```

Feb 06 2018

Ralph Doncaster <nerdralph github.com> writes:

On Tuesday, 6 February 2018 at 18:33:02 UTC, Ralph Doncaster 
wrote:
 I've been reading std.conv and std.range, trying to figure out 
 a high-level way of converting a hex string to bytes.  The only 
 way I've been able to do it is through pointer access:

 import std.stdio;
 import std.string;
 import std.conv;

 void main()
 {
     immutable char* hex = "deadbeef".toStringz;
     for (auto i=0; hex[i]; i += 2)
         writeln(to!byte(hex[i]));
 }

Thanks for all the feedback.  I'll have to do some more reading 
about maps.  My initial though is they don't seem as readable as 
loops.
The chunks() is useful, so for now what I'm going with is:
     ubyte[] arr;
     foreach (b; "deadbeef".chunks(2))
     {
         arr ~= b.to!ubyte(16);
     }

Feb 06 2018

=?UTF-8?Q?Ali_=c3=87ehreli?= <acehreli yahoo.com> writes:

On 02/06/2018 11:55 AM, Ralph Doncaster wrote:

 I'll have to do some more reading about
 maps.  My initial though is they don't seem as readable as loops.

Surprisingly, they may be very easy to read in some situations.

 The chunks() is useful, so for now what I'm going with is:
      ubyte[] arr;
      foreach (b; "deadbeef".chunks(2))
      {
          arr ~= b.to!ubyte(16);
      }

That is great but it has two issues that may be important in some programs:

1) It makes multiple allocations as the array grows

2) You need another loop to to do the actual work later on

If you needed to go through the elements just once, the memory 
allocations would be wasted. Instead, you can solve both issues with a 
chained expression like the ones that has been shown by others.

The cool thing is, you can always hide the ugly bits in a function. 
Starting with ag0aep6g's code, I went overboard to pick a better 
destination type (other than ubyte) to support different chunk sizes:

void main()
{
     import std.stdio;
     foreach (b; "deadbeef".hexValues)
     {
         writefln("%2X", b);
     }

     // Works for different sized chunks as well:
     writeln("12345678".hexValues!4);
}

auto hexValues(size_t digits = 2, ToType = 
DefaultTypeForSize!digits)(string s) {
     import std.algorithm: map;
     import std.conv: to;
     import std.range: chunks;

     return s.chunks(digits).map!(chars => chars.to!ToType(16));
}

template DefaultTypeForSize(size_t s) {
     static if (s == 1) {
         alias DefaultTypeForSize = ubyte;
     } else static if (s == 2) {
         alias DefaultTypeForSize = ushort;
     } else static if (s == 4) {
         alias DefaultTypeForSize = uint;
     } else static if (s == 8) {
         alias DefaultTypeForSize = ulong;
     } else {
         import std.string : format;

         static assert(false, format("There is no default %s-byte type", 
s));
     }
}

Ali

Feb 06 2018

Ralph Doncaster <nerdralph github.com> writes:

On Tuesday, 6 February 2018 at 18:33:02 UTC, Ralph Doncaster 
wrote:
 I've been reading std.conv and std.range, trying to figure out 
 a high-level way of converting a hex string to bytes.  The only 
 way I've been able to do it is through pointer access:

 import std.stdio;
 import std.string;
 import std.conv;

 void main()
 {
     immutable char* hex = "deadbeef".toStringz;
     for (auto i=0; hex[i]; i += 2)
         writeln(to!byte(hex[i]));
 }


 While it works, I'm wondering if there is a more 
 object-oriented way of doing it in D.

After a bunch of searching, I came across hex string literals.  
They are mentioned but not documented as a literal.
https://dlang.org/spec/lex.html#string_literals

Combined with the toHexString function in std.digest, it is easy 
to convert between hex strings and byte arrays.

import std.stdio;
import std.digest;

void main() {
     auto data = cast(ubyte[]) x"deadbeef";
     writeln("data: 0x", toHexString(data));
}

p.s. the cast should probably be to immutable ubyte[].  I'm 
guessing without it, there is an automatic copy of the data being 
made.

Feb 07 2018

Adam D. Ruppe <destructionator gmail.com> writes:

On Wednesday, 7 February 2018 at 14:47:04 UTC, Ralph Doncaster 
wrote:
 p.s. the cast should probably be to immutable ubyte[].  I'm 
 guessing without it, there is an automatic copy of the data 
 being made.

No copy - you just get undefined behavior if you actually try to 
modify it!

Feb 07 2018

D Programming

C/C++ Programming

Other

digitalmars.D.learn - more OO way to do hex string to bytes conversion