digitalmars.D.learn - more OO way to do hex string to bytes conversion
- Ralph Doncaster (14/14) Feb 06 2018 I've been reading std.conv and std.range, trying to figure out a
- ag0aep6g (19/37) Feb 06 2018 I don't think that works as you intend. Your code is taking the numeric
- H. S. Teoh (63/81) Feb 06 2018 OO is outdated. D uses the range-based idiom with UFCS for chaining
- Steven Schveighoffer (4/40) Feb 06 2018 Hm... format in a loop? That returns strings, and allocates. Yuck! ;)
- Craig Dillabaugh (13/31) Feb 06 2018 clip
- rikki cattermole (2/33) Feb 06 2018 But you could with signatures and structs instead ;)
- Craig Dillabaugh (5/17) Feb 06 2018 I am not sure how this would work ... would this actually be a
- rikki cattermole (4/21) Feb 06 2018 A very good idea :)
- Machin (14/28) Feb 06 2018 converting data has nothing to do with OOP.
- Ralph Doncaster (11/23) Feb 06 2018 Thanks for all the feedback. I'll have to do some more reading
- =?UTF-8?Q?Ali_=c3=87ehreli?= (44/52) Feb 06 2018 That is great but it has two issues that may be important in some progra...
- Ralph Doncaster (16/30) Feb 07 2018 After a bunch of searching, I came across hex string literals.
- Adam D. Ruppe (4/7) Feb 07 2018 No copy - you just get undefined behavior if you actually try to
I've been reading std.conv and std.range, trying to figure out a high-level way of converting a hex string to bytes. The only way I've been able to do it is through pointer access: import std.stdio; import std.string; import std.conv; void main() { immutable char* hex = "deadbeef".toStringz; for (auto i=0; hex[i]; i += 2) writeln(to!byte(hex[i])); } While it works, I'm wondering if there is a more object-oriented way of doing it in D.
Feb 06 2018
On 02/06/2018 07:33 PM, Ralph Doncaster wrote:I've been reading std.conv and std.range, trying to figure out a high-level way of converting a hex string to bytes. The only way I've been able to do it is through pointer access: import std.stdio; import std.string; import std.conv; void main() { immutable char* hex = "deadbeef".toStringz; for (auto i=0; hex[i]; i += 2) writeln(to!byte(hex[i])); } While it works, I'm wondering if there is a more object-oriented way of doing it in D.I don't think that works as you intend. Your code is taking the numeric value of every other character. But you want 0xDE, 0xAD, 0xBE, 0xEF, no? Here's one way to do that, but I don't think it qualifies as object oriented (but I'm also not sure how an object oriented solution is supposed to look): ---- void main() { import std.algorithm: map; import std.conv: to; import std.range: chunks; import std.stdio; foreach (b; "deadbeef".chunks(2).map!(chars => chars.to!ubyte(16))) { writefln("%2X", b); } } ----
Feb 06 2018
On Tue, Feb 06, 2018 at 06:33:02PM +0000, Ralph Doncaster via Digitalmars-d-learn wrote:I've been reading std.conv and std.range, trying to figure out a high-level way of converting a hex string to bytes. The only way I've been able to do it is through pointer access: import std.stdio; import std.string; import std.conv; void main() { immutable char* hex = "deadbeef".toStringz; for (auto i=0; hex[i]; i += 2) writeln(to!byte(hex[i])); } While it works, I'm wondering if there is a more object-oriented way of doing it in D.OO is outdated. D uses the range-based idiom with UFCS for chaining operations in a way that doesn't require you to write loops yourself. For example: import std.array; import std.algorithm; import std.conv; import std.range; // No need to use .toStringz unless you're interfacing with C auto hex = "deadbeef"; // let compiler infer the type for you auto bytes = hex.chunks(2) // lazily iterate over `hex` by digit pairs .map!(s => s.to!ubyte(16)) // convert each pair to a ubyte .array; // make an array out of it // Do whatever you wish with the ubyte[] array. writefln("%(%02X %)", bytes); If you want a reusable way to convert a hex string to bytes, you could do something like this: import std.array; import std.algorithm; import std.conv; import std.range; ubyte[] hexToBytes(string hex) { return hex.chunks(2) .map!(s => s.to!ubyte(16)) .array; } Of course, this eagerly constructs an array to store the result, which allocates, and also requires the hex string to be fully constructed first. You can make this code lazy by turning it into a range algorithm, then you can actually generate the hex digits lazily from somewhere else, and process the output bytes as they are generated, no allocation necessary: /* Run this example by putting this in a file called 'test.d' * and invoking `dmd -unittest -main -run test.d` */ import std.array; import std.algorithm; import std.conv; import std.format; import std.range; import std.stdio; auto hexToBytes(R)(R hex) if (isInputRange!R && is(ElementType!R : dchar)) { return hex.chunks(2) .map!(s => s.to!ubyte(16)); } unittest { // Infinite stream of hex digits auto digits = "0123456789abcdef".cycle; digits.take(100) // take the first 100 digits .hexToBytes // turn them into bytes .map!(b => format("%02X", b)) // print in uppercase .joiner(" ") // nicely delimit bytes with spaces .chain("\n") // end with a nice newline .copy(stdout.lockingTextWriter); // write output directly to stdout } T -- Designer clothes: how to cover less by paying more.
Feb 06 2018
On 2/6/18 1:46 PM, H. S. Teoh wrote:Of course, this eagerly constructs an array to store the result, which allocates, and also requires the hex string to be fully constructed first. You can make this code lazy by turning it into a range algorithm, then you can actually generate the hex digits lazily from somewhere else, and process the output bytes as they are generated, no allocation necessary: /* Run this example by putting this in a file called 'test.d' * and invoking `dmd -unittest -main -run test.d` */ import std.array; import std.algorithm; import std.conv; import std.format; import std.range; import std.stdio; auto hexToBytes(R)(R hex) if (isInputRange!R && is(ElementType!R : dchar)) { return hex.chunks(2) .map!(s => s.to!ubyte(16)); } unittest { // Infinite stream of hex digits auto digits = "0123456789abcdef".cycle; digits.take(100) // take the first 100 digits .hexToBytes // turn them into bytes .map!(b => format("%02X", b)) // print in uppercase .joiner(" ") // nicely delimit bytes with spaces .chain("\n") // end with a nice newline .copy(stdout.lockingTextWriter); // write output directly to stdoutHm... format in a loop? That returns strings, and allocates. Yuck! ;) writefln("%(%02X %)", digits.take(100).hexToBytes); -Steve
Feb 06 2018
On Tuesday, 6 February 2018 at 18:46:54 UTC, H. S. Teoh wrote:On Tue, Feb 06, 2018 at 06:33:02PM +0000, Ralph Doncaster via Digitalmars-d-learn wrote:clipOO is outdated. D uses the range-based idiom with UFCS for chaining operations in a way that doesn't require you to write loops yourself. For example: import std.array; import std.algorithm; import std.conv; import std.range; // No need to use .toStringz unless you're interfacing with C auto hex = "deadbeef"; // let compiler infer the type for you auto bytes = hex.chunks(2) // lazily iterate over `hex` by digit pairs .map!(s => s.to!ubyte(16)) // convert each pair to a ubyte .array; // make an array out of it // Do whatever you wish with the ubyte[] array. writefln("%(%02X %)", bytes);clipTWouldn't it be more accurate to say OO is not the correct tool for every job rather than it is "outdated". How would one write a GUI library with chains and CTFE? Second, while 'auto' is nice, for learning examples I think putting the type there is actually more helpful to someone trying to understand what is happening. If you know the type why not just write it ... its not like using auto saves you any work in most cases. I understand that its nice in templates and for ranges and the like, but for basic types I don't see any advantage to using it.
Feb 06 2018
On 06/02/2018 8:46 PM, Craig Dillabaugh wrote:On Tuesday, 6 February 2018 at 18:46:54 UTC, H. S. Teoh wrote:But you could with signatures and structs instead ;)On Tue, Feb 06, 2018 at 06:33:02PM +0000, Ralph Doncaster via Digitalmars-d-learn wrote:clipOO is outdated. D uses the range-based idiom with UFCS for chaining operations in a way that doesn't require you to write loops yourself. For example: import std.array; import std.algorithm; import std.conv; import std.range; // No need to use .toStringz unless you're interfacing with C auto hex = "deadbeef"; // let compiler infer the type for you auto bytes = hex.chunks(2) // lazily iterate over `hex` by digit pairs .map!(s => s.to!ubyte(16)) // convert each pair to a ubyte .array; // make an array out of it // Do whatever you wish with the ubyte[] array. writefln("%(%02X %)", bytes);clipTWouldn't it be more accurate to say OO is not the correct tool for every job rather than it is "outdated". How would one write a GUI library with chains and CTFE?
Feb 06 2018
On Wednesday, 7 February 2018 at 03:25:05 UTC, rikki cattermole wrote:On 06/02/2018 8:46 PM, Craig Dillabaugh wrote:I am not sure how this would work ... would this actually be a good idea, or are you just saying that technically it would be possible?On Tuesday, 6 February 2018 at 18:46:54 UTC, H. S. Teoh wrote:But you could with signatures and structs instead ;)[...]clip[...]clip[...]Wouldn't it be more accurate to say OO is not the correct tool for every job rather than it is "outdated". How would one write a GUI library with chains and CTFE?
Feb 06 2018
On 07/02/2018 4:06 AM, Craig Dillabaugh wrote:On Wednesday, 7 February 2018 at 03:25:05 UTC, rikki cattermole wrote:A very good idea :) WIP: https://github.com/rikkimax/DIPs/blob/master/DIPs/DIP1xxx-RC.md https://github.com/rikkimax/stdc-signatures/tree/master/stdcOn 06/02/2018 8:46 PM, Craig Dillabaugh wrote:I am not sure how this would work ... would this actually be a good idea, or are you just saying that technically it would be possible?On Tuesday, 6 February 2018 at 18:46:54 UTC, H. S. Teoh wrote:But you could with signatures and structs instead ;)[...]clip[...]clip[...]Wouldn't it be more accurate to say OO is not the correct tool for every job rather than it is "outdated". How would one write a GUI library with chains and CTFE?
Feb 06 2018
On Tuesday, 6 February 2018 at 18:33:02 UTC, Ralph Doncaster wrote:I've been reading std.conv and std.range, trying to figure out a high-level way of converting a hex string to bytes. The only way I've been able to do it is through pointer access: import std.stdio; import std.string; import std.conv; void main() { immutable char* hex = "deadbeef".toStringz; for (auto i=0; hex[i]; i += 2) writeln(to!byte(hex[i])); } While it works, I'm wondering if there is a more object-oriented way of doing it in D.converting data has nothing to do with OOP. In D we write like that: ``` import std.range : chunks; // consumes lazily two by two import std.algorithm.iteration : map; // apply a func to the chuncks import std.conv : to; // the func: convert with a custom base import std.array : array; // render the whole stuff ubyte[] a = "deadbeef".chunks(2).map!(a => a.to!ubyte(16)).array; ```
Feb 06 2018
On Tuesday, 6 February 2018 at 18:33:02 UTC, Ralph Doncaster wrote:I've been reading std.conv and std.range, trying to figure out a high-level way of converting a hex string to bytes. The only way I've been able to do it is through pointer access: import std.stdio; import std.string; import std.conv; void main() { immutable char* hex = "deadbeef".toStringz; for (auto i=0; hex[i]; i += 2) writeln(to!byte(hex[i])); }Thanks for all the feedback. I'll have to do some more reading about maps. My initial though is they don't seem as readable as loops. The chunks() is useful, so for now what I'm going with is: ubyte[] arr; foreach (b; "deadbeef".chunks(2)) { arr ~= b.to!ubyte(16); }
Feb 06 2018
On 02/06/2018 11:55 AM, Ralph Doncaster wrote:I'll have to do some more reading about maps. My initial though is they don't seem as readable as loops.Surprisingly, they may be very easy to read in some situations.The chunks() is useful, so for now what I'm going with is: ubyte[] arr; foreach (b; "deadbeef".chunks(2)) { arr ~= b.to!ubyte(16); }That is great but it has two issues that may be important in some programs: 1) It makes multiple allocations as the array grows 2) You need another loop to to do the actual work later on If you needed to go through the elements just once, the memory allocations would be wasted. Instead, you can solve both issues with a chained expression like the ones that has been shown by others. The cool thing is, you can always hide the ugly bits in a function. Starting with ag0aep6g's code, I went overboard to pick a better destination type (other than ubyte) to support different chunk sizes: void main() { import std.stdio; foreach (b; "deadbeef".hexValues) { writefln("%2X", b); } // Works for different sized chunks as well: writeln("12345678".hexValues!4); } auto hexValues(size_t digits = 2, ToType = DefaultTypeForSize!digits)(string s) { import std.algorithm: map; import std.conv: to; import std.range: chunks; return s.chunks(digits).map!(chars => chars.to!ToType(16)); } template DefaultTypeForSize(size_t s) { static if (s == 1) { alias DefaultTypeForSize = ubyte; } else static if (s == 2) { alias DefaultTypeForSize = ushort; } else static if (s == 4) { alias DefaultTypeForSize = uint; } else static if (s == 8) { alias DefaultTypeForSize = ulong; } else { import std.string : format; static assert(false, format("There is no default %s-byte type", s)); } } Ali
Feb 06 2018
On Tuesday, 6 February 2018 at 18:33:02 UTC, Ralph Doncaster wrote:I've been reading std.conv and std.range, trying to figure out a high-level way of converting a hex string to bytes. The only way I've been able to do it is through pointer access: import std.stdio; import std.string; import std.conv; void main() { immutable char* hex = "deadbeef".toStringz; for (auto i=0; hex[i]; i += 2) writeln(to!byte(hex[i])); } While it works, I'm wondering if there is a more object-oriented way of doing it in D.After a bunch of searching, I came across hex string literals. They are mentioned but not documented as a literal. https://dlang.org/spec/lex.html#string_literals Combined with the toHexString function in std.digest, it is easy to convert between hex strings and byte arrays. import std.stdio; import std.digest; void main() { auto data = cast(ubyte[]) x"deadbeef"; writeln("data: 0x", toHexString(data)); } p.s. the cast should probably be to immutable ubyte[]. I'm guessing without it, there is an automatic copy of the data being made.
Feb 07 2018
On Wednesday, 7 February 2018 at 14:47:04 UTC, Ralph Doncaster wrote:p.s. the cast should probably be to immutable ubyte[]. I'm guessing without it, there is an automatic copy of the data being made.No copy - you just get undefined behavior if you actually try to modify it!
Feb 07 2018