www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - FormattedRead hex string

reply "Jason Spencer" <spencer8 sbcglobal.net> writes:
I imagine there's a slick way to do this, but I'm not seeing it.

I have a string of hex digits which I'd like to convert to an 
array of 8 ubytes:

0123456789abcdef --> [0x01, 0x23, 0x45, 0x67, 0x89, 0xAB, 0xCD, 
0xEF]

I'm looking at std.format.formattedRead, but the documentation 
is...lightish.  First of all, it seems there's no format 
specifier except %s on reads and type information is gleaned from 
the args' types.  I was able to experiment and show that %x 
works, but no documentation on exactly how.

Second, array syntax seems to work only if there's some 
delimiter.  With:

void main(string[] args)
{
    ubyte[8] b;

    formattedRead(args[1], "%(%s%)", &b);
}

I get

std.conv.ConvOverflowException C:\Tools\D\dmd2\windows\bin\..\..\src\phobos\std\
conv.d(2006): Overflow in integral conversion

at least once. :)  But that makes sense--hard to tell how many 
input chars to assign to one byte versus another (although it 
seems to me a hungry algorithm would work--saturate one type's 
max and move to the next.)

There doesn't seem to be any support for field sizes or counts in 
formatted read, similar to old C "%16x".  This barks at me right 
away--"%1 not supported."

I know I could read (in this case) as two longs or a uint16, but 
I don't want to deal with endianess--just data.

Is there some trick to use the fact that b is fixed size 8 bytes 
and know that requires 16 hex digits and converts automatically?  
Is there some other suggestion for how to do this eloquently?  I 
can play around with split and join, but it seemed like there is 
probably some way to do this directly that I'm  not seeing.

Thanks!
Jason
Sep 24 2012
parent reply "monarch_dodra" <monarchdodra gmail.com> writes:
On Monday, 24 September 2012 at 15:05:54 UTC, Jason Spencer wrote:
 I imagine there's a slick way to do this, but I'm not seeing it.

 I have a string of hex digits which I'd like to convert to an 
 array of 8 ubytes:

 0123456789abcdef --> [0x01, 0x23, 0x45, 0x67, 0x89, 0xAB, 0xCD, 
 0xEF]

 I'm looking at std.format.formattedRead, but the documentation 
 is...lightish.  First of all, it seems there's no format 
 specifier except %s on reads and type information is gleaned 
 from the args' types.  I was able to experiment and show that 
 %x works, but no documentation on exactly how.

 Second, array syntax seems to work only if there's some 
 delimiter.  With:

 void main(string[] args)
 {
    ubyte[8] b;

    formattedRead(args[1], "%(%s%)", &b);
 }

 I get

 std.conv.ConvOverflowException C:\Tools\D\dmd2\windows\bin\..\..\src\phobos\std\
 conv.d(2006): Overflow in integral conversion

 at least once. :)  But that makes sense--hard to tell how many 
 input chars to assign to one byte versus another (although it 
 seems to me a hungry algorithm would work--saturate one type's 
 max and move to the next.)

 There doesn't seem to be any support for field sizes or counts 
 in formatted read, similar to old C "%16x".  This barks at me 
 right away--"%1 not supported."

 I know I could read (in this case) as two longs or a uint16, 
 but I don't want to deal with endianess--just data.

 Is there some trick to use the fact that b is fixed size 8 
 bytes and know that requires 16 hex digits and converts 
 automatically?  Is there some other suggestion for how to do 
 this eloquently?  I can play around with split and join, but it 
 seemed like there is probably some way to do this directly that 
 I'm  not seeing.

 Thanks!
 Jason
I think that you are not supposed to use a static array: If there are not EXACTLY as many array elements as there are parse-able elements, then the formatted read will consider the parse to have failed. Try this, it's what you want, right? -------- void main() { string s = "ffff fff ff f"; ushort[] vals; formattedRead(s, "%(%x %)", &vals); writefln("%(%s - %)", vals); } -------- 65535 - 4095 - 255 - 15 -------- Regarding the %1x, well, I guess it just isn't supported (yet?)
Sep 24 2012
parent reply "Jason Spencer" <spencer8 sbcglobal.net> writes:
On Monday, 24 September 2012 at 16:32:45 UTC, monarch_dodra wrote:
 On Monday, 24 September 2012 at 15:05:54 UTC, Jason Spencer 
 wrote:
 I imagine there's a slick way to do this, but I'm not seeing 
 it.

 I have a string of hex digits which I'd like to convert to an 
 array of 8 ubytes:

 0123456789abcdef --> [0x01, 0x23, 0x45, 0x67, 0x89, 0xAB, 
 0xCD, 0xEF]
<snip>
 void main(string[] args)
 {
   ubyte[8] b;

   formattedRead(args[1], "%(%s%)", &b);
 }
 I think that you are not supposed to use a static array: If 
 there are not EXACTLY as many array elements as there are 
 parse-able elements, then the formatted read will consider the 
 parse to have failed.
The sample code was just for testing convenience. In practice the string will be conditioned and known to have 16 characters in {0-9, a-f}.
 Try this, it's what you want, right?

 --------
 void main()
 {
     string s = "ffff fff ff f";
     ushort[] vals;
     formattedRead(s, "%(%x %)", &vals);
     writefln("%(%s - %)", vals);
 }
Not quite. You've taken the liberty of using a delimiter--spaces. I have to take 16 contiguous, NON-delimited hex digits and produce 8 bytes. So I could read it as a uint64 (not uint16, as I mistakenly posted before), but then I'd have to byte-reverse it. I could use slicing and do a byte at a time. I just wondered if there were a slick way to get in-place data from a contiguous hex string. Thanks, Jason
Sep 24 2012
parent "monarch_dodra" <monarchdodra gmail.com> writes:
On Monday, 24 September 2012 at 22:38:59 UTC, Jason Spencer wrote:
 On Monday, 24 September 2012 at 16:32:45 UTC, monarch_dodra 
 wrote:
 On Monday, 24 September 2012 at 15:05:54 UTC, Jason Spencer 
 wrote:
 I imagine there's a slick way to do this, but I'm not seeing 
 it.

 I have a string of hex digits which I'd like to convert to an 
 array of 8 ubytes:

 0123456789abcdef --> [0x01, 0x23, 0x45, 0x67, 0x89, 0xAB, 
 0xCD, 0xEF]
<snip>
 void main(string[] args)
 {
  ubyte[8] b;

  formattedRead(args[1], "%(%s%)", &b);
 }
 I think that you are not supposed to use a static array: If 
 there are not EXACTLY as many array elements as there are 
 parse-able elements, then the formatted read will consider the 
 parse to have failed.
The sample code was just for testing convenience. In practice the string will be conditioned and known to have 16 characters in {0-9, a-f}.
 Try this, it's what you want, right?

 --------
 void main()
 {
    string s = "ffff fff ff f";
    ushort[] vals;
    formattedRead(s, "%(%x %)", &vals);
    writefln("%(%s - %)", vals);
 }
Not quite. You've taken the liberty of using a delimiter--spaces. I have to take 16 contiguous, NON-delimited hex digits and produce 8 bytes. So I could read it as a uint64 (not uint16, as I mistakenly posted before), but then I'd have to byte-reverse it. I could use slicing and do a byte at a time. I just wondered if there were a slick way to get in-place data from a contiguous hex string. Thanks, Jason
I am unsure if the non-support of %2x is by design, or just "not yet supported". Keep in mind that slicing a string *is* inplace. It is equivalent to pointer arithmetic. I'd just do a loop: -------- void main() { string s = "0123456789abcdef"; ushort[8] vals; foreach(size_t i; 0..8) { string slice = s[2*i .. 2*(i+1)]; slice.formattedRead("%x", &vals[i]); } writeln(vals); } -------- [1, 35, 69, 103, 137, 171, 205, 239] -------- Will still get the job done pretty cleanly and efficiently. Chances are this is even faster and more efficient than a supposed "%(%2x%)" scheme, since you are lowering the complexity from a list of reads to a simple extract data. Alternatively, you could just use conv's "to" or "parse". I've had others argue that "ForamttedRead" is meant as an implementation detail, and should be used by other functions, but "consumers" shouldn't use it directly. I found this strange at first, but I've grown fond of the power of "to": -------- import std.conv, std.stdio; void main() { string s = "0123456789abcdef"; ushort[8] vals; foreach(size_t i; 0..8) vals[i] = s[2*i .. 2*(i+1)].to!ushort(16); writeln(vals); } -------- Pretty nice, no?
Sep 25 2012