digitalmars.D.learn - bigEndian in std.bitmanip
- Salih Dincer (45/45) Oct 31 2023 Hello,
- Jonathan M Davis (9/12) Oct 31 2023 Why would you expect little endian to be the default? The typical thing ...
- Salih Dincer (26/36) Oct 31 2023 Because when we create a structure with a Union, it does reverse
- Jonathan M Davis (24/62) Oct 31 2023 I fail to see what the situation with the union has to do with anything.
- Imperatorn (6/51) Oct 31 2023 It might make sense to change since little endian is the most
- Salih Dincer (78/83) Nov 02 2023 I realized that I had to make my prefer based on the most common.
- Imperatorn (4/16) Nov 02 2023 Nice to hear you found a solution. Little endian is *most common*
Hello, Why isn't Endian.littleEndian the default setting for read() in std.bitmanip? Okay, we can easily change this if we want (I could use enum LE in the example) and I can also be reversed with data.retro.array(). ```d void main() { import std.conv : hexString; string helloD = hexString!"48656C6C6F204421"; // compile time converted literal string -ˆ import std.string : format; auto hexF = helloD.format!"%(%02X%)"; import std.digest: toHexString; auto arr = cast(ubyte[])"Hello D!"; auto hex = arr.toHexString; assert(hex == hexF); import std.stdio : writeln; hex.writeln(": ", helloD); // 48656C6C6F204421: Hello D! assert(helloD == "Hello D!"); auto data = arr.readBytes!size_t; data.code.writeln(": ", data.bytes); // 2397076564600448328: Hello D! } template readBytes(T, R) { union Bytes { T code; char[T.sizeof] bytes; } import std.bitmanip; enum LE = Endian.littleEndian; auto readBytes(ref R data) { import std.range : retro, array; auto reverse = data.retro.array; return Bytes(reverse.read!T); } } ``` However, I think it is not compatible with Union. Thanks... SDB 79
Oct 31 2023
On Tuesday, October 31, 2023 4:09:53 AM MDT Salih Dincer via Digitalmars-d- learn wrote:Hello, Why isn't Endian.littleEndian the default setting for read() in std.bitmanip?Why would you expect little endian to be the default? The typical thing to do when encoding integral values in a platform-agnostic manner is to use big endian, not little endian. Either way, it supports both big endian and little endian, so if your use case requires little endian, you can do that. You just have to specifiy the endianness, and if you find that to be too verbose, you can create a wrapper to use in your own code. - Jonathan M Davis
Oct 31 2023
On Tuesday, 31 October 2023 at 10:24:56 UTC, Jonathan M Davis wrote:On Tuesday, October 31, 2023 4:09:53 AM MDT Salih Dincer via Digitalmars-d- learn wrote:Because when we create a structure with a Union, it does reverse insertion with according to the static array(bytes) index; I showed this above. I also have a convenience template like this: ```d template readBytes(T, bool big = false, R) { // pair endian version 2.0 import bop = std.bitmanip; static if(big) enum E = bop.Endian.bigEndian; else enum E = bop.Endian.littleEndian; auto readBytes(ref R dat) => bop.read!(T, E)(dat); } ``` Sorry to give you extra engage because I already solved the problem with readBytes(). Thank you for your answer, but there is 1 more problem, or even 2! The read() in the library, which is 2nd function, conflicts with std.write. Yeah, there are many solutions to this, but what it does is just read bytes. However, you can insert 4 ushorts into one ulong. Don't you think the name of the function should be readBytes, not read? Because it doesn't work with any type other than ubyte[]! SDB 79Hello, Why isn't Endian.littleEndian the default setting for read() in std.bitmanip?Why would you expect little endian to be the default? The typical thing to do when encoding integral values in a platform-agnostic manner is to use big endian, not little endian...
Oct 31 2023
On Tuesday, October 31, 2023 8:23:28 AM MDT Salih Dincer via Digitalmars-d- learn wrote:On Tuesday, 31 October 2023 at 10:24:56 UTC, Jonathan M Davis wrote:I fail to see what the situation with the union has to do with anything. Sure, you can convert between an array of bytes and an int with a union if you want to, but what that does is going to be dependent on your local architecture. read and its related functions in std.bitmanip are architecture-independent. So, they will convert from little endian or big endian regardless of what your local architecture is. You would typically use it on ranges of bytes that come from the network or from serialized data. The most common scenario there is likely to be that they'll be in big endian, because that's what platforma-independent binary formats typically do, but you can explicitly tell read that the range is in little endian if your range of bytes happens to be in little endian. Both scenarios can occur, and it supports both. It just defaults to big endian, because that's the more common scenario when dealing with binary formats.On Tuesday, October 31, 2023 4:09:53 AM MDT Salih Dincer via Digitalmars-d- learn wrote:Because when we create a structure with a Union, it does reverse insertion with according to the static array(bytes) index; I showed this above.Hello, Why isn't Endian.littleEndian the default setting for read() in std.bitmanip?Why would you expect little endian to be the default? The typical thing to do when encoding integral values in a platform-agnostic manner is to use big endian, not little endian...I also have a convenience template like this: ```d template readBytes(T, bool big = false, R) { // pair endian version 2.0 import bop = std.bitmanip; static if(big) enum E = bop.Endian.bigEndian; else enum E = bop.Endian.littleEndian; auto readBytes(ref R dat) => bop.read!(T, E)(dat); } ``` Sorry to give you extra engage because I already solved the problem with readBytes(). Thank you for your answer, but there is 1 more problem, or even 2! The read() in the library, which is 2nd function, conflicts with std.write. Yeah, there are many solutions to this, but what it does is just read bytes. However, you can insert 4 ushorts into one ulong. Don't you think the name of the function should be readBytes, not read? Because it doesn't work with any type other than ubyte[]!D's module system makes it so that names do not need to be unique across modules, and this is not the only case in Phobos where multiple modules use the same function name. It's easy enough to import only the functions you're using or to rename them via the import if you happen to be importing from multiple modules containing functions with the same name. E.G. if you want to do std.bitmanip : readBytes = read; then you can. - Jonathan M Davis
Oct 31 2023
On Tuesday, 31 October 2023 at 10:09:53 UTC, Salih Dincer wrote:Hello, Why isn't Endian.littleEndian the default setting for read() in std.bitmanip? Okay, we can easily change this if we want (I could use enum LE in the example) and I can also be reversed with data.retro.array(). ```d void main() { import std.conv : hexString; string helloD = hexString!"48656C6C6F204421"; // compile time converted literal string -ˆ import std.string : format; auto hexF = helloD.format!"%(%02X%)"; import std.digest: toHexString; auto arr = cast(ubyte[])"Hello D!"; auto hex = arr.toHexString; assert(hex == hexF); import std.stdio : writeln; hex.writeln(": ", helloD); // 48656C6C6F204421: Hello D! assert(helloD == "Hello D!"); auto data = arr.readBytes!size_t; data.code.writeln(": ", data.bytes); // 2397076564600448328: Hello D! } template readBytes(T, R) { union Bytes { T code; char[T.sizeof] bytes; } import std.bitmanip; enum LE = Endian.littleEndian; auto readBytes(ref R data) { import std.range : retro, array; auto reverse = data.retro.array; return Bytes(reverse.read!T); } } ``` However, I think it is not compatible with Union. Thanks... SDB 79It might make sense to change since little endian is the most common when it comes to hardware. But big endian is most common when it comes to networking. So I guess it depends on your view of what is most common. Interacting with your local hardware or networking.
Oct 31 2023
On Tuesday, 31 October 2023 at 14:43:43 UTC, Imperatorn wrote:It might make sense to change since little endian is the most common when it comes to hardware. But big endian is most common when it comes to networking. So I guess it depends on your view of what is most common. Interacting with your local hardware or networking.I realized that I had to make my prefer based on the most common. But I have to use Union. That's why I have to choose little.Endian. Because it is compatible with both Union and HexString. My test code works perfectly as seen below. I'm grateful to everyone who helped here and [on the other thread](https://forum.dlang.org/thread/ekpvajiablcfueyipcal forum.dlang.org). ```d enum sampleText = "Hello D!"; // length <= 8 char void main() { //import sdb.string : UnionBytes; mixin UnionBytes!size_t; bytes.init = sampleText; import std.digest: toHexString; auto hexF = bytes.cell.toHexString; assert(hexF == "48656C6C6F204421"); import std.string : format; auto helloD = sampleText.format!"%(%02X%)"; assert(hexF == helloD); import std.stdio; bytes.code.writeln(": ", helloD); /* Prints: 2397076564600448328: 48656C6C6F204421 */ import std.conv : hexString; static assert(sampleText == hexString!"48656C6C6F204421"); //import sdb.string : readBytes; auto code = bytes.cell.readBytes!size_t; assert(code == bytes.code); bytes.init = code; code.writeln(": ", bytes); /* Prints: 2397076564600448328: Hello D! */ assert(bytes[] == [72, 101, 108, 108, 111, 32, 68, 33]); //import sdb.string : HexString auto str = "0x"; auto hex = HexString!size_t(bytes.code); hex.each!(chr => str ~= chr); str.writeln; // 0x48656C6C6F204421 } ``` My core template (UnionBytes) is initialized like this, and underneath I have the readBytes template, which also works with static arrays: ```d // ... import std.range : front, popFront; size_t i; do // new version: range support { char chr; // default init: 0xFF chr &= str.front; // masking code |= T(chr) << (i * 8); // shifting str.popFront; // next char } while(++i < size); } auto opCast(Cast : T)() const => code; auto opCast(Cast : string)() const => this.format!"%s"; auto toString(void delegate(in char[]) sink) const => sink.formattedWrite("%s", cast(char[])cell); } UnionBytes bytes; // for mixin } template readBytes(T, bool big = false, R) { // pair endian version 2.1 import std.bitmanip; static if(big) enum E = Endian.bigEndian; else enum E = Endian.littleEndian; import std.range : ElementType; alias ET = ElementType!R; auto readBytes(ref R dat) { auto data = cast(ET[])dat; return read!(T, E)(data); } } ``` SDB 79
Nov 02 2023
On Thursday, 2 November 2023 at 11:29:05 UTC, Salih Dincer wrote:On Tuesday, 31 October 2023 at 14:43:43 UTC, Imperatorn wrote:Nice to hear you found a solution. Little endian is *most common* in hardware but big endian is *most common* in networking, so defining a default endianness can be tricky.It might make sense to change since little endian is the most common when it comes to hardware. But big endian is most common when it comes to networking. So I guess it depends on your view of what is most common. Interacting with your local hardware or networking.I realized that I had to make my prefer based on the most common. But I have to use Union. That's why I have to choose little.Endian. Because it is compatible with both Union and HexString. My test code works perfectly as seen below. I'm grateful to everyone who helped here and [on the other thread](https://forum.dlang.org/thread/ekpvajiablcfueyipcal forum.dlang.org).
Nov 02 2023