www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - C bitfields guarantees

reply Richard (Rikki) Andrew Cattermole <richard cattermole.co.nz> writes:
Today many people have spent some time to try and understand 
Walter's belief that C is "good enough" for bit fields in terms 
of guarantees.

I believe I have understood a core component to this.

 From the C23 standard:

 An implementation may allocate any addressable storage unit 
 large enough to hold a bit-field. If
enough space remains, a bit-field that immediately follows another bit-field in a structure shall be packed into adjacent bits of the same unit. If insufficient space remains, whether a bit-field that does not fit is put into the next unit or overlaps adjacent units is implementation-defined. The order of allocation of bit-fields within a unit (high-order to low-order or low-order to high-order) is implementation-defined. The alignment of the addressable storage unit is unspecified. What matters is the _initial_ type in the bit-field, the rest of the types _do not_ matter. As long as you _do not_ start and finish in two separate memory addresses for that initial type it will be predictable. I have filed a [ticket](https://github.com/dlang-community/D-Scanner/issues/955) for dscanner to introduce a warning to tell you that the compiler is going to do a bad thing, that will cause you problems and the compiler will not assist you. Ideally, we wouldn't allow it for ``extern(D)`` code at all. As of right now, assuming we get Dscanner to give the warning I can withdraw my concerns, although I do think that ``extern(D)`` shouldn't be offering you such a heavy foot-gun.
Jul 04 2024
next sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 7/5/24 07:37, Richard (Rikki) Andrew Cattermole wrote:
 Today many people have spent some time to try and understand Walter's 
 belief that C is "good enough" for bit fields in terms of guarantees.
 
 I believe I have understood a core component to this.
 
  From the C23 standard:
 
 An implementation may allocate any addressable storage unit large 
 enough to hold a bit-field. If
enough space remains, a bit-field that immediately follows another bit-field in a structure shall be packed into adjacent bits of the same unit. If insufficient space remains, whether a bit-field that does not fit is put into the next unit or overlaps adjacent units is implementation-defined. The order of allocation of bit-fields within a unit (high-order to low-order or low-order to high-order) is implementation-defined. The alignment of the addressable storage unit is unspecified. What matters is the _initial_ type in the bit-field, the rest of the types _do not_ matter. ...
According to this text, none of the types matter for layout guarantees. Only the bit sizes matter somewhat. And then the implementation still has way too much leeway in how it allocates things. Walter's reasoning has been that _in practice_, C implementations are a bit more sane than what the standard allows. I don't think it is fruitful to try and find any useful guarantees in the standard. If there were any, that's what Walter would point to instead.
 As long as you _do not_ start and finish in two separate memory 
 addresses for that initial type it will be predictable.
 ...
According to the standard, no. E.g.: int a:7; int b:25; According to the standard, this could put `a` in a 1-byte unit and `b` in a subsequent 32-byte unit. It could put `a` in a 1-byte unit, use the last bit for `b`, then put the remaining 24 bits of `b` in a new unit. It could also put both in separate 4-byte integers. Or it could pack them into a single 4-byte location. It is not specified. In practice, implementations will usually put both of them in a single 4-byte location, and this is what Walter is relying on. The C standard gives you almost nothing (it could even choose to put both `a` and `b` into a 8-byte or larger unit, there is no upper limit on the size, only a lower one.) And I did not even get into different possible orderings of bit fields within a unit.
Jul 05 2024
parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 7/5/24 11:13, Timon Gehr wrote:
 ...
 It could also put both in separate 4-byte integers.
Actually no, this is one of the few things it cannot do. I got a bit too excited there. Anyway, the point stands.
Jul 05 2024
parent reply "Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:
On 05/07/2024 9:42 PM, Timon Gehr wrote:
 On 7/5/24 11:13, Timon Gehr wrote:
 ...
 It could also put both in separate 4-byte integers.
Actually no, this is one of the few things it cannot do. I got a bit too excited there. Anyway, the point stands.
Oh oh no, you are so right, I was applying the type there that I shouldn't have been. Don't read the C standard after you've been awake more than 12 hour folks! However in saying that, the point that we can mitigate it using a dscanner warning does still stand. Therefore my original post stating I withdraw my concerns is valid. The only problem is it'll be word size specific and alignment specific check now. I hate every bit that we need to make such a specific mitigation for what amounts to a brand new feature. It is quite frankly ridicules to need a _mitigation_ for this.
Jul 05 2024
parent reply Salih Dincer <salihdb hotmail.com> writes:
On Friday, 5 July 2024 at 14:18:43 UTC, Richard (Rikki) Andrew 
Cattermole wrote:
 Don't read the C standard after you've been awake more than 12 
 hour folks!

 However in saying that, the point that we can mitigate it using 
 a dscanner warning does still stand. Therefore my original post 
 stating I withdraw my concerns is valid.
Given that today is July 5, 2024, the publication of the C23 standard is imminent, with the limit date for publication being July 12, 2024. This means that within a week, the C23 standard should be officially published, marking a significant milestone for the C programming language and for D. Is it a good time to start planning for any necessary updates to our existing codebases or libraries to ensure compatibility with C23? Can we say that DMD will also support this in parallel with the developments? SDB 79
Jul 05 2024
parent "Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:
On 06/07/2024 4:48 AM, Salih Dincer wrote:
 Is it a good time to start planning for any necessary updates to our 
 existing codebases or libraries to ensure compatibility with C23? Can we 
 say that DMD will also support this in parallel with the developments?
As of right now, the only thing planned is the changing of our identifiers to match the C23 identifier tables that is UAX31 based. I've implemented and has been in a release, although we are not transitioned over, the breakage is expected as of 2.119 (the tables are both bigger and smaller than C99 *sigh*, right now we are in a recombination of all the different tables). Walter really does not want the normalization stuff that UAX31 and with that C23 requires and some of it was implemented, but alas. But the other things like different float types are not currently planned to be supported as far as I know. We should probably discuss that at some point as a community. Other things like nodiscard on a function have no D equivalent just yet although we have allowed for it to occur in the future as part of `` mustuse``. Apart from identifiers there isn't much you should need to deal with for your code base :)
Jul 05 2024
prev sibling next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 7/4/2024 10:37 PM, Richard (Rikki) Andrew Cattermole wrote:
 Today many people have spent some time to try and understand Walter's belief 
 that C is "good enough" for bit fields in terms of guarantees.
It's straightforward. If you use uint as the field type, you'll get the same layout across every C compiler I've ever heard of. The reason for this is straightforward: 1. it's the obvious way to do things 2. professional C compiler developers are sensible people 3. professional C compiler developers want to compile existing code and have it behave the same way on the same platform, they don't care to antagonize their users The differences crop up when using multiple field types *and* porting to a different ecosystem. These problems are trivially avoided. Even so, within a particular ecosystem, the C compilers are all compatible with each other. Why? Because C compiler developers want their compiler to be useful! Is anyone surprised that gcc/clang/ImportC work exactly the same on each ecosystem? Consider also that the C standard does not specify the size of a 'char'. There are C compilers for special CPUs that have different char sizes - notably 32 bit chars for some DSP processors, and 10 bit chars for the CPU on a Mattel Intellivision game computer. C on a PDP-10 has 36 bit ints, too! and 18 bit shorts. I can pretty much guarantee that all C code developed on a conventional CPU will fail to work on those machines. But so what. When you port to a diverse machine, you expect such problems.
Jul 05 2024
next sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 7/5/24 18:35, Walter Bright wrote:
 
 Consider also that the C standard does not specify the size of a 'char'.
D does specify it.
 There are C compilers for special CPUs that have different char sizes - 
 notably 32 bit chars for some DSP processors, and 10 bit chars for the 
 CPU on a Mattel Intellivision game computer. C on a PDP-10 has 36 bit 
 ints, too! and 18 bit shorts.
 
 I can pretty much guarantee that all C code developed on a conventional 
 CPU will fail to work on those machines.
 
 But so what. When you port to a diverse machine, you expect such problems.
Well, this is the D newsgroup.
Jul 05 2024
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 7/5/2024 10:02 AM, Timon Gehr wrote:
 On 7/5/24 18:35, Walter Bright wrote:
 Consider also that the C standard does not specify the size of a 'char'.
D does specify it.
Yes. And I have no concern at all about some C compiler that uses a different size. None of those C compilers will compile "portable" C code, either, even though the Standard permits such compilers. If we go though a dimensional warp into an alternate universe, where C chars are 9 bits, we'll change the D compiler to match.
Jul 05 2024
parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 7/6/24 01:23, Walter Bright wrote:
 On 7/5/2024 10:02 AM, Timon Gehr wrote:
 On 7/5/24 18:35, Walter Bright wrote:
 Consider also that the C standard does not specify the size of a 'char'.
D does specify it.
Yes. And I have no concern at all about some C compiler that uses a different size. None of those C compilers will compile "portable" C code, either, even though the Standard permits such compilers. If we go though a dimensional warp into an alternate universe, where C chars are 9 bits, we'll change the D compiler to match.
The point was: D should actually specify more bitfield layout guarantees than the C standard.
Jul 06 2024
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 7/6/2024 8:54 AM, Timon Gehr wrote:
 The point was: D should actually specify more bitfield layout guarantees than 
 the C standard.
I understand that. Given that any desired portable bitfield layout can be done with minimal effort, there is no need to add more semantics to the language than what C does. I.e. portable not only to the associated C compiler, but to any C compiler with 8 bit chars and 32 bit ints. Throw me an example that shows me wrong! Personally, I would find this to be much more readable code than adding more syntactical constructs.
Jul 06 2024
parent reply "Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:
On 07/07/2024 11:19 AM, Walter Bright wrote:
 On 7/6/2024 8:54 AM, Timon Gehr wrote:
 
     The point was: D should actually specify more bitfield layout
     guarantees than the C standard.
 
 I understand that. Given that any desired portable bitfield layout can 
 be done with minimal effort, there is no need to add more semantics to 
 the language than what C does.
You have an expert understanding of the subject matter. Nobody else around here has this knowledge or expertise. As of right now there does not appear to be a single person on the D Discord server that understands how to use C bit fields to have predictable behavior let alone portable. I understand that you think that this is simple, but nobody else can understand it, and you are failing to explain it sufficiently. If somebody has their program failing, it will be hard to diagnose the problem let alone explain it. The only person who can do this is you. That does not scale. At this point multiple people who are usually responsible for explaining language features to other people and diagnosing programs, are telling you that they cannot use it as intended, this should be sending up major red flags that only you can use this feature. Please seriously reconsider the ``extern(D)``/``extern(C)`` split, because right now we will have no choice but to have DScanner issue a warning for improper use of bit fields, and that is quite frankly ridicules that a brand new ``extern(D)`` language feature needs a warning.
Jul 06 2024
parent Walter Bright <newshound2 digitalmars.com> writes:
C and D programmers already know how to align things with pad fields. It's a 
basic skill.

If extern(C) and extern(D) were added to bitfields, then the programmer would 
have to learn two new syntactic constructs and what they mean. It's a 
distinction that is easily forgettable, too. Quick, what's the difference in 
calling convention between extern(C) and extern(D) functions?

With pad fields, there's nothing new to learn, and it's quite obvious even for
a 
naive programmer what is happening with them.
Jul 06 2024
prev sibling parent reply Steven Schveighoffer <schveiguy gmail.com> writes:
On Friday, 5 July 2024 at 16:35:43 UTC, Walter Bright wrote:
 On 7/4/2024 10:37 PM, Richard (Rikki) Andrew Cattermole wrote:
 Today many people have spent some time to try and understand 
 Walter's belief that C is "good enough" for bit fields in 
 terms of guarantees.
It's straightforward. If you use uint as the field type, you'll get the same layout across every C compiler I've ever heard of.
What if you need > 32 bits or want to pack into a `ulong`? Is the behavior sane across compilers? -Steve
Jul 05 2024
next sibling parent reply Tim <tim.dlang t-online.de> writes:
On Friday, 5 July 2024 at 19:35:10 UTC, Steven Schveighoffer 
wrote:
 What if you need > 32 bits or want to pack into a `ulong`? Is 
 the behavior sane across compilers?
The following struct has a different layout for different platforms: ``` struct S { unsigned int x; unsigned long long a:20, b:20, c:24; }; ``` Windows layout: ``` 0 | struct S 0 | unsigned int x 8:0-19 | unsigned long long a 10:4-23 | unsigned long long b 13:0-23 | unsigned long long c | [sizeof=16, align=8] ``` Linux x86_64 layout: ``` 0 | struct S 0 | unsigned int x 4:0-19 | unsigned long long a 8:0-19 | unsigned long long b 10:4-27 | unsigned long long c | [sizeof=16, align=8] ``` Linux i686 layout: ``` 0 | struct S 0 | unsigned int x 4:0-19 | unsigned long long a 6:4-23 | unsigned long long b 9:0-23 | unsigned long long c | [sizeof=12, align=4] ```
Jul 05 2024
parent reply Steven Schveighoffer <schveiguy gmail.com> writes:
On Friday, 5 July 2024 at 19:56:49 UTC, Tim wrote:
 On Friday, 5 July 2024 at 19:35:10 UTC, Steven Schveighoffer 
 wrote:
 What if you need > 32 bits or want to pack into a `ulong`? Is 
 the behavior sane across compilers?
The following struct has a different layout for different platforms:
... Thanks for this. I also tested the following, and found it too shows discrepancies. ```c struct S { unsigned short x; unsigned int a : 12; unsigned int b : 12; unsigned int c : 8; }; ``` Here there are only `uint` bitfields, yet the compiler chooses to layout the bits differently based on the *preceding* field. Walter, I have to unfortunately withdraw my support for defining D bitfields to just be the same as C bitfields -- the minefields are too subtle. The statement that "If you use uint as the field type, you'll get the same layout across every C compiler" is not true. And I don't think we can really specify the true nature of what you must do for portable bitfields in a way that is straightforward. Saying something like "you can only use `uint` bitfields in structs that contain only `uint` types" is not a good feature. I'm back to requesting that we have a mechanism to request C bitfields (such as marking a struct as `extern(C)`), or picking one C style and going with that. -Steve
Jul 05 2024
next sibling parent reply "Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:
On 06/07/2024 9:12 AM, Steven Schveighoffer wrote:
 On Friday, 5 July 2024 at 19:56:49 UTC, Tim wrote:
 On Friday, 5 July 2024 at 19:35:10 UTC, Steven Schveighoffer wrote:
 What if you need > 32 bits or want to pack into a `ulong`? Is the 
 behavior sane across compilers?
The following struct has a different layout for different platforms:
... Thanks for this. I also tested the following, and found it too shows discrepancies. ```c struct S {     unsigned short x;     unsigned int a : 12;     unsigned int b : 12;     unsigned int c : 8; }; ``` Here there are only `uint` bitfields, yet the compiler chooses to layout the bits differently based on the *preceding* field. Walter, I have to unfortunately withdraw my support for defining D bitfields to just be the same as C bitfields -- the minefields are too subtle. The statement that "If you use uint as the field type, you'll get the same layout across every C compiler" is not true. And I don't think we can really specify the true nature of what you must do for portable bitfields in a way that is straightforward. Saying something like "you can only use `uint` bitfields in structs that contain only `uint` types" is not a good feature. I'm back to requesting that we have a mechanism to request C bitfields (such as marking a struct as `extern(C)`), or picking one C style and going with that. -Steve
I did not expect this. This prevents my mitigation from working. So now we also have to put it into an anonymous struct to even get the layout we think it should be. ```c struct Foo { unsigned short x; struct { unsigned int a : 12; unsigned int b : 12; unsigned int c : 8; }; //void* next; }; int main() { struct Foo foo; foo.a = 1; foo.b = 0; return 0; } ``` ```asm main: push rbp mov rbp,rsp mov DWORD PTR [rbp-0x4],0x0 mov eax,DWORD PTR [rbp-0x8] and eax,0xfffff000 or eax,0x1 mov DWORD PTR [rbp-0x8],eax mov eax,DWORD PTR [rbp-0x8] and eax,0xff000fff or eax,0x0 mov DWORD PTR [rbp-0x8],eax xor eax,eax pop rbp ret ```
Jul 05 2024
parent Walter Bright <newshound2 digitalmars.com> writes:
On 7/5/2024 2:25 PM, Richard (Rikki) Andrew Cattermole wrote:
 So now we also have to put it into an anonymous struct
See my other reply.
Jul 05 2024
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 7/5/2024 2:12 PM, Steven Schveighoffer wrote:
 I also tested the following, and found it too shows discrepancies.
 
 ```c
 struct S {
      unsigned short x;
      unsigned int a : 12;
      unsigned int b : 12;
      unsigned int c : 8;
 };
 ```
The following will also show discrepancies: ``` struct T { unsigned short x; unsigned int y; } ``` for the same reason.
 Here there are only uint bitfields, yet the compiler chooses to layout the 
bits differently based on the preceding field. It's actually based on the *alignment* of the preceding field. I'm regret not saying that, but that's what I meant with the fields need to be of the same type, so they have the same alignment. If the uint bitfield started off aligned at a uint boundary, my statement holds. When mixing field types of different sizes, there will be different alignments of those fields on different platforms/compilers, whether or not bitfields are involved. The layout can be portably controlled as desired, by being cognizant of field alignment: ```c struct S { unsigned short x; unsigned short a : 12; // at offset 2 unsigned int b : 12; // at offset 4 unsigned int c : 8; // at offset 4 }; ``` ```c struct S { unsigned short x; unsigned short dummy; // for alignment porpoises unsigned int a : 12; // at offset 4 unsigned int b : 12; // at offset 4 unsigned int c : 8; // at offset 4 }; ``` Simply put, avoiding fields that straddle alignment boundaries avoids portability issues. This is true with both bitfields and regular fields.
Jul 05 2024
parent reply Steven Schveighoffer <schveiguy gmail.com> writes:
On Saturday, 6 July 2024 at 00:16:23 UTC, Walter Bright wrote:
 On 7/5/2024 2:12 PM, Steven Schveighoffer wrote:
 I also tested the following, and found it too shows 
 discrepancies.
 
 ```c
 struct S {
      unsigned short x;
      unsigned int a : 12;
      unsigned int b : 12;
      unsigned int c : 8;
 };
 ```
The following will also show discrepancies: ``` struct T { unsigned short x; unsigned int y; } ``` for the same reason.
I tested this struct, and there were no discrepancies between compilers. All compilers put 2 bytes of padding between the `ushort` and the `uint`.
 It's actually based on the *alignment* of the preceding field. 
 I'm regret not saying that, but that's what I meant with the 
 fields need to be of the same type, so they have the same 
 alignment. If the uint bitfield started off aligned at a uint 
 boundary, my statement holds.
Hm..., well it's not ideal to require the user to nudge the compiler for the desired layout. It's an odd thing to say that a uint bitfield may not be uint aligned, even if the equivalent uint value would be. The documentation note we talked about was simple -- just always use the same type for your bitfields and it works. This is different. Not impossible to learn, but for sure more challenging.
 When mixing field types of different sizes, there will be 
 different alignments of those fields on different 
 platforms/compilers, whether or not bitfields are involved.
The confusing thing here is that the alignment does *not* obey the alignment of the containing type. And how it is aligned depends instead on the *previous* member (sometimes). This is not the case for full-sized uints. I will note that I'm reading that ulong is aligned to 4-bytes on 32-bit linux, and so this does make an alignment difference even for non-bitfields. My recommendation still is either: 1. Denote D bitfields by a specified layout system (pick the most common C one and do that). C bitfields can match the C compiler. 2. Simply forbid problematic alignments at compile time: ```d struct S { uint x; uint64 a : 24; uint64 b : 24; uint64 c : 16; } // error, alignment of bitfield `a` may not match C layout, please use padding or aligned bitfields to specify intended layout. // these are OK. struct SWithPadding { uint x; uint _; // padding uint64 a : 24; uint64 b : 24; uint64 c : 16; } struct SPacked { uint64 x : 32; uint64 a : 24; uint64 b : 24; uint64 c : 16; } ``` Maybe the error only occurs if you specify a compiler switch? -Steve
Jul 05 2024
next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 7/5/2024 8:23 PM, Steven Schveighoffer wrote:
 On Saturday, 6 July 2024 at 00:16:23 UTC, Walter Bright wrote:
 The following will also show discrepancies:

 ```
 struct T {
     unsigned short x;
     unsigned int y;
 }
 ```

 for the same reason.
I tested this struct, and there were no discrepancies between compilers. All compilers put 2 bytes of padding between the `ushort` and the `uint`.
Try it with a 16 bit compiler, which aligns on 16 bits rather than 32 bits. No, I'm not cheating with this - I wanted to point out the consistency between 32 bit compilers, despite the Standard saying nothing about it. But I can still break the example, with a 32/64 bit compiler: ``` struct U { unsigned int x; unsigned long y; } ``` You'll get different sizes for 32 vs 64 bit compilations, including with D.
Jul 06 2024
parent reply Steven Schveighoffer <schveiguy gmail.com> writes:
On Saturday, 6 July 2024 at 23:26:43 UTC, Walter Bright wrote:
 On 7/5/2024 8:23 PM, Steven Schveighoffer wrote:
 On Saturday, 6 July 2024 at 00:16:23 UTC, Walter Bright wrote:
 The following will also show discrepancies:

 ```
 struct T {
     unsigned short x;
     unsigned int y;
 }
 ```

 for the same reason.
I tested this struct, and there were no discrepancies between compilers. All compilers put 2 bytes of padding between the `ushort` and the `uint`.
Try it with a 16 bit compiler, which aligns on 16 bits rather than 32 bits. No, I'm not cheating with this - I wanted to point out the consistency between 32 bit compilers, despite the Standard saying nothing about it. But I can still break the example, with a 32/64 bit compiler: ``` struct U { unsigned int x; unsigned long y; } ``` You'll get different sizes for 32 vs 64 bit compilations, including with D.
Right, but with bitfields, you get discrepancies within the *same compiler*. Let's take another example: ```c struct U { unsigned int x; unsigned long long y: 30; unsigned long long z: 34; } struct U2 { unsigned int x; unsigned long long y: 34; unsigned long long z: 30; } ``` In the first case, Linux 64-bit clang will layout y to be right after x, and z will be pushed 2 bits further so it lines up on a 64-bit address. In the second case, in the same compiler, y is pushed *32* bits off so it lines up on a 64-bit address space, and z is past that. In both cases, 96 bits of data consumes 128 bits (sizeof both structs is 16). In other words, the compiler makes probably unexpected layout decisions, and the reason is "because C does it". Truly, with C, you cannot count on *any* explicit layout. Yes, there are reasons, and all of those have to do with performance. But when the goal is explicit layouts, then confusion rules. This is why I said, either define what D does explicitly, or warn when it does something really bizarre for the sake of C. Another option is just to say, "don't use bitfields for anything other than space-saving. Do not attempt to use bitfields for defined bit layout, because the compiler can make arbitrary decisions on layout." But you will have to tell [this guy](https://forum.dlang.org/post/v622m1$mu4$1 digitalmars.com). D has a chance to do better, but it's clear we are just going to saddle ourselves with C insanity for the sake of zero real use cases. -Steve
Jul 06 2024
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 7/6/2024 8:50 PM, Steven Schveighoffer wrote:
 Let's take another example:
 
 ```c
 struct U {
    unsigned int x;
    unsigned long long y: 30;
    unsigned long long z: 34;
 }
 
 struct U2 {
    unsigned int x;
    unsigned long long y: 34;
    unsigned long long z: 30;
 }
 ```
Simple solution: ``` struct U { unsigned int x; unsigned int y:30; unsigned long long z:34; } ``` or: ``` struct U2 { unsigned int x; unsigned int pad; unsigned long long y:30; unsigned long long z:34; } ``` depending on which layout is desired. This is simple, predictable, and portable. It's not going to be a mystery to anyone reading the code - it's eminently readable. An anonymous union can be pressed into service, which can be handy if the type of `x` is opaque: ``` struct U { T x; union { ulong pad; // for alignment struct { ulong y: 30; ulong z: 34; } } } ``` or use align: ``` struct U { T x; align(8) ulong y:30, z:34; } ``` There are many existing ways to accomplish this. Adding more language features to duplicate existing capability needs a very strong case.
Jul 06 2024
next sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 7/7/24 06:47, Walter Bright wrote:
 On 7/6/2024 8:50 PM, Steven Schveighoffer wrote:
 Let's take another example:

 ```c
 struct U {
    unsigned int x;
    unsigned long long y: 30;
    unsigned long long z: 34;
 }

 struct U2 {
    unsigned int x;
    unsigned long long y: 34;
    unsigned long long z: 30;
 }
 ```
Simple solution: ...
You said: Same type, same alignment. It's clearly not true in Steven's example. It seems alignment depends on bit width. Also consider this: ```d struct S{ uint x; ulong y:30; ulong z:34; } pragma(msg, S.y.offsetof, " ", S.y.alignof); // 4LU 8LU The offset of `y` does not even respect its alignment! This is insanity. It also happens with `uint`: ```d struct S{ ushort x; uint y:16; } pragma(msg, S.y.offsetof, " ", S.y.alignof); // 2LU 4LU ``` I.e., "stick to `int`/`uint` bitfields, things will be predictable" is not even true. They may be laid out differently based on what's before them.
 ```
 struct U {
      unsigned int x;
      unsigned int y:30;
      unsigned long long z:34;
 }
 ```
 
 or:
 
 ```
 struct U2 {
      unsigned int x;
      unsigned int pad;
      unsigned long long y:30;
      unsigned long long z:34;
 }
 ```
 
 depending on which layout is desired. This is simple,
If it is simple, you should have no trouble stating how it works completely in a couple sentences.
 predictable, and 
 portable. It's not going to be a mystery to anyone reading the code - 
 it's eminently readable.
 ...
Walter, this is frustrating. It is only obvious to you because having reverse-engineered and implemented it, you already know how it works. Note that the things you were saying earlier suggested it would actually work differently in Steven's example. I hope you understand that this is confusing. I am as a result now not sure whether what you stated is the full truth, or it is still some inadmissible simplification that glosses over some further dragons. Also, I hope `.offsetof % .alignof != 0` is just a bug in your bitfield implementation.
Jul 07 2024
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 7/7/2024 3:42 AM, Timon Gehr wrote:
 If it is simple, you should have no trouble stating how it works completely in
a 
 couple sentences.
One sentence: If the bitfields of type T start on a T alignment boundary and do not straddle a T alignment boundary, then the bitfields will be portable. I agree I sometimes have trouble writing exact specifications, but I'm also confident that you understand this.
 I am as a result now not sure whether what you stated is the full truth, or it
is still 
 some inadmissible simplification that glosses over some further dragons.
Feel free to try pathological examples and let me know of any adverse discoveries.
 Also, I hope `.offsetof % .alignof != 0` is just a bug in your bitfield 
 implementation.
??
Jul 08 2024
parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 7/9/24 01:52, Walter Bright wrote:
 On 7/7/2024 3:42 AM, Timon Gehr wrote:
 If it is simple, you should have no trouble stating how it works 
 completely in a couple sentences.
One sentence: If the bitfields of type T start on a T alignment boundary and do not straddle a T alignment boundary, then the bitfields will be portable. ...
Well, this is not a complete characterization, but good enough I guess. So the preferred alignment of a bitfield of a given width is not portable? I.e., are there so-called sane C compilers where a `uint:16` has an (actual) alignment of 4 instead of 2?
 I agree I sometimes have trouble writing exact specifications, but I'm 
 also confident that you understand this.
 ...
Sure, but I really think we should just enforce this kind of rule for `extern(D)` bitfields. If a programmer does not follow the rule, just error out and present options to the programmer for how to make the code compile: error: bitfield layout is ambiguous - add extern(C) to match the layout of the associated C compiler - add padding and/or 0-width bitfields to unambiguously start bitfields on a T alignment boundary without straddling A priori you just don't know which of those was intended. It's good to require explicit input here, as it is subtle.
 
 I am as a result now not sure whether what you stated is the full 
 truth, or it is still some inadmissible simplification that glosses 
 over some further dragons.
Feel free to try pathological examples and let me know of any adverse discoveries.
 Also, I hope `.offsetof % .alignof != 0` is just a bug in your 
 bitfield implementation.
??
It's elaborated upon in the part of the post you ignored: On 7/7/24 12:42, Timon Gehr wrote:
 
 Also consider this:
 
 ```d
 struct S{
      uint x;
      ulong y:30;
      ulong z:34;
 }
 pragma(msg, S.y.offsetof, " ", S.y.alignof); // 4LU 8LU
 
 The offset of `y` does not even respect its alignment! This is insanity.
 
 It also happens with `uint`:
 
 ```d
 struct S{
      ushort x;
      uint y:16;
 }
 pragma(msg, S.y.offsetof, " ", S.y.alignof); // 2LU 4LU 
Jul 09 2024
next sibling parent Timon Gehr <timon.gehr gmx.ch> writes:
On 7/10/24 02:32, Timon Gehr wrote:
 
 error: bitfield layout is ambiguous
 
 - add extern(C) to match the layout of the associated C compiler
 - add padding and/or 0-width bitfields to unambiguously start bitfields 
 on a T alignment boundary without straddling
Or change some of the bitfield types to ones with smaller alignment I guess. (If that is necessary at all. It's still not so obvious exactly what assumptions are portable in practice.)
Jul 09 2024
prev sibling next sibling parent reply Steven Schveighoffer <schveiguy gmail.com> writes:
On Wednesday, 10 July 2024 at 00:32:53 UTC, Timon Gehr wrote:
 On 7/9/24 01:52, Walter Bright wrote:
 I agree I sometimes have trouble writing exact specifications, 
 but I'm also confident that you understand this.
 ...
Sure, but I really think we should just enforce this kind of rule for `extern(D)` bitfields. If a programmer does not follow the rule, just error out and present options to the programmer for how to make the code compile: error: bitfield layout is ambiguous - add extern(C) to match the layout of the associated C compiler - add padding and/or 0-width bitfields to unambiguously start bitfields on a T alignment boundary without straddling A priori you just don't know which of those was intended. It's good to require explicit input here, as it is subtle.
Yes, this is the correct answer. I stayed away from `extern(C)` specification because I *kinda* see the point that we have no precedent for `extern(C)` to adjust field layout. But this seems so obvious to me, I challenge anyone to fault this as a bad experience. For those who want C Compatibility, just say so. The D compiler has you covered. For those who want exact bitfield layout, you can use D, because D ensures you have not shot yourself in the foot by making an ambiguous layout request. -Steve
Jul 09 2024
parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 7/10/24 03:41, Steven Schveighoffer wrote:
 I stayed away from `extern(C)` specification because I *kinda* see the 
 point that we have no precedent for `extern(C)` to adjust field layout.
Well, it does affect layout: ```d extern(C) struct S{} pragma(msg, S.sizeof); // 0LU pragma(msg, (S[100]).sizeof); // 0LU struct T{} pragma(msg, T.sizeof); // 1LU pragma(msg, (T[100]).sizeof); // 100LU ``` In any case, here, the usage is a bit different, in that the `extern(D)` version would just be a bit more restrictive, but still fully compatible.
Jul 09 2024
parent Walter Bright <newshound2 digitalmars.com> writes:
On 7/9/2024 6:57 PM, Timon Gehr wrote:
 ```d
 extern(C) struct S{}
 pragma(msg, S.sizeof); // 0LU
 pragma(msg, (S[100]).sizeof); // 0LU
 
 struct T{}
 pragma(msg, T.sizeof); // 1LU
 pragma(msg, (T[100]).sizeof); // 100LU
 ```
 
 In any case, here, the usage is a bit different, in that the `extern(D)`
version 
 would just be a bit more restrictive, but still fully compatible.
C and C++ differ here, too. D defaults to the C++ route because they wanted distinct objects to have distinct addresses, which made sense to me.
Jul 09 2024
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 7/9/2024 5:32 PM, Timon Gehr wrote:
 The offset of `y` does not even respect its alignment! This is insanity.
That's right. It's not a bug, it matches what the associated C compiler does. It's the same thing as Steven pointed out. I posted how to portably get either arrangement.
Jul 09 2024
parent Timon Gehr <timon.gehr gmx.ch> writes:
On 7/10/24 08:53, Walter Bright wrote:
 On 7/9/2024 5:32 PM, Timon Gehr wrote:
 The offset of `y` does not even respect its alignment! This is insanity.
That's right. It's not a bug, it matches what the associated C compiler does.
Nonsense. The issue is the inconsistency between `S.y.alignof` and `S.y.offsetof`. In C, neither `offsetof` nor `alignof` work with bitfields in the first place, so the question does not even pose itself.
Jul 10 2024
prev sibling parent reply Steven Schveighoffer <schveiguy gmail.com> writes:
On Sunday, 7 July 2024 at 04:47:30 UTC, Walter Bright wrote:
 On 7/6/2024 8:50 PM, Steven Schveighoffer wrote:
 Let's take another example:
 
 ```c
 struct U {
    unsigned int x;
    unsigned long long y: 30;
    unsigned long long z: 34;
 }
 
 struct U2 {
    unsigned int x;
    unsigned long long y: 34;
    unsigned long long z: 30;
 }
 ```
Simple solution: ``` struct U { unsigned int x; unsigned int y:30; unsigned long long z:34; } ``` or: ``` struct U2 { unsigned int x; unsigned int pad; unsigned long long y:30; unsigned long long z:34; } ``` depending on which layout is desired. This is simple, predictable, and portable. It's not going to be a mystery to anyone reading the code - it's eminently readable.
Simple, no. Predictable, yes (it's unambiguous). And not obvious. What I want is for the compiler to *require* you to do this to avoid inconsistencies. It is going to be a mystery to anyone reading it *why* they put these things in there. (hey, I simplified your code by getting rid of the pad, it comes out the same anyway due to [my wholly understandable but mistaken understanding of] alignment!) To give some examples, we require empty if statements to use {} and not ;. It doesn't require any new syntax but it helps you avoid issues that many people make, even though it is allowed in C. We require explicit conversion when narrowing the range of an integer (i.e. assigning a long to an int). This avoids issues that many people would make, even though it is allowed in C.
 There are many existing ways to accomplish this. Adding more 
 language features to duplicate existing capability needs a very 
 strong case.
I'm not asking for any new features. -Steve
Jul 07 2024
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 7/7/2024 8:49 AM, Steven Schveighoffer wrote:
 Simple, no. Predictable, yes (it's unambiguous). And not obvious.
It is trivially obvious to the most casual observer! Joking aside, it's the same technique used to inure a struct layout against member alignment issues.
 What I want is 
 for the compiler to *require* you to do this to avoid inconsistencies. It is 
 going to be a mystery to anyone reading it *why* they put these things in
there.
I've seen fields named "pad" or "padding" many times in C code. It's normal practice. Failing that, the purpose of comments is to add the 'why'. One could also use `static assert` for extra insurance. I've also seen fields named "reserved". No comment needed.
 To give some examples, we require empty if statements to use {} and not ;. It 
 doesn't require any new syntax but it helps you avoid issues that many people 
 make, even though it is allowed in C.
Then one could not write a C compatible bitfield.
 We require explicit conversion when narrowing the range of an integer (i.e. 
 assigning a long to an int). This avoids issues that many people would make, 
 even though it is allowed in C.
The C semantics are still allowed by adding a cast. Let's say Bob (poor Bob) needs to convert 20,000 lines of C code to D. I know you've done some of this yourself! Bob doesn't want to go through it line by line. Isn't it nice for Bob if it "just works"? If all those data declarations just work? Especially if the result still has to be compatible with the files that C code wrote out? But what if the compiler says "Bob, you can't lay out a bitfield like that!" Or worse, it lays out the bitfield into a portable (but different) layout. Then it doesn't just work, Bob has got some debugging to do (while Bob curses D and me), and Bob's got to figure out an alternative. Who wants to do that? Not Bob. Not me. Not nobody not nohow.
 There are many existing ways to accomplish this. Adding more language features 
 to duplicate existing capability needs a very strong case.
I'm not asking for any new features.
Every switch that changes the semantics is a new feature and a new source of complexity and bugs. One of my original requirements for D was no switches that change language semantics. I have failed at that. But I wasn't wrong to aspire towards it.
Jul 08 2024
next sibling parent reply Steven Schveighoffer <schveiguy gmail.com> writes:
On Tuesday, 9 July 2024 at 00:29:20 UTC, Walter Bright wrote:
 On 7/7/2024 8:49 AM, Steven Schveighoffer wrote:
 Simple, no. Predictable, yes (it's unambiguous). And not 
 obvious.
It is trivially obvious to the most casual observer! Joking aside, it's the same technique used to inure a struct layout against member alignment issues.
Yes, but there is a subtle difference -- the compiler ignores its own rules. In other words, explicit padding is required way more than with normal fields, which have consistent layout expectations. As Timon points out, the compiler doesn't obey its own alignment requirements for bitfields.
 What I want is for the compiler to *require* you to do this to 
 avoid inconsistencies. It is going to be a mystery to anyone 
 reading it *why* they put these things in there.
I've seen fields named "pad" or "padding" many times in C code. It's normal practice. Failing that, the purpose of comments is to add the 'why'. One could also use `static assert` for extra insurance. I've also seen fields named "reserved". No comment needed.
I concede that this is probably true. This does rely on convention though, and having the compiler yell at you if you try to remove it is even better.
 To give some examples, we require empty if statements to use 
 {} and not ;. It doesn't require any new syntax but it helps 
 you avoid issues that many people make, even though it is 
 allowed in C.
Then one could not write a C compatible bitfield.
Yes you can. You can use C to write a C compatible bitfield (ImportC is a thing). If you are using C bitfields as part of an API, it's either to do register layout or protocol processing. In both of these cases, layout matters more than arbitrary implementation matching. If you have a use case that relies on the arbitrariness of C bitfields (i.e. doesn't care), then yeah, I guess you have to go through ImportC. I don't see a problem with this -- this is almost always not public API (due to the problems with C bitfields). See for instance how the linux kernel doesn't use bitfields for anything other than internal flags to save space. It's not something we need to cater to.
 We require explicit conversion when narrowing the range of an 
 integer (i.e. assigning a long to an int). This avoids issues 
 that many people would make, even though it is allowed in C.
The C semantics are still allowed by adding a cast.
The C bitfield layout is achievable with D as well, it just might be the same exact syntax. i.e. you may need to use a uint instead of unsigned long long, or you might need to insert padding.
 Let's say Bob (poor Bob) needs to convert 20,000 lines of C 
 code to D. I know you've done some of this yourself! Bob 
 doesn't want to go through it line by line. Isn't it nice for 
 Bob if it "just works"? If all those data declarations just 
 work? Especially if the result still has to be compatible with 
 the files that C code wrote out?
ImportC is a thing. Leave the bitfield structs defined in C until you are fully in D, then use D bitfields. Or you modify your C code to use the recommended layouts that D uses. If you don't care about layout, it shouldn't be a problem. And the D port should tell you exactly which parts you need to change through the errors.
 But what if the compiler says "Bob, you can't lay out a 
 bitfield like that!" Or worse, it lays out the bitfield into a 
 portable (but different) layout. Then it doesn't just work, Bob 
 has got some debugging to do (while Bob curses D and me), and 
 Bob's got to figure out an alternative. Who wants to do that? 
 Not Bob. Not me. Not nobody not nohow.
This already happens, we don't need bitfields for this kind of pain. ImportC is the solution. Note that this follows the rule "if it looks like C and compiles, it should act like C". It's OK for things *not* to compile because we decided they are too error prone.
 I'm not asking for any new features.
Every switch that changes the semantics is a new feature and a new source of complexity and bugs. One of my original requirements for D was no switches that change language semantics. I have failed at that. But I wasn't wrong to aspire towards it.
How convenient that we draw the line here. I have no rebuttal for this as it's totally arbitrary, so if this is your only qualm, I guess you got me. -Steve
Jul 08 2024
parent reply Walter Bright <newshound2 digitalmars.com> writes:
I had written a detailed reply, but realized you and I were simply running 
around in the same circle saying the same things.
Jul 10 2024
parent reply Daniel N <no public.email> writes:
On Wednesday, 10 July 2024 at 07:09:10 UTC, Walter Bright wrote:
 I had written a detailed reply, but realized you and I were 
 simply running around in the same circle saying the same things.
Maybe some input from 3rd party could help? I use bitfields daily and never had any issues. What I do is to always use fix size types and then simply take all freedom away from the compiler. uint32_t a; uint32_t :32; // Forced padding uint64_t b:10; uint64_t c:10; uint64_t :44; // Forced padding uint32_t d; I guess one can use 0 size bitfields also but I usually prefer to visualize how much padding remains for potential future use.
Jul 10 2024
parent reply Daniel N <no public.email> writes:
On Wednesday, 10 July 2024 at 07:43:40 UTC, Daniel N wrote:
 On Wednesday, 10 July 2024 at 07:09:10 UTC, Walter Bright wrote:
 I had written a detailed reply, but realized you and I were 
 simply running around in the same circle saying the same 
 things.
Maybe some input from 3rd party could help? I use bitfields daily and never had any issues. What I do is to always use fix size types and then simply take all freedom away from the compiler. uint32_t a; uint32_t :32; // Forced padding uint64_t b:10; uint64_t c:10; uint64_t :44; // Forced padding uint32_t d; I guess one can use 0 size bitfields also but I usually prefer to visualize how much padding remains for potential future use.
PS To avoid relying on convention, you could make an incomplete bitfield a compilation error in D, then D bitfields would have C layout *AND* be deterministic.
Jul 10 2024
parent Timon Gehr <timon.gehr gmx.ch> writes:
On 7/10/24 09:55, Daniel N wrote:
 On Wednesday, 10 July 2024 at 07:43:40 UTC, Daniel N wrote:
 On Wednesday, 10 July 2024 at 07:09:10 UTC, Walter Bright wrote:
 I had written a detailed reply, but realized you and I were simply 
 running around in the same circle saying the same things.
Maybe some input from 3rd party could help? I use bitfields daily and never had any issues. What I do is to always use fix size types and then simply take all freedom away from the compiler. uint32_t a; uint32_t  :32; // Forced padding uint64_t b:10; uint64_t c:10; uint64_t  :44; // Forced padding uint32_t d; I guess one can use 0 size bitfields also but I usually prefer to visualize how much padding remains for potential future use.
PS To avoid relying on convention, you could make an incomplete bitfield a compilation error in D, then D bitfields would have C layout *AND* be deterministic.
Yes, something like that I think would be great, but I think Walter has a point that there should still be a way to match C bitfields even if the original author was less competent than you w.r.t. bitfield layout. Hence the proposal that anything goes if there is an `extern(C)` annotation, but for `extern(D)` bitfields, something like your approach would be enforced by the compiler.
Jul 10 2024
prev sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 7/9/24 02:29, Walter Bright wrote:
 
 Let's say Bob (poor Bob) needs to convert 20,000 lines of C code to D. I 
 know you've done some of this yourself! Bob doesn't want to go through 
 it line by line. Isn't it nice for Bob if it "just works"?
It won't, some edits will be necessary.
 If all those 
 data declarations just work? Especially if the result still has to be 
 compatible with the files that C code wrote out?
 
 But what if the compiler says "Bob, you can't lay out a bitfield like 
 that!"
The compiler should simply say: "Bob, are you sure you want to lay out a bitfield like this?" If Bob is comfortable with it, he can add `extern(C)` and move on.
 Or worse, it lays out the bitfield into a portable (but 
 different) layout.
Well I think this is not an option.
 Then it doesn't just work, Bob has got some debugging 
 to do (while Bob curses D and me), and Bob's got to figure out an 
 alternative. Who wants to do that? Not Bob. Not me. Not nobody not nohow.
As far as I am concerned, this is an irrelevant straw man. I don't want this. I never suggested anything that would cause this. It's pure FUD. Similarly, I don't want to go chasing down subtle differences in behavior/cache performance etc. between platforms. Portability may be important. It shouldn't be insane by default, it should be insane by choice. Informed consent. Especially given that bitfields have a "much nicer syntax" than alternative approaches. It's not nice to hand out a footgun disguised as candy.
Jul 09 2024
next sibling parent Timon Gehr <timon.gehr gmx.ch> writes:
On 7/10/24 02:44, Timon Gehr wrote:
 
 Especially given that bitfields have a "much nicer syntax" than 
 alternative approaches. It's not nice to hand out a footgun disguised as 
 candy.
Maybe check out this guy's take on this kind of thing: https://youtu.be/3iWn4S8JV8g We should take it to heart.
Jul 09 2024
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 7/9/2024 5:44 PM, Timon Gehr wrote:
 On 7/9/24 02:29, Walter Bright wrote:
 Let's say Bob (poor Bob) needs to convert 20,000 lines of C code to D. I know 
 you've done some of this yourself! Bob doesn't want to go through it line by 
 line. Isn't it nice for Bob if it "just works"?
It won't, some edits will be necessary.
Yes, we know it is imperfect. The fewer nits, the better.
 Then it doesn't just work, Bob has got some debugging to do (while Bob curses 
 D and me), and Bob's got to figure out an alternative. Who wants to do that? 
 Not Bob. Not me. Not nobody not nohow.
As far as I am concerned, this is an irrelevant straw man. I don't want this. I never suggested anything that would cause this. It's pure FUD.
Having a D code with the same declarations as C code, but the code generated is different, is going to lead to subtle memory bugs. I.e. just another footgun.
 It's not nice to hand out a footgun disguised as candy.
Requiring an extern(C) to make it compatible with a C layout is just another footgun, and there's no way for the compiler to detect it. The implementation-defined C layout has been there for what, 50 years? If it was so awful there'd be proposals to the C Standard to change it. People gripe about it now and then, but just go and fix their code and move on. Neither has C++ ever made any effort to change it, even though C++ has `extern "C"`. I do not understand why this is such a problem, since C compilers change the struct member layout based on compiler switches (which I showed in another post), and elicits no complaint from anybody. Having the default D struct member layout not line up with the associated C compiler layout is a memory safety issue. Not lining up with an externally imposed layout is not a memory safety issue. The bottom line, whether D supports bitfields or not, whether extern(C) is applied or not, to conform to an externally specified layout, you're going to have to check and see if it matches. If it doesn't match, there are really simple ways to get it to match.
Jul 10 2024
parent Timon Gehr <timon.gehr gmx.ch> writes:
On 7/10/24 09:44, Walter Bright wrote:
 On 7/9/2024 5:44 PM, Timon Gehr wrote:
 On 7/9/24 02:29, Walter Bright wrote:
 Let's say Bob (poor Bob) needs to convert 20,000 lines of C code to 
 D. I know you've done some of this yourself! Bob doesn't want to go 
 through it line by line. Isn't it nice for Bob if it "just works"?
It won't, some edits will be necessary.
Yes, we know it is imperfect. The fewer nits, the better. ...
No, there are other considerations, otherwise D would be identical to C.
 
 Then it doesn't just work, Bob has got some debugging to do (while 
 Bob curses D and me), and Bob's got to figure out an alternative. Who 
 wants to do that? Not Bob. Not me. Not nobody not nohow.
As far as I am concerned, this is an irrelevant straw man. I don't want this. I never suggested anything that would cause this. It's pure FUD.
Having a D code with the same declarations as C code, but the code generated is different, is going to lead to subtle memory bugs. I.e. just another footgun. ...
My position is: no footguns. It's easily achievable. Your position is: one footgun or another footgun, does it really matter, let's just choose the footgun with the simpler design.
 
 It's not nice to hand out a footgun disguised as candy.
Requiring an extern(C) to make it compatible with a C layout is just another footgun, and there's no way for the compiler to detect it. ...
Again: You are arguing against something you made up yourself. Something that is not even on the table. I am however glad you agree there should not be footguns.
 The implementation-defined C layout has been there for what, 50 years? 
 If it was so awful there'd be proposals to the C Standard to change it. 
I think you know very well that C has many design errors that were never fixed. Those people put up with C in the first place. They often even think it is a well-designed language.
 People gripe about it now and then, but just go and fix their code and 
 move on. Neither has C++ ever made any effort to change it, even though 
 C++ has `extern "C"`.
 
 I do not understand why this is such a problem,
Because D prides itself on fixing mistakes, including underspecified layout.
 since C compilers change 
 the struct member layout based on compiler switches (which I showed in 
 another post),
I use D because it does not do stupid things like that. You seem to think that "some C implementations did it that way" is in some way a good way to justify something. It just is not.
 and elicits no complaint from anybody.
 ...
I highly doubt it. I would complain about this.
 Having the default D struct member layout not line up with the 
 associated C compiler layout is a memory safety issue.
I know. I care about memory safety. I do however not care about your argument, because it is a pure straw man. I am _not_ suggesting to make `extern(D)` bitfields silently have a different layout, just a bit more restrictions, rejecting sloppily written bit-field code. Anyway, if you interoperate with C, you are on your own w.r.t. memory safety anyway, as you have no idea what kind of compiler extensions, attributes, and switches will change the struct layout in a way that ImportC does not yet understand, but will silently ignore. ```c #include <stddef.h> #include <stdio.h> #include <stdlib.h> struct __attribute((packed)) S{ int x; int* y; }; int main(){ printf("%ld\n",offsetof(struct S, x)); printf("%ld\n",offsetof(struct S, y)); } ``` ```d dmd -run test.c 0 8 ``` ```d gcc test.c && ./a.out 0 4 ```
 Not lining up 
 with an externally imposed layout is not a memory safety issue.
 ...
It does not even need to be externally imposed. Consistency and reproducibility is important on its own.
 The bottom line, whether D supports bitfields or not, whether extern(C) 
 is applied or not, to conform to an externally specified layout, you're 
 going to have to check and see if it matches. If it doesn't match, there 
 are really simple ways to get it to match.
 
On the platform you happen to be using. It may not work on another platform. The entire point is that it should be sufficient to check once, and to match it once. If what you need to match is insane inconsistent C behavior, just be explicit about it. That is all I am asking.
Jul 10 2024
prev sibling parent Walter Bright <newshound2 digitalmars.com> writes:
On 7/5/2024 8:23 PM, Steven Schveighoffer wrote:
 My recommendation still is either:
 
 1. Denote D bitfields by a specified layout system (pick the most common C one 
 and do that). C bitfields can match the C compiler.
 2. Simply forbid problematic alignments at compile time:
 
 ```d
 struct S {
     uint x;
     uint64 a : 24;
     uint64 b : 24;
     uint64 c : 16;
 }
 
 // error, alignment of bitfield `a` may not match C layout, please use padding 
 or aligned bitfields to specify intended layout.
 
 // these are OK.
 struct SWithPadding {
     uint x;
     uint _; // padding
     uint64 a : 24;
     uint64 b : 24;
     uint64 c : 16;
 }
 
 struct SPacked {
     uint64 x : 32;
     uint64 a : 24;
     uint64 b : 24;
     uint64 c : 16;
 }
 ```
 
 Maybe the error only occurs if you specify a compiler switch?
It's clear how to get the desired portable layout. Consider that nobody ever requested a warning on: ``` struct S { uint x; ulong y; } ``` which is the equivalent. Not for D, C, or C++. At least none directed at my compilers. I would be good with a note about this technique in the specification. P.S. We've already got soooo many compiler switches, adding another one needs a strong case. Every such switch is a bug :-/
Jul 06 2024
prev sibling parent Walter Bright <newshound2 digitalmars.com> writes:
On 7/5/2024 12:35 PM, Steven Schveighoffer wrote:
 What if you need > 32 bits or want to pack into a `ulong`? Is the behavior
sane 
 across compilers?
Yes. The trouble happens when you mix different field types. There are also differences when declaring "packed" bit fields - a C extension that ImportC does not implement. You can see which cases are different in: ImportC: https://github.com/dlang/dmd/blob/master/compiler/test/runnable/bitfieldsms.c https://github.com/dlang/dmd/blob/master/compiler/test/runnable/bitfieldsposix32.c https://github.com/dlang/dmd/blob/master/compiler/test/runnable/bitfieldsposix64.c D: https://github.com/dlang/dmd/blob/master/compiler/test/runnable/dbitfieldsms.c https://github.com/dlang/dmd/blob/master/compiler/test/runnable/dbitfieldsposix32.c https://github.com/dlang/dmd/blob/master/compiler/test/runnable/dbitfieldsposix64.c
Jul 05 2024
prev sibling parent reply cc <cc nevernet.com> writes:
On Friday, 5 July 2024 at 05:37:50 UTC, Richard (Rikki) Andrew 
Cattermole wrote:
 Today many people have spent some time to try and understand 
 Walter's belief that C is "good enough" for bit fields in terms 
 of guarantees.
Can I define my D struct using bitfields, and then correctly unpack them in a GLSL shader using bitfieldExtract() with the predicted offsets and sizes? If the answer is no, then that bitfield implementation belongs in a garbage can. Currently the answer to this question is "yes" for std.bitmanip and "no" for "native" D bitfields.
Jul 06 2024
parent "Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:
On 07/07/2024 2:22 PM, cc wrote:
 On Friday, 5 July 2024 at 05:37:50 UTC, Richard (Rikki) Andrew 
 Cattermole wrote:
 Today many people have spent some time to try and understand Walter's 
 belief that C is "good enough" for bit fields in terms of guarantees.
Can I define my D struct using bitfields, and then correctly unpack them in a GLSL shader using bitfieldExtract() with the predicted offsets and sizes?  If the answer is no, then that bitfield implementation belongs in a garbage can. Currently the answer to this question is "yes" for std.bitmanip and "no" for "native" D bitfields.
Yes and no. If it is exactly 32bits, its fine. If it unintentionally crosses above it (due to implementation defined behavior), or if it is (u)int 2/3/4 then it likely won't work.
Jul 06 2024