digitalmars.D - C bitfields guarantees

Richard (Rikki) Andrew Cattermole (28/30) Jul 04 2024 Today many people have spent some time to try and understand

Timon Gehr (24/50) Jul 05 2024 According to this text, none of the types matter for layout guarantees....

Timon Gehr (3/5) Jul 05 2024 Actually no, this is one of the few things it cannot do. I got a bit too...

Richard (Rikki) Andrew Cattermole (12/18) Jul 05 2024 Oh oh no, you are so right, I was applying the type there that I

Salih Dincer (12/17) Jul 05 2024 Given that today is July 5, 2024, the publication of the C23

Richard (Rikki) Andrew Cattermole (17/20) Jul 05 2024 As of right now, the only thing planned is the changing of our

Walter Bright (20/22) Jul 05 2024 It's straightforward. If you use uint as the field type, you'll get the ...

Timon Gehr (3/14) Jul 05 2024 Well, this is the D newsgroup.

Walter Bright (6/11) Jul 05 2024 Yes. And I have no concern at all about some C compiler that uses a diff...

Timon Gehr (3/16) Jul 06 2024 The point was: D should actually specify more bitfield layout guarantees...

Walter Bright (9/11) Jul 06 2024 I understand that. Given that any desired portable bitfield layout can b...

Richard (Rikki) Andrew Cattermole (19/27) Jul 06 2024 You have an expert understanding of the subject matter.

Walter Bright (8/8) Jul 06 2024 C and D programmers already know how to align things with pad fields. It...

Steven Schveighoffer (4/10) Jul 05 2024 What if you need > 32 bits or want to pack into a `ulong`? Is the

Tim (34/36) Jul 05 2024 The following struct has a different layout for different

Steven Schveighoffer (27/34) Jul 05 2024 ...

Richard (Rikki) Andrew Cattermole (39/79) Jul 05 2024 I did not expect this.

Walter Bright (2/3) Jul 05 2024 See my other reply.

Walter Bright (38/49) Jul 05 2024 The following will also show discrepancies:

Steven Schveighoffer (49/77) Jul 05 2024 I tested this struct, and there were no discrepancies between

Walter Bright (12/26) Jul 06 2024 Try it with a 16 bit compiler, which aligns on 16 bits rather than 32 bi...

Steven Schveighoffer (39/69) Jul 06 2024 Right, but with bitfields, you get discrepancies within the *same

Walter Bright (44/59) Jul 06 2024 Simple solution:

Timon Gehr (33/76) Jul 07 2024 You said: Same type, same alignment. It's clearly not true in Steven's

Walter Bright (8/14) Jul 08 2024 One sentence:

Timon Gehr (17/64) Jul 09 2024 Well, this is not a complete characterization, but good enough I guess.

Timon Gehr (4/10) Jul 09 2024 Or change some of the bitfield types to ones with smaller alignment I
Steven Schveighoffer (10/24) Jul 09 2024 Yes, this is the correct answer. I stayed away from `extern(C)`

Timon Gehr (12/14) Jul 09 2024 Well, it does affect layout:

Walter Bright (3/15) Jul 09 2024 C and C++ differ here, too. D defaults to the C++ route because they wan...

Walter Bright (4/5) Jul 09 2024 That's right. It's not a bug, it matches what the associated C compiler ...

Timon Gehr (4/9) Jul 10 2024 Nonsense. The issue is the inconsistency between `S.y.alignof` and

Steven Schveighoffer (17/56) Jul 07 2024 Simple, no. Predictable, yes (it's unambiguous). And not obvious.

Walter Bright (24/37) Jul 08 2024 It is trivially obvious to the most casual observer!

Steven Schveighoffer (40/81) Jul 08 2024 Yes, but there is a subtle difference -- the compiler ignores its

Walter Bright (2/2) Jul 10 2024 I had written a detailed reply, but realized you and I were simply runni...

Daniel N (13/15) Jul 10 2024 Maybe some input from 3rd party could help?

Daniel N (4/20) Jul 10 2024 PS To avoid relying on convention, you could make an incomplete

Timon Gehr (7/32) Jul 10 2024 Yes, something like that I think would be great, but I think Walter has

Timon Gehr (15/30) Jul 09 2024 The compiler should simply say: "Bob, are you sure you want to lay out a...

Timon Gehr (4/8) Jul 09 2024 Maybe check out this guy's take on this kind of thing:
Walter Bright (20/33) Jul 10 2024 Having a D code with the same declarations as C code, but the code gener...

Timon Gehr (53/104) Jul 10 2024 My position is: no footguns. It's easily achievable.

Walter Bright (14/49) Jul 06 2024 It's clear how to get the desired portable layout.

Walter Bright (13/15) Jul 05 2024 Yes. The trouble happens when you mix different field types. There are a...

cc (8/11) Jul 06 2024 Can I define my D struct using bitfields, and then correctly

Richard (Rikki) Andrew Cattermole (5/17) Jul 06 2024 Yes and no.

Richard (Rikki) Andrew Cattermole <richard cattermole.co.nz> writes:

Today many people have spent some time to try and understand 
Walter's belief that C is "good enough" for bit fields in terms 
of guarantees.

I believe I have understood a core component to this.

 From the C23 standard:

 An implementation may allocate any addressable storage unit 
 large enough to hold a bit-field. If

enough space remains, a bit-field that immediately follows 
another bit-field in a structure shall be
packed into adjacent bits of the same unit. If insufficient space 
remains, whether a bit-field that
does not fit is put into the next unit or overlaps adjacent units 
is implementation-defined. The
order of allocation of bit-fields within a unit (high-order to 
low-order or low-order to high-order) is
implementation-defined. The alignment of the addressable storage 
unit is unspecified.

What matters is the _initial_ type in the bit-field, the rest of 
the types _do not_ matter.

As long as you _do not_ start and finish in two separate memory 
addresses for that initial type it will be predictable.

I have filed a 
[ticket](https://github.com/dlang-community/D-Scanner/issues/955) 
for dscanner to introduce a warning to tell you that the compiler 
is going to do a bad thing, that will cause you problems and the 
compiler will not assist you.

Ideally, we wouldn't allow it for ``extern(D)`` code at all.

As of right now, assuming we get Dscanner to give the warning I 
can withdraw my concerns, although I do think that ``extern(D)`` 
shouldn't be offering you such a heavy foot-gun.

Jul 04 2024

Timon Gehr <timon.gehr gmx.ch> writes:

On 7/5/24 07:37, Richard (Rikki) Andrew Cattermole wrote:
 Today many people have spent some time to try and understand Walter's 
 belief that C is "good enough" for bit fields in terms of guarantees.
 
 I believe I have understood a core component to this.
 
  From the C23 standard:
 
 An implementation may allocate any addressable storage unit large 
 enough to hold a bit-field. If

 enough space remains, a bit-field that immediately follows another 
 bit-field in a structure shall be
 packed into adjacent bits of the same unit. If insufficient space 
 remains, whether a bit-field that
 does not fit is put into the next unit or overlaps adjacent units is 
 implementation-defined. The
 order of allocation of bit-fields within a unit (high-order to low-order 
 or low-order to high-order) is
 implementation-defined. The alignment of the addressable storage unit is 
 unspecified.
 
 What matters is the _initial_ type in the bit-field, the rest of the 
 types _do not_ matter.
 ...

According to this text,  none of the types matter for layout guarantees. 
Only the bit sizes matter somewhat. And then the implementation still 
has way too much leeway in how it allocates things.

Walter's reasoning has been that _in practice_, C implementations are a 
bit more sane than what the standard allows. I don't think it is 
fruitful to try and find any useful guarantees in the standard. If there 
were any, that's what Walter would point to instead.

 As long as you _do not_ start and finish in two separate memory 
 addresses for that initial type it will be predictable.
 ...

According to the standard, no.

E.g.:

int a:7;
int b:25;

According to the standard, this could put `a` in a 1-byte unit and `b` 
in a subsequent 32-byte unit. It could put `a` in a 1-byte unit, use the 
last bit for `b`, then put the remaining 24 bits of `b` in a new unit.

It could also put both in separate 4-byte integers. Or it could pack 
them into a single 4-byte location. It is not specified. In practice, 
implementations will usually put both of them in a single 4-byte 
location, and this is what Walter is relying on. The C standard gives 
you almost nothing (it could even choose to put both `a` and `b` into a 
8-byte or larger unit, there is no upper limit on the size, only a lower 
one.)

And I did not even get into different possible orderings of bit fields 
within a unit.

Jul 05 2024

Timon Gehr <timon.gehr gmx.ch> writes:

On 7/5/24 11:13, Timon Gehr wrote:
 ...
 It could also put both in separate 4-byte integers.

Actually no, this is one of the few things it cannot do. I got a bit too 
excited there. Anyway, the point stands.

Jul 05 2024

"Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:

On 05/07/2024 9:42 PM, Timon Gehr wrote:
 On 7/5/24 11:13, Timon Gehr wrote:
 ...
 It could also put both in separate 4-byte integers.

 
 Actually no, this is one of the few things it cannot do. I got a bit too 
 excited there. Anyway, the point stands.

Oh oh no, you are so right, I was applying the type there that I 
shouldn't have been.

Don't read the C standard after you've been awake more than 12 hour folks!

However in saying that, the point that we can mitigate it using a 
dscanner warning does still stand. Therefore my original post stating I 
withdraw my concerns is valid.

The only problem is it'll be word size specific and alignment specific 
check now.

I hate every bit that we need to make such a specific mitigation for 
what amounts to a brand new feature. It is quite frankly ridicules to 
need a _mitigation_ for this.

Jul 05 2024

Salih Dincer <salihdb hotmail.com> writes:

On Friday, 5 July 2024 at 14:18:43 UTC, Richard (Rikki) Andrew 
Cattermole wrote:
 Don't read the C standard after you've been awake more than 12 
 hour folks!

 However in saying that, the point that we can mitigate it using 
 a dscanner warning does still stand. Therefore my original post 
 stating I withdraw my concerns is valid.

Given that today is July 5, 2024, the publication of the C23 
standard is imminent, with the limit date for publication being 
July 12, 2024. This means that within a week, the C23 standard 
should be officially published, marking a significant milestone 
for the C programming language and for D.

Is it a good time to start planning for any necessary updates to 
our existing codebases or libraries to ensure compatibility with 
C23? Can we say that DMD will also support this in parallel with 
the developments?

SDB 79

Jul 05 2024

"Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:

On 06/07/2024 4:48 AM, Salih Dincer wrote:
 Is it a good time to start planning for any necessary updates to our 
 existing codebases or libraries to ensure compatibility with C23? Can we 
 say that DMD will also support this in parallel with the developments?

As of right now, the only thing planned is the changing of our 
identifiers to match the C23 identifier tables that is UAX31 based.

I've implemented and has been in a release, although we are not 
transitioned over, the breakage is expected as of 2.119 (the tables are 
both bigger and smaller than C99 *sigh*, right now we are in a 
recombination of all the different tables).

Walter really does not want the normalization stuff that UAX31 and with 
that C23 requires and some of it was implemented, but alas.

But the other things like different float types are not currently 
planned to be supported as far as I know. We should probably discuss 
that at some point as a community.

Other things like nodiscard on a function have no D equivalent just yet 
although we have allowed for it to occur in the future as part of 
`` mustuse``.

Apart from identifiers there isn't much you should need to deal with for 
your code base :)

Jul 05 2024

Walter Bright <newshound2 digitalmars.com> writes:

On 7/4/2024 10:37 PM, Richard (Rikki) Andrew Cattermole wrote:
 Today many people have spent some time to try and understand Walter's belief 
 that C is "good enough" for bit fields in terms of guarantees.

It's straightforward. If you use uint as the field type, you'll get the same 
layout across every C compiler I've ever heard of.

The reason for this is straightforward:

1. it's the obvious way to do things
2. professional C compiler developers are sensible people
3. professional C compiler developers want to compile existing code and have it 
behave the same way on the same platform, they don't care to antagonize their
users

The differences crop up when using multiple field types *and* porting to a 
different ecosystem. These problems are trivially avoided. Even so, within a 
particular ecosystem, the C compilers are all compatible with each other. Why? 
Because C compiler developers want their compiler to be useful!

Is anyone surprised that gcc/clang/ImportC work exactly the same on each
ecosystem?

Consider also that the C standard does not specify the size of a 'char'. There 
are C compilers for special CPUs that have different char sizes - notably 32
bit 
chars for some DSP processors, and 10 bit chars for the CPU on a Mattel 
Intellivision game computer. C on a PDP-10 has 36 bit ints, too! and 18 bit
shorts.

I can pretty much guarantee that all C code developed on a conventional CPU
will 
fail to work on those machines.

But so what. When you port to a diverse machine, you expect such problems.

Jul 05 2024

Timon Gehr <timon.gehr gmx.ch> writes:

On 7/5/24 18:35, Walter Bright wrote:
 
 Consider also that the C standard does not specify the size of a 'char'.

D does specify it.

 There are C compilers for special CPUs that have different char sizes - 
 notably 32 bit chars for some DSP processors, and 10 bit chars for the 
 CPU on a Mattel Intellivision game computer. C on a PDP-10 has 36 bit 
 ints, too! and 18 bit shorts.
 
 I can pretty much guarantee that all C code developed on a conventional 
 CPU will fail to work on those machines.
 
 But so what. When you port to a diverse machine, you expect such problems.

Well, this is the D newsgroup.

Jul 05 2024

Walter Bright <newshound2 digitalmars.com> writes:

On 7/5/2024 10:02 AM, Timon Gehr wrote:
 On 7/5/24 18:35, Walter Bright wrote:
 Consider also that the C standard does not specify the size of a 'char'.

 
 D does specify it.

Yes. And I have no concern at all about some C compiler that uses a different 
size. None of those C compilers will compile "portable" C code, either, even 
though the Standard permits such compilers.

If we go though a dimensional warp into an alternate universe, where C chars
are 
  9 bits, we'll change the D compiler to match.

Jul 05 2024

Timon Gehr <timon.gehr gmx.ch> writes:

On 7/6/24 01:23, Walter Bright wrote:
 On 7/5/2024 10:02 AM, Timon Gehr wrote:
 On 7/5/24 18:35, Walter Bright wrote:
 Consider also that the C standard does not specify the size of a 'char'.

 D does specify it.

 
 Yes. And I have no concern at all about some C compiler that uses a 
 different size. None of those C compilers will compile "portable" C 
 code, either, even though the Standard permits such compilers.
 
 If we go though a dimensional warp into an alternate universe, where C 
 chars are 9 bits, we'll change the D compiler to match.

The point was: D should actually specify more bitfield layout guarantees 
than the C standard.

Jul 06 2024

Walter Bright <newshound2 digitalmars.com> writes:

On 7/6/2024 8:54 AM, Timon Gehr wrote:
 The point was: D should actually specify more bitfield layout guarantees than 
 the C standard.

I understand that. Given that any desired portable bitfield layout can be done 
with minimal effort, there is no need to add more semantics to the language
than 
what C does.

I.e. portable not only to the associated C compiler, but to any C compiler with 
8 bit chars and 32 bit ints.

Throw me an example that shows me wrong!

Personally, I would find this to be much more readable code than adding more 
syntactical constructs.

Jul 06 2024

"Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:

On 07/07/2024 11:19 AM, Walter Bright wrote:
 On 7/6/2024 8:54 AM, Timon Gehr wrote:
 
     The point was: D should actually specify more bitfield layout
     guarantees than the C standard.
 
 I understand that. Given that any desired portable bitfield layout can 
 be done with minimal effort, there is no need to add more semantics to 
 the language than what C does.

You have an expert understanding of the subject matter.

Nobody else around here has this knowledge or expertise.

As of right now there does not appear to be a single person on the D 
Discord server that understands how to use C bit fields to have 
predictable behavior let alone portable.

I understand that you think that this is simple, but nobody else can 
understand it, and you are failing to explain it sufficiently.

If somebody has their program failing, it will be hard to diagnose the 
problem let alone explain it. The only person who can do this is you. 
That does not scale.

At this point multiple people who are usually responsible for explaining 
language features to other people and diagnosing programs, are telling 
you that they cannot use it as intended, this should be sending up major 
red flags that only you can use this feature.

Please seriously reconsider the ``extern(D)``/``extern(C)`` split, 
because right now we will have no choice but to have DScanner issue a 
warning for improper use of bit fields, and that is quite frankly 
ridicules that a brand new ``extern(D)`` language feature needs a warning.

Jul 06 2024

Walter Bright <newshound2 digitalmars.com> writes:

C and D programmers already know how to align things with pad fields. It's a 
basic skill.

If extern(C) and extern(D) were added to bitfields, then the programmer would 
have to learn two new syntactic constructs and what they mean. It's a 
distinction that is easily forgettable, too. Quick, what's the difference in 
calling convention between extern(C) and extern(D) functions?

With pad fields, there's nothing new to learn, and it's quite obvious even for
a 
naive programmer what is happening with them.

Jul 06 2024

Steven Schveighoffer <schveiguy gmail.com> writes:

On Friday, 5 July 2024 at 16:35:43 UTC, Walter Bright wrote:
 On 7/4/2024 10:37 PM, Richard (Rikki) Andrew Cattermole wrote:
 Today many people have spent some time to try and understand 
 Walter's belief that C is "good enough" for bit fields in 
 terms of guarantees.

 It's straightforward. If you use uint as the field type, you'll 
 get the same layout across every C compiler I've ever heard of.

What if you need > 32 bits or want to pack into a `ulong`? Is the 
behavior sane across compilers?

-Steve

Jul 05 2024

Tim <tim.dlang t-online.de> writes:

On Friday, 5 July 2024 at 19:35:10 UTC, Steven Schveighoffer 
wrote:
 What if you need > 32 bits or want to pack into a `ulong`? Is 
 the behavior sane across compilers?

The following struct has a different layout for different 
platforms:
```
struct S { unsigned int x; unsigned long long a:20, b:20, c:24; };
```

Windows layout:
```
          0 | struct S
          0 |   unsigned int x
     8:0-19 |   unsigned long long a
    10:4-23 |   unsigned long long b
    13:0-23 |   unsigned long long c
            | [sizeof=16, align=8]
```

Linux x86_64 layout:
```
          0 | struct S
          0 |   unsigned int x
     4:0-19 |   unsigned long long a
     8:0-19 |   unsigned long long b
    10:4-27 |   unsigned long long c
            | [sizeof=16, align=8]
```

Linux i686 layout:
```
          0 | struct S
          0 |   unsigned int x
     4:0-19 |   unsigned long long a
     6:4-23 |   unsigned long long b
     9:0-23 |   unsigned long long c
            | [sizeof=12, align=4]
```

Jul 05 2024

Steven Schveighoffer <schveiguy gmail.com> writes:

On Friday, 5 July 2024 at 19:56:49 UTC, Tim wrote:
 On Friday, 5 July 2024 at 19:35:10 UTC, Steven Schveighoffer 
 wrote:
 What if you need > 32 bits or want to pack into a `ulong`? Is 
 the behavior sane across compilers?

 The following struct has a different layout for different 
 platforms:

...

Thanks for this.

I also tested the following, and found it too shows discrepancies.

```c
struct S {
     unsigned short x;
     unsigned int a : 12;
     unsigned int b : 12;
     unsigned int c : 8;
};
```

Here there are only `uint` bitfields, yet the compiler chooses to 
layout the bits differently based on the *preceding* field.

Walter, I have to unfortunately withdraw my support for defining 
D bitfields to just be the same as C bitfields -- the minefields 
are too subtle. The statement that "If you use uint as the field 
type, you'll get the same layout across every C compiler" is not 
true. And I don't think we can really specify the true nature of 
what you must do for portable bitfields in a way that is 
straightforward. Saying something like "you can only use `uint` 
bitfields in structs that contain only `uint` types" is not a 
good feature.

I'm back to requesting that we have a mechanism to request C 
bitfields (such as marking a struct as `extern(C)`), or picking 
one C style and going with that.

-Steve

Jul 05 2024

"Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:

On 06/07/2024 9:12 AM, Steven Schveighoffer wrote:
 On Friday, 5 July 2024 at 19:56:49 UTC, Tim wrote:
 On Friday, 5 July 2024 at 19:35:10 UTC, Steven Schveighoffer wrote:
 What if you need > 32 bits or want to pack into a `ulong`? Is the 
 behavior sane across compilers?

 The following struct has a different layout for different platforms:

 
 ...
 
 Thanks for this.
 
 I also tested the following, and found it too shows discrepancies.
 
 ```c
 struct S {
      unsigned short x;
      unsigned int a : 12;
      unsigned int b : 12;
      unsigned int c : 8;
 };
 ```
 
 Here there are only `uint` bitfields, yet the compiler chooses to layout 
 the bits differently based on the *preceding* field.
 
 Walter, I have to unfortunately withdraw my support for defining D 
 bitfields to just be the same as C bitfields -- the minefields are too 
 subtle. The statement that "If you use uint as the field type, you'll 
 get the same layout across every C compiler" is not true. And I don't 
 think we can really specify the true nature of what you must do for 
 portable bitfields in a way that is straightforward. Saying something 
 like "you can only use `uint` bitfields in structs that contain only 
 `uint` types" is not a good feature.
 
 I'm back to requesting that we have a mechanism to request C bitfields 
 (such as marking a struct as `extern(C)`), or picking one C style and 
 going with that.
 
 -Steve

I did not expect this.

This prevents my mitigation from working.

So now we also have to put it into an anonymous struct to even get the 
layout we think it should be.

```c
struct Foo {
      unsigned short x;

      struct {
         unsigned int a : 12;
         unsigned int b : 12;
         unsigned int c : 8;
      };

      //void* next;
};

int main() {
     struct Foo foo;
     foo.a = 1;
     foo.b = 0;

     return 0;
}
```

```asm
main:
  push   rbp
  mov    rbp,rsp
  mov    DWORD PTR [rbp-0x4],0x0
  mov    eax,DWORD PTR [rbp-0x8]
  and    eax,0xfffff000
  or     eax,0x1
  mov    DWORD PTR [rbp-0x8],eax
  mov    eax,DWORD PTR [rbp-0x8]
  and    eax,0xff000fff
  or     eax,0x0
  mov    DWORD PTR [rbp-0x8],eax
  xor    eax,eax
  pop    rbp
  ret
```

Jul 05 2024

Walter Bright <newshound2 digitalmars.com> writes:

On 7/5/2024 2:25 PM, Richard (Rikki) Andrew Cattermole wrote:
 So now we also have to put it into an anonymous struct

See my other reply.

Jul 05 2024

Walter Bright <newshound2 digitalmars.com> writes:

On 7/5/2024 2:12 PM, Steven Schveighoffer wrote:
 I also tested the following, and found it too shows discrepancies.
 
 ```c
 struct S {
      unsigned short x;
      unsigned int a : 12;
      unsigned int b : 12;
      unsigned int c : 8;
 };
 ```

The following will also show discrepancies:

```
struct T {
     unsigned short x;
     unsigned int y;
}
```

for the same reason.

 Here there are only uint bitfields, yet the compiler chooses to layout the 

bits differently based on the preceding field.

It's actually based on the *alignment* of the preceding field. I'm regret not 
saying that, but that's what I meant with the fields need to be of the same 
type, so they have the same alignment. If the uint bitfield started off aligned 
at a uint boundary, my statement holds.

When mixing field types of different sizes, there will be different alignments 
of those fields on different platforms/compilers, whether or not bitfields are 
involved.

The layout can be portably controlled as desired, by being cognizant of field 
alignment:

```c
struct S {
      unsigned short x;
      unsigned short a : 12;  // at offset 2
      unsigned int   b : 12;  // at offset 4
      unsigned int   c : 8;   // at offset 4
};
```

```c
struct S {
      unsigned short x;
      unsigned short dummy;   // for alignment porpoises
      unsigned int a : 12;    // at offset 4
      unsigned int b : 12;    // at offset 4
      unsigned int c : 8;     // at offset 4
};
```

Simply put, avoiding fields that straddle alignment boundaries avoids 
portability issues. This is true with both bitfields and regular fields.

Jul 05 2024

Steven Schveighoffer <schveiguy gmail.com> writes:

On Saturday, 6 July 2024 at 00:16:23 UTC, Walter Bright wrote:
 On 7/5/2024 2:12 PM, Steven Schveighoffer wrote:
 I also tested the following, and found it too shows 
 discrepancies.
 
 ```c
 struct S {
      unsigned short x;
      unsigned int a : 12;
      unsigned int b : 12;
      unsigned int c : 8;
 };
 ```

 The following will also show discrepancies:

 ```
 struct T {
     unsigned short x;
     unsigned int y;
 }
 ```

 for the same reason.

I tested this struct, and there were no discrepancies between 
compilers. All compilers put 2 bytes of padding between the 
`ushort` and the `uint`.

 It's actually based on the *alignment* of the preceding field. 
 I'm regret not saying that, but that's what I meant with the 
 fields need to be of the same type, so they have the same 
 alignment. If the uint bitfield started off aligned at a uint 
 boundary, my statement holds.

Hm..., well it's not ideal to require the user to nudge the 
compiler for the desired layout. It's an odd thing to say that a 
uint bitfield may not be uint aligned, even if the equivalent 
uint value would be.

The documentation note we talked about was simple -- just always 
use the same type for your bitfields and it works. This is 
different. Not impossible to learn, but for sure more challenging.

 When mixing field types of different sizes, there will be 
 different alignments of those fields on different 
 platforms/compilers, whether or not bitfields are involved.

The confusing thing here is that the alignment does *not* obey 
the alignment of the containing type. And how it is aligned 
depends instead on the *previous* member (sometimes). This is not 
the case for full-sized uints.

I will note that I'm reading that ulong is aligned to 4-bytes on 
32-bit linux, and so this does make an alignment difference even 
for non-bitfields.

My recommendation still is either:

1. Denote D bitfields by a specified layout system (pick the most 
common C one and do that). C bitfields can match the C compiler.
2. Simply forbid problematic alignments at compile time:

```d
struct S {
    uint x;
    uint64 a : 24;
    uint64 b : 24;
    uint64 c : 16;
}

// error, alignment of bitfield `a` may not match C layout, 
please use padding or aligned bitfields to specify intended 
layout.

// these are OK.
struct SWithPadding {
    uint x;
    uint _; // padding
    uint64 a : 24;
    uint64 b : 24;
    uint64 c : 16;
}

struct SPacked {
    uint64 x : 32;
    uint64 a : 24;
    uint64 b : 24;
    uint64 c : 16;
}
```

Maybe the error only occurs if you specify a compiler switch?

-Steve

Jul 05 2024

Walter Bright <newshound2 digitalmars.com> writes:

On 7/5/2024 8:23 PM, Steven Schveighoffer wrote:
 On Saturday, 6 July 2024 at 00:16:23 UTC, Walter Bright wrote:
 The following will also show discrepancies:

 ```
 struct T {
     unsigned short x;
     unsigned int y;
 }
 ```

 for the same reason.

 
 I tested this struct, and there were no discrepancies between compilers. All 
 compilers put 2 bytes of padding between the `ushort` and the `uint`.

Try it with a 16 bit compiler, which aligns on 16 bits rather than 32 bits.

No, I'm not cheating with this - I wanted to point out the consistency between 
32 bit compilers, despite the Standard saying nothing about it. But I can still 
break the example, with a 32/64 bit compiler:

```
struct U {
     unsigned int x;
     unsigned long y;
}
```

You'll get different sizes for 32 vs 64 bit compilations, including with D.

Jul 06 2024

Steven Schveighoffer <schveiguy gmail.com> writes:

On Saturday, 6 July 2024 at 23:26:43 UTC, Walter Bright wrote:
 On 7/5/2024 8:23 PM, Steven Schveighoffer wrote:
 On Saturday, 6 July 2024 at 00:16:23 UTC, Walter Bright wrote:
 The following will also show discrepancies:

 ```
 struct T {
     unsigned short x;
     unsigned int y;
 }
 ```

 for the same reason.

 
 I tested this struct, and there were no discrepancies between 
 compilers. All compilers put 2 bytes of padding between the 
 `ushort` and the `uint`.

 Try it with a 16 bit compiler, which aligns on 16 bits rather 
 than 32 bits.

 No, I'm not cheating with this - I wanted to point out the 
 consistency between 32 bit compilers, despite the Standard 
 saying nothing about it. But I can still break the example, 
 with a 32/64 bit compiler:

 ```
 struct U {
     unsigned int x;
     unsigned long y;
 }
 ```

 You'll get different sizes for 32 vs 64 bit compilations, 
 including with D.

Right, but with bitfields, you get discrepancies within the *same 
compiler*.

Let's take another example:

```c
struct U {
   unsigned int x;
   unsigned long long y: 30;
   unsigned long long z: 34;
}

struct U2 {
   unsigned int x;
   unsigned long long y: 34;
   unsigned long long z: 30;
}
```

In the first case, Linux 64-bit clang will layout y to be right 
after x, and z will be pushed 2 bits further so it lines up on a 
64-bit address.

In the second case, in the same compiler, y is pushed *32* bits 
off so it lines up on a 64-bit address space, and z is past that.

In both cases, 96 bits of data consumes 128 bits (sizeof both 
structs is 16).

In other words, the compiler makes probably unexpected layout 
decisions, and the reason is "because C does it".

Truly, with C, you cannot count on *any* explicit layout. Yes, 
there are reasons, and all of those have to do with performance. 
But when the goal is explicit layouts, then confusion rules.

This is why I said, either define what D does explicitly, or warn 
when it does something really bizarre for the sake of C.

Another option is just to say, "don't use bitfields for anything 
other than space-saving. Do not attempt to use bitfields for 
defined bit layout, because the compiler can make arbitrary 
decisions on layout." But you will have to tell [this 
guy](https://forum.dlang.org/post/v622m1$mu4$1 digitalmars.com).

D has a chance to do better, but it's clear we are just going to 
saddle ourselves with C insanity for the sake of zero real use 
cases.

-Steve

Jul 06 2024

Walter Bright <newshound2 digitalmars.com> writes:

On 7/6/2024 8:50 PM, Steven Schveighoffer wrote:
 Let's take another example:
 
 ```c
 struct U {
    unsigned int x;
    unsigned long long y: 30;
    unsigned long long z: 34;
 }
 
 struct U2 {
    unsigned int x;
    unsigned long long y: 34;
    unsigned long long z: 30;
 }
 ```

Simple solution:

```
struct U {
     unsigned int x;
     unsigned int y:30;
     unsigned long long z:34;
}
```

or:

```
struct U2 {
     unsigned int x;
     unsigned int pad;
     unsigned long long y:30;
     unsigned long long z:34;
}
```

depending on which layout is desired. This is simple, predictable, and
portable. 
It's not going to be a mystery to anyone reading the code - it's eminently
readable.

An anonymous union can be pressed into service, which can be handy if the type 
of `x` is opaque:

```
struct U {
     T x;
     union {
         ulong pad; // for alignment
         struct {
             ulong y: 30;
             ulong z: 34;
         }
     }
}
```

or use align:

```
struct U {
     T x;
   align(8)
     ulong y:30, z:34;
}
```

There are many existing ways to accomplish this. Adding more language features 
to duplicate existing capability needs a very strong case.

Jul 06 2024

Timon Gehr <timon.gehr gmx.ch> writes:

On 7/7/24 06:47, Walter Bright wrote:
 On 7/6/2024 8:50 PM, Steven Schveighoffer wrote:
 Let's take another example:

 ```c
 struct U {
    unsigned int x;
    unsigned long long y: 30;
    unsigned long long z: 34;
 }

 struct U2 {
    unsigned int x;
    unsigned long long y: 34;
    unsigned long long z: 30;
 }
 ```

 
 Simple solution:
 ...

You said: Same type, same alignment. It's clearly not true in Steven's 
example. It seems alignment depends on bit width.

Also consider this:

```d
struct S{
     uint x;
     ulong y:30;
     ulong z:34;
}
pragma(msg, S.y.offsetof, " ", S.y.alignof); // 4LU 8LU

The offset of `y` does not even respect its alignment! This is insanity.

It also happens with `uint`:

```d
struct S{
     ushort x;
     uint y:16;
}
pragma(msg, S.y.offsetof, " ", S.y.alignof); // 2LU 4LU
```

I.e., "stick to `int`/`uint` bitfields, things will be predictable" is 
not even true. They may be laid out differently based on what's before them.

 ```
 struct U {
      unsigned int x;
      unsigned int y:30;
      unsigned long long z:34;
 }
 ```
 
 or:
 
 ```
 struct U2 {
      unsigned int x;
      unsigned int pad;
      unsigned long long y:30;
      unsigned long long z:34;
 }
 ```
 
 depending on which layout is desired. This is simple,

If it is simple, you should have no trouble stating how it works 
completely in a couple sentences.

 predictable, and 
 portable. It's not going to be a mystery to anyone reading the code - 
 it's eminently readable.
 ...

Walter, this is frustrating. It is only obvious to you because having 
reverse-engineered and implemented it, you already know how it works. 
Note that the things you were saying earlier suggested it would actually 
work differently in Steven's example. I hope you understand that this is 
confusing. I am as a result now not sure whether what you stated is the 
full truth, or it is still some inadmissible simplification that glosses 
over some further dragons.

Also, I hope `.offsetof % .alignof != 0` is just a bug in your bitfield 
implementation.

Jul 07 2024

Walter Bright <newshound2 digitalmars.com> writes:

On 7/7/2024 3:42 AM, Timon Gehr wrote:
 If it is simple, you should have no trouble stating how it works completely in
a 
 couple sentences.

One sentence:

If the bitfields of type T start on a T alignment boundary and do not straddle
a 
T alignment boundary, then the bitfields will be portable.

I agree I sometimes have trouble writing exact specifications, but I'm also 
confident that you understand this.


 I am as a result now not sure whether what you stated is the full truth, or it
is still 
 some inadmissible simplification that glosses over some further dragons.

Feel free to try pathological examples and let me know of any adverse
discoveries.


 Also, I hope `.offsetof % .alignof != 0` is just a bug in your bitfield 
 implementation.

??

Jul 08 2024

Timon Gehr <timon.gehr gmx.ch> writes:

On 7/9/24 01:52, Walter Bright wrote:
 On 7/7/2024 3:42 AM, Timon Gehr wrote:
 If it is simple, you should have no trouble stating how it works 
 completely in a couple sentences.

 
 One sentence:
 
 If the bitfields of type T start on a T alignment boundary and do not 
 straddle a T alignment boundary, then the bitfields will be portable.
 ...

Well, this is not a complete characterization, but good enough I guess.

So the preferred alignment of a bitfield of a given width is not 
portable? I.e., are there so-called sane C compilers where a `uint:16` 
has an (actual) alignment of 4 instead of 2?

 I agree I sometimes have trouble writing exact specifications, but I'm 
 also confident that you understand this.
 ...

Sure, but I really think we should just enforce this kind of rule for 
`extern(D)` bitfields. If a programmer does not follow the rule, just 
error out and present options to the programmer for how to make the code 
compile:

error: bitfield layout is ambiguous

- add extern(C) to match the layout of the associated C compiler
- add padding and/or 0-width bitfields to unambiguously start bitfields 
on a T alignment boundary without straddling

A priori you just don't know which of those was intended. It's good to 
require explicit input here, as it is subtle.

 
 I am as a result now not sure whether what you stated is the full 
 truth, or it is still some inadmissible simplification that glosses 
 over some further dragons.

 
 Feel free to try pathological examples and let me know of any adverse 
 discoveries.
 
 
 Also, I hope `.offsetof % .alignof != 0` is just a bug in your 
 bitfield implementation.

 
 ??
 

It's elaborated upon in the part of the post you ignored:

On 7/7/24 12:42, Timon Gehr wrote:
 
 Also consider this:
 
 ```d
 struct S{
      uint x;
      ulong y:30;
      ulong z:34;
 }
 pragma(msg, S.y.offsetof, " ", S.y.alignof); // 4LU 8LU
 
 The offset of `y` does not even respect its alignment! This is insanity.
 
 It also happens with `uint`:
 
 ```d
 struct S{
      ushort x;
      uint y:16;
 }
 pragma(msg, S.y.offsetof, " ", S.y.alignof); // 2LU 4LU

Jul 09 2024

Timon Gehr <timon.gehr gmx.ch> writes:

On 7/10/24 02:32, Timon Gehr wrote:
 
 error: bitfield layout is ambiguous
 
 - add extern(C) to match the layout of the associated C compiler
 - add padding and/or 0-width bitfields to unambiguously start bitfields 
 on a T alignment boundary without straddling

Or change some of the bitfield types to ones with smaller alignment I 
guess. (If that is necessary at all. It's still not so obvious exactly 
what assumptions are portable in practice.)

Jul 09 2024

Steven Schveighoffer <schveiguy gmail.com> writes:

On Wednesday, 10 July 2024 at 00:32:53 UTC, Timon Gehr wrote:
 On 7/9/24 01:52, Walter Bright wrote:
 I agree I sometimes have trouble writing exact specifications, 
 but I'm also confident that you understand this.
 ...

 Sure, but I really think we should just enforce this kind of 
 rule for `extern(D)` bitfields. If a programmer does not follow 
 the rule, just error out and present options to the programmer 
 for how to make the code compile:

 error: bitfield layout is ambiguous

 - add extern(C) to match the layout of the associated C compiler
 - add padding and/or 0-width bitfields to unambiguously start 
 bitfields on a T alignment boundary without straddling

 A priori you just don't know which of those was intended. It's 
 good to require explicit input here, as it is subtle.

Yes, this is the correct answer. I stayed away from `extern(C)` 
specification because I *kinda* see the point that we have no 
precedent for `extern(C)` to adjust field layout. But this seems 
so obvious to me, I challenge anyone to fault this as a bad 
experience. For those who want C Compatibility, just say so. The 
D compiler has you covered. For those who want exact bitfield 
layout, you can use D, because D ensures you have not shot 
yourself in the foot by making an ambiguous layout request.

-Steve

Jul 09 2024

Timon Gehr <timon.gehr gmx.ch> writes:

On 7/10/24 03:41, Steven Schveighoffer wrote:
 I stayed away from `extern(C)` specification because I *kinda* see the 
 point that we have no precedent for `extern(C)` to adjust field layout.

Well, it does affect layout:

```d
extern(C) struct S{}
pragma(msg, S.sizeof); // 0LU
pragma(msg, (S[100]).sizeof); // 0LU

struct T{}
pragma(msg, T.sizeof); // 1LU
pragma(msg, (T[100]).sizeof); // 100LU
```

In any case, here, the usage is a bit different, in that the `extern(D)` 
version would just be a bit more restrictive, but still fully compatible.

Jul 09 2024

Walter Bright <newshound2 digitalmars.com> writes:

On 7/9/2024 6:57 PM, Timon Gehr wrote:
 ```d
 extern(C) struct S{}
 pragma(msg, S.sizeof); // 0LU
 pragma(msg, (S[100]).sizeof); // 0LU
 
 struct T{}
 pragma(msg, T.sizeof); // 1LU
 pragma(msg, (T[100]).sizeof); // 100LU
 ```
 
 In any case, here, the usage is a bit different, in that the `extern(D)`
version 
 would just be a bit more restrictive, but still fully compatible.

C and C++ differ here, too. D defaults to the C++ route because they wanted 
distinct objects to have distinct addresses, which made sense to me.

Jul 09 2024

Walter Bright <newshound2 digitalmars.com> writes:

On 7/9/2024 5:32 PM, Timon Gehr wrote:
 The offset of `y` does not even respect its alignment! This is insanity.


That's right. It's not a bug, it matches what the associated C compiler does. 
It's the same thing as Steven pointed out. I posted how to portably get either 
arrangement.

Jul 09 2024

Timon Gehr <timon.gehr gmx.ch> writes:

On 7/10/24 08:53, Walter Bright wrote:
 On 7/9/2024 5:32 PM, Timon Gehr wrote:
 The offset of `y` does not even respect its alignment! This is insanity.


 
 That's right. It's not a bug, it matches what the associated C compiler 
 does.

Nonsense. The issue is the inconsistency between `S.y.alignof` and 
`S.y.offsetof`. In C, neither `offsetof` nor `alignof` work with 
bitfields in the first place, so the question does not even pose itself.

Jul 10 2024

Steven Schveighoffer <schveiguy gmail.com> writes:

On Sunday, 7 July 2024 at 04:47:30 UTC, Walter Bright wrote:
 On 7/6/2024 8:50 PM, Steven Schveighoffer wrote:
 Let's take another example:
 
 ```c
 struct U {
    unsigned int x;
    unsigned long long y: 30;
    unsigned long long z: 34;
 }
 
 struct U2 {
    unsigned int x;
    unsigned long long y: 34;
    unsigned long long z: 30;
 }
 ```

 Simple solution:

 ```
 struct U {
     unsigned int x;
     unsigned int y:30;
     unsigned long long z:34;
 }
 ```

 or:

 ```
 struct U2 {
     unsigned int x;
     unsigned int pad;
     unsigned long long y:30;
     unsigned long long z:34;
 }
 ```

 depending on which layout is desired. This is simple, 
 predictable, and portable. It's not going to be a mystery to 
 anyone reading the code - it's eminently readable.

Simple, no. Predictable, yes (it's unambiguous). And not obvious. 
What I want is for the compiler to *require* you to do this to 
avoid inconsistencies. It is going to be a mystery to anyone 
reading it *why* they put these things in there. (hey, I 
simplified your code by getting rid of the pad, it comes out the 
same anyway due to [my wholly understandable but mistaken 
understanding of] alignment!)

To give some examples, we require empty if statements to use {} 
and not ;. It doesn't require any new syntax but it helps you 
avoid issues that many people make, even though it is allowed in 
C.

We require explicit conversion when narrowing the range of an 
integer (i.e. assigning a long to an int). This avoids issues 
that many people would make, even though it is allowed in C.

 There are many existing ways to accomplish this. Adding more 
 language features to duplicate existing capability needs a very 
 strong case.

I'm not asking for any new features.

-Steve

Jul 07 2024

Walter Bright <newshound2 digitalmars.com> writes:

On 7/7/2024 8:49 AM, Steven Schveighoffer wrote:
 Simple, no. Predictable, yes (it's unambiguous). And not obvious.

It is trivially obvious to the most casual observer!

Joking aside, it's the same technique used to inure a struct layout against 
member alignment issues.


 What I want is 
 for the compiler to *require* you to do this to avoid inconsistencies. It is 
 going to be a mystery to anyone reading it *why* they put these things in
there.

I've seen fields named "pad" or "padding" many times in C code. It's normal 
practice. Failing that, the purpose of comments is to add the 'why'. One could 
also use `static assert` for extra insurance.

I've also seen fields named "reserved". No comment needed.


 To give some examples, we require empty if statements to use {} and not ;. It 
 doesn't require any new syntax but it helps you avoid issues that many people 
 make, even though it is allowed in C.

Then one could not write a C compatible bitfield.


 We require explicit conversion when narrowing the range of an integer (i.e. 
 assigning a long to an int). This avoids issues that many people would make, 
 even though it is allowed in C.

The C semantics are still allowed by adding a cast.

Let's say Bob (poor Bob) needs to convert 20,000 lines of C code to D. I know 
you've done some of this yourself! Bob doesn't want to go through it line by 
line. Isn't it nice for Bob if it "just works"? If all those data declarations 
just work? Especially if the result still has to be compatible with the files 
that C code wrote out?

But what if the compiler says "Bob, you can't lay out a bitfield like that!" Or 
worse, it lays out the bitfield into a portable (but different) layout. Then it 
doesn't just work, Bob has got some debugging to do (while Bob curses D and
me), 
and Bob's got to figure out an alternative. Who wants to do that? Not Bob. Not 
me. Not nobody not nohow.


 There are many existing ways to accomplish this. Adding more language features 
 to duplicate existing capability needs a very strong case.

 I'm not asking for any new features.

Every switch that changes the semantics is a new feature and a new source of 
complexity and bugs.

One of my original requirements for D was no switches that change language 
semantics. I have failed at that. But I wasn't wrong to aspire towards it.

Jul 08 2024

Steven Schveighoffer <schveiguy gmail.com> writes:

On Tuesday, 9 July 2024 at 00:29:20 UTC, Walter Bright wrote:
 On 7/7/2024 8:49 AM, Steven Schveighoffer wrote:
 Simple, no. Predictable, yes (it's unambiguous). And not 
 obvious.

 It is trivially obvious to the most casual observer!

 Joking aside, it's the same technique used to inure a struct 
 layout against member alignment issues.

Yes, but there is a subtle difference -- the compiler ignores its 
own rules. In other words, explicit padding is required way more 
than with normal fields, which have consistent layout 
expectations.

As Timon points out, the compiler doesn't obey its own alignment 
requirements for bitfields.

 What I want is for the compiler to *require* you to do this to 
 avoid inconsistencies. It is going to be a mystery to anyone 
 reading it *why* they put these things in there.

 I've seen fields named "pad" or "padding" many times in C code. 
 It's normal practice. Failing that, the purpose of comments is 
 to add the 'why'. One could also use `static assert` for extra 
 insurance.

 I've also seen fields named "reserved". No comment needed.

I concede that this is probably true. This does rely on 
convention though, and having the compiler yell at you if you try 
to remove it is even better.

 To give some examples, we require empty if statements to use 
 {} and not ;. It doesn't require any new syntax but it helps 
 you avoid issues that many people make, even though it is 
 allowed in C.

 Then one could not write a C compatible bitfield.

Yes you can. You can use C to write a C compatible bitfield 
(ImportC is a thing).

If you are using C bitfields as part of an API, it's either to do 
register layout or protocol processing. In both of these cases, 
layout matters more than arbitrary implementation matching.

If you have a use case that relies on the arbitrariness of C 
bitfields (i.e. doesn't care), then yeah, I guess you have to go 
through ImportC. I don't see a problem with this -- this is 
almost always not public API (due to the problems with C 
bitfields). See for instance how the linux kernel doesn't use 
bitfields for anything other than internal flags to save space.

It's not something we need to cater to.

 We require explicit conversion when narrowing the range of an 
 integer (i.e. assigning a long to an int). This avoids issues 
 that many people would make, even though it is allowed in C.

 The C semantics are still allowed by adding a cast.

The C bitfield layout is achievable with D as well, it just might 
be the same exact syntax. i.e. you may need to use a uint instead 
of unsigned long long, or you might need to insert padding.

 Let's say Bob (poor Bob) needs to convert 20,000 lines of C 
 code to D. I know you've done some of this yourself! Bob 
 doesn't want to go through it line by line. Isn't it nice for 
 Bob if it "just works"? If all those data declarations just 
 work? Especially if the result still has to be compatible with 
 the files that C code wrote out?

ImportC is a thing. Leave the bitfield structs defined in C until 
you are fully in D, then use D bitfields.

Or you modify your C code to use the recommended layouts that D 
uses. If you don't care about layout, it shouldn't be a problem. 
And the D port should tell you exactly which parts you need to 
change through the errors.

 But what if the compiler says "Bob, you can't lay out a 
 bitfield like that!" Or worse, it lays out the bitfield into a 
 portable (but different) layout. Then it doesn't just work, Bob 
 has got some debugging to do (while Bob curses D and me), and 
 Bob's got to figure out an alternative. Who wants to do that? 
 Not Bob. Not me. Not nobody not nohow.

This already happens, we don't need bitfields for this kind of 
pain. ImportC is the solution.

Note that this follows the rule "if it looks like C and compiles, 
it should act like C". It's OK for things *not* to compile 
because we decided they are too error prone.

 I'm not asking for any new features.

 Every switch that changes the semantics is a new feature and a 
 new source of complexity and bugs.

 One of my original requirements for D was no switches that 
 change language semantics. I have failed at that. But I wasn't 
 wrong to aspire towards it.

How convenient that we draw the line here.

I have no rebuttal for this as it's totally arbitrary, so if this 
is your only qualm, I guess you got me.

-Steve

Jul 08 2024

Walter Bright <newshound2 digitalmars.com> writes:

I had written a detailed reply, but realized you and I were simply running 
around in the same circle saying the same things.

Jul 10 2024

Daniel N <no public.email> writes:

On Wednesday, 10 July 2024 at 07:09:10 UTC, Walter Bright wrote:
 I had written a detailed reply, but realized you and I were 
 simply running around in the same circle saying the same things.

Maybe some input from 3rd party could help?

I use bitfields daily and never had any issues. What I do is to 
always use fix size types and then simply take all freedom away 
from the compiler.

uint32_t a;
uint32_t  :32; // Forced padding
uint64_t b:10;
uint64_t c:10;
uint64_t  :44; // Forced padding
uint32_t d;

I guess one can use 0 size bitfields also but I usually prefer to 
visualize how much padding remains for potential future use.

Jul 10 2024

Daniel N <no public.email> writes:

On Wednesday, 10 July 2024 at 07:43:40 UTC, Daniel N wrote:
 On Wednesday, 10 July 2024 at 07:09:10 UTC, Walter Bright wrote:
 I had written a detailed reply, but realized you and I were 
 simply running around in the same circle saying the same 
 things.

 Maybe some input from 3rd party could help?

 I use bitfields daily and never had any issues. What I do is to 
 always use fix size types and then simply take all freedom away 
 from the compiler.

 uint32_t a;
 uint32_t  :32; // Forced padding
 uint64_t b:10;
 uint64_t c:10;
 uint64_t  :44; // Forced padding
 uint32_t d;

 I guess one can use 0 size bitfields also but I usually prefer 
 to visualize how much padding remains for potential future use.

PS To avoid relying on convention, you could make an incomplete 
bitfield a compilation error in D, then D bitfields would have C 
layout *AND* be deterministic.

Jul 10 2024

Timon Gehr <timon.gehr gmx.ch> writes:

On 7/10/24 09:55, Daniel N wrote:
 On Wednesday, 10 July 2024 at 07:43:40 UTC, Daniel N wrote:
 On Wednesday, 10 July 2024 at 07:09:10 UTC, Walter Bright wrote:
 I had written a detailed reply, but realized you and I were simply 
 running around in the same circle saying the same things.

 Maybe some input from 3rd party could help?

 I use bitfields daily and never had any issues. What I do is to always 
 use fix size types and then simply take all freedom away from the 
 compiler.

 uint32_t a;
 uint32_t  :32; // Forced padding
 uint64_t b:10;
 uint64_t c:10;
 uint64_t  :44; // Forced padding
 uint32_t d;

 I guess one can use 0 size bitfields also but I usually prefer to 
 visualize how much padding remains for potential future use.

 
 PS To avoid relying on convention, you could make an incomplete bitfield 
 a compilation error in D, then D bitfields would have C layout *AND* be 
 deterministic.
 

Yes, something like that I think would be great, but I think Walter has 
a point that there should still be a way to match C bitfields even if 
the original author was less competent than you w.r.t. bitfield layout. 
Hence the proposal that anything goes if there is an `extern(C)` 
annotation, but for `extern(D)` bitfields, something like your approach 
would be enforced by the compiler.

Jul 10 2024

Timon Gehr <timon.gehr gmx.ch> writes:

On 7/9/24 02:29, Walter Bright wrote:
 
 Let's say Bob (poor Bob) needs to convert 20,000 lines of C code to D. I 
 know you've done some of this yourself! Bob doesn't want to go through 
 it line by line. Isn't it nice for Bob if it "just works"?

It won't, some edits will be necessary.

 If all those 
 data declarations just work? Especially if the result still has to be 
 compatible with the files that C code wrote out?
 
 But what if the compiler says "Bob, you can't lay out a bitfield like 
 that!"

The compiler should simply say: "Bob, are you sure you want to lay out a 
bitfield like this?" If Bob is comfortable with it, he can add 
`extern(C)` and move on.

 Or worse, it lays out the bitfield into a portable (but 
 different) layout.

Well I think this is not an option.

 Then it doesn't just work, Bob has got some debugging 
 to do (while Bob curses D and me), and Bob's got to figure out an 
 alternative. Who wants to do that? Not Bob. Not me. Not nobody not nohow.

As far as I am concerned, this is an irrelevant straw man. I don't want 
this. I never suggested anything that would cause this. It's pure FUD.

Similarly, I don't want to go chasing down subtle differences in 
behavior/cache performance etc. between platforms. Portability may be 
important. It shouldn't be insane by default, it should be insane by 
choice. Informed consent.

Especially given that bitfields have a "much nicer syntax" than 
alternative approaches. It's not nice to hand out a footgun disguised as 
candy.

Jul 09 2024

Timon Gehr <timon.gehr gmx.ch> writes:

On 7/10/24 02:44, Timon Gehr wrote:
 
 Especially given that bitfields have a "much nicer syntax" than 
 alternative approaches. It's not nice to hand out a footgun disguised as 
 candy.

Maybe check out this guy's take on this kind of thing:
https://youtu.be/3iWn4S8JV8g

We should take it to heart.

Jul 09 2024

Walter Bright <newshound2 digitalmars.com> writes:

On 7/9/2024 5:44 PM, Timon Gehr wrote:
 On 7/9/24 02:29, Walter Bright wrote:
 Let's say Bob (poor Bob) needs to convert 20,000 lines of C code to D. I know 
 you've done some of this yourself! Bob doesn't want to go through it line by 
 line. Isn't it nice for Bob if it "just works"?

 
 It won't, some edits will be necessary.

Yes, we know it is imperfect. The fewer nits, the better.


 Then it doesn't just work, Bob has got some debugging to do (while Bob curses 
 D and me), and Bob's got to figure out an alternative. Who wants to do that? 
 Not Bob. Not me. Not nobody not nohow.

 As far as I am concerned, this is an irrelevant straw man. I don't want this.
I 
 never suggested anything that would cause this. It's pure FUD.

Having a D code with the same declarations as C code, but the code generated is 
different, is going to lead to subtle memory bugs. I.e. just another footgun.


 It's not nice to hand out a footgun disguised as candy.

Requiring an extern(C) to make it compatible with a C layout is just another 
footgun, and there's no way for the compiler to detect it.

The implementation-defined C layout has been there for what, 50 years? If it
was 
so awful there'd be proposals to the C Standard to change it. People gripe
about 
it now and then, but just go and fix their code and move on. Neither has C++ 
ever made any effort to change it, even though C++ has `extern "C"`.

I do not understand why this is such a problem, since C compilers change the 
struct member layout based on compiler switches (which I showed in another 
post), and elicits no complaint from anybody.

Having the default D struct member layout not line up with the associated C 
compiler layout is a memory safety issue. Not lining up with an externally 
imposed layout is not a memory safety issue.

The bottom line, whether D supports bitfields or not, whether extern(C) is 
applied or not, to conform to an externally specified layout, you're going to 
have to check and see if it matches. If it doesn't match, there are really 
simple ways to get it to match.

Jul 10 2024

Timon Gehr <timon.gehr gmx.ch> writes:

On 7/10/24 09:44, Walter Bright wrote:
 On 7/9/2024 5:44 PM, Timon Gehr wrote:
 On 7/9/24 02:29, Walter Bright wrote:
 Let's say Bob (poor Bob) needs to convert 20,000 lines of C code to 
 D. I know you've done some of this yourself! Bob doesn't want to go 
 through it line by line. Isn't it nice for Bob if it "just works"?

 It won't, some edits will be necessary.

 
 Yes, we know it is imperfect. The fewer nits, the better.
 ...

No, there are other considerations, otherwise D would be identical to C.

 
 Then it doesn't just work, Bob has got some debugging to do (while 
 Bob curses D and me), and Bob's got to figure out an alternative. Who 
 wants to do that? Not Bob. Not me. Not nobody not nohow.

 As far as I am concerned, this is an irrelevant straw man. I don't 
 want this. I never suggested anything that would cause this. It's pure 
 FUD.

 
 Having a D code with the same declarations as C code, but the code 
 generated is different, is going to lead to subtle memory bugs. I.e. 
 just another footgun.
 ...

My position is: no footguns. It's easily achievable.

Your position is: one footgun or another footgun, does it really matter, 
let's just choose the footgun with the simpler design.

 
 It's not nice to hand out a footgun disguised as candy.

 
 Requiring an extern(C) to make it compatible with a C layout is just 
 another footgun, and there's no way for the compiler to detect it.
 ...

Again: You are arguing against something you made up yourself. Something 
that is not even on the table. I am however glad you agree there should 
not be footguns.

 The implementation-defined C layout has been there for what, 50 years? 
 If it was so awful there'd be proposals to the C Standard to change it. 

I think you know very well that C has many design errors that were never 
fixed. Those people put up with C in the first place. They often even 
think it is a well-designed language.

 People gripe about it now and then, but just go and fix their code and 
 move on. Neither has C++ ever made any effort to change it, even though 
 C++ has `extern "C"`.
 
 I do not understand why this is such a problem,

Because D prides itself on fixing mistakes, including underspecified layout.

 since C compilers change 
 the struct member layout based on compiler switches (which I showed in 
 another post),

I use D because it does not do stupid things like that.

You seem to think that "some C implementations did it that way" is in 
some way a good way to justify something. It just is not.

 and elicits no complaint from anybody.
 ...

I highly doubt it. I would complain about this.

 Having the default D struct member layout not line up with the 
 associated C compiler layout is a memory safety issue.

I know. I care about memory safety. I do however not care about your 
argument, because it is a pure straw man. I am _not_ suggesting to make 
`extern(D)` bitfields silently have a different layout, just a bit more 
restrictions, rejecting sloppily written bit-field code.

Anyway, if you interoperate with C, you are on your own w.r.t. memory 
safety anyway, as you have no idea what kind of compiler extensions, 
attributes, and switches will change the struct layout in a way that 
ImportC does not yet understand, but will silently ignore.

```c
#include <stddef.h>
#include <stdio.h>
#include <stdlib.h>
struct __attribute((packed)) S{
     int x;
     int* y;
};

int main(){
     printf("%ld\n",offsetof(struct S, x));
     printf("%ld\n",offsetof(struct S, y));
}
```

```d
dmd -run test.c
0
8
```

```d
gcc test.c && ./a.out
0
4
```


 Not lining up 
 with an externally imposed layout is not a memory safety issue.
 ...

It does not even need to be externally imposed. Consistency and 
reproducibility is important on its own.

 The bottom line, whether D supports bitfields or not, whether extern(C) 
 is applied or not, to conform to an externally specified layout, you're 
 going to have to check and see if it matches. If it doesn't match, there 
 are really simple ways to get it to match.
 

On the platform you happen to be using. It may not work on another 
platform. The entire point is that it should be sufficient to check 
once, and to match it once. If what you need to match is insane 
inconsistent C behavior, just be explicit about it. That is all I am asking.

Jul 10 2024

Walter Bright <newshound2 digitalmars.com> writes:

On 7/5/2024 8:23 PM, Steven Schveighoffer wrote:
 My recommendation still is either:
 
 1. Denote D bitfields by a specified layout system (pick the most common C one 
 and do that). C bitfields can match the C compiler.
 2. Simply forbid problematic alignments at compile time:
 
 ```d
 struct S {
     uint x;
     uint64 a : 24;
     uint64 b : 24;
     uint64 c : 16;
 }
 
 // error, alignment of bitfield `a` may not match C layout, please use padding 
 or aligned bitfields to specify intended layout.
 
 // these are OK.
 struct SWithPadding {
     uint x;
     uint _; // padding
     uint64 a : 24;
     uint64 b : 24;
     uint64 c : 16;
 }
 
 struct SPacked {
     uint64 x : 32;
     uint64 a : 24;
     uint64 b : 24;
     uint64 c : 16;
 }
 ```
 
 Maybe the error only occurs if you specify a compiler switch?

It's clear how to get the desired portable layout.

Consider that nobody ever requested a warning on:

```
struct S {
     uint x;
     ulong y;
}
```
which is the equivalent. Not for D, C, or C++. At least none directed at my 
compilers.

I would be good with a note about this technique in the specification.

P.S. We've already got soooo many compiler switches, adding another one needs a 
strong case. Every such switch is a bug :-/

Jul 06 2024

Walter Bright <newshound2 digitalmars.com> writes:

On 7/5/2024 12:35 PM, Steven Schveighoffer wrote:
 What if you need > 32 bits or want to pack into a `ulong`? Is the behavior
sane 
 across compilers?

Yes. The trouble happens when you mix different field types. There are also 
differences when declaring "packed" bit fields - a C extension that ImportC
does 
not implement.

You can see which cases are different in:

ImportC:

https://github.com/dlang/dmd/blob/master/compiler/test/runnable/bitfieldsms.c
https://github.com/dlang/dmd/blob/master/compiler/test/runnable/bitfieldsposix32.c
https://github.com/dlang/dmd/blob/master/compiler/test/runnable/bitfieldsposix64.c

D:

https://github.com/dlang/dmd/blob/master/compiler/test/runnable/dbitfieldsms.c
https://github.com/dlang/dmd/blob/master/compiler/test/runnable/dbitfieldsposix32.c
https://github.com/dlang/dmd/blob/master/compiler/test/runnable/dbitfieldsposix64.c

Jul 05 2024

cc <cc nevernet.com> writes:

On Friday, 5 July 2024 at 05:37:50 UTC, Richard (Rikki) Andrew 
Cattermole wrote:
 Today many people have spent some time to try and understand 
 Walter's belief that C is "good enough" for bit fields in terms 
 of guarantees.

Can I define my D struct using bitfields, and then correctly 
unpack them in a GLSL shader using bitfieldExtract() with the 
predicted offsets and sizes?  If the answer is no, then that 
bitfield implementation belongs in a garbage can.

Currently the answer to this question is "yes" for std.bitmanip 
and "no" for "native" D bitfields.

Jul 06 2024

"Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:

On 07/07/2024 2:22 PM, cc wrote:
 On Friday, 5 July 2024 at 05:37:50 UTC, Richard (Rikki) Andrew 
 Cattermole wrote:
 Today many people have spent some time to try and understand Walter's 
 belief that C is "good enough" for bit fields in terms of guarantees.

 
 Can I define my D struct using bitfields, and then correctly unpack them 
 in a GLSL shader using bitfieldExtract() with the predicted offsets and 
 sizes?  If the answer is no, then that bitfield implementation belongs 
 in a garbage can.
 
 Currently the answer to this question is "yes" for std.bitmanip and "no" 
 for "native" D bitfields.

Yes and no.

If it is exactly 32bits, its fine.

If it unintentionally crosses above it (due to implementation defined 
behavior), or if it is (u)int 2/3/4 then it likely won't work.

Jul 06 2024

D Programming

C/C++ Programming

Other

digitalmars.D - C bitfields guarantees