digitalmars.dip.development - second draft: add Bitfields to D

Walter Bright (1/1) Apr 22 2024 https://github.com/WalterBright/documents/blob/dcb8caabff76312eee25bb9aa...

Richard (Rikki) Andrew Cattermole (40/41) Apr 22 2024 "The specific layout of bitfields in C is implementation-defined, and
Steven Schveighoffer (18/19) Apr 26 2024 Suffers from the same major problem as last time - nobody is

Jonathan M Davis (24/37) Apr 27 2024 C compatability matters a lot for importC and for C bindings in general ...

Walter Bright (12/15) Apr 27 2024 D used to have its own function call ABI, because I thought I'd make a c...

Jonathan M Davis (99/116) Apr 28 2024 In this particular case, as I understand it, there are use cases that

Richard (Rikki) Andrew Cattermole (10/16) Apr 28 2024 I'm not sure that anyone cares what the default is.
Walter Bright (11/11) Apr 28 2024 I listed these 3 use cases in the second draft, and it seems we are most...

Walter Bright (16/30) Apr 27 2024 I am, as soon as they become available in the D bootstrap compiler. I do...

Adam Wilson (3/4) Apr 27 2024 I would approve this because we gain C compatibility and we can

Jonathan M Davis (16/21) Apr 27 2024 Actually, it doesn't fix the need for std.bitmanip.bitfields, though it ...

Timon Gehr (31/41) Apr 28 2024 Maybe only those cases should be allowed without `extern(C)`. I think

Timon Gehr (5/17) Apr 28 2024 I get that combinations of code that exist today won't break, but it
Walter Bright (38/73) Apr 28 2024 I doubt introspection libraries would break. If they are not checking fo...

Timon Gehr (21/108) Apr 29 2024 You are breaking even simple patterns like

Timon Gehr (7/13) Apr 29 2024 Forgot to fully answer this.

Walter Bright (2/4) Apr 29 2024 The fat pointer in D is a delegate, and that's how I'd do it.

Jonathan M Davis (35/48) Apr 29 2024 druntime and Phobos both specifically uses tupleof to look for the actua...

Walter Bright (16/19) Apr 29 2024 Using `is(T==union)` is incomplete because anonymous unions are not fiel...

Walter Bright (55/57) Apr 29 2024 Let's see what happens.

Timon Gehr (2/4) Apr 30 2024 You mean fail silently.

Walter Bright (2/7) May 03 2024 No, I did not mean that.

Timon Gehr (3/12) May 04 2024 Well, that is what it will do if you assign a value to the `ref`

Walter Bright (22/39) Apr 29 2024 Getting and setting bit fields reads/writes all the bits in the underlyi...

Timon Gehr (24/85) Apr 30 2024 No, more than one bitfield is valid at a time even if they have the same...

Jonathan M Davis (15/20) Apr 30 2024 I don't see why not. You shouldn't be able to mutate them, but reading t...

Timon Gehr (5/34) Apr 30 2024 Well, I am bringing it up because the DIP draft ignores type qualifiers

Jonathan M Davis (14/18) Apr 30 2024 Atila was talking about possibly doing implicit atomics, and we may get
Walter Bright (6/10) May 03 2024 Since a bitfield can be part of a shared object, of course the shared ty...

Walter Bright (22/33) May 03 2024 Any expectations that there would be no two fields with the same offset ...

Timon Gehr (32/41) May 04 2024 This is explicitly called a `union`, so I really do not see what you are...

Timon Gehr (18/66) May 04 2024 Another example:

Timon Gehr (3/4) May 04 2024 I guess this should have been `S s={};` to ensure the POD struct is

Walter Bright (13/19) May 04 2024 An anonymous union is simply a way to specify layout. No actual union is...

Timon Gehr (14/43) May 04 2024 Well, now you are simply slicing the terminology in a weird way. `union`...

Walter Bright (3/5) May 04 2024 The data.sizeof for a bitfield will always be the size of the memory obj...

Timon Gehr (23/35) May 05 2024 I do not understand. I thought bitfields are supposed to match the

Walter Bright (7/21) May 06 2024 The sizeof there, in both cases, is giving the size in bytes of the memo...

Timon Gehr (14/41) May 06 2024 This is C and neither sizeof is on a memory object, they are both on

Walter Bright (9/36) May 06 2024 Since the memory object that x is in is 1 byte, the sizeof would be 1 by...

Timon Gehr (12/16) May 07 2024 I agree that given the C-like bitfield design you seem to have set your

Patrick Schluter (7/19) May 03 2024 Not true. x86 provides BMI1 instructions which are present in x86

user1234 (6/12) May 03 2024 About BMI/BMI2 it would interesting to see if optimizing

user1234 (2/15) May 03 2024 ![](https://i.imgur.com/Uw6qZ1g.png)

Walter Bright (3/8) May 03 2024 I'd be very surprised if it didn't work by reading the entire field firs...

Jonathan M Davis (26/27) Apr 29 2024 I don't think that I have _ever_ seen anyone use offsetof to determine

Richard (Rikki) Andrew Cattermole (8/9) May 03 2024 Randomly came across people talking about C bitfields, and how the

Walter Bright (12/24) May 04 2024 All the tweet says is:

Per =?UTF-8?B?Tm9yZGzDtnc=?= (3/4) May 06 2024 Why not use `.bitsizeof` instead of `.bitwidth`? For the sake of

Walter Bright (2/4) May 06 2024 Sizeof is in bytes, so I use a different word for number of bits.

Mike Parker (4/4) May 21 2024 A thread for review of the third draft was opened subsequent to

Walter Bright <newshound2 digitalmars.com> writes:

https://github.com/WalterBright/documents/blob/dcb8caabff76312eee25bb9aa057774d981a5bf7/bitfields.md

Apr 22 2024

"Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:

On 23/04/2024 1:01 PM, Walter Bright wrote:
 https://github.com/WalterBright/documents/blob/dcb8caabff76312eee25bb9aa057774d981a5bf7/bitfields.md

"The specific layout of bitfields in C is implementation-defined, and 
varies between the Digital Mars, Microsoft, and gdc/ldc compilers. 
gdc/lcd are lumped together because they produce identical results on 
the same platform."

s/lcd/ldc/



Worth mentioning here is that as long as you don't use string mixins 
attempting semantic is actually pretty cheap to determine compilability. 
Now that I'm thinking about the fact that its the same entry point 
internally.

Not ideal, will need an example in the specification on how to do this, 
if there is no trait. But in saying that, you'll need to use a trait 
anyway, so...

```d
T t;
enum isNotBitField = !__traits(compiles, &__traits(getMember, t, member));
```

Not ideal.

```d
void main() {
     Foo t;
	enum isBitField = !__traits(compiles, &__traits(getMember, t, "member"));
     pragma(msg, isBitField);
}

struct Foo {
     enum member;
}
```

Okay yes, not having the trait is a bad idea.
It makes introspection capabilities of D have less capability to 
determine what a symbol is.



I also mentioned this previously, but I want to see std.bitmap.bitfields 
gone for PhobosV3.

Anything that uses string mixins that the user interacts with makes 
tooling fail with it.

This is not an acceptable solution to be recommending to people, we can 
do significantly better than that.

It also means that people have to remember and understand the two 
separate solutions that we are recommending that in no way are 
comparable in how they are implemented.

Apr 22 2024

Steven Schveighoffer <schveiguy gmail.com> writes:

On Tuesday, 23 April 2024 at 01:01:11 UTC, Walter Bright wrote:
 https://github.com/WalterBright/documents/blob/dcb8caabff76312eee25bb9aa057774d981a5bf7/bitfields.md

Suffers from the same major problem as last time - nobody is 
going to be using C bitfield structs from D, yet we are 
inheriting all the problems. Keeping C compatibility is 
meaningless. We should pick one way and do it that way for D 
bitfields.

Have you considered that people might build some libraries with 
ldc, but build applications with dmd? If LDC picks one mechanism 
for laying out bitfields, but DMD picks a different one, then 
what happens when you try to use the two together? Do we really 
want to make D incompatible with itself?

This already happens with C. See for instance 
https://stackoverflow.com/questions/43504113/bitfield-struct-size-different-between-gcc-and-msft-cl

Adding more `__traits` is trivial, don't skimp here.

Still does not address `sizeof`.

The mechanism described to get the bit offset is... horrific. 
Please just add some `__traits`.

-Steve

Apr 26 2024

Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:

On Friday, April 26, 2024 9:26:06 AM MDT Steven Schveighoffer via 
dip.development wrote:
 On Tuesday, 23 April 2024 at 01:01:11 UTC, Walter Bright wrote:
 https://github.com/WalterBright/documents/blob/dcb8caabff76312eee25bb9aa05
 7774d981a5bf7/bitfields.md

 Suffers from the same major problem as last time - nobody is
 going to be using C bitfield structs from D, yet we are
 inheriting all the problems. Keeping C compatibility is
 meaningless. We should pick one way and do it that way for D
 bitfields.

C compatability matters a lot for importC and for C bindings in general -
not that we have to have a bitfields feature for general D which matches
that, but if we don't have a way to match what C does, then we have trouble
creating bindings for C code that uses bitfields. extern(C++) code
potentially needs the same thing.

Personally, binding to C is the primary way that I've ever had to deal with
bitfields, and not having the ability to do that has made dealing with such
bindings... interesting.

Now, if we want to do something like have extern(C) bitfields and extern(D)
bitfields so that we can have clean and consistent behavior in normal D
code, I'm perfectly fine with that, but I don't agree at all that binding to
C doesn't matter. For me at least, that's the primary place that bitfields
matter, particularly since I can use other solutions in D if need be,
whereas if a C API is designed to use bitfields, then you kind of need
support for that in D if you want the bindings to work correctly.

 Have you considered that people might build some libraries with
 ldc, but build applications with dmd? If LDC picks one mechanism
 for laying out bitfields, but DMD picks a different one, then
 what happens when you try to use the two together? Do we really
 want to make D incompatible with itself?

Completely aside from this specific issue, isn't it already the case that
you can't mix code built with different D compilers? I didn't think that
there was any guarantee of ABI compatibility across compilers, and I would
fully expect there to be trouble if I built parts of my code with one
compiler and other parts with another. I typically get linker errors at work
if I fail to clean out the build files when switching between dmd and ldc.

- Jonathan M Davis

Apr 27 2024

Walter Bright <newshound2 digitalmars.com> writes:

On 4/27/2024 12:12 AM, Jonathan M Davis wrote:
 Now, if we want to do something like have extern(C) bitfields and extern(D)
 bitfields so that we can have clean and consistent behavior in normal D
 code

D used to have its own function call ABI, because I thought I'd make a clean
and 
consistent one.

It turned out, nobody cared about clean and consistent. They wanted C 
compatibility. For example, debuggers could not handle anything other than what 
the associated C compiler emitted, regardless of what the debug info spec says.

There really is not a clean and consistent layout. There is only C 
compatibility. Just like we do for endianess and alignment.

All of the portability issues people have mentioned are easily dealt with.

There is always writing functions that do shifts and masks as a last resort. 
(Shifts and masks is what the code generator does anyway, so this won't cost
any 
performance.)

Apr 27 2024

Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:

On Sunday, April 28, 2024 12:44:41 AM MDT Walter Bright via dip.development 
wrote:
 On 4/27/2024 12:12 AM, Jonathan M Davis wrote:
 Now, if we want to do something like have extern(C) bitfields and
 extern(D)
 bitfields so that we can have clean and consistent behavior in normal D
 code

 D used to have its own function call ABI, because I thought I'd make a clean
 and consistent one.

 It turned out, nobody cared about clean and consistent. They wanted C
 compatibility. For example, debuggers could not handle anything other than
 what the associated C compiler emitted, regardless of what the debug info
 spec says.

 There really is not a clean and consistent layout. There is only C
 compatibility. Just like we do for endianess and alignment.

 All of the portability issues people have mentioned are easily dealt with.

 There is always writing functions that do shifts and masks as a last resort.
 (Shifts and masks is what the code generator does anyway, so this won't
 cost any performance.)

In this particular case, as I understand it, there are use cases that
definitely need to be able to have a guaranteed bit layout (e.g.
serialization code). So, I don't think that this is quite the same situation
as something like the call ABI. Even if a particular call ABI might
theoretically be better, it's not something that code normally cares about
in practice so long as it works, whereas some code will actually care what
the exact layout of bitfields is. The call ABI is largely a language
implementation detail, whereas the layout of bitfields actually affects the
behavior of the code.

It seems to me that we're dealing with three use cases here:

1. Code that is specifically binding to C bitfields. It needs to match what
the C compiler does, or it won't work. That comes with whatever pros and
cons the C layout has, but since the D code needs to match the C layout to
work, we just have to deal with whatever the layout is, and realistically,
the D code using it should not be written to care what the layout is,
because it could differ across OSes and architectures.

2. Code that needs a guaranteed bit layout, because it's actually taking the
integers that the bitfields are compacted into and storing them elsewhere
(e.g. storing the data on disk or sending it across the network). What C
does with bitfields for such code is utterly irrelevant, and it's
undesirable to even attempt compatibility. The bits need to be laid out
precisely in the way that the programmer indicates.

3. Code that just wants to store bits in a compact manner, and how that's
done doesn't particularly matter as long as the code just operates on the
individual bitfields and doesn't actually do anything with the integer
values that they're compacted into where the layout would matter.

For the third use case, it's arguably the case that we'd be better off with
a guaranteed bit layout so that it would be consistent across OSes and
architectures, and anyone who accidentally wrote code that relied on the bit
layout wouldn't have issues as a result (similar to how we make it so that
long is guaranteed to be 64 bits across OSes and architectures regardless of
what C does; we avoid whole classes of bugs that way). If I understand
correctly, it's the issues that come from accidentally relying on the exact
bit layout when it's not guaranteed which are why folks like Steven are
arguing that it's a terrible idea to follow C's layout.

However, it's also true that since such code in theory doesn't care what the
bit layout is (since it's just using bitfields for compact storage and not
for something like serialization), the third use case could be solved with
either C-compatible bitfields or with bitfields which have a guaranteed
layout. It would be less error-prone (and thus more desirable) if the bit
layout were consistent, but as long as code doesn't accidentally depend on
the layout, it shouldn't matter.



completely incompatible, and we therefore need separate solutions for them.

For C compatibility, the obvious solution is to have the compiler deal with
it like this DIP is doing. It already has to deal with C compatibility for a
variety of things, and it's just going to be far easier and cleaner to have
the compiler set up to provide C-compatible bitfields than it is to try to
provide a library solution. I wouldn't expect a library solution to cover
all of the possible targets correctly, whereas it should be much more
straightforward for the compiler to do it.


to be guaranteed.

I get the impression that you favor leaving the guaranteed bit layout to a
library solution, since you don't think that that use case matters much,
whereas you think that C compatibility matters a great deal, and you don't
think that the issues with accidentally relying on the layout when it's not
guaranteed are a big enough concern to avoid using C bitfields for code that
just wants to compact the bits. On the other hand, a number of the folks in
this thread don't think that C compatibility matters and don't want the bugs
that come from accidentally relying on the bit layout when it's not
guaranteed, so they're arguing for just making our bitfields have a
guaranteed layout and not worrying about C.

Personally, I'm inclined to argue that it would just be better to treat this
like we do extern(C++). extern(C++) structs and classes have whatever tweaks
are necessary to make them work with C++, whereas extern(D) code does what
we want to do with D types. We can do the same with extern(C) bitfields and
extern(D) bitfields. That way, we get C compatibility for the code that
needs it and a guaranteed bit layout for the code that needs that. And since
the guaranteed layout would be the default, we'd largely avoid bugs related
to relying on the bit layout when it's not guaranteed. It would be like how
D code in general uses long rather than c_long, so normal D code can rely on
the size of long and avoid the bugs that come with the type's size varying
depending on the target, whereas the code that actually needs C
compatibility uses c_long and takes the risks that come with a variable
integer size, because it has to. The issues with C bitfields would be
restricted to the code that actually needs the compatibility. It would also
make it cleaner to write code that has a guaranteed bit layout than it would
be a with a library solution, since it could use the nice syntax too rather
than treating it as a second-class citizen.

However, in terms of what's actually necessary, I think that realistically,
extern(C) bitfields need to be in the language like this DIP is proposing,
since it's just too risky to do that with a library solution, whereas
extern(D) bitfields _can_ be solved with a library solution like they are
right now. I don't think that that's the best solution, but it's certainly
better than what we have right now, since we don't have C-compatible
bitfields anywhere at the moment (outside of a preview switch).

In any case, it seems like the core issue that's resulting in most of the
debate over this DIP is how important some people think that it is to have a
guaranteed bit layout by default so that bugs which come from relying on a
layout that isn't guaranteed will be avoided. You don't seem to think that
that's much of a concern, whereas some of the other folks think that it's a
big concern.

Either way, I completely agree that we need a C-compatible solution in the
language so that we can sanely bind to C code that uses bitfields.

- Jonathan M Davis

Apr 28 2024

"Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:

On 29/04/2024 1:32 AM, Jonathan M Davis wrote:
 In any case, it seems like the core issue that's resulting in most of the
 debate over this DIP is how important some people think that it is to have a
 guaranteed bit layout by default so that bugs which come from relying on a
 layout that isn't guaranteed will be avoided. You don't seem to think that
 that's much of a concern, whereas some of the other folks think that it's a
 big concern.

I'm not sure that anyone cares what the default is.


dealing with a binding or serialization that you care and each of those 
are specialized enough to opt-in to whatever strategy is appropriate.

But one thing that has been on my kill list for PhobosV3 is string 
mixins publicly introducing any new symbols like... bitfields.

Simply because auto-completion cannot see it, and may never be able to 
see it due to the CTFE requirement.

https://github.com/LightBender/PhobosV3-Design/discussions/32

Apr 28 2024

Walter Bright <newshound2 digitalmars.com> writes:

I listed these 3 use cases in the second draft, and it seems we are mostly in 
agreement.

Using bit fields to reduce memory consumption, and to be compatible with C
code, 
is handled by default nicely with the proposal.

Conformance to an externally imposed layout sometimes is necessary, but it is 
much less common. It is almost always easily done with a minor bit of
attention. 
The worst case is writing a shift/mask accessor function, very easy to do. I 
suspect these workarounds are even less effort than reading the spec on how to 
use special syntax for it. Nobody is obliged to use std.bitmanip.bitfield to
get 
the job done.

I can help with any externally defined format anyone is having difficulty with.

Apr 28 2024

Walter Bright <newshound2 digitalmars.com> writes:

On 4/26/2024 8:26 AM, Steven Schveighoffer wrote:
 Suffers from the same major problem as last time - nobody is going to be using
C 
 bitfield structs from D,

I am, as soon as they become available in the D bootstrap compiler. I don't
much 
care for the ugly workarounds used currently.

 yet we are inheriting all the problems.

There aren't any problems if one is using bitfields for reducing memory 
consumption or for C compatibility.

 Keeping C compatibility is meaningless.

In the D compiler source code, it means gcd and ldc with their C++ backends 
won't have any issues with it.

 Have you considered that people might build some libraries with ldc, but build 
 applications with dmd? If LDC picks one mechanism for laying out bitfields,
but 
 DMD picks a different one, then what happens when you try to use the two 
 together? Do we really want to make D incompatible with itself?

I have considered that. dmd will pick the same layout as the associated C 
compiler, which is gcc (used by gdc), and clang (used by ldc).

 This already happens with C. See for instance 
 https://stackoverflow.com/questions/43504113/bitfield-struct-size-different-between-gcc-and-msft-cl

Can you even mix/match object files between vc and gdc, or vc and ldc, anyway?

dmd on Windows generates DMC layout for -m32, and VC layout for -m64 and
-m32mscoff


 Adding more `__traits` is trivial, don't skimp here.

Can be added later. The point is, the information is available.


 Still does not address `sizeof`.

Oops forgot that. It would return the size of the bitfield's type.

 The mechanism described to get the bit offset is... horrific. Please just add 
 some `__traits`.

It can be added later. But in general it is not a good idea to add things that 
are deducible from existing things. In this case, it's a loop. A function could 
be written to do it.

Apr 27 2024

Adam Wilson <flyboynw gmail.com> writes:

On Tuesday, 23 April 2024 at 01:01:11 UTC, Walter Bright wrote:
 https://github.com/WalterBright/documents/blob/dcb8caabff76312eee25bb9aa057774d981a5bf7/bitfields.md

I would approve this because we gain C compatibility and we can 
drop the `std.bitmanip.bitfields` type entirely from Phobos 3.

Apr 27 2024

Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:

On Saturday, April 27, 2024 6:31:37 PM MDT Adam Wilson via dip.development 
wrote:
 On Tuesday, 23 April 2024 at 01:01:11 UTC, Walter Bright wrote:
 https://github.com/WalterBright/documents/blob/dcb8caabff76312eee25bb9aa05
 7774d981a5bf7/bitfields.md

 I would approve this because we gain C compatibility and we can
 drop the `std.bitmanip.bitfields` type entirely from Phobos 3.

Actually, it doesn't fix the need for std.bitmanip.bitfields, though it does
reduce it. Use cases that need a guaranteed layout (e.g. for serialization)
won't work with C-compatible bitfields, because the layout could change
depending on the target platform. So adding this feature to the language
doesn't help them at all, and they'd still need something like the Phobos
solution.

Of course, this DIP helps quite a bit with regards to C bindings (which the
Phobos solution does not help with), because those cases need to match the C
layout rather than guaranteeing a layout that will be the same across all
OSes and architectures. This DIP could also be used in cases where you don't
care what C is doing, but you also don't care exactly how the bitfields are
laid out. So, it would reduce the need for a Phobos solution, but it doesn't
replace it.

- Jonathan M Davis

Apr 27 2024

Timon Gehr <timon.gehr gmx.ch> writes:

On 4/23/24 03:01, Walter Bright wrote:
https://github.com/WalterBright/documents/blob/dcb8caabff76312eee25bb9aa057774d981a5bf7/bitfields.md

In practice, however, if one sticks to int, uint, long and ulong bitfields,
they are laid out the same.

Maybe only those cases should be allowed without `extern(C)`. I think
that might be an ok compromise.

However, I would still much prefer a solution that explicitly introduces
the underlying `int`, `uint`, `long` and `ulong` fields, which would be
the ones visible to introspection in a portable way, so that
introspection code does not really need to concern itself with bitfields
at all if it is not important and we do not break existing introspection
libraries, such as all serialization libraries.

Symbolic Debug Info

This does not seem like a strong argument. I am pretty confident debug
info can work pretty well regardless of how D lays out the bits.

["a", "b", "c"]
["a", "_b_c_d_bf", "b", "b_min", "b_max", "c", "c_min", "c_max", "d", "d_min",
"d_max"]

I like that the members are not as cluttered. I guess maybe some people
still would like to access the underlying data (e.g., to implement a
pointer to bitfield as a struct with a pointer plus bit offset and bit
length, or something), so perhaps you could add a note that explains how
to do that.

You forgot to say what `.tupleof` will do for a struct with bitfields in it.

There isn't a specific trait for "is this a bitfield".

I think it would be better to have such a `__traits` even just for
discoverability when people look at the `__traits` page to implement
some introspection code.

testing to see if the address of a field can be taken, enables discovery of a
bitfield.

Not really, a field could be an `enum` field, and you cannot take the
address of that either. And if we ever add another feature that has
fields whose address can be taken, existing introspection code may
break. It is better to be explicit.

The values of .max or .min enable determining the number of bits in a bitfield.

I do not like this a lot, it does not seem like the canonical way to
determine it. `.bitlength`?

The bit offset can be introspected by summing the number of bits in each
preceding bitfield that has the same value of .offsetof.

I think it would be much better to just add a `__trait` for this or add
something like `.bitoffsetof`. This is a) much more user friendly and b)
is a bit more likely to work reliably in practice. D currently does not
give any guarantees on the order you will see members when using
`__traits(allMembers, ...)`.

Apr 28 2024

Timon Gehr <timon.gehr gmx.ch> writes:

On 4/29/24 00:30, Timon Gehr wrote:
 On 4/23/24 03:01, Walter Bright wrote:
 https://github.com/WalterBright/documents/blob/dcb8caabff76312eee25bb9aa057774d981a5bf7/bitfields.md

 ...
 
 However, I would still much prefer a solution that explicitly introduces 
 the underlying `int`, `uint`, `long` and `ulong` fields, which would be 
 the ones visible to introspection in a portable way, so that 
 introspection code does not really need to concern itself with bitfields 
 at all if it is not important and we do not break existing introspection 
 libraries, such as all serialization libraries.
 ...

This also renders somewhat moot the following claims from the DIP:

 This is an additive feature and does not break any existing code. Its use is
entirely optional.

I get that combinations of code that exist today won't break, but it 
still does break libraries that do "just works" serialization if new 
code uses that library with bitfields, and the breakage might be silent.

Apr 28 2024

Walter Bright <newshound2 digitalmars.com> writes:

On 4/28/2024 3:30 PM, Timon Gehr wrote:
 However, I would still much prefer a solution that explicitly introduces the 
 underlying `int`, `uint`, `long` and `ulong` fields, which would be the ones 
 visible to introspection in a portable way, so that introspection code does
not 
 really need to concern itself with bitfields at all if it is not important and 
 we do not break existing introspection libraries, such as all serialization 
 libraries.

I doubt introspection libraries would break. If they are not checking for 
bitfields, but are just looking at .offsetof and the type, they'll interpret
the 
bitfields as a union (which, in a way, is accurate).


 Symbolic Debug Info

 
 This does not seem like a strong argument. I am pretty confident debug info
can 
 work pretty well regardless of how D lays out the bits.

I'm not. I'd follow the dwarf spec and it didn't work, because the only thing 
that was ever tested was apparently what the C compiler actually did. In order 
to get gdb to work, I wound up ignoring the spec and doing what gcc did. It's 
the same with object file formats. The spec is somewhat of a fairy tale, it's 
what the associated C compiler actually does that matters.


 I like that the members are not as cluttered. I guess maybe some people still 
 would like to access the underlying data (e.g., to implement a pointer to 
 bitfield as a struct with a pointer plus bit offset and bit length, or 
 something), so perhaps you could add a note that explains how to do that.

Pointer to bitfields will work just the same as they do in C. I don't
understand 
what you're asking for.

 You forgot to say what `.tupleof` will do for a struct with bitfields in it.

They do exactly what you'd expect them to do:

```
import std.stdio;
struct S { int a:4, b:5; }
void main()
{
     S s;
     s.a = 7;
     s.b = 9;
     writeln(s.tupleof);
}
```
prints:
```
79
```
It's not necessary to specify this, because this behavior does not diverge from 
field access semantics. Only things that differ need to be specified.
Specifying 
"it works like X except for A,B,C" is a lot more reliable and compact than 
reiterating everything X does.


 I think it would be better to have such a `__traits` even just for 
 discoverability when people look at the `__traits` page to implement some 
 introspection code.

There isn't for other members, it's just "allMembers".


 testing to see if the address of a field can be taken, enables discovery of a 
 bitfield.

 
 Not really, a field could be an `enum` field, and you cannot take the address
of 
 that either. And if we ever add another feature that has fields whose address 
 can be taken, existing introspection code may break. It is better to be
explicit.

An enum is distinguished by it not being possible to use .offsetof with it.


 The values of .max or .min enable determining the number of bits in a bitfield.

 I do not like this a lot, it does not seem like the canonical way to determine 
 it. `.bitlength`?

I agree it's a bit(!) jarring at first blush, but it's easy and perfectly 
reliable. 7 and 15 are always going to be a 4 bit field. We do a lot of 
introspection via indirect things like this.


 The bit offset can be introspected by summing the number of bits in each 
 preceding bitfield that has the same value of .offsetof.

 
 I think it would be much better to just add a `__trait` for this or add 
 something like `.bitoffsetof`. This is a) much more user friendly and b) is a 
 bit more likely to work reliably in practice. D currently does not give any 
 guarantees on the order you will see members when using `__traits(allMembers, 
 ...)`.

I overlooked that bitfields can have holes in them, so probably something like 
.bitoffsetof is probably necessary.

Apr 28 2024

Timon Gehr <timon.gehr gmx.ch> writes:

On 4/29/24 08:44, Walter Bright wrote:
 On 4/28/2024 3:30 PM, Timon Gehr wrote:
 However, I would still much prefer a solution that explicitly 
 introduces the underlying `int`, `uint`, `long` and `ulong` fields, 
 which would be the ones visible to introspection in a portable way, so 
 that introspection code does not really need to concern itself with 
 bitfields at all if it is not important and we do not break existing 
 introspection libraries, such as all serialization libraries.

 
 I doubt introspection libraries would break.

You are breaking even simple patterns like
`foreach(ref field;s.tupleof){ }`.

It would be a miracle if libraries did not break.

 If they are not checking 
 for bitfields, but are just looking at .offsetof and the type, they'll 
 interpret the bitfields as a union (which, in a way, is accurate).
 ...

No, it is not accurate.

 ...
 I like that the members are not as cluttered. I guess maybe some 
 people still would like to access the underlying data (e.g., to 
 implement a pointer to bitfield as a struct with a pointer plus bit 
 offset and bit length, or something), so perhaps you could add a note 
 that explains how to do that.

 
 Pointer to bitfields will work just the same as they do in C. I don't 
 understand what you're asking for.
 ...

Well, you can't take a pointer to a bitfield.


 You forgot to say what `.tupleof` will do for a struct with bitfields 
 in it.

 
 They do exactly what you'd expect them to do:
 
 ```
 import std.stdio;
 struct S { int a:4, b:5; }
 void main()
 {
      S s;
      s.a = 7;
      s.b = 9;
      writeln(s.tupleof);
 }
 ```
 prints:
 ```
 79
 ```
 It's not necessary to specify this,

Well, so far everything in `.tupleof` had an address.
It should at least be mentioned in the DIP, if nowhere else you should 
put it in the breaking language changes section.

 because this behavior does not 
 diverge from field access semantics.

There is a difference between a DIP (that can change the language) and 
the specification (that can indeed be written in a way that does not 
explicitly mention bitfields under the `.tupleof` documentation.)

 ...
 
 I think it would be better to have such a `__traits` even just for 
 discoverability when people look at the `__traits` page to implement 
 some introspection code.

 
 There isn't for other members, it's just "allMembers".
 ...

Despite not being very relevant to what I was asking for, this is simply 
untrue. `allMembers` gives you the members, and `.tupleof` gives you the 
fields.

 
 testing to see if the address of a field can be taken, enables 
 discovery of a bitfield.

 Not really, a field could be an `enum` field, and you cannot take the 
 address of that either. And if we ever add another feature that has 
 fields whose address can be taken, existing introspection code may 
 break. It is better to be explicit.

 
 An enum is distinguished by it not being possible to use .offsetof with it.
 ...

Well, if you are trying to deliberately make introspection unnecessarily 
complicated, I guess that's your prerogative.

 
 The values of .max or .min enable determining the number of bits in a 
 bitfield.

 I do not like this a lot, it does not seem like the canonical way to 
 determine it. `.bitlength`?

 
 I agree it's a bit(!) jarring at first blush, but it's easy and 
 perfectly reliable. 7 and 15 are always going to be a 4 bit field. We do 
 a lot of introspection via indirect things like this.
 ...

All of those things are ugly hacks. This kind of brain teaser is how 
metaprogramming works (or increasingly: used to work) in C++, but I 
think it is not very wise to continue this tradition in D.

 
 The bit offset can be introspected by summing the number of bits in 
 each preceding bitfield that has the same value of .offsetof.

 I think it would be much better to just add a `__trait` for this or 
 add something like `.bitoffsetof`. This is a) much more user friendly 
 and b) is a bit more likely to work reliably in practice. D currently 
 does not give any guarantees on the order you will see members when 
 using `__traits(allMembers, ...)`.

 
 I overlooked that bitfields can have holes in them, so probably 
 something like .bitoffsetof is probably necessary.

Sounds good.

Apr 29 2024

Timon Gehr <timon.gehr gmx.ch> writes:

On 4/29/24 14:04, Timon Gehr wrote:
 Pointer to bitfields will work just the same as they do in C. I don't 
 understand what you're asking for.
 ...

 
 Well, you can't take a pointer to a bitfield.

Forgot to fully answer this.

I am asking for example code how you would implement a function that 
gives you a "fat pointer" to a bitfield that lets you read and write 
from that bitfield.

It cannot be the same as in C, as I think this inherently requires 
introspection.

Apr 29 2024

Walter Bright <newshound2 digitalmars.com> writes:

On 4/29/2024 5:07 AM, Timon Gehr wrote:
 I am asking for example code how you would implement a function that gives you
a 
 "fat pointer" to a bitfield that lets you read and write from that bitfield.

The fat pointer in D is a delegate, and that's how I'd do it.

Apr 29 2024

Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:

On Monday, April 29, 2024 6:04:13 AM MDT Timon Gehr via dip.development wrote:
 On 4/29/24 08:44, Walter Bright wrote:
 On 4/28/2024 3:30 PM, Timon Gehr wrote:
 However, I would still much prefer a solution that explicitly
 introduces the underlying `int`, `uint`, `long` and `ulong` fields,
 which would be the ones visible to introspection in a portable way, so
 that introspection code does not really need to concern itself with
 bitfields at all if it is not important and we do not break existing
 introspection libraries, such as all serialization libraries.

 I doubt introspection libraries would break.

 You are breaking even simple patterns like
 `foreach(ref field;s.tupleof){ }`.

 It would be a miracle if libraries did not break.

druntime and Phobos both specifically uses tupleof to look for the actual
members of a type which take up storage space in that type and whose address
can be taken. Traits such as std.traits.Fields do that and document it as
such. If bitfields show up as part of tupleof, I would fully expect that to
cause problems with any type introspection that operates on the member
variables of a type. The breakage may be minimal in practice due to the fact
that bitfields aren't currently part of the language, and it's only new code
which would encounter this problem, but any existing type introspection code
looking at fields is going to expect that all of those fields take up
storage space and that their address can be taken, so if it's given a type
which has bitfields, and those show up in tupleof, that code is not going to
work correctly.

Such code does already need to take unions into account (and there is _some_
similarity between those and bitfields), but it's going to have done that by
checking things like is(T == union), which won't help with bitfields at all.
And really, even if bitfields matched that, you wouldn't necessarily get the
right result anyway, because while both bitfields and unions have members
which are not proper fields on their own, the way they behave and take up
space in the type is completely different.

Maybe we should add a check for bitfields? Presumably, it would have to be
something more like __traits(isBitfield, member), since unlike with a union,
you can't check the type, and we're not adding a bitfields keyword, but
regardless of how you'd check whether something is a bitfield, existing type
introspection code is going to have to be updated in some fashion to take
bitfields into account, or it's going to do the wrong thing when it's given
a type that has bitfields. There's no way that bitfields are going to just
magically work correctly with code that does type introspection.

It does make sense that __traits(allMembers, T) would give you the
bitfields, but I don't think that it makes sense that tupleof would, since
you cannot take their addresses, but either way, it _will_ break Phobos code
if tupleof gives bitfields - and not in a way that would be easily detected,
because doing so would require having tests that used bitfields, which of
course, don't exist, because bitfields have to be added first.

- Jonathan M Davis

Apr 29 2024

Walter Bright <newshound2 digitalmars.com> writes:

On 4/29/2024 8:54 AM, Jonathan M Davis wrote:
 Such code does already need to take unions into account (and there is _some_
 similarity between those and bitfields), but it's going to have done that by
 checking things like is(T == union), which won't help with bitfields at all.

Using `is(T==union)` is incomplete because anonymous unions are not fields. The 
compiler doesn't do a union check internally for that reason. The correct check 
would be:

```
if (S.a.offsetof + typeof(S.a).sizeof <= S.b.offsetof ||
     S.b.offsetof + typeof(S.b).sizeof <= S.a.offsetof)
{
     // S.a and S.b do not overlap
}
else
{
     // S.a and S.b overlap
}
```
This will work without change for bitfields.

Apr 29 2024

Walter Bright <newshound2 digitalmars.com> writes:

On 4/29/2024 5:04 AM, Timon Gehr wrote:
 You are breaking even simple patterns like
 `foreach(ref field;s.tupleof){ }`.

Let's see what happens.

```
import core.stdc.stdio;

struct S { int a; int b; enum int c = 3; int d:3, e:4; }

void main()
{
     S s;
     s.a = 1;
     s.b = 2;
     s.d = 3;
     s.e = 4;
     foreach (ref f; s.tupleof) { printf("%d\n", f); }
}
```
which prints:
```
1
2
3
4
```
What is going on here? foreach over a tuple is not really a loop. It's a 
shorthand for a sequence of statements that the compiler unrolls the "loop" 
into. When the compiler sees a 'ref' for something that cannot have its address 
taken, it ignores the 'ref'. This can also be seen with:
```
foreach(ref i; 0 .. 10) { }
```
which works. You can see this in action when compiling with -vasm.

Additionally, for such unrolled loops the 'f' is not a loop variable. It is a 
new variable created for each unroll of the loop. You can see this with:
```
import core.stdc.stdio;

struct S { int a; int b; enum int c = 3; int d:3, e:4; }

void main()
{
     S s;
     s.a = 1;
     s.b = 2;
     s.d = 3;
     s.e = 4;
     foreach (ref f; s.tupleof) { foo(f); }
     foreach (ref f; s.tupleof) { printf("%d\n", f); }
}

void foo(ref int f)
{
     printf("f: %d\n", f);
     ++f;
}
```
where s.a and s.b get incremented, but s.d and s.e do not.

I do not recall exactly why this `ref` behavior was done for foreach, but it
was 
either a mistake or was done precisely to make generic code work. Either way, 
what's done is done, and there doesn't seem to be much point in breaking it.

Apr 29 2024

Timon Gehr <timon.gehr gmx.ch> writes:

On 4/30/24 05:14, Walter Bright wrote:
 I do not recall exactly why this `ref` behavior was done for foreach, 
 but it was either a mistake or was done precisely to make generic code work.

You mean fail silently.

Apr 30 2024

Walter Bright <newshound2 digitalmars.com> writes:

On 4/30/2024 3:00 AM, Timon Gehr wrote:
 On 4/30/24 05:14, Walter Bright wrote:
 I do not recall exactly why this `ref` behavior was done for foreach, but it 
 was either a mistake or was done precisely to make generic code work.

 
 You mean fail silently.

No, I did not mean that.

May 03 2024

Timon Gehr <timon.gehr gmx.ch> writes:

On 5/4/24 03:45, Walter Bright wrote:
 On 4/30/2024 3:00 AM, Timon Gehr wrote:
 On 4/30/24 05:14, Walter Bright wrote:
 I do not recall exactly why this `ref` behavior was done for foreach, 
 but it was either a mistake or was done precisely to make generic 
 code work.

 You mean fail silently.

 
 No, I did not mean that.

Well, that is what it will do if you assign a value to the `ref` 
iteration variable.

May 04 2024

Walter Bright <newshound2 digitalmars.com> writes:

On 4/29/2024 5:04 AM, Timon Gehr wrote:
 If they are not checking for bitfields, but are just looking at .offsetof and 
 the type, they'll interpret the bitfields as a union (which, in a way, is 
 accurate).
 ...

 
 No, it is not accurate.

Getting and setting bit fields reads/writes all the bits in the underlying 
field, so it definitely is like a union. std.bitmanip.bitfields also implements 
it as a union, because there is no other way. The CPU does not provide any 
instructions to access bit fields. (This is why atomics won't work on
bitfields.)

If the user of bitfields does not understand the underlying physical reality of 
bitfields, they will forever have problems with them. Just like programmers who 
do not understand the physical reality of pointers, floating point, 2s 
complement, etc., are always crippled and would probably be better off using 
Excel as their programming language :-/


 Pointer to bitfields will work just the same as they do in C. I don't 
 understand what you're asking for.

 Well, you can't take a pointer to a bitfield.

Exactly what I meant!


 Well, so far everything in `.tupleof` had an address.

When you mentioned enums not having an address, I had assumed you were talking 
about __traits(allMembers). .tupleof skips over enums.


 It should at least be mentioned in the DIP, if nowhere else you should put it
in 
 the breaking language changes section.

I can mention it, sure.


 Well, if you are trying to deliberately make introspection unnecessarily 
 complicated, I guess that's your prerogative.

__traits has an ugly syntax. The idea was to provide the ability, and the user 
(or Phobos) would put a pretty face on it.


 All of those things are ugly hacks. This kind of brain teaser is how 
 metaprogramming works (or increasingly: used to work) in C++, but I think it
is 
 not very wise to continue this tradition in D.

std.traits definitely continues the tradition. While I'm fine with ugly 
implementations in it, std.traits fails to document the behavior of the 
functions that supposedly put a pretty face on it. I've asked Adam Wilson to 
consider completely re-engineering std.traits.

As long as it is possible to put a pretty face on it, I'm ok with an underlying 
ugliness in the service of not having N>1 diverse ways to do X.

Apr 29 2024

Timon Gehr <timon.gehr gmx.ch> writes:

On 4/30/24 05:30, Walter Bright wrote:
 On 4/29/2024 5:04 AM, Timon Gehr wrote:
 If they are not checking for bitfields, but are just looking at 
 .offsetof and the type, they'll interpret the bitfields as a union 
 (which, in a way, is accurate).
 ...

 No, it is not accurate.

 
 Getting and setting bit fields reads/writes all the bits in the 
 underlying field, so it definitely is like a union.

No, more than one bitfield is valid at a time even if they have the same 
offsetof. This is definitely breaking expectations that used to be true.

 std.bitmanip.bitfields also implements it as a union,

No, this is not correct. It implements it as a field with accessors for 
different groups of bits. The only reason why `union` appears in that 
file is to support bitfields inside a union. This again highlights that 
those are not the same thing.

 because there is 
 no other way. The CPU does not provide any instructions to access bit 
 fields. (This is why atomics won't work on bitfields.)
 ...

Sure! I guess this opens the question what happens with bitfields and 
type qualifiers. The DIP currently says you can have `int`, `uint`, 
`long` and `ulong` bitfields.

Are e.g. `immutable(int)` bitfields allowed?
I'd expect `shared(int)` bitfields are not allowed?


 If the user of bitfields does not understand the underlying physical 
 reality of bitfields, they will forever have problems with them. Just 
 like programmers who do not understand the physical reality of pointers, 
 floating point, 2s complement, etc.,

I understand the underlying reality of all of those concepts and I still 
disagree that interpreting bitfields as a union is correct. There are 
bitfields and there are unions.

 are always crippled and would 
 probably be better off using Excel as their programming language :-/
 ...

However, this seems like an exaggeration. I think there are programmers 
who are gainfully employed and fall into neither of those categories.

 
 Pointer to bitfields will work just the same as they do in C. I don't 
 understand what you're asking for.

 Well, you can't take a pointer to a bitfield.

 
 Exactly what I meant!
 
 
 Well, so far everything in `.tupleof` had an address.

 
 When you mentioned enums not having an address, I had assumed you were 
 talking about __traits(allMembers). .tupleof skips over enums.
 ...

But it will include bitfields, and not the underlying "physical" variables.

 
 It should at least be mentioned in the DIP, if nowhere else you should 
 put it in the breaking language changes section.

 
 I can mention it, sure.
 ...

Thanks!

 
 Well, if you are trying to deliberately make introspection 
 unnecessarily complicated, I guess that's your prerogative.

 
 __traits has an ugly syntax. The idea was to provide the ability, and 
 the user (or Phobos) would put a pretty face on it.
 ...

In practice people often do use `__traits`, either because it is more 
efficient or wrapping is impossible. In any case, providing exactly what 
is needed is much simpler.

 
 All of those things are ugly hacks. This kind of brain teaser is how 
 metaprogramming works (or increasingly: used to work) in C++, but I 
 think it is not very wise to continue this tradition in D.

 
 std.traits definitely continues the tradition. While I'm fine with ugly 
 implementations in it, std.traits fails to document the behavior of the 
 functions that supposedly put a pretty face on it. I've asked Adam 
 Wilson to consider completely re-engineering std.traits.
 
 As long as it is possible to put a pretty face on it, I'm ok with an 
 underlying ugliness in the service of not having N>1 diverse ways to do X.
 

I think the main thing is it should be immediately obvious to readers, 
as it is actually not hard.

Apr 30 2024

Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:

On Tuesday, April 30, 2024 7:43:52 AM MDT Timon Gehr via dip.development 
wrote:
 Sure! I guess this opens the question what happens with bitfields and
 type qualifiers. The DIP currently says you can have `int`, `uint`,
 `long` and `ulong` bitfields.

 Are e.g. `immutable(int)` bitfields allowed?

I don't see why not. You shouldn't be able to mutate them, but reading them
should be fine, since they're not going to change.

 I'd expect `shared(int)` bitfields are not allowed?

There should be no problem with shared bitfields existing. However, it
shouldn't be legal to read them or write them so long as they're shared. But
with the preview switch to lock down shared, that's true of any type,
including int. Accessing shared data while it's not protected is always a
problem.

Atomics shouldn't work with bitfields, since they can't, whereas if you
protect them with a mutex, you can then temporarily cast away shared and
operate on them just like you'd do with any other shared data. So, I don't
see why bitfields would be at all special with regards to what needs to
happen with shared.

- Jonathan M Davis

Apr 30 2024

Timon Gehr <timon.gehr gmx.ch> writes:

On 4/30/24 18:42, Jonathan M Davis wrote:
 On Tuesday, April 30, 2024 7:43:52 AM MDT Timon Gehr via dip.development
 wrote:
 Sure! I guess this opens the question what happens with bitfields and
 type qualifiers. The DIP currently says you can have `int`, `uint`,
 `long` and `ulong` bitfields.

 Are e.g. `immutable(int)` bitfields allowed?

 
 I don't see why not. You shouldn't be able to mutate them, but reading them
 should be fine, since they're not going to change.
 
 I'd expect `shared(int)` bitfields are not allowed?

 
 There should be no problem with shared bitfields existing. However, it
 shouldn't be legal to read them or write them so long as they're shared. But
 with the preview switch to lock down shared, that's true of any type,
 including int. Accessing shared data while it's not protected is always a
 problem.
 
 Atomics shouldn't work with bitfields, since they can't, whereas if you
 protect them with a mutex, you can then temporarily cast away shared and
 operate on them just like you'd do with any other shared data. So, I don't
 see why bitfields would be at all special with regards to what needs to
 happen with shared.
 
 - Jonathan M Davis
 
 
 

Well, I am bringing it up because the DIP draft ignores type qualifiers 
so far (and explicitly only lists unqualified types for support). What 
is happening with `shared` I think has not been fully pinned down, but 
last I heard the goal was to get implicit atomics.

Apr 30 2024

Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:

On Tuesday, April 30, 2024 2:48:46 PM MDT Timon Gehr via dip.development 
wrote:
 Well, I am bringing it up because the DIP draft ignores type qualifiers
 so far (and explicitly only lists unqualified types for support). What
 is happening with `shared` I think has not been fully pinned down, but
 last I heard the goal was to get implicit atomics.

Atila was talking about possibly doing implicit atomics, and we may get
that, but either way, for types that _can't_ use atomics (like bitfields),
it's pretty clearly going to have to be the case that they can't be read or
written to without casting if shared is actually going to protect against
accessing shared data in a manner which isn't guaranteed to be thread-safe
like it's theoretically supposed to and -preview=nosharedaccess is supposed
to enforce.

So, the normal rules for type qualifiers should apply to bitfields exactly
like they would with any other type, and there shouldn't be any surprises
here, but yes, if we want to be thorough about things, then the DIP should
probably mention what happens with type qualifiers.

- Jonathan M Davis

Apr 30 2024

Walter Bright <newshound2 digitalmars.com> writes:

On 4/30/2024 1:48 PM, Timon Gehr wrote:
 Well, I am bringing it up because the DIP draft ignores type qualifiers so far 
 (and explicitly only lists unqualified types for support). What is happening 
 with `shared` I think has not been fully pinned down, but last I heard the
goal 
 was to get implicit atomics.

Since a bitfield can be part of a shared object, of course the shared type 
should exist for it. But since you cannot take a reference to a bitfield,
you're 
going to have to dip into  system code to manipulate it, and it's up to the 
programmer to figure out what to do.

Immutable means read only. I don't see any issue with that, either.

May 03 2024

Walter Bright <newshound2 digitalmars.com> writes:

On 4/30/2024 6:43 AM, Timon Gehr wrote:
 No, more than one bitfield is valid at a time even if they have the same 
 offsetof. This is definitely breaking expectations that used to be true.

Any expectations that there would be no two fields with the same offset were 
incorrect anyway, as that is what happens with anonymous unions.


 std.bitmanip.bitfields also implements it as a union,

 No, this is not correct. It implements it as a field with accessors for 
 different groups of bits. The only reason why `union` appears in that file is
to 
 support bitfields inside a union. This again highlights that those are not the 
 same thing.

They are the same thing, it is not substantive what label is painted on it. 
Anonymous unions can lay fields on top of each other without explicitly
labeling 
it as a union.


 I understand the underlying reality of all of those concepts and I still 
 disagree that interpreting bitfields as a union is correct. There are
bitfields 
 and there are unions.

There is no value to that distinction. As I replied to Jonathan in this thread, 
D can have fields laying over the top of each other right now without any
unions 
or bitfields declared.


 But it will include bitfields, and not the underlying "physical" variables.

If the introspection code does not take into account anonymous unions, it's 
already broken anyway.

```
struct S
{
     union
     {
         int a;
         int b;
         int c:32;
     }
}
```

May 03 2024

Timon Gehr <timon.gehr gmx.ch> writes:

On 5/4/24 04:07, Walter Bright wrote:

 No, this is not correct. It implements it as a field with accessors 
 for different groups of bits. The only reason why `union` appears in 
 that file is to support bitfields inside a union. This again 
 highlights that those are not the same thing.

 
 They are the same thing, it is not substantive what label is painted on 
 it. Anonymous unions can lay fields on top of each other without 
 explicitly labeling it as a union.

This is explicitly called a `union`, so I really do not see what you are 
trying to say.

Bitfields do not imply union, but they can be in a union, e.g.:

```C++
#include <bits/stdc++.h>
using namespace std;

struct S{
     union{
         struct {
             unsigned int x:1;
             unsigned int y:1;
         };
         unsigned int z:2;
     };
};

int main(){
     S s;
     s.z=3;
     cout<<s.x<<" "<<s.y<<endl; // 1 1
     s.x=0;
     cout<<s.y<<" "<<s.z<<endl; // 1 2
     s.y=0;
     s.x=1;
     cout<<s.z<<endl; // 1
}
```

Clearly bitfields are a new and distinct way fields can share the same 
`.offsetof` in D. Before bitfields, such fields would overlap given they 
both were of a type with a positive size. With bitfields there is no 
such overlap.

BTW: what about `sizeof`? I think in C++ this is disallowed on a bitfield.

May 04 2024

Timon Gehr <timon.gehr gmx.ch> writes:

On 5/4/24 19:05, Timon Gehr wrote:
 On 5/4/24 04:07, Walter Bright wrote:

 No, this is not correct. It implements it as a field with accessors 
 for different groups of bits. The only reason why `union` appears in 
 that file is to support bitfields inside a union. This again 
 highlights that those are not the same thing.

 They are the same thing, it is not substantive what label is painted 
 on it. Anonymous unions can lay fields on top of each other without 
 explicitly labeling it as a union.

 
 This is explicitly called a `union`, so I really do not see what you are 
 trying to say.
 
 Bitfields do not imply union, but they can be in a union, e.g.:
 
 ```C++
 #include <bits/stdc++.h>
 using namespace std;
 
 struct S{
      union{
          struct {
              unsigned int x:1;
              unsigned int y:1;
          };
          unsigned int z:2;
      };
 };
 
 int main(){
      S s;
      s.z=3;
      cout<<s.x<<" "<<s.y<<endl; // 1 1
      s.x=0;
      cout<<s.y<<" "<<s.z<<endl; // 1 2
      s.y=0;
      s.x=1;
      cout<<s.z<<endl; // 1
 }
 ```
 
 Clearly bitfields are a new and distinct way fields can share the same 
 `.offsetof` in D. Before bitfields, such fields would overlap given they 
 both were of a type with a positive size. With bitfields there is no 
 such overlap.
 
 BTW: what about `sizeof`? I think in C++ this is disallowed on a bitfield.

Another example:

```C++
#include <bits/stdc++.h>
using namespace std;

struct S{
     union{
         unsigned int x:1;
         unsigned int y:1;
     };
};

int main(){
     S s;
     cout<<s.x<<" "<<s.y<<endl; // 0 0
     s.x=1;
     cout<<s.x<<" "<<s.y<<endl; // 1 1
}
```

May 04 2024

Timon Gehr <timon.gehr gmx.ch> writes:

On 5/4/24 19:21, Timon Gehr wrote:
      S s;

I guess this should have been `S s={};` to ensure the POD struct is 
initialized.

May 04 2024

Walter Bright <newshound2 digitalmars.com> writes:

On 5/4/2024 10:05 AM, Timon Gehr wrote:
 This is explicitly called a `union`, so I really do not see what you are
trying 
 to say.

An anonymous union is simply a way to specify layout. No actual union is 
created, as there is no point to it. How would one refer to an anonymous union? 
The same goes for anonymous structs.


 Clearly bitfields are a new and distinct way fields can share the same 
 `.offsetof` in D. Before bitfields, such fields would overlap given they both 
 were of a type with a positive size. With bitfields there is no such overlap.

Bitfields do overlap, which is why they are accessed with shift and mask.

Besides, the context here is with existing introspection. Existing
introspection 
will treat them as overlapping fields.


 BTW: what about `sizeof`? I think in C++ this is disallowed on a bitfield.

```
struct S { int b:3; }
pragma(msg, S.b.sizeof);
```
prints 4LU, as it applies to the type. To get the field width, use .max, as 
already discussed.

May 04 2024

Timon Gehr <timon.gehr gmx.ch> writes:

On 5/4/24 22:17, Walter Bright wrote:
 On 5/4/2024 10:05 AM, Timon Gehr wrote:
 This is explicitly called a `union`, so I really do not see what you 
 are trying to say.

 
 An anonymous union is simply a way to specify layout. No actual union is 
 created, as there is no point to it. How would one refer to an anonymous 
 union? The same goes for anonymous structs.
 ...

Well, now you are simply slicing the terminology in a weird way. `union` 
and `struct` are tools to lay out data, anonymous or otherwise, whether 
you generate typeinfo or otherwise.

 
 Clearly bitfields are a new and distinct way fields can share the same 
 `.offsetof` in D. Before bitfields, such fields would overlap given 
 they both were of a type with a positive size. With bitfields there is 
 no such overlap.

 
 Bitfields do overlap, which is why they are accessed with shift and mask.
 ...

They do not overlap if not put in a union (anonymous or otherwise). 
Otherwise, changing the value of one bitfield would affect the value of 
another one. The fact that they occupy space in the same byte and that 
the processor can only address memory at byte granularity does not imply 
that the bitfields themselves overlap.

 Besides, the context here is with existing introspection. Existing 
 introspection will treat them as overlapping fields.
 
 
 BTW: what about `sizeof`? I think in C++ this is disallowed on a 
 bitfield.

 
 ```
 struct S { int b:3; }
 pragma(msg, S.b.sizeof);
 ```
 prints 4LU, as it applies to the type. To get the field width, use .max, 
 as already discussed.

Well, then code that is set up to work with data using `.tupleof`, 
`.offsetof` and `.sizeof` will silently break. Whether you acknowledge 
that or not, it's simply the truth.

You are breaking the previous invariant that data in a struct lives at 
relative addresses `data.offsetof..data.offsetof+data.sizeof`.

May 04 2024

Walter Bright <newshound2 digitalmars.com> writes:

On 5/4/2024 4:01 PM, Timon Gehr wrote:
 You are breaking the previous invariant that data in a struct lives at
relative 
 addresses `data.offsetof..data.offsetof+data.sizeof`.

The data.sizeof for a bitfield will always be the size of the memory object 
containing the field. The invariant is not broken.

May 04 2024

Timon Gehr <timon.gehr gmx.ch> writes:

On 5/5/24 04:11, Walter Bright wrote:
 On 5/4/2024 4:01 PM, Timon Gehr wrote:
 You are breaking the previous invariant that data in a struct lives at 
 relative addresses `data.offsetof..data.offsetof+data.sizeof`.

 
 The data.sizeof for a bitfield will always be the size of the memory 
 object containing the field. The invariant is not broken.

I do not understand. I thought bitfields are supposed to match the 
layout of the associated C compiler. Instead, you seem to now be arguing 
that there should actually be stringent layout guarantees.

GCC 11.4.0, clang 14.0.0:

```c
#include <stdio.h>
struct __attribute__((packed)) S{
     long long x:8;
};
int main(){
     printf("%ld\n",sizeof(long long)); // 8
     printf("%ld\n",sizeof(struct S)); // 1
}
```

It indeed seems `dmd 2.108.1` disagrees and gives `8` and `8`, but I 
guess this is a mistake.

In any case, laying the struct out like this is in compliance with C 
standards even without the additional attribute:

 An implementation may allocate any addressable storage unit large enough to
hold a bit-
 field. If enough space remains, a bit-field that immediately follows another
bit-field in a
 structure shall be packed into adjacent bits of the same unit. If insufficient
space remains,
 whether a bit-field that does not fit is put into the next unit or overlaps
adjacent units is
 implementation-defined. The order of allocation of bit-fields within a unit
(high-order to
 low-order or low-order to high-order) is i

http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1570.pdf

Maybe at least redefine `.sizeof` to give the size of the underlying 
storage unit for a bitfield. Otherwise, you exacerbate the risk of 
memory corruption due to invalid assumptions about layout.

May 05 2024

Walter Bright <newshound2 digitalmars.com> writes:

On 5/5/2024 2:08 AM, Timon Gehr wrote:
 GCC 11.4.0, clang 14.0.0:
 
 ```c
 #include <stdio.h>
 struct __attribute__((packed)) S{
      long long x:8;
 };
 int main(){
      printf("%ld\n",sizeof(long long)); // 8
      printf("%ld\n",sizeof(struct S)); // 1
 }
 ```

The sizeof there, in both cases, is giving the size in bytes of the memory 
object the field is a subset of.


 Maybe at least redefine `.sizeof` to give the size of the underlying storage 
 unit for a bitfield.

That's what it's doing in the example.

BTW, I didn't implement packed bitfields in ImportC. It never occurred to me
:-/ 
I suppose it should get a bugzilla issue.

https://issues.dlang.org/show_bug.cgi?id=24538

May 06 2024

Timon Gehr <timon.gehr gmx.ch> writes:

On 5/6/24 09:14, Walter Bright wrote:
 On 5/5/2024 2:08 AM, Timon Gehr wrote:
 GCC 11.4.0, clang 14.0.0:

 ```c
 #include <stdio.h>
 struct __attribute__((packed)) S{
      long long x:8;
 };
 int main(){
      printf("%ld\n",sizeof(long long)); // 8
      printf("%ld\n",sizeof(struct S)); // 1
 }
 ```

 
 The sizeof there, in both cases, is giving the size in bytes of the 
 memory object the field is a subset of.
 ...

This is C and neither sizeof is on a memory object, they are both on 
types. sizeof on x gives a compile error. However, with the DIP, given 
that you implement packed bitfields in DMD, when importing an example 
like this one, `x.sizeof` would be eight times as big as the size of the 
`struct` it is a part of.

 
 Maybe at least redefine `.sizeof` to give the size of the underlying 
 storage unit for a bitfield.

 
 That's what it's doing in the example.
 ...

Well, the DIP now says `bitfield.sizeof` is `typeof(bitfield).sizeof`. 
Unless I misunderstand and `typeof(x)` is not `long long`, this should 
not be the case in this example, because a `long long` is longer than 
the memory location it is packed into in this case. (I think this is 
another broken invariant.)

 BTW, I didn't implement packed bitfields in ImportC. It never occurred 
 to me :-/ I suppose it should get a bugzilla issue.
 
 https://issues.dlang.org/show_bug.cgi?id=24538

Well, that will help, but the point was the C standard does not give the 
guarantees you assumed to hold earlier, and in practice it in fact does 
not hold, as in this example.

May 06 2024

Walter Bright <newshound2 digitalmars.com> writes:

On 5/6/2024 1:37 PM, Timon Gehr wrote:
 On 5/6/24 09:14, Walter Bright wrote:
 On 5/5/2024 2:08 AM, Timon Gehr wrote:
 GCC 11.4.0, clang 14.0.0:

 ```c
 #include <stdio.h>
 struct __attribute__((packed)) S{
      long long x:8;
 };
 int main(){
      printf("%ld\n",sizeof(long long)); // 8
      printf("%ld\n",sizeof(struct S)); // 1
 }
 ```

 The sizeof there, in both cases, is giving the size in bytes of the memory 
 object the field is a subset of.
 ...

 
 This is C and neither sizeof is on a memory object, they are both on types. 
 sizeof on x gives a compile error. However, with the DIP, given that you 
 implement packed bitfields in DMD, when importing an example like this one, 
 `x.sizeof` would be eight times as big as the size of the `struct` it is a
part of.

Since the memory object that x is in is 1 byte, the sizeof would be 1 byte (if
I 
implemented the packed logic).


 Well, the DIP now says `bitfield.sizeof` is `typeof(bitfield).sizeof`.

Yes, as I didn't know about the packed thing then.

 Well, that will help, but the point was the C standard does not give the 
 guarantees you assumed to hold earlier, and in practice it in fact does not 
 hold, as in this example.

The C standard says nothing about __attribte__((packed), and C doesn't allow 
sizeof on bit fields, so we can make .sizeof work as we like. The most
practical 
thing is to make it mean the size of the memory object the bitfield is a subset 
of. Unless (unimplemented) packed bitfields are used, the sizeof is the size of 
the type.

May 06 2024

Timon Gehr <timon.gehr gmx.ch> writes:

On 5/7/24 05:39, Walter Bright wrote:
 The most practical thing is to make it mean the size of the memory 
 object the bitfield is a subset of.

I agree that given the C-like bitfield design you seem to have set your 
mind on, and given the behavior of `.offsetof` this is a decent behavior 
for `.sizeof`.

However, this is not what the DIP currently says, so it should be updated.

 Unless (unimplemented) packed 
 bitfields are used, the sizeof is the size of the type.

The size of the memory object has to be whatever the associated C 
compiler allocates, and according to the standard, it is in principle 
allowed to pack by default. There is no guarantee that the memory object 
is in fact at least as big as the type of the bitfield. I am not 
familiar with bitfield layout on all platforms that D supports via GDC 
and LDC, but I would not be surprised if on some of them this is 
actually an issue in practice.

May 07 2024

Patrick Schluter <Patrick.Schluter bbox.fr> writes:

On Tuesday, 30 April 2024 at 03:30:15 UTC, Walter Bright wrote:
 On 4/29/2024 5:04 AM, Timon Gehr wrote:
 If they are not checking for bitfields, but are just looking 
 at .offsetof and the type, they'll interpret the bitfields as 
 a union (which, in a way, is accurate).
 ...

 
 No, it is not accurate.

 Getting and setting bit fields reads/writes all the bits in the 
 underlying field, so it definitely is like a union. 
 std.bitmanip.bitfields also implements it as a union, because 
 there is no other way. The CPU does not provide any 
 instructions to access bit fields.

Not true. x86 provides BMI1 instructions which are present in x86 
CPUs at least since 2013.
ARM also provides bit field instructions and quite a number of 
legacy CPU's also had bitfield instructions (m68k, NEC V30, 
Itanium, PowerPC, etc.).
Doesn't change the issues with language bitfields

May 03 2024

user1234 <user1234 12.de> writes:

On Friday, 3 May 2024 at 12:52:09 UTC, Patrick Schluter wrote:
 Not true. x86 provides BMI1 instructions which are present in 
 x86 CPUs at least since 2013.
 ARM also provides bit field instructions and quite a number of 
 legacy CPU's also had bitfield instructions (m68k, NEC V30, 
 Itanium, PowerPC, etc.).
 Doesn't change the issues with language bitfields

About BMI/BMI2 it would interesting to see if optimizing 
compilers actually generate instructions of these extensions for 
c++ bitfields. I've tried for styx enum-sets, sure that's a bit a 
special case of bitfields, but so far the only difference visible 
is a BMI2 `shlxl` emitted. But once again very special case.

May 03 2024

user1234 <user1234 12.de> writes:

On Friday, 3 May 2024 at 15:50:42 UTC, user1234 wrote:
 On Friday, 3 May 2024 at 12:52:09 UTC, Patrick Schluter wrote:
 Not true. x86 provides BMI1 instructions which are present in 
 x86 CPUs at least since 2013.
 ARM also provides bit field instructions and quite a number of 
 legacy CPU's also had bitfield instructions (m68k, NEC V30, 
 Itanium, PowerPC, etc.).
 Doesn't change the issues with language bitfields

 About BMI/BMI2 it would interesting to see if optimizing 
 compilers actually generate instructions of these extensions 
 for c++ bitfields. I've tried for styx enum-sets, sure that's a 
 bit a special case of bitfields, but so far the only difference 
 visible is a BMI2 `shlxl` emitted. But once again very special 
 case.

![](https://i.imgur.com/Uw6qZ1g.png)

May 03 2024

Walter Bright <newshound2 digitalmars.com> writes:

On 5/3/2024 5:52 AM, Patrick Schluter wrote:
 Not true. x86 provides BMI1 instructions which are present in x86 CPUs at
least 
 since 2013.
 ARM also provides bit field instructions and quite a number of legacy CPU's
also 
 had bitfield instructions (m68k, NEC V30, Itanium, PowerPC, etc.).
 Doesn't change the issues with language bitfields

I'd be very surprised if it didn't work by reading the entire field first.

https://www.felixcloutier.com/x86/bextr

May 03 2024

Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:

On Monday, April 29, 2024 12:44:08 AM MDT Walter Bright via dip.development 
wrote:
 An enum is distinguished by it not being possible to use .offsetof with it.

I don't think that I have _ever_ seen anyone use offsetof to determine
anything with type introspection other than the actual offset. Existing code
will almost certainly be using & to determine whether a member is an enum or
not.

That being said, _usually_, it's the case that code cares when a member is
an enum or not when doing type introspection, because it's looking for
something else (e.g. for whether the member is a static member variable), so
I don't know whether suddenly having additional members that cannot have
their address taken will break anything, but any situation where there isn't
a trait that outright tells you what you're looking for makes it highly
likely that any existing code which needed to figure it out did so by trying
out a variety of checks and found some combination of things to check for
being true and some combination of things to check for being false and then
did enough testing to be reasonably sure that that combination of checks
told them what they needed to know, but even if they did get it right,
because it's quite indirect, adding more catogories of things which could
affect introspection will ultimately run a pretty high risk of breaking
_something_.

There's only so much that we can do about that, but I do think that we need
to be very careful about saying that X is the way to test for something and
have any expectation that that's how folks are actually doing it unless that
something is a specific trait from __traits or std.traits which checks for
that exact thing.

- Jonathan M Davis

Apr 29 2024

"Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:

On 23/04/2024 1:01 PM, Walter Bright wrote:
 https://github.com/WalterBright/documents/blob/dcb8caabff76312eee25bb9aa057774d981a5bf7/bitfields.md

Randomly came across people talking about C bitfields, and how the 
non-defined bit layout is causing them problems.

https://twitter.com/__phantomderp/status/1786628836953604201

Turns out even they think they need control over the layout including 
predictable LSB..MSB byte by byte definition.

Making it default to C for the layout is not a good addition to the 
language.

May 03 2024

Walter Bright <newshound2 digitalmars.com> writes:

On 5/3/2024 11:07 PM, Richard (Rikki) Andrew Cattermole wrote:
 On 23/04/2024 1:01 PM, Walter Bright wrote:
 https://github.com/WalterBright/documents/blob/dcb8caabff76312eee25bb9aa057774d981a5bf7/bitfields.md

 
 Randomly came across people talking about C bitfields, and how the non-defined 
 bit layout is causing them problems.
 
 https://twitter.com/__phantomderp/status/1786628836953604201
 
 Turns out even they think they need control over the layout including 
 predictable LSB..MSB byte by byte definition.
 
 Making it default to C for the layout is not a good addition to the language.

All the tweet says is:

```
As they should.
(But now it's time for C and C++ to give users explicit layout control, so that 
eventually we can use our chairs on other more heinous programming criminals.)
```

I've responded thoroughly to every complaint about the layout. The only 
substantive external one is Linus', which is linked to in the DIP, and I 
responded to that, too.

If you don't like bitfields at this point, don't use them. If you need help 
getting a specific layout, post here and I can help you.

May 04 2024

Per =?UTF-8?B?Tm9yZGzDtnc=?= <per.nordlow gmail.com> writes:

On Tuesday, 23 April 2024 at 01:01:11 UTC, Walter Bright wrote:
 https://github.com/WalterBright/documents/blob/dcb8caabff76312eee25bb9aa057774d981a5bf7/bitfields.md

Why not use `.bitsizeof` instead of `.bitwidth`? For the sake of 
conformance with `.sizeof`?

May 06 2024

Walter Bright <newshound2 digitalmars.com> writes:

On 5/6/2024 6:52 AM, Per Nordlöw wrote:
 Why not use `.bitsizeof` instead of `.bitwidth`? For the sake of conformance 
 with `.sizeof`?

Sizeof is in bytes, so I use a different word for number of bits.

May 06 2024

Mike Parker <aldacron gmail.com> writes:

A thread for review of the third draft was opened subsequent to 
this one. Please leave further feedback there:

https://forum.dlang.org/post/v193hc$b9c$1 digitalmars.com

This thread is now closed.

May 21 2024

D Programming

C/C++ Programming

Other

digitalmars.dip.development - second draft: add Bitfields to D