www.digitalmars.com         C & C++   DMDScript  

digitalmars.dip.development - second draft: add Bitfields to D

reply Walter Bright <newshound2 digitalmars.com> writes:
https://github.com/WalterBright/documents/blob/dcb8caabff76312eee25bb9aa057774d981a5bf7/bitfields.md
Apr 22
next sibling parent "Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:
On 23/04/2024 1:01 PM, Walter Bright wrote:
 https://github.com/WalterBright/documents/blob/dcb8caabff76312eee25bb9aa057774d981a5bf7/bitfields.md
"The specific layout of bitfields in C is implementation-defined, and varies between the Digital Mars, Microsoft, and gdc/ldc compilers. gdc/lcd are lumped together because they produce identical results on the same platform." s/lcd/ldc/ Worth mentioning here is that as long as you don't use string mixins attempting semantic is actually pretty cheap to determine compilability. Now that I'm thinking about the fact that its the same entry point internally. Not ideal, will need an example in the specification on how to do this, if there is no trait. But in saying that, you'll need to use a trait anyway, so... ```d T t; enum isNotBitField = !__traits(compiles, &__traits(getMember, t, member)); ``` Not ideal. ```d void main() { Foo t; enum isBitField = !__traits(compiles, &__traits(getMember, t, "member")); pragma(msg, isBitField); } struct Foo { enum member; } ``` Okay yes, not having the trait is a bad idea. It makes introspection capabilities of D have less capability to determine what a symbol is. I also mentioned this previously, but I want to see std.bitmap.bitfields gone for PhobosV3. Anything that uses string mixins that the user interacts with makes tooling fail with it. This is not an acceptable solution to be recommending to people, we can do significantly better than that. It also means that people have to remember and understand the two separate solutions that we are recommending that in no way are comparable in how they are implemented.
Apr 22
prev sibling next sibling parent reply Steven Schveighoffer <schveiguy gmail.com> writes:
On Tuesday, 23 April 2024 at 01:01:11 UTC, Walter Bright wrote:
 https://github.com/WalterBright/documents/blob/dcb8caabff76312eee25bb9aa057774d981a5bf7/bitfields.md
Suffers from the same major problem as last time - nobody is going to be using C bitfield structs from D, yet we are inheriting all the problems. Keeping C compatibility is meaningless. We should pick one way and do it that way for D bitfields. Have you considered that people might build some libraries with ldc, but build applications with dmd? If LDC picks one mechanism for laying out bitfields, but DMD picks a different one, then what happens when you try to use the two together? Do we really want to make D incompatible with itself? This already happens with C. See for instance https://stackoverflow.com/questions/43504113/bitfield-struct-size-different-between-gcc-and-msft-cl Adding more `__traits` is trivial, don't skimp here. Still does not address `sizeof`. The mechanism described to get the bit offset is... horrific. Please just add some `__traits`. -Steve
Apr 26
next sibling parent reply Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:
On Friday, April 26, 2024 9:26:06 AM MDT Steven Schveighoffer via 
dip.development wrote:
 On Tuesday, 23 April 2024 at 01:01:11 UTC, Walter Bright wrote:
 https://github.com/WalterBright/documents/blob/dcb8caabff76312eee25bb9aa05
 7774d981a5bf7/bitfields.md
Suffers from the same major problem as last time - nobody is going to be using C bitfield structs from D, yet we are inheriting all the problems. Keeping C compatibility is meaningless. We should pick one way and do it that way for D bitfields.
C compatability matters a lot for importC and for C bindings in general - not that we have to have a bitfields feature for general D which matches that, but if we don't have a way to match what C does, then we have trouble creating bindings for C code that uses bitfields. extern(C++) code potentially needs the same thing. Personally, binding to C is the primary way that I've ever had to deal with bitfields, and not having the ability to do that has made dealing with such bindings... interesting. Now, if we want to do something like have extern(C) bitfields and extern(D) bitfields so that we can have clean and consistent behavior in normal D code, I'm perfectly fine with that, but I don't agree at all that binding to C doesn't matter. For me at least, that's the primary place that bitfields matter, particularly since I can use other solutions in D if need be, whereas if a C API is designed to use bitfields, then you kind of need support for that in D if you want the bindings to work correctly.
 Have you considered that people might build some libraries with
 ldc, but build applications with dmd? If LDC picks one mechanism
 for laying out bitfields, but DMD picks a different one, then
 what happens when you try to use the two together? Do we really
 want to make D incompatible with itself?
Completely aside from this specific issue, isn't it already the case that you can't mix code built with different D compilers? I didn't think that there was any guarantee of ABI compatibility across compilers, and I would fully expect there to be trouble if I built parts of my code with one compiler and other parts with another. I typically get linker errors at work if I fail to clean out the build files when switching between dmd and ldc. - Jonathan M Davis
Apr 27
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 4/27/2024 12:12 AM, Jonathan M Davis wrote:
 Now, if we want to do something like have extern(C) bitfields and extern(D)
 bitfields so that we can have clean and consistent behavior in normal D
 code
D used to have its own function call ABI, because I thought I'd make a clean and consistent one. It turned out, nobody cared about clean and consistent. They wanted C compatibility. For example, debuggers could not handle anything other than what the associated C compiler emitted, regardless of what the debug info spec says. There really is not a clean and consistent layout. There is only C compatibility. Just like we do for endianess and alignment. All of the portability issues people have mentioned are easily dealt with. There is always writing functions that do shifts and masks as a last resort. (Shifts and masks is what the code generator does anyway, so this won't cost any performance.)
Apr 27
parent reply Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:
On Sunday, April 28, 2024 12:44:41 AM MDT Walter Bright via dip.development 
wrote:
 On 4/27/2024 12:12 AM, Jonathan M Davis wrote:
 Now, if we want to do something like have extern(C) bitfields and
 extern(D)
 bitfields so that we can have clean and consistent behavior in normal D
 code
D used to have its own function call ABI, because I thought I'd make a clean and consistent one. It turned out, nobody cared about clean and consistent. They wanted C compatibility. For example, debuggers could not handle anything other than what the associated C compiler emitted, regardless of what the debug info spec says. There really is not a clean and consistent layout. There is only C compatibility. Just like we do for endianess and alignment. All of the portability issues people have mentioned are easily dealt with. There is always writing functions that do shifts and masks as a last resort. (Shifts and masks is what the code generator does anyway, so this won't cost any performance.)
In this particular case, as I understand it, there are use cases that definitely need to be able to have a guaranteed bit layout (e.g. serialization code). So, I don't think that this is quite the same situation as something like the call ABI. Even if a particular call ABI might theoretically be better, it's not something that code normally cares about in practice so long as it works, whereas some code will actually care what the exact layout of bitfields is. The call ABI is largely a language implementation detail, whereas the layout of bitfields actually affects the behavior of the code. It seems to me that we're dealing with three use cases here: 1. Code that is specifically binding to C bitfields. It needs to match what the C compiler does, or it won't work. That comes with whatever pros and cons the C layout has, but since the D code needs to match the C layout to work, we just have to deal with whatever the layout is, and realistically, the D code using it should not be written to care what the layout is, because it could differ across OSes and architectures. 2. Code that needs a guaranteed bit layout, because it's actually taking the integers that the bitfields are compacted into and storing them elsewhere (e.g. storing the data on disk or sending it across the network). What C does with bitfields for such code is utterly irrelevant, and it's undesirable to even attempt compatibility. The bits need to be laid out precisely in the way that the programmer indicates. 3. Code that just wants to store bits in a compact manner, and how that's done doesn't particularly matter as long as the code just operates on the individual bitfields and doesn't actually do anything with the integer values that they're compacted into where the layout would matter. For the third use case, it's arguably the case that we'd be better off with a guaranteed bit layout so that it would be consistent across OSes and architectures, and anyone who accidentally wrote code that relied on the bit layout wouldn't have issues as a result (similar to how we make it so that long is guaranteed to be 64 bits across OSes and architectures regardless of what C does; we avoid whole classes of bugs that way). If I understand correctly, it's the issues that come from accidentally relying on the exact bit layout when it's not guaranteed which are why folks like Steven are arguing that it's a terrible idea to follow C's layout. However, it's also true that since such code in theory doesn't care what the bit layout is (since it's just using bitfields for compact storage and not for something like serialization), the third use case could be solved with either C-compatible bitfields or with bitfields which have a guaranteed layout. It would be less error-prone (and thus more desirable) if the bit layout were consistent, but as long as code doesn't accidentally depend on the layout, it shouldn't matter. completely incompatible, and we therefore need separate solutions for them. For C compatibility, the obvious solution is to have the compiler deal with it like this DIP is doing. It already has to deal with C compatibility for a variety of things, and it's just going to be far easier and cleaner to have the compiler set up to provide C-compatible bitfields than it is to try to provide a library solution. I wouldn't expect a library solution to cover all of the possible targets correctly, whereas it should be much more straightforward for the compiler to do it. to be guaranteed. I get the impression that you favor leaving the guaranteed bit layout to a library solution, since you don't think that that use case matters much, whereas you think that C compatibility matters a great deal, and you don't think that the issues with accidentally relying on the layout when it's not guaranteed are a big enough concern to avoid using C bitfields for code that just wants to compact the bits. On the other hand, a number of the folks in this thread don't think that C compatibility matters and don't want the bugs that come from accidentally relying on the bit layout when it's not guaranteed, so they're arguing for just making our bitfields have a guaranteed layout and not worrying about C. Personally, I'm inclined to argue that it would just be better to treat this like we do extern(C++). extern(C++) structs and classes have whatever tweaks are necessary to make them work with C++, whereas extern(D) code does what we want to do with D types. We can do the same with extern(C) bitfields and extern(D) bitfields. That way, we get C compatibility for the code that needs it and a guaranteed bit layout for the code that needs that. And since the guaranteed layout would be the default, we'd largely avoid bugs related to relying on the bit layout when it's not guaranteed. It would be like how D code in general uses long rather than c_long, so normal D code can rely on the size of long and avoid the bugs that come with the type's size varying depending on the target, whereas the code that actually needs C compatibility uses c_long and takes the risks that come with a variable integer size, because it has to. The issues with C bitfields would be restricted to the code that actually needs the compatibility. It would also make it cleaner to write code that has a guaranteed bit layout than it would be a with a library solution, since it could use the nice syntax too rather than treating it as a second-class citizen. However, in terms of what's actually necessary, I think that realistically, extern(C) bitfields need to be in the language like this DIP is proposing, since it's just too risky to do that with a library solution, whereas extern(D) bitfields _can_ be solved with a library solution like they are right now. I don't think that that's the best solution, but it's certainly better than what we have right now, since we don't have C-compatible bitfields anywhere at the moment (outside of a preview switch). In any case, it seems like the core issue that's resulting in most of the debate over this DIP is how important some people think that it is to have a guaranteed bit layout by default so that bugs which come from relying on a layout that isn't guaranteed will be avoided. You don't seem to think that that's much of a concern, whereas some of the other folks think that it's a big concern. Either way, I completely agree that we need a C-compatible solution in the language so that we can sanely bind to C code that uses bitfields. - Jonathan M Davis
Apr 28
next sibling parent "Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:
On 29/04/2024 1:32 AM, Jonathan M Davis wrote:
 In any case, it seems like the core issue that's resulting in most of the
 debate over this DIP is how important some people think that it is to have a
 guaranteed bit layout by default so that bugs which come from relying on a
 layout that isn't guaranteed will be avoided. You don't seem to think that
 that's much of a concern, whereas some of the other folks think that it's a
 big concern.
I'm not sure that anyone cares what the default is. dealing with a binding or serialization that you care and each of those are specialized enough to opt-in to whatever strategy is appropriate. But one thing that has been on my kill list for PhobosV3 is string mixins publicly introducing any new symbols like... bitfields. Simply because auto-completion cannot see it, and may never be able to see it due to the CTFE requirement. https://github.com/LightBender/PhobosV3-Design/discussions/32
Apr 28
prev sibling parent Walter Bright <newshound2 digitalmars.com> writes:
I listed these 3 use cases in the second draft, and it seems we are mostly in 
agreement.

Using bit fields to reduce memory consumption, and to be compatible with C
code, 
is handled by default nicely with the proposal.

Conformance to an externally imposed layout sometimes is necessary, but it is 
much less common. It is almost always easily done with a minor bit of
attention. 
The worst case is writing a shift/mask accessor function, very easy to do. I 
suspect these workarounds are even less effort than reading the spec on how to 
use special syntax for it. Nobody is obliged to use std.bitmanip.bitfield to
get 
the job done.

I can help with any externally defined format anyone is having difficulty with.
Apr 28
prev sibling parent Walter Bright <newshound2 digitalmars.com> writes:
On 4/26/2024 8:26 AM, Steven Schveighoffer wrote:
 Suffers from the same major problem as last time - nobody is going to be using
C 
 bitfield structs from D,
I am, as soon as they become available in the D bootstrap compiler. I don't much care for the ugly workarounds used currently.
 yet we are inheriting all the problems.
There aren't any problems if one is using bitfields for reducing memory consumption or for C compatibility.
 Keeping C compatibility is meaningless.
In the D compiler source code, it means gcd and ldc with their C++ backends won't have any issues with it.
 Have you considered that people might build some libraries with ldc, but build 
 applications with dmd? If LDC picks one mechanism for laying out bitfields,
but 
 DMD picks a different one, then what happens when you try to use the two 
 together? Do we really want to make D incompatible with itself?
I have considered that. dmd will pick the same layout as the associated C compiler, which is gcc (used by gdc), and clang (used by ldc).
 This already happens with C. See for instance 
 https://stackoverflow.com/questions/43504113/bitfield-struct-size-different-between-gcc-and-msft-cl
Can you even mix/match object files between vc and gdc, or vc and ldc, anyway? dmd on Windows generates DMC layout for -m32, and VC layout for -m64 and -m32mscoff
 Adding more `__traits` is trivial, don't skimp here.
Can be added later. The point is, the information is available.
 Still does not address `sizeof`.
Oops forgot that. It would return the size of the bitfield's type.
 The mechanism described to get the bit offset is... horrific. Please just add 
 some `__traits`.
It can be added later. But in general it is not a good idea to add things that are deducible from existing things. In this case, it's a loop. A function could be written to do it.
Apr 27
prev sibling next sibling parent reply Adam Wilson <flyboynw gmail.com> writes:
On Tuesday, 23 April 2024 at 01:01:11 UTC, Walter Bright wrote:
 https://github.com/WalterBright/documents/blob/dcb8caabff76312eee25bb9aa057774d981a5bf7/bitfields.md
I would approve this because we gain C compatibility and we can drop the `std.bitmanip.bitfields` type entirely from Phobos 3.
Apr 27
parent Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:
On Saturday, April 27, 2024 6:31:37 PM MDT Adam Wilson via dip.development 
wrote:
 On Tuesday, 23 April 2024 at 01:01:11 UTC, Walter Bright wrote:
 https://github.com/WalterBright/documents/blob/dcb8caabff76312eee25bb9aa05
 7774d981a5bf7/bitfields.md
I would approve this because we gain C compatibility and we can drop the `std.bitmanip.bitfields` type entirely from Phobos 3.
Actually, it doesn't fix the need for std.bitmanip.bitfields, though it does reduce it. Use cases that need a guaranteed layout (e.g. for serialization) won't work with C-compatible bitfields, because the layout could change depending on the target platform. So adding this feature to the language doesn't help them at all, and they'd still need something like the Phobos solution. Of course, this DIP helps quite a bit with regards to C bindings (which the Phobos solution does not help with), because those cases need to match the C layout rather than guaranteeing a layout that will be the same across all OSes and architectures. This DIP could also be used in cases where you don't care what C is doing, but you also don't care exactly how the bitfields are laid out. So, it would reduce the need for a Phobos solution, but it doesn't replace it. - Jonathan M Davis
Apr 27
prev sibling next sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 4/23/24 03:01, Walter Bright wrote:
 https://github.com/WalterBright/documents/blob/dcb8caabff76312eee25bb9aa057774d981a5bf7/bitfields.md
 In practice, however, if one sticks to int, uint, long and ulong bitfields,
they are laid out the same.
Maybe only those cases should be allowed without `extern(C)`. I think that might be an ok compromise. However, I would still much prefer a solution that explicitly introduces the underlying `int`, `uint`, `long` and `ulong` fields, which would be the ones visible to introspection in a portable way, so that introspection code does not really need to concern itself with bitfields at all if it is not important and we do not break existing introspection libraries, such as all serialization libraries.
 Symbolic Debug Info
This does not seem like a strong argument. I am pretty confident debug info can work pretty well regardless of how D lays out the bits.
 ["a", "b", "c"]
 ["a", "_b_c_d_bf", "b", "b_min", "b_max", "c", "c_min", "c_max", "d", "d_min",
"d_max"]
I like that the members are not as cluttered. I guess maybe some people still would like to access the underlying data (e.g., to implement a pointer to bitfield as a struct with a pointer plus bit offset and bit length, or something), so perhaps you could add a note that explains how to do that. You forgot to say what `.tupleof` will do for a struct with bitfields in it.
 There isn't a specific trait for "is this a bitfield".
I think it would be better to have such a `__traits` even just for discoverability when people look at the `__traits` page to implement some introspection code.
 testing to see if the address of a field can be taken, enables discovery of a
bitfield.
Not really, a field could be an `enum` field, and you cannot take the address of that either. And if we ever add another feature that has fields whose address can be taken, existing introspection code may break. It is better to be explicit.
 The values of .max or .min enable determining the number of bits in a bitfield.
I do not like this a lot, it does not seem like the canonical way to determine it. `.bitlength`?
 
 The bit offset can be introspected by summing the number of bits in each
preceding bitfield that has the same value of .offsetof.
I think it would be much better to just add a `__trait` for this or add something like `.bitoffsetof`. This is a) much more user friendly and b) is a bit more likely to work reliably in practice. D currently does not give any guarantees on the order you will see members when using `__traits(allMembers, ...)`.
Apr 28
next sibling parent Timon Gehr <timon.gehr gmx.ch> writes:
On 4/29/24 00:30, Timon Gehr wrote:
 On 4/23/24 03:01, Walter Bright wrote:
 https://github.com/WalterBright/documents/blob/dcb8caabff76312eee25bb9aa057774d981a5bf7/bitfields.md
... However, I would still much prefer a solution that explicitly introduces the underlying `int`, `uint`, `long` and `ulong` fields, which would be the ones visible to introspection in a portable way, so that introspection code does not really need to concern itself with bitfields at all if it is not important and we do not break existing introspection libraries, such as all serialization libraries. ...
This also renders somewhat moot the following claims from the DIP:
 This is an additive feature and does not break any existing code. Its use is
entirely optional.
I get that combinations of code that exist today won't break, but it still does break libraries that do "just works" serialization if new code uses that library with bitfields, and the breakage might be silent.
Apr 28
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 4/28/2024 3:30 PM, Timon Gehr wrote:
 However, I would still much prefer a solution that explicitly introduces the 
 underlying `int`, `uint`, `long` and `ulong` fields, which would be the ones 
 visible to introspection in a portable way, so that introspection code does
not 
 really need to concern itself with bitfields at all if it is not important and 
 we do not break existing introspection libraries, such as all serialization 
 libraries.
I doubt introspection libraries would break. If they are not checking for bitfields, but are just looking at .offsetof and the type, they'll interpret the bitfields as a union (which, in a way, is accurate).
 Symbolic Debug Info
This does not seem like a strong argument. I am pretty confident debug info can work pretty well regardless of how D lays out the bits.
I'm not. I'd follow the dwarf spec and it didn't work, because the only thing that was ever tested was apparently what the C compiler actually did. In order to get gdb to work, I wound up ignoring the spec and doing what gcc did. It's the same with object file formats. The spec is somewhat of a fairy tale, it's what the associated C compiler actually does that matters.
 I like that the members are not as cluttered. I guess maybe some people still 
 would like to access the underlying data (e.g., to implement a pointer to 
 bitfield as a struct with a pointer plus bit offset and bit length, or 
 something), so perhaps you could add a note that explains how to do that.
Pointer to bitfields will work just the same as they do in C. I don't understand what you're asking for.
 You forgot to say what `.tupleof` will do for a struct with bitfields in it.
They do exactly what you'd expect them to do: ``` import std.stdio; struct S { int a:4, b:5; } void main() { S s; s.a = 7; s.b = 9; writeln(s.tupleof); } ``` prints: ``` 79 ``` It's not necessary to specify this, because this behavior does not diverge from field access semantics. Only things that differ need to be specified. Specifying "it works like X except for A,B,C" is a lot more reliable and compact than reiterating everything X does.
 I think it would be better to have such a `__traits` even just for 
 discoverability when people look at the `__traits` page to implement some 
 introspection code.
There isn't for other members, it's just "allMembers".
 testing to see if the address of a field can be taken, enables discovery of a 
 bitfield.
Not really, a field could be an `enum` field, and you cannot take the address of that either. And if we ever add another feature that has fields whose address can be taken, existing introspection code may break. It is better to be explicit.
An enum is distinguished by it not being possible to use .offsetof with it.
 The values of .max or .min enable determining the number of bits in a bitfield.
I do not like this a lot, it does not seem like the canonical way to determine it. `.bitlength`?
I agree it's a bit(!) jarring at first blush, but it's easy and perfectly reliable. 7 and 15 are always going to be a 4 bit field. We do a lot of introspection via indirect things like this.
 The bit offset can be introspected by summing the number of bits in each 
 preceding bitfield that has the same value of .offsetof.
I think it would be much better to just add a `__trait` for this or add something like `.bitoffsetof`. This is a) much more user friendly and b) is a bit more likely to work reliably in practice. D currently does not give any guarantees on the order you will see members when using `__traits(allMembers, ...)`.
I overlooked that bitfields can have holes in them, so probably something like .bitoffsetof is probably necessary.
Apr 28
next sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 4/29/24 08:44, Walter Bright wrote:
 On 4/28/2024 3:30 PM, Timon Gehr wrote:
 However, I would still much prefer a solution that explicitly 
 introduces the underlying `int`, `uint`, `long` and `ulong` fields, 
 which would be the ones visible to introspection in a portable way, so 
 that introspection code does not really need to concern itself with 
 bitfields at all if it is not important and we do not break existing 
 introspection libraries, such as all serialization libraries.
I doubt introspection libraries would break.
You are breaking even simple patterns like `foreach(ref field;s.tupleof){ }`. It would be a miracle if libraries did not break.
 If they are not checking 
 for bitfields, but are just looking at .offsetof and the type, they'll 
 interpret the bitfields as a union (which, in a way, is accurate).
 ...
No, it is not accurate.
 ...
 I like that the members are not as cluttered. I guess maybe some 
 people still would like to access the underlying data (e.g., to 
 implement a pointer to bitfield as a struct with a pointer plus bit 
 offset and bit length, or something), so perhaps you could add a note 
 that explains how to do that.
Pointer to bitfields will work just the same as they do in C. I don't understand what you're asking for. ...
Well, you can't take a pointer to a bitfield.
 You forgot to say what `.tupleof` will do for a struct with bitfields 
 in it.
They do exactly what you'd expect them to do: ``` import std.stdio; struct S { int a:4, b:5; } void main() {     S s;     s.a = 7;     s.b = 9;     writeln(s.tupleof); } ``` prints: ``` 79 ``` It's not necessary to specify this,
Well, so far everything in `.tupleof` had an address. It should at least be mentioned in the DIP, if nowhere else you should put it in the breaking language changes section.
 because this behavior does not 
 diverge from field access semantics.
There is a difference between a DIP (that can change the language) and the specification (that can indeed be written in a way that does not explicitly mention bitfields under the `.tupleof` documentation.)
 ...
 
 I think it would be better to have such a `__traits` even just for 
 discoverability when people look at the `__traits` page to implement 
 some introspection code.
There isn't for other members, it's just "allMembers". ...
Despite not being very relevant to what I was asking for, this is simply untrue. `allMembers` gives you the members, and `.tupleof` gives you the fields.
 
 testing to see if the address of a field can be taken, enables 
 discovery of a bitfield.
Not really, a field could be an `enum` field, and you cannot take the address of that either. And if we ever add another feature that has fields whose address can be taken, existing introspection code may break. It is better to be explicit.
An enum is distinguished by it not being possible to use .offsetof with it. ...
Well, if you are trying to deliberately make introspection unnecessarily complicated, I guess that's your prerogative.
 
 The values of .max or .min enable determining the number of bits in a 
 bitfield.
I do not like this a lot, it does not seem like the canonical way to determine it. `.bitlength`?
I agree it's a bit(!) jarring at first blush, but it's easy and perfectly reliable. 7 and 15 are always going to be a 4 bit field. We do a lot of introspection via indirect things like this. ...
All of those things are ugly hacks. This kind of brain teaser is how metaprogramming works (or increasingly: used to work) in C++, but I think it is not very wise to continue this tradition in D.
 
 The bit offset can be introspected by summing the number of bits in 
 each preceding bitfield that has the same value of .offsetof.
I think it would be much better to just add a `__trait` for this or add something like `.bitoffsetof`. This is a) much more user friendly and b) is a bit more likely to work reliably in practice. D currently does not give any guarantees on the order you will see members when using `__traits(allMembers, ...)`.
I overlooked that bitfields can have holes in them, so probably something like .bitoffsetof is probably necessary.
Sounds good.
Apr 29
next sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 4/29/24 14:04, Timon Gehr wrote:
 Pointer to bitfields will work just the same as they do in C. I don't 
 understand what you're asking for.
 ...
Well, you can't take a pointer to a bitfield.
Forgot to fully answer this. I am asking for example code how you would implement a function that gives you a "fat pointer" to a bitfield that lets you read and write from that bitfield. It cannot be the same as in C, as I think this inherently requires introspection.
Apr 29
parent Walter Bright <newshound2 digitalmars.com> writes:
On 4/29/2024 5:07 AM, Timon Gehr wrote:
 I am asking for example code how you would implement a function that gives you
a 
 "fat pointer" to a bitfield that lets you read and write from that bitfield.
The fat pointer in D is a delegate, and that's how I'd do it.
Apr 29
prev sibling next sibling parent reply Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:
On Monday, April 29, 2024 6:04:13 AM MDT Timon Gehr via dip.development wrote:
 On 4/29/24 08:44, Walter Bright wrote:
 On 4/28/2024 3:30 PM, Timon Gehr wrote:
 However, I would still much prefer a solution that explicitly
 introduces the underlying `int`, `uint`, `long` and `ulong` fields,
 which would be the ones visible to introspection in a portable way, so
 that introspection code does not really need to concern itself with
 bitfields at all if it is not important and we do not break existing
 introspection libraries, such as all serialization libraries.
I doubt introspection libraries would break.
You are breaking even simple patterns like `foreach(ref field;s.tupleof){ }`. It would be a miracle if libraries did not break.
druntime and Phobos both specifically uses tupleof to look for the actual members of a type which take up storage space in that type and whose address can be taken. Traits such as std.traits.Fields do that and document it as such. If bitfields show up as part of tupleof, I would fully expect that to cause problems with any type introspection that operates on the member variables of a type. The breakage may be minimal in practice due to the fact that bitfields aren't currently part of the language, and it's only new code which would encounter this problem, but any existing type introspection code looking at fields is going to expect that all of those fields take up storage space and that their address can be taken, so if it's given a type which has bitfields, and those show up in tupleof, that code is not going to work correctly. Such code does already need to take unions into account (and there is _some_ similarity between those and bitfields), but it's going to have done that by checking things like is(T == union), which won't help with bitfields at all. And really, even if bitfields matched that, you wouldn't necessarily get the right result anyway, because while both bitfields and unions have members which are not proper fields on their own, the way they behave and take up space in the type is completely different. Maybe we should add a check for bitfields? Presumably, it would have to be something more like __traits(isBitfield, member), since unlike with a union, you can't check the type, and we're not adding a bitfields keyword, but regardless of how you'd check whether something is a bitfield, existing type introspection code is going to have to be updated in some fashion to take bitfields into account, or it's going to do the wrong thing when it's given a type that has bitfields. There's no way that bitfields are going to just magically work correctly with code that does type introspection. It does make sense that __traits(allMembers, T) would give you the bitfields, but I don't think that it makes sense that tupleof would, since you cannot take their addresses, but either way, it _will_ break Phobos code if tupleof gives bitfields - and not in a way that would be easily detected, because doing so would require having tests that used bitfields, which of course, don't exist, because bitfields have to be added first. - Jonathan M Davis
Apr 29
parent Walter Bright <newshound2 digitalmars.com> writes:
On 4/29/2024 8:54 AM, Jonathan M Davis wrote:
 Such code does already need to take unions into account (and there is _some_
 similarity between those and bitfields), but it's going to have done that by
 checking things like is(T == union), which won't help with bitfields at all.
Using `is(T==union)` is incomplete because anonymous unions are not fields. The compiler doesn't do a union check internally for that reason. The correct check would be: ``` if (S.a.offsetof + typeof(S.a).sizeof <= S.b.offsetof || S.b.offsetof + typeof(S.b).sizeof <= S.a.offsetof) { // S.a and S.b do not overlap } else { // S.a and S.b overlap } ``` This will work without change for bitfields.
Apr 29
prev sibling next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 4/29/2024 5:04 AM, Timon Gehr wrote:
 You are breaking even simple patterns like
 `foreach(ref field;s.tupleof){ }`.
Let's see what happens. ``` import core.stdc.stdio; struct S { int a; int b; enum int c = 3; int d:3, e:4; } void main() { S s; s.a = 1; s.b = 2; s.d = 3; s.e = 4; foreach (ref f; s.tupleof) { printf("%d\n", f); } } ``` which prints: ``` 1 2 3 4 ``` What is going on here? foreach over a tuple is not really a loop. It's a shorthand for a sequence of statements that the compiler unrolls the "loop" into. When the compiler sees a 'ref' for something that cannot have its address taken, it ignores the 'ref'. This can also be seen with: ``` foreach(ref i; 0 .. 10) { } ``` which works. You can see this in action when compiling with -vasm. Additionally, for such unrolled loops the 'f' is not a loop variable. It is a new variable created for each unroll of the loop. You can see this with: ``` import core.stdc.stdio; struct S { int a; int b; enum int c = 3; int d:3, e:4; } void main() { S s; s.a = 1; s.b = 2; s.d = 3; s.e = 4; foreach (ref f; s.tupleof) { foo(f); } foreach (ref f; s.tupleof) { printf("%d\n", f); } } void foo(ref int f) { printf("f: %d\n", f); ++f; } ``` where s.a and s.b get incremented, but s.d and s.e do not. I do not recall exactly why this `ref` behavior was done for foreach, but it was either a mistake or was done precisely to make generic code work. Either way, what's done is done, and there doesn't seem to be much point in breaking it.
Apr 29
parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 4/30/24 05:14, Walter Bright wrote:
 I do not recall exactly why this `ref` behavior was done for foreach, 
 but it was either a mistake or was done precisely to make generic code work.
You mean fail silently.
Apr 30
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 4/30/2024 3:00 AM, Timon Gehr wrote:
 On 4/30/24 05:14, Walter Bright wrote:
 I do not recall exactly why this `ref` behavior was done for foreach, but it 
 was either a mistake or was done precisely to make generic code work.
You mean fail silently.
No, I did not mean that.
May 03
parent Timon Gehr <timon.gehr gmx.ch> writes:
On 5/4/24 03:45, Walter Bright wrote:
 On 4/30/2024 3:00 AM, Timon Gehr wrote:
 On 4/30/24 05:14, Walter Bright wrote:
 I do not recall exactly why this `ref` behavior was done for foreach, 
 but it was either a mistake or was done precisely to make generic 
 code work.
You mean fail silently.
No, I did not mean that.
Well, that is what it will do if you assign a value to the `ref` iteration variable.
May 04
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 4/29/2024 5:04 AM, Timon Gehr wrote:
 If they are not checking for bitfields, but are just looking at .offsetof and 
 the type, they'll interpret the bitfields as a union (which, in a way, is 
 accurate).
 ...
No, it is not accurate.
Getting and setting bit fields reads/writes all the bits in the underlying field, so it definitely is like a union. std.bitmanip.bitfields also implements it as a union, because there is no other way. The CPU does not provide any instructions to access bit fields. (This is why atomics won't work on bitfields.) If the user of bitfields does not understand the underlying physical reality of bitfields, they will forever have problems with them. Just like programmers who do not understand the physical reality of pointers, floating point, 2s complement, etc., are always crippled and would probably be better off using Excel as their programming language :-/
 Pointer to bitfields will work just the same as they do in C. I don't 
 understand what you're asking for.
Well, you can't take a pointer to a bitfield.
Exactly what I meant!
 Well, so far everything in `.tupleof` had an address.
When you mentioned enums not having an address, I had assumed you were talking about __traits(allMembers). .tupleof skips over enums.
 It should at least be mentioned in the DIP, if nowhere else you should put it
in 
 the breaking language changes section.
I can mention it, sure.
 Well, if you are trying to deliberately make introspection unnecessarily 
 complicated, I guess that's your prerogative.
__traits has an ugly syntax. The idea was to provide the ability, and the user (or Phobos) would put a pretty face on it.
 All of those things are ugly hacks. This kind of brain teaser is how 
 metaprogramming works (or increasingly: used to work) in C++, but I think it
is 
 not very wise to continue this tradition in D.
std.traits definitely continues the tradition. While I'm fine with ugly implementations in it, std.traits fails to document the behavior of the functions that supposedly put a pretty face on it. I've asked Adam Wilson to consider completely re-engineering std.traits. As long as it is possible to put a pretty face on it, I'm ok with an underlying ugliness in the service of not having N>1 diverse ways to do X.
Apr 29
next sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 4/30/24 05:30, Walter Bright wrote:
 On 4/29/2024 5:04 AM, Timon Gehr wrote:
 If they are not checking for bitfields, but are just looking at 
 .offsetof and the type, they'll interpret the bitfields as a union 
 (which, in a way, is accurate).
 ...
No, it is not accurate.
Getting and setting bit fields reads/writes all the bits in the underlying field, so it definitely is like a union.
No, more than one bitfield is valid at a time even if they have the same offsetof. This is definitely breaking expectations that used to be true.
 std.bitmanip.bitfields also implements it as a union,
No, this is not correct. It implements it as a field with accessors for different groups of bits. The only reason why `union` appears in that file is to support bitfields inside a union. This again highlights that those are not the same thing.
 because there is 
 no other way. The CPU does not provide any instructions to access bit 
 fields. (This is why atomics won't work on bitfields.)
 ...
Sure! I guess this opens the question what happens with bitfields and type qualifiers. The DIP currently says you can have `int`, `uint`, `long` and `ulong` bitfields. Are e.g. `immutable(int)` bitfields allowed? I'd expect `shared(int)` bitfields are not allowed?
 If the user of bitfields does not understand the underlying physical 
 reality of bitfields, they will forever have problems with them. Just 
 like programmers who do not understand the physical reality of pointers, 
 floating point, 2s complement, etc.,
I understand the underlying reality of all of those concepts and I still disagree that interpreting bitfields as a union is correct. There are bitfields and there are unions.
 are always crippled and would 
 probably be better off using Excel as their programming language :-/
 ...
However, this seems like an exaggeration. I think there are programmers who are gainfully employed and fall into neither of those categories.
 
 Pointer to bitfields will work just the same as they do in C. I don't 
 understand what you're asking for.
Well, you can't take a pointer to a bitfield.
Exactly what I meant!
 Well, so far everything in `.tupleof` had an address.
When you mentioned enums not having an address, I had assumed you were talking about __traits(allMembers). .tupleof skips over enums. ...
But it will include bitfields, and not the underlying "physical" variables.
 
 It should at least be mentioned in the DIP, if nowhere else you should 
 put it in the breaking language changes section.
I can mention it, sure. ...
Thanks!
 
 Well, if you are trying to deliberately make introspection 
 unnecessarily complicated, I guess that's your prerogative.
__traits has an ugly syntax. The idea was to provide the ability, and the user (or Phobos) would put a pretty face on it. ...
In practice people often do use `__traits`, either because it is more efficient or wrapping is impossible. In any case, providing exactly what is needed is much simpler.
 
 All of those things are ugly hacks. This kind of brain teaser is how 
 metaprogramming works (or increasingly: used to work) in C++, but I 
 think it is not very wise to continue this tradition in D.
std.traits definitely continues the tradition. While I'm fine with ugly implementations in it, std.traits fails to document the behavior of the functions that supposedly put a pretty face on it. I've asked Adam Wilson to consider completely re-engineering std.traits. As long as it is possible to put a pretty face on it, I'm ok with an underlying ugliness in the service of not having N>1 diverse ways to do X.
I think the main thing is it should be immediately obvious to readers, as it is actually not hard.
Apr 30
next sibling parent reply Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:
On Tuesday, April 30, 2024 7:43:52 AM MDT Timon Gehr via dip.development 
wrote:
 Sure! I guess this opens the question what happens with bitfields and
 type qualifiers. The DIP currently says you can have `int`, `uint`,
 `long` and `ulong` bitfields.

 Are e.g. `immutable(int)` bitfields allowed?
I don't see why not. You shouldn't be able to mutate them, but reading them should be fine, since they're not going to change.
 I'd expect `shared(int)` bitfields are not allowed?
There should be no problem with shared bitfields existing. However, it shouldn't be legal to read them or write them so long as they're shared. But with the preview switch to lock down shared, that's true of any type, including int. Accessing shared data while it's not protected is always a problem. Atomics shouldn't work with bitfields, since they can't, whereas if you protect them with a mutex, you can then temporarily cast away shared and operate on them just like you'd do with any other shared data. So, I don't see why bitfields would be at all special with regards to what needs to happen with shared. - Jonathan M Davis
Apr 30
parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 4/30/24 18:42, Jonathan M Davis wrote:
 On Tuesday, April 30, 2024 7:43:52 AM MDT Timon Gehr via dip.development
 wrote:
 Sure! I guess this opens the question what happens with bitfields and
 type qualifiers. The DIP currently says you can have `int`, `uint`,
 `long` and `ulong` bitfields.

 Are e.g. `immutable(int)` bitfields allowed?
I don't see why not. You shouldn't be able to mutate them, but reading them should be fine, since they're not going to change.
 I'd expect `shared(int)` bitfields are not allowed?
There should be no problem with shared bitfields existing. However, it shouldn't be legal to read them or write them so long as they're shared. But with the preview switch to lock down shared, that's true of any type, including int. Accessing shared data while it's not protected is always a problem. Atomics shouldn't work with bitfields, since they can't, whereas if you protect them with a mutex, you can then temporarily cast away shared and operate on them just like you'd do with any other shared data. So, I don't see why bitfields would be at all special with regards to what needs to happen with shared. - Jonathan M Davis
Well, I am bringing it up because the DIP draft ignores type qualifiers so far (and explicitly only lists unqualified types for support). What is happening with `shared` I think has not been fully pinned down, but last I heard the goal was to get implicit atomics.
Apr 30
next sibling parent Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:
On Tuesday, April 30, 2024 2:48:46 PM MDT Timon Gehr via dip.development 
wrote:
 Well, I am bringing it up because the DIP draft ignores type qualifiers
 so far (and explicitly only lists unqualified types for support). What
 is happening with `shared` I think has not been fully pinned down, but
 last I heard the goal was to get implicit atomics.
Atila was talking about possibly doing implicit atomics, and we may get that, but either way, for types that _can't_ use atomics (like bitfields), it's pretty clearly going to have to be the case that they can't be read or written to without casting if shared is actually going to protect against accessing shared data in a manner which isn't guaranteed to be thread-safe like it's theoretically supposed to and -preview=nosharedaccess is supposed to enforce. So, the normal rules for type qualifiers should apply to bitfields exactly like they would with any other type, and there shouldn't be any surprises here, but yes, if we want to be thorough about things, then the DIP should probably mention what happens with type qualifiers. - Jonathan M Davis
Apr 30
prev sibling parent Walter Bright <newshound2 digitalmars.com> writes:
On 4/30/2024 1:48 PM, Timon Gehr wrote:
 Well, I am bringing it up because the DIP draft ignores type qualifiers so far 
 (and explicitly only lists unqualified types for support). What is happening 
 with `shared` I think has not been fully pinned down, but last I heard the
goal 
 was to get implicit atomics.
Since a bitfield can be part of a shared object, of course the shared type should exist for it. But since you cannot take a reference to a bitfield, you're going to have to dip into system code to manipulate it, and it's up to the programmer to figure out what to do. Immutable means read only. I don't see any issue with that, either.
May 03
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 4/30/2024 6:43 AM, Timon Gehr wrote:
 No, more than one bitfield is valid at a time even if they have the same 
 offsetof. This is definitely breaking expectations that used to be true.
Any expectations that there would be no two fields with the same offset were incorrect anyway, as that is what happens with anonymous unions.
 std.bitmanip.bitfields also implements it as a union,
No, this is not correct. It implements it as a field with accessors for different groups of bits. The only reason why `union` appears in that file is to support bitfields inside a union. This again highlights that those are not the same thing.
They are the same thing, it is not substantive what label is painted on it. Anonymous unions can lay fields on top of each other without explicitly labeling it as a union.
 I understand the underlying reality of all of those concepts and I still 
 disagree that interpreting bitfields as a union is correct. There are
bitfields 
 and there are unions.
There is no value to that distinction. As I replied to Jonathan in this thread, D can have fields laying over the top of each other right now without any unions or bitfields declared.
 But it will include bitfields, and not the underlying "physical" variables.
If the introspection code does not take into account anonymous unions, it's already broken anyway. ``` struct S { union { int a; int b; int c:32; } } ```
May 03
parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 5/4/24 04:07, Walter Bright wrote:

 No, this is not correct. It implements it as a field with accessors 
 for different groups of bits. The only reason why `union` appears in 
 that file is to support bitfields inside a union. This again 
 highlights that those are not the same thing.
They are the same thing, it is not substantive what label is painted on it. Anonymous unions can lay fields on top of each other without explicitly labeling it as a union.
This is explicitly called a `union`, so I really do not see what you are trying to say. Bitfields do not imply union, but they can be in a union, e.g.: ```C++ #include <bits/stdc++.h> using namespace std; struct S{ union{ struct { unsigned int x:1; unsigned int y:1; }; unsigned int z:2; }; }; int main(){ S s; s.z=3; cout<<s.x<<" "<<s.y<<endl; // 1 1 s.x=0; cout<<s.y<<" "<<s.z<<endl; // 1 2 s.y=0; s.x=1; cout<<s.z<<endl; // 1 } ``` Clearly bitfields are a new and distinct way fields can share the same `.offsetof` in D. Before bitfields, such fields would overlap given they both were of a type with a positive size. With bitfields there is no such overlap. BTW: what about `sizeof`? I think in C++ this is disallowed on a bitfield.
May 04
next sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 5/4/24 19:05, Timon Gehr wrote:
 On 5/4/24 04:07, Walter Bright wrote:

 No, this is not correct. It implements it as a field with accessors 
 for different groups of bits. The only reason why `union` appears in 
 that file is to support bitfields inside a union. This again 
 highlights that those are not the same thing.
They are the same thing, it is not substantive what label is painted on it. Anonymous unions can lay fields on top of each other without explicitly labeling it as a union.
This is explicitly called a `union`, so I really do not see what you are trying to say. Bitfields do not imply union, but they can be in a union, e.g.: ```C++ #include <bits/stdc++.h> using namespace std; struct S{     union{         struct {             unsigned int x:1;             unsigned int y:1;         };         unsigned int z:2;     }; }; int main(){     S s;     s.z=3;     cout<<s.x<<" "<<s.y<<endl; // 1 1     s.x=0;     cout<<s.y<<" "<<s.z<<endl; // 1 2     s.y=0;     s.x=1;     cout<<s.z<<endl; // 1 } ``` Clearly bitfields are a new and distinct way fields can share the same `.offsetof` in D. Before bitfields, such fields would overlap given they both were of a type with a positive size. With bitfields there is no such overlap. BTW: what about `sizeof`? I think in C++ this is disallowed on a bitfield.
Another example: ```C++ #include <bits/stdc++.h> using namespace std; struct S{ union{ unsigned int x:1; unsigned int y:1; }; }; int main(){ S s; cout<<s.x<<" "<<s.y<<endl; // 0 0 s.x=1; cout<<s.x<<" "<<s.y<<endl; // 1 1 } ```
May 04
parent Timon Gehr <timon.gehr gmx.ch> writes:
On 5/4/24 19:21, Timon Gehr wrote:
      S s;
I guess this should have been `S s={};` to ensure the POD struct is initialized.
May 04
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 5/4/2024 10:05 AM, Timon Gehr wrote:
 This is explicitly called a `union`, so I really do not see what you are
trying 
 to say.
An anonymous union is simply a way to specify layout. No actual union is created, as there is no point to it. How would one refer to an anonymous union? The same goes for anonymous structs.
 Clearly bitfields are a new and distinct way fields can share the same 
 `.offsetof` in D. Before bitfields, such fields would overlap given they both 
 were of a type with a positive size. With bitfields there is no such overlap.
Bitfields do overlap, which is why they are accessed with shift and mask. Besides, the context here is with existing introspection. Existing introspection will treat them as overlapping fields.
 BTW: what about `sizeof`? I think in C++ this is disallowed on a bitfield.
``` struct S { int b:3; } pragma(msg, S.b.sizeof); ``` prints 4LU, as it applies to the type. To get the field width, use .max, as already discussed.
May 04
parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 5/4/24 22:17, Walter Bright wrote:
 On 5/4/2024 10:05 AM, Timon Gehr wrote:
 This is explicitly called a `union`, so I really do not see what you 
 are trying to say.
An anonymous union is simply a way to specify layout. No actual union is created, as there is no point to it. How would one refer to an anonymous union? The same goes for anonymous structs. ...
Well, now you are simply slicing the terminology in a weird way. `union` and `struct` are tools to lay out data, anonymous or otherwise, whether you generate typeinfo or otherwise.
 
 Clearly bitfields are a new and distinct way fields can share the same 
 `.offsetof` in D. Before bitfields, such fields would overlap given 
 they both were of a type with a positive size. With bitfields there is 
 no such overlap.
Bitfields do overlap, which is why they are accessed with shift and mask. ...
They do not overlap if not put in a union (anonymous or otherwise). Otherwise, changing the value of one bitfield would affect the value of another one. The fact that they occupy space in the same byte and that the processor can only address memory at byte granularity does not imply that the bitfields themselves overlap.
 Besides, the context here is with existing introspection. Existing 
 introspection will treat them as overlapping fields.
 
 
 BTW: what about `sizeof`? I think in C++ this is disallowed on a 
 bitfield.
``` struct S { int b:3; } pragma(msg, S.b.sizeof); ``` prints 4LU, as it applies to the type. To get the field width, use .max, as already discussed.
Well, then code that is set up to work with data using `.tupleof`, `.offsetof` and `.sizeof` will silently break. Whether you acknowledge that or not, it's simply the truth. You are breaking the previous invariant that data in a struct lives at relative addresses `data.offsetof..data.offsetof+data.sizeof`.
May 04
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 5/4/2024 4:01 PM, Timon Gehr wrote:
 You are breaking the previous invariant that data in a struct lives at
relative 
 addresses `data.offsetof..data.offsetof+data.sizeof`.
The data.sizeof for a bitfield will always be the size of the memory object containing the field. The invariant is not broken.
May 04
parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 5/5/24 04:11, Walter Bright wrote:
 On 5/4/2024 4:01 PM, Timon Gehr wrote:
 You are breaking the previous invariant that data in a struct lives at 
 relative addresses `data.offsetof..data.offsetof+data.sizeof`.
The data.sizeof for a bitfield will always be the size of the memory object containing the field. The invariant is not broken.
I do not understand. I thought bitfields are supposed to match the layout of the associated C compiler. Instead, you seem to now be arguing that there should actually be stringent layout guarantees. GCC 11.4.0, clang 14.0.0: ```c #include <stdio.h> struct __attribute__((packed)) S{ long long x:8; }; int main(){ printf("%ld\n",sizeof(long long)); // 8 printf("%ld\n",sizeof(struct S)); // 1 } ``` It indeed seems `dmd 2.108.1` disagrees and gives `8` and `8`, but I guess this is a mistake. In any case, laying the struct out like this is in compliance with C standards even without the additional attribute:
 An implementation may allocate any addressable storage unit large enough to
hold a bit-
 field. If enough space remains, a bit-field that immediately follows another
bit-field in a
 structure shall be packed into adjacent bits of the same unit. If insufficient
space remains,
 whether a bit-field that does not fit is put into the next unit or overlaps
adjacent units is
 implementation-defined. The order of allocation of bit-fields within a unit
(high-order to
 low-order or low-order to high-order) is i
http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1570.pdf Maybe at least redefine `.sizeof` to give the size of the underlying storage unit for a bitfield. Otherwise, you exacerbate the risk of memory corruption due to invalid assumptions about layout.
May 05
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 5/5/2024 2:08 AM, Timon Gehr wrote:
 GCC 11.4.0, clang 14.0.0:
 
 ```c
 #include <stdio.h>
 struct __attribute__((packed)) S{
      long long x:8;
 };
 int main(){
      printf("%ld\n",sizeof(long long)); // 8
      printf("%ld\n",sizeof(struct S)); // 1
 }
 ```
The sizeof there, in both cases, is giving the size in bytes of the memory object the field is a subset of.
 Maybe at least redefine `.sizeof` to give the size of the underlying storage 
 unit for a bitfield.
That's what it's doing in the example. BTW, I didn't implement packed bitfields in ImportC. It never occurred to me :-/ I suppose it should get a bugzilla issue. https://issues.dlang.org/show_bug.cgi?id=24538
May 06
parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 5/6/24 09:14, Walter Bright wrote:
 On 5/5/2024 2:08 AM, Timon Gehr wrote:
 GCC 11.4.0, clang 14.0.0:

 ```c
 #include <stdio.h>
 struct __attribute__((packed)) S{
      long long x:8;
 };
 int main(){
      printf("%ld\n",sizeof(long long)); // 8
      printf("%ld\n",sizeof(struct S)); // 1
 }
 ```
The sizeof there, in both cases, is giving the size in bytes of the memory object the field is a subset of. ...
This is C and neither sizeof is on a memory object, they are both on types. sizeof on x gives a compile error. However, with the DIP, given that you implement packed bitfields in DMD, when importing an example like this one, `x.sizeof` would be eight times as big as the size of the `struct` it is a part of.
 
 Maybe at least redefine `.sizeof` to give the size of the underlying 
 storage unit for a bitfield.
That's what it's doing in the example. ...
Well, the DIP now says `bitfield.sizeof` is `typeof(bitfield).sizeof`. Unless I misunderstand and `typeof(x)` is not `long long`, this should not be the case in this example, because a `long long` is longer than the memory location it is packed into in this case. (I think this is another broken invariant.)
 BTW, I didn't implement packed bitfields in ImportC. It never occurred 
 to me :-/ I suppose it should get a bugzilla issue.
 
 https://issues.dlang.org/show_bug.cgi?id=24538
Well, that will help, but the point was the C standard does not give the guarantees you assumed to hold earlier, and in practice it in fact does not hold, as in this example.
May 06
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 5/6/2024 1:37 PM, Timon Gehr wrote:
 On 5/6/24 09:14, Walter Bright wrote:
 On 5/5/2024 2:08 AM, Timon Gehr wrote:
 GCC 11.4.0, clang 14.0.0:

 ```c
 #include <stdio.h>
 struct __attribute__((packed)) S{
      long long x:8;
 };
 int main(){
      printf("%ld\n",sizeof(long long)); // 8
      printf("%ld\n",sizeof(struct S)); // 1
 }
 ```
The sizeof there, in both cases, is giving the size in bytes of the memory object the field is a subset of. ...
This is C and neither sizeof is on a memory object, they are both on types. sizeof on x gives a compile error. However, with the DIP, given that you implement packed bitfields in DMD, when importing an example like this one, `x.sizeof` would be eight times as big as the size of the `struct` it is a part of.
Since the memory object that x is in is 1 byte, the sizeof would be 1 byte (if I implemented the packed logic).
 Well, the DIP now says `bitfield.sizeof` is `typeof(bitfield).sizeof`.
Yes, as I didn't know about the packed thing then.
 Well, that will help, but the point was the C standard does not give the 
 guarantees you assumed to hold earlier, and in practice it in fact does not 
 hold, as in this example.
The C standard says nothing about __attribte__((packed), and C doesn't allow sizeof on bit fields, so we can make .sizeof work as we like. The most practical thing is to make it mean the size of the memory object the bitfield is a subset of. Unless (unimplemented) packed bitfields are used, the sizeof is the size of the type.
May 06
parent Timon Gehr <timon.gehr gmx.ch> writes:
On 5/7/24 05:39, Walter Bright wrote:
 The most practical thing is to make it mean the size of the memory 
 object the bitfield is a subset of.
I agree that given the C-like bitfield design you seem to have set your mind on, and given the behavior of `.offsetof` this is a decent behavior for `.sizeof`. However, this is not what the DIP currently says, so it should be updated.
 Unless (unimplemented) packed 
 bitfields are used, the sizeof is the size of the type.
The size of the memory object has to be whatever the associated C compiler allocates, and according to the standard, it is in principle allowed to pack by default. There is no guarantee that the memory object is in fact at least as big as the type of the bitfield. I am not familiar with bitfield layout on all platforms that D supports via GDC and LDC, but I would not be surprised if on some of them this is actually an issue in practice.
May 07
prev sibling parent reply Patrick Schluter <Patrick.Schluter bbox.fr> writes:
On Tuesday, 30 April 2024 at 03:30:15 UTC, Walter Bright wrote:
 On 4/29/2024 5:04 AM, Timon Gehr wrote:
 If they are not checking for bitfields, but are just looking 
 at .offsetof and the type, they'll interpret the bitfields as 
 a union (which, in a way, is accurate).
 ...
No, it is not accurate.
Getting and setting bit fields reads/writes all the bits in the underlying field, so it definitely is like a union. std.bitmanip.bitfields also implements it as a union, because there is no other way. The CPU does not provide any instructions to access bit fields.
Not true. x86 provides BMI1 instructions which are present in x86 CPUs at least since 2013. ARM also provides bit field instructions and quite a number of legacy CPU's also had bitfield instructions (m68k, NEC V30, Itanium, PowerPC, etc.). Doesn't change the issues with language bitfields
May 03
next sibling parent reply user1234 <user1234 12.de> writes:
On Friday, 3 May 2024 at 12:52:09 UTC, Patrick Schluter wrote:
 Not true. x86 provides BMI1 instructions which are present in 
 x86 CPUs at least since 2013.
 ARM also provides bit field instructions and quite a number of 
 legacy CPU's also had bitfield instructions (m68k, NEC V30, 
 Itanium, PowerPC, etc.).
 Doesn't change the issues with language bitfields
About BMI/BMI2 it would interesting to see if optimizing compilers actually generate instructions of these extensions for c++ bitfields. I've tried for styx enum-sets, sure that's a bit a special case of bitfields, but so far the only difference visible is a BMI2 `shlxl` emitted. But once again very special case.
May 03
parent user1234 <user1234 12.de> writes:
On Friday, 3 May 2024 at 15:50:42 UTC, user1234 wrote:
 On Friday, 3 May 2024 at 12:52:09 UTC, Patrick Schluter wrote:
 Not true. x86 provides BMI1 instructions which are present in 
 x86 CPUs at least since 2013.
 ARM also provides bit field instructions and quite a number of 
 legacy CPU's also had bitfield instructions (m68k, NEC V30, 
 Itanium, PowerPC, etc.).
 Doesn't change the issues with language bitfields
About BMI/BMI2 it would interesting to see if optimizing compilers actually generate instructions of these extensions for c++ bitfields. I've tried for styx enum-sets, sure that's a bit a special case of bitfields, but so far the only difference visible is a BMI2 `shlxl` emitted. But once again very special case.
![](https://i.imgur.com/Uw6qZ1g.png)
May 03
prev sibling parent Walter Bright <newshound2 digitalmars.com> writes:
On 5/3/2024 5:52 AM, Patrick Schluter wrote:
 Not true. x86 provides BMI1 instructions which are present in x86 CPUs at
least 
 since 2013.
 ARM also provides bit field instructions and quite a number of legacy CPU's
also 
 had bitfield instructions (m68k, NEC V30, Itanium, PowerPC, etc.).
 Doesn't change the issues with language bitfields
I'd be very surprised if it didn't work by reading the entire field first. https://www.felixcloutier.com/x86/bextr
May 03
prev sibling parent Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:
On Monday, April 29, 2024 12:44:08 AM MDT Walter Bright via dip.development 
wrote:
 An enum is distinguished by it not being possible to use .offsetof with it.
I don't think that I have _ever_ seen anyone use offsetof to determine anything with type introspection other than the actual offset. Existing code will almost certainly be using & to determine whether a member is an enum or not. That being said, _usually_, it's the case that code cares when a member is an enum or not when doing type introspection, because it's looking for something else (e.g. for whether the member is a static member variable), so I don't know whether suddenly having additional members that cannot have their address taken will break anything, but any situation where there isn't a trait that outright tells you what you're looking for makes it highly likely that any existing code which needed to figure it out did so by trying out a variety of checks and found some combination of things to check for being true and some combination of things to check for being false and then did enough testing to be reasonably sure that that combination of checks told them what they needed to know, but even if they did get it right, because it's quite indirect, adding more catogories of things which could affect introspection will ultimately run a pretty high risk of breaking _something_. There's only so much that we can do about that, but I do think that we need to be very careful about saying that X is the way to test for something and have any expectation that that's how folks are actually doing it unless that something is a specific trait from __traits or std.traits which checks for that exact thing. - Jonathan M Davis
Apr 29
prev sibling next sibling parent reply "Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:
On 23/04/2024 1:01 PM, Walter Bright wrote:
 https://github.com/WalterBright/documents/blob/dcb8caabff76312eee25bb9aa057774d981a5bf7/bitfields.md
Randomly came across people talking about C bitfields, and how the non-defined bit layout is causing them problems. https://twitter.com/__phantomderp/status/1786628836953604201 Turns out even they think they need control over the layout including predictable LSB..MSB byte by byte definition. Making it default to C for the layout is not a good addition to the language.
May 03
parent Walter Bright <newshound2 digitalmars.com> writes:
On 5/3/2024 11:07 PM, Richard (Rikki) Andrew Cattermole wrote:
 On 23/04/2024 1:01 PM, Walter Bright wrote:
 https://github.com/WalterBright/documents/blob/dcb8caabff76312eee25bb9aa057774d981a5bf7/bitfields.md
Randomly came across people talking about C bitfields, and how the non-defined bit layout is causing them problems. https://twitter.com/__phantomderp/status/1786628836953604201 Turns out even they think they need control over the layout including predictable LSB..MSB byte by byte definition. Making it default to C for the layout is not a good addition to the language.
All the tweet says is: ``` As they should. (But now it's time for C and C++ to give users explicit layout control, so that eventually we can use our chairs on other more heinous programming criminals.) ``` I've responded thoroughly to every complaint about the layout. The only substantive external one is Linus', which is linked to in the DIP, and I responded to that, too. If you don't like bitfields at this point, don't use them. If you need help getting a specific layout, post here and I can help you.
May 04
prev sibling next sibling parent reply Per =?UTF-8?B?Tm9yZGzDtnc=?= <per.nordlow gmail.com> writes:
On Tuesday, 23 April 2024 at 01:01:11 UTC, Walter Bright wrote:
 https://github.com/WalterBright/documents/blob/dcb8caabff76312eee25bb9aa057774d981a5bf7/bitfields.md
Why not use `.bitsizeof` instead of `.bitwidth`? For the sake of conformance with `.sizeof`?
May 06
parent Walter Bright <newshound2 digitalmars.com> writes:
On 5/6/2024 6:52 AM, Per Nordlöw wrote:
 Why not use `.bitsizeof` instead of `.bitwidth`? For the sake of conformance 
 with `.sizeof`?
Sizeof is in bytes, so I use a different word for number of bits.
May 06
prev sibling parent Mike Parker <aldacron gmail.com> writes:
A thread for review of the third draft was opened subsequent to 
this one. Please leave further feedback there:

https://forum.dlang.org/post/v193hc$b9c$1 digitalmars.com

This thread is now closed.
May 21