digitalmars.dip.development - second draft: add Bitfields to D
- Walter Bright (1/1) Apr 22 https://github.com/WalterBright/documents/blob/dcb8caabff76312eee25bb9aa...
- Richard (Rikki) Andrew Cattermole (40/41) Apr 22 "The specific layout of bitfields in C is implementation-defined, and
- Steven Schveighoffer (18/19) Apr 26 Suffers from the same major problem as last time - nobody is
- Jonathan M Davis (24/37) Apr 27 C compatability matters a lot for importC and for C bindings in general ...
- Walter Bright (12/15) Apr 27 D used to have its own function call ABI, because I thought I'd make a c...
- Jonathan M Davis (99/116) Apr 28 In this particular case, as I understand it, there are use cases that
- Richard (Rikki) Andrew Cattermole (10/16) Apr 28 I'm not sure that anyone cares what the default is.
- Walter Bright (11/11) Apr 28 I listed these 3 use cases in the second draft, and it seems we are most...
- Walter Bright (16/30) Apr 27 I am, as soon as they become available in the D bootstrap compiler. I do...
- Adam Wilson (3/4) Apr 27 I would approve this because we gain C compatibility and we can
- Jonathan M Davis (16/21) Apr 27 Actually, it doesn't fix the need for std.bitmanip.bitfields, though it ...
- Timon Gehr (31/41) Apr 28 Maybe only those cases should be allowed without `extern(C)`. I think
- Timon Gehr (5/17) Apr 28 I get that combinations of code that exist today won't break, but it
- Walter Bright (38/73) Apr 28 I doubt introspection libraries would break. If they are not checking fo...
- Timon Gehr (21/108) Apr 29 You are breaking even simple patterns like
- Timon Gehr (7/13) Apr 29 Forgot to fully answer this.
- Walter Bright (2/4) Apr 29 The fat pointer in D is a delegate, and that's how I'd do it.
- Jonathan M Davis (35/48) Apr 29 druntime and Phobos both specifically uses tupleof to look for the actua...
- Walter Bright (16/19) Apr 29 Using `is(T==union)` is incomplete because anonymous unions are not fiel...
- Walter Bright (55/57) Apr 29 Let's see what happens.
- Timon Gehr (2/4) Apr 30 You mean fail silently.
- Walter Bright (2/7) May 03 No, I did not mean that.
- Timon Gehr (3/12) May 04 Well, that is what it will do if you assign a value to the `ref`
- Walter Bright (22/39) Apr 29 Getting and setting bit fields reads/writes all the bits in the underlyi...
- Timon Gehr (24/85) Apr 30 No, more than one bitfield is valid at a time even if they have the same...
- Jonathan M Davis (15/20) Apr 30 I don't see why not. You shouldn't be able to mutate them, but reading t...
- Timon Gehr (5/34) Apr 30 Well, I am bringing it up because the DIP draft ignores type qualifiers
- Jonathan M Davis (14/18) Apr 30 Atila was talking about possibly doing implicit atomics, and we may get
- Walter Bright (6/10) May 03 Since a bitfield can be part of a shared object, of course the shared ty...
- Walter Bright (22/33) May 03 Any expectations that there would be no two fields with the same offset ...
- Timon Gehr (32/41) May 04 This is explicitly called a `union`, so I really do not see what you are...
- Timon Gehr (18/66) May 04 Another example:
- Timon Gehr (3/4) May 04 I guess this should have been `S s={};` to ensure the POD struct is
- Walter Bright (13/19) May 04 An anonymous union is simply a way to specify layout. No actual union is...
- Timon Gehr (14/43) May 04 Well, now you are simply slicing the terminology in a weird way. `union`...
- Walter Bright (3/5) May 04 The data.sizeof for a bitfield will always be the size of the memory obj...
- Timon Gehr (23/35) May 05 I do not understand. I thought bitfields are supposed to match the
- Walter Bright (7/21) May 06 The sizeof there, in both cases, is giving the size in bytes of the memo...
- Timon Gehr (14/41) May 06 This is C and neither sizeof is on a memory object, they are both on
- Walter Bright (9/36) May 06 Since the memory object that x is in is 1 byte, the sizeof would be 1 by...
- Timon Gehr (12/16) May 07 I agree that given the C-like bitfield design you seem to have set your
- Patrick Schluter (7/19) May 03 Not true. x86 provides BMI1 instructions which are present in x86
- user1234 (6/12) May 03 About BMI/BMI2 it would interesting to see if optimizing
- user1234 (2/15) May 03 ![](https://i.imgur.com/Uw6qZ1g.png)
- Walter Bright (3/8) May 03 I'd be very surprised if it didn't work by reading the entire field firs...
- Jonathan M Davis (26/27) Apr 29 I don't think that I have _ever_ seen anyone use offsetof to determine
- Richard (Rikki) Andrew Cattermole (8/9) May 03 Randomly came across people talking about C bitfields, and how the
- Walter Bright (12/24) May 04 All the tweet says is:
- Per =?UTF-8?B?Tm9yZGzDtnc=?= (3/4) May 06 Why not use `.bitsizeof` instead of `.bitwidth`? For the sake of
- Walter Bright (2/4) May 06 Sizeof is in bytes, so I use a different word for number of bits.
- Mike Parker (4/4) May 21 A thread for review of the third draft was opened subsequent to
https://github.com/WalterBright/documents/blob/dcb8caabff76312eee25bb9aa057774d981a5bf7/bitfields.md
Apr 22
On 23/04/2024 1:01 PM, Walter Bright wrote:https://github.com/WalterBright/documents/blob/dcb8caabff76312eee25bb9aa057774d981a5bf7/bitfields.md"The specific layout of bitfields in C is implementation-defined, and varies between the Digital Mars, Microsoft, and gdc/ldc compilers. gdc/lcd are lumped together because they produce identical results on the same platform." s/lcd/ldc/ Worth mentioning here is that as long as you don't use string mixins attempting semantic is actually pretty cheap to determine compilability. Now that I'm thinking about the fact that its the same entry point internally. Not ideal, will need an example in the specification on how to do this, if there is no trait. But in saying that, you'll need to use a trait anyway, so... ```d T t; enum isNotBitField = !__traits(compiles, &__traits(getMember, t, member)); ``` Not ideal. ```d void main() { Foo t; enum isBitField = !__traits(compiles, &__traits(getMember, t, "member")); pragma(msg, isBitField); } struct Foo { enum member; } ``` Okay yes, not having the trait is a bad idea. It makes introspection capabilities of D have less capability to determine what a symbol is. I also mentioned this previously, but I want to see std.bitmap.bitfields gone for PhobosV3. Anything that uses string mixins that the user interacts with makes tooling fail with it. This is not an acceptable solution to be recommending to people, we can do significantly better than that. It also means that people have to remember and understand the two separate solutions that we are recommending that in no way are comparable in how they are implemented.
Apr 22
On Tuesday, 23 April 2024 at 01:01:11 UTC, Walter Bright wrote:https://github.com/WalterBright/documents/blob/dcb8caabff76312eee25bb9aa057774d981a5bf7/bitfields.mdSuffers from the same major problem as last time - nobody is going to be using C bitfield structs from D, yet we are inheriting all the problems. Keeping C compatibility is meaningless. We should pick one way and do it that way for D bitfields. Have you considered that people might build some libraries with ldc, but build applications with dmd? If LDC picks one mechanism for laying out bitfields, but DMD picks a different one, then what happens when you try to use the two together? Do we really want to make D incompatible with itself? This already happens with C. See for instance https://stackoverflow.com/questions/43504113/bitfield-struct-size-different-between-gcc-and-msft-cl Adding more `__traits` is trivial, don't skimp here. Still does not address `sizeof`. The mechanism described to get the bit offset is... horrific. Please just add some `__traits`. -Steve
Apr 26
On Friday, April 26, 2024 9:26:06 AM MDT Steven Schveighoffer via dip.development wrote:On Tuesday, 23 April 2024 at 01:01:11 UTC, Walter Bright wrote:C compatability matters a lot for importC and for C bindings in general - not that we have to have a bitfields feature for general D which matches that, but if we don't have a way to match what C does, then we have trouble creating bindings for C code that uses bitfields. extern(C++) code potentially needs the same thing. Personally, binding to C is the primary way that I've ever had to deal with bitfields, and not having the ability to do that has made dealing with such bindings... interesting. Now, if we want to do something like have extern(C) bitfields and extern(D) bitfields so that we can have clean and consistent behavior in normal D code, I'm perfectly fine with that, but I don't agree at all that binding to C doesn't matter. For me at least, that's the primary place that bitfields matter, particularly since I can use other solutions in D if need be, whereas if a C API is designed to use bitfields, then you kind of need support for that in D if you want the bindings to work correctly.https://github.com/WalterBright/documents/blob/dcb8caabff76312eee25bb9aa05 7774d981a5bf7/bitfields.mdSuffers from the same major problem as last time - nobody is going to be using C bitfield structs from D, yet we are inheriting all the problems. Keeping C compatibility is meaningless. We should pick one way and do it that way for D bitfields.Have you considered that people might build some libraries with ldc, but build applications with dmd? If LDC picks one mechanism for laying out bitfields, but DMD picks a different one, then what happens when you try to use the two together? Do we really want to make D incompatible with itself?Completely aside from this specific issue, isn't it already the case that you can't mix code built with different D compilers? I didn't think that there was any guarantee of ABI compatibility across compilers, and I would fully expect there to be trouble if I built parts of my code with one compiler and other parts with another. I typically get linker errors at work if I fail to clean out the build files when switching between dmd and ldc. - Jonathan M Davis
Apr 27
On 4/27/2024 12:12 AM, Jonathan M Davis wrote:Now, if we want to do something like have extern(C) bitfields and extern(D) bitfields so that we can have clean and consistent behavior in normal D codeD used to have its own function call ABI, because I thought I'd make a clean and consistent one. It turned out, nobody cared about clean and consistent. They wanted C compatibility. For example, debuggers could not handle anything other than what the associated C compiler emitted, regardless of what the debug info spec says. There really is not a clean and consistent layout. There is only C compatibility. Just like we do for endianess and alignment. All of the portability issues people have mentioned are easily dealt with. There is always writing functions that do shifts and masks as a last resort. (Shifts and masks is what the code generator does anyway, so this won't cost any performance.)
Apr 27
On Sunday, April 28, 2024 12:44:41 AM MDT Walter Bright via dip.development wrote:On 4/27/2024 12:12 AM, Jonathan M Davis wrote:In this particular case, as I understand it, there are use cases that definitely need to be able to have a guaranteed bit layout (e.g. serialization code). So, I don't think that this is quite the same situation as something like the call ABI. Even if a particular call ABI might theoretically be better, it's not something that code normally cares about in practice so long as it works, whereas some code will actually care what the exact layout of bitfields is. The call ABI is largely a language implementation detail, whereas the layout of bitfields actually affects the behavior of the code. It seems to me that we're dealing with three use cases here: 1. Code that is specifically binding to C bitfields. It needs to match what the C compiler does, or it won't work. That comes with whatever pros and cons the C layout has, but since the D code needs to match the C layout to work, we just have to deal with whatever the layout is, and realistically, the D code using it should not be written to care what the layout is, because it could differ across OSes and architectures. 2. Code that needs a guaranteed bit layout, because it's actually taking the integers that the bitfields are compacted into and storing them elsewhere (e.g. storing the data on disk or sending it across the network). What C does with bitfields for such code is utterly irrelevant, and it's undesirable to even attempt compatibility. The bits need to be laid out precisely in the way that the programmer indicates. 3. Code that just wants to store bits in a compact manner, and how that's done doesn't particularly matter as long as the code just operates on the individual bitfields and doesn't actually do anything with the integer values that they're compacted into where the layout would matter. For the third use case, it's arguably the case that we'd be better off with a guaranteed bit layout so that it would be consistent across OSes and architectures, and anyone who accidentally wrote code that relied on the bit layout wouldn't have issues as a result (similar to how we make it so that long is guaranteed to be 64 bits across OSes and architectures regardless of what C does; we avoid whole classes of bugs that way). If I understand correctly, it's the issues that come from accidentally relying on the exact bit layout when it's not guaranteed which are why folks like Steven are arguing that it's a terrible idea to follow C's layout. However, it's also true that since such code in theory doesn't care what the bit layout is (since it's just using bitfields for compact storage and not for something like serialization), the third use case could be solved with either C-compatible bitfields or with bitfields which have a guaranteed layout. It would be less error-prone (and thus more desirable) if the bit layout were consistent, but as long as code doesn't accidentally depend on the layout, it shouldn't matter. completely incompatible, and we therefore need separate solutions for them. For C compatibility, the obvious solution is to have the compiler deal with it like this DIP is doing. It already has to deal with C compatibility for a variety of things, and it's just going to be far easier and cleaner to have the compiler set up to provide C-compatible bitfields than it is to try to provide a library solution. I wouldn't expect a library solution to cover all of the possible targets correctly, whereas it should be much more straightforward for the compiler to do it. to be guaranteed. I get the impression that you favor leaving the guaranteed bit layout to a library solution, since you don't think that that use case matters much, whereas you think that C compatibility matters a great deal, and you don't think that the issues with accidentally relying on the layout when it's not guaranteed are a big enough concern to avoid using C bitfields for code that just wants to compact the bits. On the other hand, a number of the folks in this thread don't think that C compatibility matters and don't want the bugs that come from accidentally relying on the bit layout when it's not guaranteed, so they're arguing for just making our bitfields have a guaranteed layout and not worrying about C. Personally, I'm inclined to argue that it would just be better to treat this like we do extern(C++). extern(C++) structs and classes have whatever tweaks are necessary to make them work with C++, whereas extern(D) code does what we want to do with D types. We can do the same with extern(C) bitfields and extern(D) bitfields. That way, we get C compatibility for the code that needs it and a guaranteed bit layout for the code that needs that. And since the guaranteed layout would be the default, we'd largely avoid bugs related to relying on the bit layout when it's not guaranteed. It would be like how D code in general uses long rather than c_long, so normal D code can rely on the size of long and avoid the bugs that come with the type's size varying depending on the target, whereas the code that actually needs C compatibility uses c_long and takes the risks that come with a variable integer size, because it has to. The issues with C bitfields would be restricted to the code that actually needs the compatibility. It would also make it cleaner to write code that has a guaranteed bit layout than it would be a with a library solution, since it could use the nice syntax too rather than treating it as a second-class citizen. However, in terms of what's actually necessary, I think that realistically, extern(C) bitfields need to be in the language like this DIP is proposing, since it's just too risky to do that with a library solution, whereas extern(D) bitfields _can_ be solved with a library solution like they are right now. I don't think that that's the best solution, but it's certainly better than what we have right now, since we don't have C-compatible bitfields anywhere at the moment (outside of a preview switch). In any case, it seems like the core issue that's resulting in most of the debate over this DIP is how important some people think that it is to have a guaranteed bit layout by default so that bugs which come from relying on a layout that isn't guaranteed will be avoided. You don't seem to think that that's much of a concern, whereas some of the other folks think that it's a big concern. Either way, I completely agree that we need a C-compatible solution in the language so that we can sanely bind to C code that uses bitfields. - Jonathan M DavisNow, if we want to do something like have extern(C) bitfields and extern(D) bitfields so that we can have clean and consistent behavior in normal D codeD used to have its own function call ABI, because I thought I'd make a clean and consistent one. It turned out, nobody cared about clean and consistent. They wanted C compatibility. For example, debuggers could not handle anything other than what the associated C compiler emitted, regardless of what the debug info spec says. There really is not a clean and consistent layout. There is only C compatibility. Just like we do for endianess and alignment. All of the portability issues people have mentioned are easily dealt with. There is always writing functions that do shifts and masks as a last resort. (Shifts and masks is what the code generator does anyway, so this won't cost any performance.)
Apr 28
On 29/04/2024 1:32 AM, Jonathan M Davis wrote:In any case, it seems like the core issue that's resulting in most of the debate over this DIP is how important some people think that it is to have a guaranteed bit layout by default so that bugs which come from relying on a layout that isn't guaranteed will be avoided. You don't seem to think that that's much of a concern, whereas some of the other folks think that it's a big concern.I'm not sure that anyone cares what the default is. dealing with a binding or serialization that you care and each of those are specialized enough to opt-in to whatever strategy is appropriate. But one thing that has been on my kill list for PhobosV3 is string mixins publicly introducing any new symbols like... bitfields. Simply because auto-completion cannot see it, and may never be able to see it due to the CTFE requirement. https://github.com/LightBender/PhobosV3-Design/discussions/32
Apr 28
I listed these 3 use cases in the second draft, and it seems we are mostly in agreement. Using bit fields to reduce memory consumption, and to be compatible with C code, is handled by default nicely with the proposal. Conformance to an externally imposed layout sometimes is necessary, but it is much less common. It is almost always easily done with a minor bit of attention. The worst case is writing a shift/mask accessor function, very easy to do. I suspect these workarounds are even less effort than reading the spec on how to use special syntax for it. Nobody is obliged to use std.bitmanip.bitfield to get the job done. I can help with any externally defined format anyone is having difficulty with.
Apr 28
On 4/26/2024 8:26 AM, Steven Schveighoffer wrote:Suffers from the same major problem as last time - nobody is going to be using C bitfield structs from D,I am, as soon as they become available in the D bootstrap compiler. I don't much care for the ugly workarounds used currently.yet we are inheriting all the problems.There aren't any problems if one is using bitfields for reducing memory consumption or for C compatibility.Keeping C compatibility is meaningless.In the D compiler source code, it means gcd and ldc with their C++ backends won't have any issues with it.Have you considered that people might build some libraries with ldc, but build applications with dmd? If LDC picks one mechanism for laying out bitfields, but DMD picks a different one, then what happens when you try to use the two together? Do we really want to make D incompatible with itself?I have considered that. dmd will pick the same layout as the associated C compiler, which is gcc (used by gdc), and clang (used by ldc).This already happens with C. See for instance https://stackoverflow.com/questions/43504113/bitfield-struct-size-different-between-gcc-and-msft-clCan you even mix/match object files between vc and gdc, or vc and ldc, anyway? dmd on Windows generates DMC layout for -m32, and VC layout for -m64 and -m32mscoffAdding more `__traits` is trivial, don't skimp here.Can be added later. The point is, the information is available.Still does not address `sizeof`.Oops forgot that. It would return the size of the bitfield's type.The mechanism described to get the bit offset is... horrific. Please just add some `__traits`.It can be added later. But in general it is not a good idea to add things that are deducible from existing things. In this case, it's a loop. A function could be written to do it.
Apr 27
On Tuesday, 23 April 2024 at 01:01:11 UTC, Walter Bright wrote:https://github.com/WalterBright/documents/blob/dcb8caabff76312eee25bb9aa057774d981a5bf7/bitfields.mdI would approve this because we gain C compatibility and we can drop the `std.bitmanip.bitfields` type entirely from Phobos 3.
Apr 27
On Saturday, April 27, 2024 6:31:37 PM MDT Adam Wilson via dip.development wrote:On Tuesday, 23 April 2024 at 01:01:11 UTC, Walter Bright wrote:Actually, it doesn't fix the need for std.bitmanip.bitfields, though it does reduce it. Use cases that need a guaranteed layout (e.g. for serialization) won't work with C-compatible bitfields, because the layout could change depending on the target platform. So adding this feature to the language doesn't help them at all, and they'd still need something like the Phobos solution. Of course, this DIP helps quite a bit with regards to C bindings (which the Phobos solution does not help with), because those cases need to match the C layout rather than guaranteeing a layout that will be the same across all OSes and architectures. This DIP could also be used in cases where you don't care what C is doing, but you also don't care exactly how the bitfields are laid out. So, it would reduce the need for a Phobos solution, but it doesn't replace it. - Jonathan M Davishttps://github.com/WalterBright/documents/blob/dcb8caabff76312eee25bb9aa05 7774d981a5bf7/bitfields.mdI would approve this because we gain C compatibility and we can drop the `std.bitmanip.bitfields` type entirely from Phobos 3.
Apr 27
On 4/23/24 03:01, Walter Bright wrote:https://github.com/WalterBright/documents/blob/dcb8caabff76312eee25bb9aa057774d981a5bf7/bitfields.mdIn practice, however, if one sticks to int, uint, long and ulong bitfields, they are laid out the same.Maybe only those cases should be allowed without `extern(C)`. I think that might be an ok compromise. However, I would still much prefer a solution that explicitly introduces the underlying `int`, `uint`, `long` and `ulong` fields, which would be the ones visible to introspection in a portable way, so that introspection code does not really need to concern itself with bitfields at all if it is not important and we do not break existing introspection libraries, such as all serialization libraries.Symbolic Debug InfoThis does not seem like a strong argument. I am pretty confident debug info can work pretty well regardless of how D lays out the bits.["a", "b", "c"] ["a", "_b_c_d_bf", "b", "b_min", "b_max", "c", "c_min", "c_max", "d", "d_min", "d_max"]I like that the members are not as cluttered. I guess maybe some people still would like to access the underlying data (e.g., to implement a pointer to bitfield as a struct with a pointer plus bit offset and bit length, or something), so perhaps you could add a note that explains how to do that. You forgot to say what `.tupleof` will do for a struct with bitfields in it.There isn't a specific trait for "is this a bitfield".I think it would be better to have such a `__traits` even just for discoverability when people look at the `__traits` page to implement some introspection code.testing to see if the address of a field can be taken, enables discovery of a bitfield.Not really, a field could be an `enum` field, and you cannot take the address of that either. And if we ever add another feature that has fields whose address can be taken, existing introspection code may break. It is better to be explicit.The values of .max or .min enable determining the number of bits in a bitfield.I do not like this a lot, it does not seem like the canonical way to determine it. `.bitlength`?The bit offset can be introspected by summing the number of bits in each preceding bitfield that has the same value of .offsetof.I think it would be much better to just add a `__trait` for this or add something like `.bitoffsetof`. This is a) much more user friendly and b) is a bit more likely to work reliably in practice. D currently does not give any guarantees on the order you will see members when using `__traits(allMembers, ...)`.
Apr 28
On 4/29/24 00:30, Timon Gehr wrote:On 4/23/24 03:01, Walter Bright wrote:This also renders somewhat moot the following claims from the DIP:https://github.com/WalterBright/documents/blob/dcb8caabff76312eee25bb9aa057774d981a5bf7/bitfields.md... However, I would still much prefer a solution that explicitly introduces the underlying `int`, `uint`, `long` and `ulong` fields, which would be the ones visible to introspection in a portable way, so that introspection code does not really need to concern itself with bitfields at all if it is not important and we do not break existing introspection libraries, such as all serialization libraries. ...This is an additive feature and does not break any existing code. Its use is entirely optional.I get that combinations of code that exist today won't break, but it still does break libraries that do "just works" serialization if new code uses that library with bitfields, and the breakage might be silent.
Apr 28
On 4/28/2024 3:30 PM, Timon Gehr wrote:However, I would still much prefer a solution that explicitly introduces the underlying `int`, `uint`, `long` and `ulong` fields, which would be the ones visible to introspection in a portable way, so that introspection code does not really need to concern itself with bitfields at all if it is not important and we do not break existing introspection libraries, such as all serialization libraries.I doubt introspection libraries would break. If they are not checking for bitfields, but are just looking at .offsetof and the type, they'll interpret the bitfields as a union (which, in a way, is accurate).I'm not. I'd follow the dwarf spec and it didn't work, because the only thing that was ever tested was apparently what the C compiler actually did. In order to get gdb to work, I wound up ignoring the spec and doing what gcc did. It's the same with object file formats. The spec is somewhat of a fairy tale, it's what the associated C compiler actually does that matters.Symbolic Debug InfoThis does not seem like a strong argument. I am pretty confident debug info can work pretty well regardless of how D lays out the bits.I like that the members are not as cluttered. I guess maybe some people still would like to access the underlying data (e.g., to implement a pointer to bitfield as a struct with a pointer plus bit offset and bit length, or something), so perhaps you could add a note that explains how to do that.Pointer to bitfields will work just the same as they do in C. I don't understand what you're asking for.You forgot to say what `.tupleof` will do for a struct with bitfields in it.They do exactly what you'd expect them to do: ``` import std.stdio; struct S { int a:4, b:5; } void main() { S s; s.a = 7; s.b = 9; writeln(s.tupleof); } ``` prints: ``` 79 ``` It's not necessary to specify this, because this behavior does not diverge from field access semantics. Only things that differ need to be specified. Specifying "it works like X except for A,B,C" is a lot more reliable and compact than reiterating everything X does.I think it would be better to have such a `__traits` even just for discoverability when people look at the `__traits` page to implement some introspection code.There isn't for other members, it's just "allMembers".An enum is distinguished by it not being possible to use .offsetof with it.testing to see if the address of a field can be taken, enables discovery of a bitfield.Not really, a field could be an `enum` field, and you cannot take the address of that either. And if we ever add another feature that has fields whose address can be taken, existing introspection code may break. It is better to be explicit.I agree it's a bit(!) jarring at first blush, but it's easy and perfectly reliable. 7 and 15 are always going to be a 4 bit field. We do a lot of introspection via indirect things like this.The values of .max or .min enable determining the number of bits in a bitfield.I do not like this a lot, it does not seem like the canonical way to determine it. `.bitlength`?I overlooked that bitfields can have holes in them, so probably something like .bitoffsetof is probably necessary.The bit offset can be introspected by summing the number of bits in each preceding bitfield that has the same value of .offsetof.I think it would be much better to just add a `__trait` for this or add something like `.bitoffsetof`. This is a) much more user friendly and b) is a bit more likely to work reliably in practice. D currently does not give any guarantees on the order you will see members when using `__traits(allMembers, ...)`.
Apr 28
On 4/29/24 08:44, Walter Bright wrote:On 4/28/2024 3:30 PM, Timon Gehr wrote:You are breaking even simple patterns like `foreach(ref field;s.tupleof){ }`. It would be a miracle if libraries did not break.However, I would still much prefer a solution that explicitly introduces the underlying `int`, `uint`, `long` and `ulong` fields, which would be the ones visible to introspection in a portable way, so that introspection code does not really need to concern itself with bitfields at all if it is not important and we do not break existing introspection libraries, such as all serialization libraries.I doubt introspection libraries would break.If they are not checking for bitfields, but are just looking at .offsetof and the type, they'll interpret the bitfields as a union (which, in a way, is accurate). ...No, it is not accurate....Well, you can't take a pointer to a bitfield.I like that the members are not as cluttered. I guess maybe some people still would like to access the underlying data (e.g., to implement a pointer to bitfield as a struct with a pointer plus bit offset and bit length, or something), so perhaps you could add a note that explains how to do that.Pointer to bitfields will work just the same as they do in C. I don't understand what you're asking for. ...Well, so far everything in `.tupleof` had an address. It should at least be mentioned in the DIP, if nowhere else you should put it in the breaking language changes section.You forgot to say what `.tupleof` will do for a struct with bitfields in it.They do exactly what you'd expect them to do: ``` import std.stdio; struct S { int a:4, b:5; } void main() { S s; s.a = 7; s.b = 9; writeln(s.tupleof); } ``` prints: ``` 79 ``` It's not necessary to specify this,because this behavior does not diverge from field access semantics.There is a difference between a DIP (that can change the language) and the specification (that can indeed be written in a way that does not explicitly mention bitfields under the `.tupleof` documentation.)...Despite not being very relevant to what I was asking for, this is simply untrue. `allMembers` gives you the members, and `.tupleof` gives you the fields.I think it would be better to have such a `__traits` even just for discoverability when people look at the `__traits` page to implement some introspection code.There isn't for other members, it's just "allMembers". ...Well, if you are trying to deliberately make introspection unnecessarily complicated, I guess that's your prerogative.An enum is distinguished by it not being possible to use .offsetof with it. ...testing to see if the address of a field can be taken, enables discovery of a bitfield.Not really, a field could be an `enum` field, and you cannot take the address of that either. And if we ever add another feature that has fields whose address can be taken, existing introspection code may break. It is better to be explicit.All of those things are ugly hacks. This kind of brain teaser is how metaprogramming works (or increasingly: used to work) in C++, but I think it is not very wise to continue this tradition in D.I agree it's a bit(!) jarring at first blush, but it's easy and perfectly reliable. 7 and 15 are always going to be a 4 bit field. We do a lot of introspection via indirect things like this. ...The values of .max or .min enable determining the number of bits in a bitfield.I do not like this a lot, it does not seem like the canonical way to determine it. `.bitlength`?Sounds good.I overlooked that bitfields can have holes in them, so probably something like .bitoffsetof is probably necessary.The bit offset can be introspected by summing the number of bits in each preceding bitfield that has the same value of .offsetof.I think it would be much better to just add a `__trait` for this or add something like `.bitoffsetof`. This is a) much more user friendly and b) is a bit more likely to work reliably in practice. D currently does not give any guarantees on the order you will see members when using `__traits(allMembers, ...)`.
Apr 29
On 4/29/24 14:04, Timon Gehr wrote:Forgot to fully answer this. I am asking for example code how you would implement a function that gives you a "fat pointer" to a bitfield that lets you read and write from that bitfield. It cannot be the same as in C, as I think this inherently requires introspection.Pointer to bitfields will work just the same as they do in C. I don't understand what you're asking for. ...Well, you can't take a pointer to a bitfield.
Apr 29
On 4/29/2024 5:07 AM, Timon Gehr wrote:I am asking for example code how you would implement a function that gives you a "fat pointer" to a bitfield that lets you read and write from that bitfield.The fat pointer in D is a delegate, and that's how I'd do it.
Apr 29
On Monday, April 29, 2024 6:04:13 AM MDT Timon Gehr via dip.development wrote:On 4/29/24 08:44, Walter Bright wrote:druntime and Phobos both specifically uses tupleof to look for the actual members of a type which take up storage space in that type and whose address can be taken. Traits such as std.traits.Fields do that and document it as such. If bitfields show up as part of tupleof, I would fully expect that to cause problems with any type introspection that operates on the member variables of a type. The breakage may be minimal in practice due to the fact that bitfields aren't currently part of the language, and it's only new code which would encounter this problem, but any existing type introspection code looking at fields is going to expect that all of those fields take up storage space and that their address can be taken, so if it's given a type which has bitfields, and those show up in tupleof, that code is not going to work correctly. Such code does already need to take unions into account (and there is _some_ similarity between those and bitfields), but it's going to have done that by checking things like is(T == union), which won't help with bitfields at all. And really, even if bitfields matched that, you wouldn't necessarily get the right result anyway, because while both bitfields and unions have members which are not proper fields on their own, the way they behave and take up space in the type is completely different. Maybe we should add a check for bitfields? Presumably, it would have to be something more like __traits(isBitfield, member), since unlike with a union, you can't check the type, and we're not adding a bitfields keyword, but regardless of how you'd check whether something is a bitfield, existing type introspection code is going to have to be updated in some fashion to take bitfields into account, or it's going to do the wrong thing when it's given a type that has bitfields. There's no way that bitfields are going to just magically work correctly with code that does type introspection. It does make sense that __traits(allMembers, T) would give you the bitfields, but I don't think that it makes sense that tupleof would, since you cannot take their addresses, but either way, it _will_ break Phobos code if tupleof gives bitfields - and not in a way that would be easily detected, because doing so would require having tests that used bitfields, which of course, don't exist, because bitfields have to be added first. - Jonathan M DavisOn 4/28/2024 3:30 PM, Timon Gehr wrote:You are breaking even simple patterns like `foreach(ref field;s.tupleof){ }`. It would be a miracle if libraries did not break.However, I would still much prefer a solution that explicitly introduces the underlying `int`, `uint`, `long` and `ulong` fields, which would be the ones visible to introspection in a portable way, so that introspection code does not really need to concern itself with bitfields at all if it is not important and we do not break existing introspection libraries, such as all serialization libraries.I doubt introspection libraries would break.
Apr 29
On 4/29/2024 8:54 AM, Jonathan M Davis wrote:Such code does already need to take unions into account (and there is _some_ similarity between those and bitfields), but it's going to have done that by checking things like is(T == union), which won't help with bitfields at all.Using `is(T==union)` is incomplete because anonymous unions are not fields. The compiler doesn't do a union check internally for that reason. The correct check would be: ``` if (S.a.offsetof + typeof(S.a).sizeof <= S.b.offsetof || S.b.offsetof + typeof(S.b).sizeof <= S.a.offsetof) { // S.a and S.b do not overlap } else { // S.a and S.b overlap } ``` This will work without change for bitfields.
Apr 29
On 4/29/2024 5:04 AM, Timon Gehr wrote:You are breaking even simple patterns like `foreach(ref field;s.tupleof){ }`.Let's see what happens. ``` import core.stdc.stdio; struct S { int a; int b; enum int c = 3; int d:3, e:4; } void main() { S s; s.a = 1; s.b = 2; s.d = 3; s.e = 4; foreach (ref f; s.tupleof) { printf("%d\n", f); } } ``` which prints: ``` 1 2 3 4 ``` What is going on here? foreach over a tuple is not really a loop. It's a shorthand for a sequence of statements that the compiler unrolls the "loop" into. When the compiler sees a 'ref' for something that cannot have its address taken, it ignores the 'ref'. This can also be seen with: ``` foreach(ref i; 0 .. 10) { } ``` which works. You can see this in action when compiling with -vasm. Additionally, for such unrolled loops the 'f' is not a loop variable. It is a new variable created for each unroll of the loop. You can see this with: ``` import core.stdc.stdio; struct S { int a; int b; enum int c = 3; int d:3, e:4; } void main() { S s; s.a = 1; s.b = 2; s.d = 3; s.e = 4; foreach (ref f; s.tupleof) { foo(f); } foreach (ref f; s.tupleof) { printf("%d\n", f); } } void foo(ref int f) { printf("f: %d\n", f); ++f; } ``` where s.a and s.b get incremented, but s.d and s.e do not. I do not recall exactly why this `ref` behavior was done for foreach, but it was either a mistake or was done precisely to make generic code work. Either way, what's done is done, and there doesn't seem to be much point in breaking it.
Apr 29
On 4/30/24 05:14, Walter Bright wrote:I do not recall exactly why this `ref` behavior was done for foreach, but it was either a mistake or was done precisely to make generic code work.You mean fail silently.
Apr 30
On 4/30/2024 3:00 AM, Timon Gehr wrote:On 4/30/24 05:14, Walter Bright wrote:No, I did not mean that.I do not recall exactly why this `ref` behavior was done for foreach, but it was either a mistake or was done precisely to make generic code work.You mean fail silently.
May 03
On 5/4/24 03:45, Walter Bright wrote:On 4/30/2024 3:00 AM, Timon Gehr wrote:Well, that is what it will do if you assign a value to the `ref` iteration variable.On 4/30/24 05:14, Walter Bright wrote:No, I did not mean that.I do not recall exactly why this `ref` behavior was done for foreach, but it was either a mistake or was done precisely to make generic code work.You mean fail silently.
May 04
On 4/29/2024 5:04 AM, Timon Gehr wrote:Getting and setting bit fields reads/writes all the bits in the underlying field, so it definitely is like a union. std.bitmanip.bitfields also implements it as a union, because there is no other way. The CPU does not provide any instructions to access bit fields. (This is why atomics won't work on bitfields.) If the user of bitfields does not understand the underlying physical reality of bitfields, they will forever have problems with them. Just like programmers who do not understand the physical reality of pointers, floating point, 2s complement, etc., are always crippled and would probably be better off using Excel as their programming language :-/If they are not checking for bitfields, but are just looking at .offsetof and the type, they'll interpret the bitfields as a union (which, in a way, is accurate). ...No, it is not accurate.Exactly what I meant!Pointer to bitfields will work just the same as they do in C. I don't understand what you're asking for.Well, you can't take a pointer to a bitfield.Well, so far everything in `.tupleof` had an address.When you mentioned enums not having an address, I had assumed you were talking about __traits(allMembers). .tupleof skips over enums.It should at least be mentioned in the DIP, if nowhere else you should put it in the breaking language changes section.I can mention it, sure.Well, if you are trying to deliberately make introspection unnecessarily complicated, I guess that's your prerogative.__traits has an ugly syntax. The idea was to provide the ability, and the user (or Phobos) would put a pretty face on it.All of those things are ugly hacks. This kind of brain teaser is how metaprogramming works (or increasingly: used to work) in C++, but I think it is not very wise to continue this tradition in D.std.traits definitely continues the tradition. While I'm fine with ugly implementations in it, std.traits fails to document the behavior of the functions that supposedly put a pretty face on it. I've asked Adam Wilson to consider completely re-engineering std.traits. As long as it is possible to put a pretty face on it, I'm ok with an underlying ugliness in the service of not having N>1 diverse ways to do X.
Apr 29
On 4/30/24 05:30, Walter Bright wrote:On 4/29/2024 5:04 AM, Timon Gehr wrote:No, more than one bitfield is valid at a time even if they have the same offsetof. This is definitely breaking expectations that used to be true.Getting and setting bit fields reads/writes all the bits in the underlying field, so it definitely is like a union.If they are not checking for bitfields, but are just looking at .offsetof and the type, they'll interpret the bitfields as a union (which, in a way, is accurate). ...No, it is not accurate.std.bitmanip.bitfields also implements it as a union,No, this is not correct. It implements it as a field with accessors for different groups of bits. The only reason why `union` appears in that file is to support bitfields inside a union. This again highlights that those are not the same thing.because there is no other way. The CPU does not provide any instructions to access bit fields. (This is why atomics won't work on bitfields.) ...Sure! I guess this opens the question what happens with bitfields and type qualifiers. The DIP currently says you can have `int`, `uint`, `long` and `ulong` bitfields. Are e.g. `immutable(int)` bitfields allowed? I'd expect `shared(int)` bitfields are not allowed?If the user of bitfields does not understand the underlying physical reality of bitfields, they will forever have problems with them. Just like programmers who do not understand the physical reality of pointers, floating point, 2s complement, etc.,I understand the underlying reality of all of those concepts and I still disagree that interpreting bitfields as a union is correct. There are bitfields and there are unions.are always crippled and would probably be better off using Excel as their programming language :-/ ...However, this seems like an exaggeration. I think there are programmers who are gainfully employed and fall into neither of those categories.But it will include bitfields, and not the underlying "physical" variables.Exactly what I meant!Pointer to bitfields will work just the same as they do in C. I don't understand what you're asking for.Well, you can't take a pointer to a bitfield.Well, so far everything in `.tupleof` had an address.When you mentioned enums not having an address, I had assumed you were talking about __traits(allMembers). .tupleof skips over enums. ...Thanks!It should at least be mentioned in the DIP, if nowhere else you should put it in the breaking language changes section.I can mention it, sure. ...In practice people often do use `__traits`, either because it is more efficient or wrapping is impossible. In any case, providing exactly what is needed is much simpler.Well, if you are trying to deliberately make introspection unnecessarily complicated, I guess that's your prerogative.__traits has an ugly syntax. The idea was to provide the ability, and the user (or Phobos) would put a pretty face on it. ...I think the main thing is it should be immediately obvious to readers, as it is actually not hard.All of those things are ugly hacks. This kind of brain teaser is how metaprogramming works (or increasingly: used to work) in C++, but I think it is not very wise to continue this tradition in D.std.traits definitely continues the tradition. While I'm fine with ugly implementations in it, std.traits fails to document the behavior of the functions that supposedly put a pretty face on it. I've asked Adam Wilson to consider completely re-engineering std.traits. As long as it is possible to put a pretty face on it, I'm ok with an underlying ugliness in the service of not having N>1 diverse ways to do X.
Apr 30
On Tuesday, April 30, 2024 7:43:52 AM MDT Timon Gehr via dip.development wrote:Sure! I guess this opens the question what happens with bitfields and type qualifiers. The DIP currently says you can have `int`, `uint`, `long` and `ulong` bitfields. Are e.g. `immutable(int)` bitfields allowed?I don't see why not. You shouldn't be able to mutate them, but reading them should be fine, since they're not going to change.I'd expect `shared(int)` bitfields are not allowed?There should be no problem with shared bitfields existing. However, it shouldn't be legal to read them or write them so long as they're shared. But with the preview switch to lock down shared, that's true of any type, including int. Accessing shared data while it's not protected is always a problem. Atomics shouldn't work with bitfields, since they can't, whereas if you protect them with a mutex, you can then temporarily cast away shared and operate on them just like you'd do with any other shared data. So, I don't see why bitfields would be at all special with regards to what needs to happen with shared. - Jonathan M Davis
Apr 30
On 4/30/24 18:42, Jonathan M Davis wrote:On Tuesday, April 30, 2024 7:43:52 AM MDT Timon Gehr via dip.development wrote:Well, I am bringing it up because the DIP draft ignores type qualifiers so far (and explicitly only lists unqualified types for support). What is happening with `shared` I think has not been fully pinned down, but last I heard the goal was to get implicit atomics.Sure! I guess this opens the question what happens with bitfields and type qualifiers. The DIP currently says you can have `int`, `uint`, `long` and `ulong` bitfields. Are e.g. `immutable(int)` bitfields allowed?I don't see why not. You shouldn't be able to mutate them, but reading them should be fine, since they're not going to change.I'd expect `shared(int)` bitfields are not allowed?There should be no problem with shared bitfields existing. However, it shouldn't be legal to read them or write them so long as they're shared. But with the preview switch to lock down shared, that's true of any type, including int. Accessing shared data while it's not protected is always a problem. Atomics shouldn't work with bitfields, since they can't, whereas if you protect them with a mutex, you can then temporarily cast away shared and operate on them just like you'd do with any other shared data. So, I don't see why bitfields would be at all special with regards to what needs to happen with shared. - Jonathan M Davis
Apr 30
On Tuesday, April 30, 2024 2:48:46 PM MDT Timon Gehr via dip.development wrote:Well, I am bringing it up because the DIP draft ignores type qualifiers so far (and explicitly only lists unqualified types for support). What is happening with `shared` I think has not been fully pinned down, but last I heard the goal was to get implicit atomics.Atila was talking about possibly doing implicit atomics, and we may get that, but either way, for types that _can't_ use atomics (like bitfields), it's pretty clearly going to have to be the case that they can't be read or written to without casting if shared is actually going to protect against accessing shared data in a manner which isn't guaranteed to be thread-safe like it's theoretically supposed to and -preview=nosharedaccess is supposed to enforce. So, the normal rules for type qualifiers should apply to bitfields exactly like they would with any other type, and there shouldn't be any surprises here, but yes, if we want to be thorough about things, then the DIP should probably mention what happens with type qualifiers. - Jonathan M Davis
Apr 30
On 4/30/2024 1:48 PM, Timon Gehr wrote:Well, I am bringing it up because the DIP draft ignores type qualifiers so far (and explicitly only lists unqualified types for support). What is happening with `shared` I think has not been fully pinned down, but last I heard the goal was to get implicit atomics.Since a bitfield can be part of a shared object, of course the shared type should exist for it. But since you cannot take a reference to a bitfield, you're going to have to dip into system code to manipulate it, and it's up to the programmer to figure out what to do. Immutable means read only. I don't see any issue with that, either.
May 03
On 4/30/2024 6:43 AM, Timon Gehr wrote:No, more than one bitfield is valid at a time even if they have the same offsetof. This is definitely breaking expectations that used to be true.Any expectations that there would be no two fields with the same offset were incorrect anyway, as that is what happens with anonymous unions.They are the same thing, it is not substantive what label is painted on it. Anonymous unions can lay fields on top of each other without explicitly labeling it as a union.std.bitmanip.bitfields also implements it as a union,No, this is not correct. It implements it as a field with accessors for different groups of bits. The only reason why `union` appears in that file is to support bitfields inside a union. This again highlights that those are not the same thing.I understand the underlying reality of all of those concepts and I still disagree that interpreting bitfields as a union is correct. There are bitfields and there are unions.There is no value to that distinction. As I replied to Jonathan in this thread, D can have fields laying over the top of each other right now without any unions or bitfields declared.But it will include bitfields, and not the underlying "physical" variables.If the introspection code does not take into account anonymous unions, it's already broken anyway. ``` struct S { union { int a; int b; int c:32; } } ```
May 03
On 5/4/24 04:07, Walter Bright wrote:This is explicitly called a `union`, so I really do not see what you are trying to say. Bitfields do not imply union, but they can be in a union, e.g.: ```C++ #include <bits/stdc++.h> using namespace std; struct S{ union{ struct { unsigned int x:1; unsigned int y:1; }; unsigned int z:2; }; }; int main(){ S s; s.z=3; cout<<s.x<<" "<<s.y<<endl; // 1 1 s.x=0; cout<<s.y<<" "<<s.z<<endl; // 1 2 s.y=0; s.x=1; cout<<s.z<<endl; // 1 } ``` Clearly bitfields are a new and distinct way fields can share the same `.offsetof` in D. Before bitfields, such fields would overlap given they both were of a type with a positive size. With bitfields there is no such overlap. BTW: what about `sizeof`? I think in C++ this is disallowed on a bitfield.They are the same thing, it is not substantive what label is painted on it. Anonymous unions can lay fields on top of each other without explicitly labeling it as a union.No, this is not correct. It implements it as a field with accessors for different groups of bits. The only reason why `union` appears in that file is to support bitfields inside a union. This again highlights that those are not the same thing.
May 04
On 5/4/24 19:05, Timon Gehr wrote:On 5/4/24 04:07, Walter Bright wrote:Another example: ```C++ #include <bits/stdc++.h> using namespace std; struct S{ union{ unsigned int x:1; unsigned int y:1; }; }; int main(){ S s; cout<<s.x<<" "<<s.y<<endl; // 0 0 s.x=1; cout<<s.x<<" "<<s.y<<endl; // 1 1 } ```This is explicitly called a `union`, so I really do not see what you are trying to say. Bitfields do not imply union, but they can be in a union, e.g.: ```C++ #include <bits/stdc++.h> using namespace std; struct S{ union{ struct { unsigned int x:1; unsigned int y:1; }; unsigned int z:2; }; }; int main(){ S s; s.z=3; cout<<s.x<<" "<<s.y<<endl; // 1 1 s.x=0; cout<<s.y<<" "<<s.z<<endl; // 1 2 s.y=0; s.x=1; cout<<s.z<<endl; // 1 } ``` Clearly bitfields are a new and distinct way fields can share the same `.offsetof` in D. Before bitfields, such fields would overlap given they both were of a type with a positive size. With bitfields there is no such overlap. BTW: what about `sizeof`? I think in C++ this is disallowed on a bitfield.They are the same thing, it is not substantive what label is painted on it. Anonymous unions can lay fields on top of each other without explicitly labeling it as a union.No, this is not correct. It implements it as a field with accessors for different groups of bits. The only reason why `union` appears in that file is to support bitfields inside a union. This again highlights that those are not the same thing.
May 04
On 5/4/24 19:21, Timon Gehr wrote:S s;I guess this should have been `S s={};` to ensure the POD struct is initialized.
May 04
On 5/4/2024 10:05 AM, Timon Gehr wrote:This is explicitly called a `union`, so I really do not see what you are trying to say.An anonymous union is simply a way to specify layout. No actual union is created, as there is no point to it. How would one refer to an anonymous union? The same goes for anonymous structs.Clearly bitfields are a new and distinct way fields can share the same `.offsetof` in D. Before bitfields, such fields would overlap given they both were of a type with a positive size. With bitfields there is no such overlap.Bitfields do overlap, which is why they are accessed with shift and mask. Besides, the context here is with existing introspection. Existing introspection will treat them as overlapping fields.BTW: what about `sizeof`? I think in C++ this is disallowed on a bitfield.``` struct S { int b:3; } pragma(msg, S.b.sizeof); ``` prints 4LU, as it applies to the type. To get the field width, use .max, as already discussed.
May 04
On 5/4/24 22:17, Walter Bright wrote:On 5/4/2024 10:05 AM, Timon Gehr wrote:Well, now you are simply slicing the terminology in a weird way. `union` and `struct` are tools to lay out data, anonymous or otherwise, whether you generate typeinfo or otherwise.This is explicitly called a `union`, so I really do not see what you are trying to say.An anonymous union is simply a way to specify layout. No actual union is created, as there is no point to it. How would one refer to an anonymous union? The same goes for anonymous structs. ...They do not overlap if not put in a union (anonymous or otherwise). Otherwise, changing the value of one bitfield would affect the value of another one. The fact that they occupy space in the same byte and that the processor can only address memory at byte granularity does not imply that the bitfields themselves overlap.Clearly bitfields are a new and distinct way fields can share the same `.offsetof` in D. Before bitfields, such fields would overlap given they both were of a type with a positive size. With bitfields there is no such overlap.Bitfields do overlap, which is why they are accessed with shift and mask. ...Besides, the context here is with existing introspection. Existing introspection will treat them as overlapping fields.Well, then code that is set up to work with data using `.tupleof`, `.offsetof` and `.sizeof` will silently break. Whether you acknowledge that or not, it's simply the truth. You are breaking the previous invariant that data in a struct lives at relative addresses `data.offsetof..data.offsetof+data.sizeof`.BTW: what about `sizeof`? I think in C++ this is disallowed on a bitfield.``` struct S { int b:3; } pragma(msg, S.b.sizeof); ``` prints 4LU, as it applies to the type. To get the field width, use .max, as already discussed.
May 04
On 5/4/2024 4:01 PM, Timon Gehr wrote:You are breaking the previous invariant that data in a struct lives at relative addresses `data.offsetof..data.offsetof+data.sizeof`.The data.sizeof for a bitfield will always be the size of the memory object containing the field. The invariant is not broken.
May 04
On 5/5/24 04:11, Walter Bright wrote:On 5/4/2024 4:01 PM, Timon Gehr wrote:I do not understand. I thought bitfields are supposed to match the layout of the associated C compiler. Instead, you seem to now be arguing that there should actually be stringent layout guarantees. GCC 11.4.0, clang 14.0.0: ```c #include <stdio.h> struct __attribute__((packed)) S{ long long x:8; }; int main(){ printf("%ld\n",sizeof(long long)); // 8 printf("%ld\n",sizeof(struct S)); // 1 } ``` It indeed seems `dmd 2.108.1` disagrees and gives `8` and `8`, but I guess this is a mistake. In any case, laying the struct out like this is in compliance with C standards even without the additional attribute:You are breaking the previous invariant that data in a struct lives at relative addresses `data.offsetof..data.offsetof+data.sizeof`.The data.sizeof for a bitfield will always be the size of the memory object containing the field. The invariant is not broken.An implementation may allocate any addressable storage unit large enough to hold a bit- field. If enough space remains, a bit-field that immediately follows another bit-field in a structure shall be packed into adjacent bits of the same unit. If insufficient space remains, whether a bit-field that does not fit is put into the next unit or overlaps adjacent units is implementation-defined. The order of allocation of bit-fields within a unit (high-order to low-order or low-order to high-order) is ihttp://www.open-std.org/jtc1/sc22/wg14/www/docs/n1570.pdf Maybe at least redefine `.sizeof` to give the size of the underlying storage unit for a bitfield. Otherwise, you exacerbate the risk of memory corruption due to invalid assumptions about layout.
May 05
On 5/5/2024 2:08 AM, Timon Gehr wrote:GCC 11.4.0, clang 14.0.0: ```c #include <stdio.h> struct __attribute__((packed)) S{ long long x:8; }; int main(){ printf("%ld\n",sizeof(long long)); // 8 printf("%ld\n",sizeof(struct S)); // 1 } ```The sizeof there, in both cases, is giving the size in bytes of the memory object the field is a subset of.Maybe at least redefine `.sizeof` to give the size of the underlying storage unit for a bitfield.That's what it's doing in the example. BTW, I didn't implement packed bitfields in ImportC. It never occurred to me :-/ I suppose it should get a bugzilla issue. https://issues.dlang.org/show_bug.cgi?id=24538
May 06
On 5/6/24 09:14, Walter Bright wrote:On 5/5/2024 2:08 AM, Timon Gehr wrote:This is C and neither sizeof is on a memory object, they are both on types. sizeof on x gives a compile error. However, with the DIP, given that you implement packed bitfields in DMD, when importing an example like this one, `x.sizeof` would be eight times as big as the size of the `struct` it is a part of.GCC 11.4.0, clang 14.0.0: ```c #include <stdio.h> struct __attribute__((packed)) S{ long long x:8; }; int main(){ printf("%ld\n",sizeof(long long)); // 8 printf("%ld\n",sizeof(struct S)); // 1 } ```The sizeof there, in both cases, is giving the size in bytes of the memory object the field is a subset of. ...Well, the DIP now says `bitfield.sizeof` is `typeof(bitfield).sizeof`. Unless I misunderstand and `typeof(x)` is not `long long`, this should not be the case in this example, because a `long long` is longer than the memory location it is packed into in this case. (I think this is another broken invariant.)Maybe at least redefine `.sizeof` to give the size of the underlying storage unit for a bitfield.That's what it's doing in the example. ...BTW, I didn't implement packed bitfields in ImportC. It never occurred to me :-/ I suppose it should get a bugzilla issue. https://issues.dlang.org/show_bug.cgi?id=24538Well, that will help, but the point was the C standard does not give the guarantees you assumed to hold earlier, and in practice it in fact does not hold, as in this example.
May 06
On 5/6/2024 1:37 PM, Timon Gehr wrote:On 5/6/24 09:14, Walter Bright wrote:Since the memory object that x is in is 1 byte, the sizeof would be 1 byte (if I implemented the packed logic).On 5/5/2024 2:08 AM, Timon Gehr wrote:This is C and neither sizeof is on a memory object, they are both on types. sizeof on x gives a compile error. However, with the DIP, given that you implement packed bitfields in DMD, when importing an example like this one, `x.sizeof` would be eight times as big as the size of the `struct` it is a part of.GCC 11.4.0, clang 14.0.0: ```c #include <stdio.h> struct __attribute__((packed)) S{ long long x:8; }; int main(){ printf("%ld\n",sizeof(long long)); // 8 printf("%ld\n",sizeof(struct S)); // 1 } ```The sizeof there, in both cases, is giving the size in bytes of the memory object the field is a subset of. ...Well, the DIP now says `bitfield.sizeof` is `typeof(bitfield).sizeof`.Yes, as I didn't know about the packed thing then.Well, that will help, but the point was the C standard does not give the guarantees you assumed to hold earlier, and in practice it in fact does not hold, as in this example.The C standard says nothing about __attribte__((packed), and C doesn't allow sizeof on bit fields, so we can make .sizeof work as we like. The most practical thing is to make it mean the size of the memory object the bitfield is a subset of. Unless (unimplemented) packed bitfields are used, the sizeof is the size of the type.
May 06
On 5/7/24 05:39, Walter Bright wrote:The most practical thing is to make it mean the size of the memory object the bitfield is a subset of.I agree that given the C-like bitfield design you seem to have set your mind on, and given the behavior of `.offsetof` this is a decent behavior for `.sizeof`. However, this is not what the DIP currently says, so it should be updated.Unless (unimplemented) packed bitfields are used, the sizeof is the size of the type.The size of the memory object has to be whatever the associated C compiler allocates, and according to the standard, it is in principle allowed to pack by default. There is no guarantee that the memory object is in fact at least as big as the type of the bitfield. I am not familiar with bitfield layout on all platforms that D supports via GDC and LDC, but I would not be surprised if on some of them this is actually an issue in practice.
May 07
On Tuesday, 30 April 2024 at 03:30:15 UTC, Walter Bright wrote:On 4/29/2024 5:04 AM, Timon Gehr wrote:Not true. x86 provides BMI1 instructions which are present in x86 CPUs at least since 2013. ARM also provides bit field instructions and quite a number of legacy CPU's also had bitfield instructions (m68k, NEC V30, Itanium, PowerPC, etc.). Doesn't change the issues with language bitfieldsGetting and setting bit fields reads/writes all the bits in the underlying field, so it definitely is like a union. std.bitmanip.bitfields also implements it as a union, because there is no other way. The CPU does not provide any instructions to access bit fields.If they are not checking for bitfields, but are just looking at .offsetof and the type, they'll interpret the bitfields as a union (which, in a way, is accurate). ...No, it is not accurate.
May 03
On Friday, 3 May 2024 at 12:52:09 UTC, Patrick Schluter wrote:Not true. x86 provides BMI1 instructions which are present in x86 CPUs at least since 2013. ARM also provides bit field instructions and quite a number of legacy CPU's also had bitfield instructions (m68k, NEC V30, Itanium, PowerPC, etc.). Doesn't change the issues with language bitfieldsAbout BMI/BMI2 it would interesting to see if optimizing compilers actually generate instructions of these extensions for c++ bitfields. I've tried for styx enum-sets, sure that's a bit a special case of bitfields, but so far the only difference visible is a BMI2 `shlxl` emitted. But once again very special case.
May 03
On Friday, 3 May 2024 at 15:50:42 UTC, user1234 wrote:On Friday, 3 May 2024 at 12:52:09 UTC, Patrick Schluter wrote:![](https://i.imgur.com/Uw6qZ1g.png)Not true. x86 provides BMI1 instructions which are present in x86 CPUs at least since 2013. ARM also provides bit field instructions and quite a number of legacy CPU's also had bitfield instructions (m68k, NEC V30, Itanium, PowerPC, etc.). Doesn't change the issues with language bitfieldsAbout BMI/BMI2 it would interesting to see if optimizing compilers actually generate instructions of these extensions for c++ bitfields. I've tried for styx enum-sets, sure that's a bit a special case of bitfields, but so far the only difference visible is a BMI2 `shlxl` emitted. But once again very special case.
May 03
On 5/3/2024 5:52 AM, Patrick Schluter wrote:Not true. x86 provides BMI1 instructions which are present in x86 CPUs at least since 2013. ARM also provides bit field instructions and quite a number of legacy CPU's also had bitfield instructions (m68k, NEC V30, Itanium, PowerPC, etc.). Doesn't change the issues with language bitfieldsI'd be very surprised if it didn't work by reading the entire field first. https://www.felixcloutier.com/x86/bextr
May 03
On Monday, April 29, 2024 12:44:08 AM MDT Walter Bright via dip.development wrote:An enum is distinguished by it not being possible to use .offsetof with it.I don't think that I have _ever_ seen anyone use offsetof to determine anything with type introspection other than the actual offset. Existing code will almost certainly be using & to determine whether a member is an enum or not. That being said, _usually_, it's the case that code cares when a member is an enum or not when doing type introspection, because it's looking for something else (e.g. for whether the member is a static member variable), so I don't know whether suddenly having additional members that cannot have their address taken will break anything, but any situation where there isn't a trait that outright tells you what you're looking for makes it highly likely that any existing code which needed to figure it out did so by trying out a variety of checks and found some combination of things to check for being true and some combination of things to check for being false and then did enough testing to be reasonably sure that that combination of checks told them what they needed to know, but even if they did get it right, because it's quite indirect, adding more catogories of things which could affect introspection will ultimately run a pretty high risk of breaking _something_. There's only so much that we can do about that, but I do think that we need to be very careful about saying that X is the way to test for something and have any expectation that that's how folks are actually doing it unless that something is a specific trait from __traits or std.traits which checks for that exact thing. - Jonathan M Davis
Apr 29
On 23/04/2024 1:01 PM, Walter Bright wrote:https://github.com/WalterBright/documents/blob/dcb8caabff76312eee25bb9aa057774d981a5bf7/bitfields.mdRandomly came across people talking about C bitfields, and how the non-defined bit layout is causing them problems. https://twitter.com/__phantomderp/status/1786628836953604201 Turns out even they think they need control over the layout including predictable LSB..MSB byte by byte definition. Making it default to C for the layout is not a good addition to the language.
May 03
On 5/3/2024 11:07 PM, Richard (Rikki) Andrew Cattermole wrote:On 23/04/2024 1:01 PM, Walter Bright wrote:All the tweet says is: ``` As they should. (But now it's time for C and C++ to give users explicit layout control, so that eventually we can use our chairs on other more heinous programming criminals.) ``` I've responded thoroughly to every complaint about the layout. The only substantive external one is Linus', which is linked to in the DIP, and I responded to that, too. If you don't like bitfields at this point, don't use them. If you need help getting a specific layout, post here and I can help you.https://github.com/WalterBright/documents/blob/dcb8caabff76312eee25bb9aa057774d981a5bf7/bitfields.mdRandomly came across people talking about C bitfields, and how the non-defined bit layout is causing them problems. https://twitter.com/__phantomderp/status/1786628836953604201 Turns out even they think they need control over the layout including predictable LSB..MSB byte by byte definition. Making it default to C for the layout is not a good addition to the language.
May 04
On Tuesday, 23 April 2024 at 01:01:11 UTC, Walter Bright wrote:https://github.com/WalterBright/documents/blob/dcb8caabff76312eee25bb9aa057774d981a5bf7/bitfields.mdWhy not use `.bitsizeof` instead of `.bitwidth`? For the sake of conformance with `.sizeof`?
May 06
On 5/6/2024 6:52 AM, Per Nordlöw wrote:Why not use `.bitsizeof` instead of `.bitwidth`? For the sake of conformance with `.sizeof`?Sizeof is in bytes, so I use a different word for number of bits.
May 06
A thread for review of the third draft was opened subsequent to this one. Please leave further feedback there: https://forum.dlang.org/post/v193hc$b9c$1 digitalmars.com This thread is now closed.
May 21