www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - std.format and uninitialized elements in CTFE

reply berni44 <dlang d-ecke.de> writes:
std.format contains code, that accesses the members of a struct 
via tupleof (which gives access to private members, that might be 
intentionally not yet initialized), to print them. If members are 
not initialized it produces a "used before initialized" error or 
something similar. (See issue 19769 [1]).

What to do about this? Should std.format check if the members of 
a struct are not initialized and do what in that case? And how to 
check them?

Here an example:

```
import std.format;

struct Foo
{
     int a = void;
}

static x = format("%s", Foo());
```

Replace int by int[3] for a more complex version.

[1] https://issues.dlang.org/show_bug.cgi?id=19769
Dec 05 2019
next sibling parent reply Steven Schveighoffer <schveiguy gmail.com> writes:
On 12/5/19 1:37 PM, berni44 wrote:
 std.format contains code, that accesses the members of a struct via 
 tupleof (which gives access to private members, that might be 
 intentionally not yet initialized), to print them. If members are not 
 initialized it produces a "used before initialized" error or something 
 similar. (See issue 19769 [1]).
 
 What to do about this? Should std.format check if the members of a 
 struct are not initialized and do what in that case? And how to check them?
 
 Here an example:
 
 ```
 import std.format;
 
 struct Foo
 {
      int a = void;
 }
 
 static x = format("%s", Foo());
 ```
 
 Replace int by int[3] for a more complex version.
 
 [1] https://issues.dlang.org/show_bug.cgi?id=19769
What is the point of formatting items that aren't initialized? You would be printing garbage. What if you just initialize the items you wish to print, or if you have items that aren't initialized (but you don't want to print), create a toString overload? That's how I would handle it. -Steve
Dec 05 2019
parent reply berni44 <dlang d-ecke.de> writes:
On Thursday, 5 December 2019 at 19:45:27 UTC, Steven 
Schveighoffer wrote:
 What is the point of formatting items that aren't initialized? 
 You would be printing garbage.

 What if you just initialize the items you wish to print,
You are on the wrong track... I'm programming inside of phobos. Such questions do not arise there. If the user provides a struct with (partially) uninitialized items std.format has to cope with this somehow. We cannot really answer the question, why the user is doing this, nor can we make him initialize the items before printing. Ali's example shows, that this is a serious issue. IMHO my example should print `Foo(void)` or `Foo([void, void, void])` with `int[3]`. With that, Ali's example will work. For me, the question remains, how to detect (at compile time) if a variable is void. The best I could come up with yet is: int a = void; static if (!__traits(compiles, a?a:a)) But I'm not sure if ?: can be applied to all thinkable types.
Dec 05 2019
next sibling parent Johan Engelen <j j.nl> writes:
Uninitialized member variables, or partially initialized structs, 
are needed for e.g. small-string optimized string types.
There are a number of bug reports about it, see: 
https://issues.dlang.org/show_bug.cgi?id=11331

I remember that at the end of my DConf talk in 2017, we got 
Walter's approval to treat `=void` initialized struct members as 
if they are initialized / skip all initialization of it.
- Discussion of =void: https://youtu.be/YL6Tp8Zb5aI?t=2959
- Confirmation of that we should do something about it: 
https://youtu.be/YL6Tp8Zb5aI?t=3084
- Approval :-)  https://youtu.be/YL6Tp8Zb5aI?t=3266

This is a fun project to work on for LDC: 
https://github.com/ldc-developers/ldc/issues/3249

-Johan
Dec 06 2019
prev sibling next sibling parent reply berni44 <dlang d-ecke.de> writes:
On Friday, 6 December 2019 at 06:49:41 UTC, berni44 wrote:
 int a = void;

 static if (!__traits(compiles, a?a:a))

 But I'm not sure if ?: can be applied to all thinkable types.
Even that doesn't work (don't know, why I got this morning the impression, that it does). Is the following a bug in __traits? ``` struct Foo { int a = void; } int foo(Foo f) { static if (__traits(compiles, f.a == f.a)) { if (f.a == f.a) return 1; // error return 2; } return 3; } static x = foo(Foo()); ```
Dec 06 2019
parent Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:
On Friday, 6 December 2019 at 12:47:14 UTC, berni44 wrote:
 On Friday, 6 December 2019 at 06:49:41 UTC, berni44 wrote:
 int a = void;

 static if (!__traits(compiles, a?a:a))

 But I'm not sure if ?: can be applied to all thinkable types.
Even that doesn't work (don't know, why I got this morning the impression, that it does). Is the following a bug in __traits?
Probably not, because foo(…) could have been separately compiled in a different file unit? However, I think this and many other cases show that Walter is on the right track by using flow-analysis, but that D as a language would be better served by embracing Flow-typing whole-heartedly and take flow-analysis to the next level rather than doing it here-and-there. E.g. tracking of constness/immutability/nullability/initialization/aliasing etc can be done in new and intersting way if D adds flowtyping in a wholesome manner. It would probably require a new IR and perhaps break too much, so you would end up with D3m but it would be great from a metaprogramming perspective.
Dec 06 2019
prev sibling parent reply Steven Schveighoffer <schveiguy gmail.com> writes:
On 12/6/19 1:49 AM, berni44 wrote:
 On Thursday, 5 December 2019 at 19:45:27 UTC, Steven Schveighoffer wrote:
 What is the point of formatting items that aren't initialized? You 
 would be printing garbage.

 What if you just initialize the items you wish to print,
You are on the wrong track... I'm programming inside of phobos. Such questions do not arise there. If the user provides a struct with (partially) uninitialized items std.format has to cope with this somehow. We cannot really answer the question, why the user is doing this, nor can we make him initialize the items before printing.
The normal compiled code works correctly during normal execution. It prints garbage. What you are looking to do is do this at compile-time, which is not allowed. Again, just don't do that.
 
 Ali's example shows, that this is a serious issue. IMHO my example 
 should print `Foo(void)` or `Foo([void, void, void])` with `int[3]`. 
 With that, Ali's example will work.
Ali's example is different, because it's given an *actually initialized* structure, not one that is uninitialized, and the compiler complains. I'd consider that an actual bug (in Phobos I would guess, not the compiler). What you want is for CTFE to have some mechanism to deal with uninitialized data aside from an error (which is perfectly acceptable).
 
 For me, the question remains, how to detect (at compile time) if a 
 variable is void. The best I could come up with yet is:
 
 int a = void;
 
 static if (!__traits(compiles, a?a:a))
 
 But I'm not sure if ?: can be applied to all thinkable types.
You are thinking incorrectly about CTFE ;) It's executing an *already compiled* function, at compile time. There is no static determination of anything at compile-time for CTFE function parameters. This is why the __ctfe variable is a runtime variable, and not a compile-time one. So because you can access the variable at runtime (runtime D doesn't reject accesses to uninitialized data, as it doesn't know that has happened), it compiles and will try to run in CTFE. In fact, you can pass it in uninitialized data, and as long as CTFE doesn't interpret any code that will read it, it will work. For example, this works: ``` struct Foo { int i = 5; int a = void; string toString() { import std.conv; return i.to!string; } } import std.format; static x = format("%s", Foo()); ``` It's hard to have discussions about abstract types that obviously are not useful. Maybe if you have a more likely example, you can get help finding the right course of action. -Steve
Dec 08 2019
parent reply berni44 <dlang d-ecke.de> writes:
On Sunday, 8 December 2019 at 18:16:29 UTC, Steven Schveighoffer 
wrote:
 Ali's example shows, that this is a serious issue. IMHO my 
 example should print `Foo(void)` or `Foo([void, void, void])` 
 with `int[3]`. With that, Ali's example will work.
Ali's example is different, because it's given an *actually initialized* structure, not one that is uninitialized, and the compiler complains. I'd consider that an actual bug (in Phobos I would guess, not the compiler).
Well actually format uses only the *type* S of the second parameter and uses S.init to check if that can be printed (if it would use the real parameter, that could have unwanted side effects I guess). It cannot print it, because S.init contains an uninitialized member. This produces the error "cannot read uninitialized variable i in CTFE". So, at that time, the code is somehow able to recognize, that the variable "i" is uninitialized, else it could not produce this error. What I'm looking for is a way, to catch that. That would solve the issue.
 You are thinking incorrectly about CTFE ;) It's executing an 
 *already compiled* function, at compile time.
That's indeed something, I did not know. :-)
 It's hard to have discussions about abstract types that 
 obviously are not useful. Maybe if you have a more likely 
 example, you can get help finding the right course of action.
As I wrote above, I was looking into issue 19769. This is the test given there: ``` import std.uni; import std.format; static immutable x = format("%s", cast(const)"test".asCapitalized); ``` This doesn't compile, because asCapitalized returns a struct (ToCapitalizerImpl), which contains uninitialized data (dchar[3] buf = void;) and format tries to print that. It first tries to print it as a range, but isInputRange misses somehow, that this struct is a range (probably because of the cast(const), but I don't know exactly) and therefore formatValueImpl tries to print the members of that struct directly, but as buf is not initialized, that fails. And the result is a strange error message (about uninitialized variables, where everything looks initialized). This bug could be fixed by making isInputRange recognize the struct as a range. But while I looked, what happens, I found out, that printing structs with uninitialized data fails (but with an errormessage, that says, that this data is uninitialized). I thought, it would be nice, if format would just print <void> instead of throwing an error and hence my question if doing so is possible.
Dec 08 2019
parent reply Steven Schveighoffer <schveiguy gmail.com> writes:
On 12/8/19 2:41 PM, berni44 wrote:
 As I wrote above, I was looking into issue 19769. This is the test given 
 there:
 
 ```
 import std.uni;
 import std.format;
 static immutable x = format("%s", cast(const)"test".asCapitalized);
 ```
 
 This doesn't compile, because asCapitalized returns a struct 
 (ToCapitalizerImpl), which contains uninitialized data (dchar[3] buf = 
 void;) and format tries to print that. It first tries to print it as a 
 range, but isInputRange misses somehow, that this struct is a range 
 (probably because of the cast(const), but I don't know exactly) and 
 therefore formatValueImpl tries to print the members of that struct 
 directly, but as buf is not initialized, that fails. And the result is a 
 strange error message (about uninitialized variables, where everything 
 looks initialized).
It definitely is the cast(const). Removing it works. And it makes sense -- a const type cannot be a range, as popFront can never be const. So one thing you CAN do, in cases like this is just initialize the damn void stuff in CTFE. i.e. here: https://github.com/dlang/phobos/blob/a24888e533adfe8d141eb598be22a50df5e26a66/std/uni.d#L9390 change to: auto result = ToCapitalizerImpl(str); if(__ctfe) result.buf[] = dchar.init; return result;
 
 This bug could be fixed by making isInputRange recognize the struct as a 
 range. But while I looked, what happens, I found out, that printing 
 structs with uninitialized data fails (but with an errormessage, that 
 says, that this data is uninitialized). I thought, it would be nice, if 
 format would just print <void> instead of throwing an error and hence my 
 question if doing so is possible.
OK, I see what you mean. Confusing error messages can be indeed a huge problem. But let's consider the fix I outline above. In this case, instead of "Test", like he expects, he's going to get x equal to (yes, I did it at runtime to see what it would be): const(ToCapitalizerImpl)(const(Result)(const(ByCodeUnitImpl)("test"), 4294967295), const(ToCaserImpl)(const(Result)(const(ByCodeUnitImpl)(""), 4294967295), 0, "\0\0\0"), false, "\0\0\0", 0) Which is going to result in a *different* confusion. Even if it said "<void>" like you want, it's still a pile of gibberish that's far from what you expect. So which is better? A compiler error saying essentially in a long and confusing way, "dude, you really don't want to do that", or a resulting binary that has, um... that other thing instead of "Test"? -Steve
Dec 08 2019
parent reply berni44 <dlang d-ecke.de> writes:
On Sunday, 8 December 2019 at 20:18:23 UTC, Steven Schveighoffer 
wrote:
 It definitely is the cast(const). Removing it works. And it 
 makes sense -- a const type cannot be a range, as popFront can 
 never be const.
Nit-picking: A infinite range producing always the same item could be....
 So one thing you CAN do, in cases like this is just initialize 
 the damn void stuff in CTFE.
Hmm. Feels a little bit like healing symptoms. We would need to identify all such places in Phobos. I'm not sure, if this is, what I'd like to see. (And it doesn't fix Ali's examples.) But I havn't got a better idea yet.
 const(ToCapitalizerImpl)(const(Result)(const(ByCodeUnitImpl)("test"),
4294967295), const(ToCaserImpl)(const(Result)(const(ByCodeUnitImpl)(""),
4294967295), 0, "\0\0\0"), false, "\0\0\0", 0)
Indeed, not really, what one wants to have...
 Which is going to result in a *different* confusion. Even if it 
 said "<void>" like you want, it's still a pile of gibberish 
 that's far from what you expect.
So, what would we expect? Ideally, it would note, that it's not a good idea to make a range const. But I think, that neither the compiler (does not know about ranges) nor Phobos (may not even be called) can do that. Yet I've got no clue, what's best here...
Dec 08 2019
parent reply Steven Schveighoffer <schveiguy gmail.com> writes:
On 12/9/19 1:10 AM, berni44 wrote:

 So, what would we expect? Ideally, it would note, that it's not a good 
 idea to make a range const. But I think, that neither the compiler (does 
 not know about ranges) nor Phobos (may not even be called) can do that.
 
 Yet I've got no clue, what's best here...
This is a frequent problem when building generic "do the best you can" libraries. Those libraries usually go something like this: static if(someFeatureWorks!T) { useFeature(t); } else { fallback(t); } Generally the "someFeatureWorks" (e.g. isRange) works on the assumption that you can compile something. The problem is, that there are various cases why you can't compile it, not all of which mean it wasn't *intended* to compile. For example, having an error in one of your functions makes it all of a sudden not a range! One thing we could do (well, actually, we can't do this because it would be too disruptive) is change the check that you can compile a call to popFront into hasMember!(T, "popFront"). What this does is capture the intention -- if the user put in a popFront method, I can't imagine they wanted to declare anything but a range type. Then the code tries to use it as a range and it fails because you didn't do something correctly (i.e., you made it const, or you have a bug in your popFront function, etc.). I've been meaning to write a blog post on this, but haven't had any time for it. -Steve
Dec 09 2019
parent berni44 <dlang d-ecke.de> writes:
On Monday, 9 December 2019 at 16:26:10 UTC, Steven Schveighoffer 
wrote:
 Then the code tries to use it as a range and it fails because 
 you didn't do something correctly (i.e., you made it const, or 
 you have a bug in your popFront function, etc.).
Yeah, the disadvantage of ducktyping...
Dec 09 2019
prev sibling next sibling parent reply kinke <noone nowhere.com> writes:
On Thursday, 5 December 2019 at 18:37:03 UTC, berni44 wrote:
 Should std.format check if the members of a struct are not 
 initialized and do what in that case?
It shouldn't, and there's no such thing as uninitialized struct/class *members*, there are only uninitialized whole struct instances. There are surely bugzillas wrt. struct S { int bla = void; } having no effect (correct way: `S s = void`). IMO, void member initializers should be invalid to make that clear.
Dec 05 2019
next sibling parent reply Steven Schveighoffer <schveiguy gmail.com> writes:
On 12/5/19 2:56 PM, kinke wrote:
 On Thursday, 5 December 2019 at 18:37:03 UTC, berni44 wrote:
 Should std.format check if the members of a struct are not initialized 
 and do what in that case?
It shouldn't, and there's no such thing as uninitialized struct/class *members*, there are only uninitialized whole struct instances. There are surely bugzillas wrt. struct S { int bla = void; } having no effect (correct way: `S s = void`). IMO, void member initializers should be invalid to make that clear.
Well, as long as all members are void initialized, then it should have an effect right? I know that the compiler blits the whole struct at once, but if it's all void initialized, it doesn't have to. -Steve
Dec 05 2019
parent kinke <noone nowhere.com> writes:
On Thursday, 5 December 2019 at 20:47:02 UTC, Steven 
Schveighoffer wrote:
 Well, as long as all members are void initialized, then it 
 should have an effect right? I know that the compiler blits the 
 whole struct at once, but if it's all void initialized, it 
 doesn't have to.
LDC and DMD still pre-initialize it (optimizer disabled) in that case (and druntime etc. via generic `= T.init` code). I doubt there's a good use case for a struct defining all of its fields as uninitialized, without being able to be sensibly default-constructed (no default ctor). These cases are probably better handled outside the struct, i.e., where the dangerous uninitialized allocation resides, and can be handled in some little factory function returning a void-initialized stack instance. E.g., this attempt at manually eliminating the init blit/memset before the ctor call: struct S { int[32] data = void; bool isDirty = void; this() disable; this(const int[] data) { this.data[] = data[]; isDirty = false; } } can be expressed (and actually works) like this, maintaining sensible default-constructability at the price of a somewhat more verbose `S.get()` for manually optimized construction: struct S { int[32] data; bool isDirty; static S get(const int[] data) { S r = void; r.data[] = data[]; r.isDirty = false; return r; } }
Dec 05 2019
prev sibling parent mipri <mipri minimaltype.com> writes:
On Thursday, 5 December 2019 at 19:56:35 UTC, kinke wrote:
 there's no such thing as uninitialized struct/class *members*, 
 there are only uninitialized whole struct instances.
Is that true in CTFE? Consider: import std.format; struct Foo { int a = void; int b; } static x = format("%s", Foo().a); static y = format("%s", Foo().b); The second format succeeds on its own. The first fails with Error: cannot read uninitialized variable a in CTFE
Dec 05 2019
prev sibling parent =?UTF-8?Q?Ali_=c3=87ehreli?= <acehreli yahoo.com> writes:
On 12/5/19 10:37 AM, berni44 wrote:
 std.format contains code, that accesses the members of a struct via=20
 tupleof (which gives access to private members, that might be=20
 intentionally not yet initialized), to print them. If members are not=20
 initialized it produces a "used before initialized" error or something =
 similar. (See issue 19769 [1]).
=20
 What to do about this? Should std.format check if the members of a=20
 struct are not initialized and do what in that case? And how to check t=
hem?
=20
 Here an example:
=20
 ```
 import std.format;
=20
 struct Foo
 {
  =C2=A0=C2=A0=C2=A0 int a =3D void;
 }
=20
 static x =3D format("%s", Foo());
 ```
=20
 Replace int by int[3] for a more complex version.
=20
 [1] https://issues.dlang.org/show_bug.cgi?id=3D19769
Similar: import std.string; struct S { int i =3D void; } void main() { auto s =3D S(42); auto f =3D format!"%s"(s); } Error: cannot read uninitialized variable `i` in CTFE Note that the user is not doing anything at compile time. Only format is = attempting to catch errors in the format string. Ali
Dec 05 2019