digitalmars.D - How about some __initialize magic?
- Stanislav Blinov (114/114) Nov 27 2021 D lacks syntax for initializing the uninitialized. We can do this:
- kinke (20/28) Nov 27 2021 It's already removed in master.
- russhy (13/13) Nov 27 2021 I would love to be able to do:
- Stanislav Blinov (9/14) Nov 28 2021 This is orthogonal to this discussion. Even if concise
- russhy (17/33) Nov 28 2021 this is the exact same issue
- Stanislav Blinov (6/17) Nov 28 2021 No, it isn't.
- russhy (5/6) Nov 28 2021 It does, you just don't understand what "we could improve it"
- Stanislav Blinov (26/32) Nov 29 2021 Oh I have no doubt that there is indeed some lack of
- russhy (5/6) Nov 29 2021 Oh i see.., thanks for being patient with me and providing me
- russhy (6/8) Nov 28 2021 let's improve it then, let's play more with it
- Stanislav Blinov (39/69) Nov 28 2021 I'd rather not leave this to "try". Not only because it's work
- Per =?UTF-8?B?Tm9yZGzDtnc=?= (10/20) Dec 13 2021 What about taking inspiration from C++'s
- Tejas (6/9) Dec 19 2021 Think this can help with that `void` initialization problem? We
- vit (39/40) Jan 04 2022 Nice idea, placement new really missing from D.
D lacks syntax for initializing the uninitialized. We can do this: ```d T stuff = T(args); // or new T(args); ``` but this?.. ```d T* ptr = allocateForT(); // now what?.. Can't just do *ptr = T(args) - that's an assignment, not initialization! // is T a struct? A union? A class? An int?.. Is it even a constructor call?.. ``` This is, uh, "solved", using library functions - `emplaceInitializer`, `emplace`, `copyEmplace`, `moveEmplace`. The fact that there are __four__ functions to do this should already ring a bell, but if one was to look at how e.g. the `emplace` is implemented, there's lots and lots more to it - classes or structs? Constructor or no constructor? Postblit? Copy?.. And all the delegation... A single call to `emplace` may copy the bits around more than once. Talk about initializing a static array... Or look at `emplaceInitializer`, which the other three all depend upon: it is, currently, built on a hack just to avoid blowing up the stack (which is, ostensibly, what previous less hacky hack lead to). Upcoming `__traits(initSymbol)` would help in removing the hack, but won't help CTFE any. At various points of their lives, these things even explicitly called `memcpy`, which is just... argh! And some still do (`copyEmplace`, I'm looking at you). Call into CRT to blit a 8-byte struct? With statically known size and alignment? Just to sidestep type system? Eh??? Much fun for copying arrays! ...And still, none of them would work in CTFE for many types, due to various implementation quirks (which include those very calls to memcpy, or reinterpret casts). This one could, potentially, be solved with more barbed wire and swear words, that is, code, but... Thing is, all those functions are re-implementing what the compiler can already do, but in a library. Or rather, come very close to doing that, but still don't really get there. C++ with its library solution does this better! What if the language specified a "magic" function, called, say, `__initialize`, that would just do the right thing (tm)? Given an lvalue, it would instruct the compiler to generate code writing initializer, bliting, copying, or calling the appropriate constructor with the arguments. And most importantly, would work in CTFE regardless of type, and not require weird dances around T.init, dummy types involving extra argument copies, or manual fieldwise and elementwise blits (which is what one would have to do in order to e.g. make `copyEmplace` CTFE-able). I.e: ```d // Write .init T* raw0 = allocateForT(); // currently - emplaceInitializer(raw0); (*raw0).__initialize; // Initialize fields or call constructor, whichever is applicable for T(arg1, arg2) T* raw1 = allocateForT(); // currently - raw1.emplace(forward!(arg1, arg2)); (*raw1).__initialize(forward!(arg1, arg2)); // Copy T* raw2 = allocateForT(); // currently - copyEmplace(*raw1, *raw2); (*raw2).__initialize(*raw1); // Move T* raw3 = allocateForT(); // currently - moveEmplace(*raw2, *raw3); (*raw3).__initialize(move(*raw2)); // Could be called at runtime or during CTFE auto createArray() { // big array, don't initialize const(T)[1000] result = void; // exception handling omitted for brevity foreach (i, ref it; result) { // currently - `emplace`, which may fail to compile in CTFE it.__initialize(createIthElement(i)); } return result; } // CTFE use case: static auto array = createArray(); ``` The wins are obvious - unified syntax, better error messages, CTFE support, less library voodoo failing at mimicking the compiler. The losses? I don't see any. Note that I am not talking about yet another library function. This would not be a symbol in druntime, this would be compiler magic. Having that, `emplaceInitializer`, `emplace` and `copyEmplace` could be re-implemented in terms of `__initialize`, and eventually deprecated and removed. `moveEmplace` could linger until DIP1040 is implemented, tried, and proven. The `move` example, verbatim, would be pessimized compared to `moveEmplace` due to moving twice, which hopefully DIP1040 could solve. I'm a bit hesitant to suggest how this should interact with ` safe`. On one hand, the established precedent is in `emplace` - it infers, and I'm leaning towards that, even though it can potentially invalidate existing state. On the other hand, because it can indeed invalidate existing state, it should be ` system`. But then it would require some additional facility just for inference, so it could be called ` trusted` correctly, otherwise it'd be useless. And that facility, whatever it is, better not be another library reincarnation of all required semantics. For example, something like a `__traits(isSafeToInitWith, T, args)`. Whichever the approach, it should definitely infer all other attributes. There are undoubtedly other things to consider. For example - classes. It would seem prudent for this hypothetical `__initialize` to be calling class ctors. On the other, a reference itself is just a POD, and generic code might indeed want to write null as opposed to attempting to call a default constructor. Then again, generic code still would have to specialize for classes... Thoughts welcome. What do you think? DIP this, yay or nay? Suggestions?..
Nov 27 2021
On Saturday, 27 November 2021 at 21:56:05 UTC, Stanislav Blinov wrote:[...] Upcoming `__traits(initSymbol)` would help in removing the hack,It's already removed in master.but won't help CTFE any. At various points of their lives, these things even explicitly called `memcpy`, which is just... argh! And some still do (`copyEmplace`, I'm looking at you). Call into CRT to blit a 8-byte struct? With statically known size and alignment? Just to sidestep type system? Eh???1. Most optimizers recognize a memcmp call and its semantics, and try to avoid the lib call accordingly. 2. A slice copy (`source[] = target[]` with e.g. void[]-typed slices) is a memcpy with additional potential checks for matching length and no overlap (with enabled bounds checks IIRC), so memcpy avoids that overhead. It also works with -betterC; e.g., the aforementioned checks are implemented as a druntime helper function for LDC and so not available with -betterC. 3. I haven't checked, but if memcpy is the only real CTFE blocker for emplace at the moment, I guess one option would be extending the CTFE interpreter by a memcpy builtin, in order not to have to further uglify the existing library code.What do you think? DIP this, yay or nay? Suggestions?..I'm not convinced I'm afraid. :) - I've been thinking in the other direction, treating core.lifetime.{move,forward} as builtins for codegen (possibly restricted to function call argument expressions), in order to save work for the optimizer and less bloat for debug builds.
Nov 27 2021
I would love to be able to do: ```D T* t = alloc(); (*t) = .{}; // or better t.* = .{}; // then we could also go ahead and be able to do like: t.* = .{ field_a: 1, fiels_2: 2 } ``` Basically relaxing that rule: https://dlang.org/spec/struct.html#static_struct_init Other languages do that, and i love them Don't let us stay behind because we refuse to more forward!
Nov 27 2021
On Sunday, 28 November 2021 at 03:19:49 UTC, russhy wrote:I would love to be able to do:This is orthogonal to this discussion. Even if concise initializer syntax that you suggest was allowed...```D T* t = alloc(); (*t) = .{}; ```...that's an assignment. I.e. that would lower down to `uninitializedGarbage.opAssign(T.init);`. Destructing garbage and/or calling operators on garbage isn't exactly the way to success :) Which is the crux of the problem in question, and why things like `emplace` exist in the first place.
Nov 28 2021
On Sunday, 28 November 2021 at 08:54:39 UTC, Stanislav Blinov wrote:On Sunday, 28 November 2021 at 03:19:49 UTC, russhy wrote:this is the exact same issue this is exactly why i mentioned it emplace is a library, it doesn't solve anything it solves people's addiction to "import" things if you tell people they need to import package to to initialization, then the language is a failure ``.{}`` wins over ``__initialize`` there need to be a movement to stop making syntax such a pain to write, and make things overall consistent It's the same with enums ``MyEnumDoingThings myEnumThatINeed = MyEnumDoingThings.SOMETHING_IS_NOT_RIGHT;`` And now you want to same for everything else ``(*raw1).__initialize(forward!(arg1, arg2));`` more typing! templates!! more long lines!!! more slowness!!!!I would love to be able to do:This is orthogonal to this discussion. Even if concise initializer syntax that you suggest was allowed...```D T* t = alloc(); (*t) = .{}; ```...that's an assignment. I.e. that would lower down to `uninitializedGarbage.opAssign(T.init);`. Destructing garbage and/or calling operators on garbage isn't exactly the way to success :) Which is the crux of the problem in question, and why things like `emplace` exist in the first place.
Nov 28 2021
On Sunday, 28 November 2021 at 16:36:05 UTC, russhy wrote:this is the exact same issueNo, it isn't.It's the same with enumsNo, it isn't.And now you want to same for everything elseNo, I don't.``(*raw1).__initialize(forward!(arg1, arg2));`` more typing! templates!! more long lines!!! more slowness!!!!Way off mark here.This is orthogonal to this discussion. Even if concise initializer syntax that you suggest was allowedlet's improve it then, let's play more with it instead of introducing new functions/templates i feel like this is the perfect place to have such improvements take placeThis topic has nothing to do with what you're talking about.
Nov 28 2021
On Sunday, 28 November 2021 at 19:30:11 UTC, Stanislav Blinov wrote:This topic has nothing to do with what you're talking about.It does, you just don't understand what "we could improve it" mean; relaxing its rules, and reusing the syntax for doing what you ask for
Nov 28 2021
On Sunday, 28 November 2021 at 22:00:05 UTC, russhy wrote:On Sunday, 28 November 2021 at 19:30:11 UTC, Stanislav Blinov wrote:Oh I have no doubt that there is indeed some lack of understanding here. So I'm going to try one last time. The problem in question lies in the assignment operator, __not__ whatever's on the right hand side of it. It's absolutely irrelevant here how you spell the initializer. First please understand the difference between initialization and assignment. Then read up on https://dlang.org/spec/declaration.html#void_init and then try to understand that assigning to uninitialized structs that have an explicit or implicit `opAssign` defined would involve using uninitialized values, which may lead to UB. And that is just one of the problems that existing library solutions address. The rest is spelled out in the first post. Have fun with this little program: ```d import std.stdio; void main() { File file = void; file = File.init; // File.init, .{}, BANANAS - doesn't matter, it's UB } ``` So once again, if you want to discuss initializer syntax, feel free to create a topic for that as that is not what's in question here.This topic has nothing to do with what you're talking about.It does, you just don't understand what "we could improve it" mean; relaxing its rules, and reusing the syntax for doing what you ask for
Nov 29 2021
On Monday, 29 November 2021 at 08:35:06 UTC, Stanislav Blinov wrote:(..)Oh i see.., thanks for being patient with me and providing me links where i can read more about it! sorry for derailing the post!
Nov 29 2021
On Sunday, 28 November 2021 at 08:54:39 UTC, Stanislav Blinov wrote:This is orthogonal to this discussion. Even if concise initializer syntax that you suggest was allowedlet's improve it then, let's play more with it instead of introducing new functions/templates i feel like this is the perfect place to have such improvements take place
Nov 28 2021
On Sunday, 28 November 2021 at 02:15:37 UTC, kinke wrote:On Saturday, 27 November 2021 at 21:56:05 UTC, Stanislav Blinov wrote:Cool![...] Upcoming `__traits(initSymbol)` would help in removing the hack,It's already removed in master.I'd rather not leave this to "try". Not only because it's work that needn't be done, but also for debug performance. Exactly the stuff you talk about at the end of your post :Dbut won't help CTFE any. At various points of their lives, these things even explicitly called `memcpy`, which is just... argh! And some still do (`copyEmplace`, I'm looking at you). Call into CRT to blit a 8-byte struct? With statically known size and alignment? Just to sidestep type system? Eh???1. Most optimizers recognize a memcmp call and its semantics, and try to avoid the lib call accordingly.2. A slice copy (`source[] = target[]` with e.g. void[]-typed slices) is a memcpy with additional potential checks for matching length and no overlap (with enabled bounds checks IIRC), so memcpy avoids that overhead. It also works with -betterC; e.g., the aforementioned checks are implemented as a druntime helper function for LDC and so not available with -betterC.Slice copies aren't needed :) Nor would they work in CTFE, as that requires reinterpret-casting a T to a slice.3. I haven't checked, but if memcpy is the only real CTFE blocker for emplace at the moment, I guess one option would be extending the CTFE interpreter by a memcpy builtin, in order not to have to further uglify the existing library code.`emplace` is also deficient: https://github.com/dlang/druntime/blob/2b7873da09c63761fe6e69dc4dd225c0844ed4e9/src/core/internal/lifetime.d#L31-L59 Also note that that's already one call down from `emplace`, and potentially could `move` the bits or copy the argument(s) again (to call the fake struct ctor), and then, of course, again, in implementation of that fake ctor. Same goes for the actual non-fake struct `__ctor` version. Initializing large structs or those having expensive copy ctors is no fun. `-O` build may help with some of that, of course, but again I'd rather this didn't need to be in the first place. `emplaceInitializer` also may not work in all cases. Current one would fail on that mangling business, upcoming one - because `__traits(initSymbol)` gives you a `void[]`, meaning a reinterpret cast is needed somewhere, meaning no dice for CTFE. And that means none of these guys would work when initializer is required, since everyone in the `emplace` family is dependent on `emplaceInitializer`. So CTFE-able implementation would be back to union fun. Except, of course, for classes, which is... questionable. Making `mem*` functions available to CTFE would be a big improvement for sure, but it only solves half the problem (the other being reinterpret casts). `emplace` in CTFE should fail for one reason only - if the ctor is not CTFE-able (i.e. that's caller's responsibility). So far, it may fail for reasons that are down to language plumbing :(A compiler extension? Wouldn't that require semantics to be the same? Surely you wouldn't want to artificially limit their implementation in compiler just because library versions are deficient? I mean, I'm not against this idea, but AFAIUI that route mandates we make library versions more robust. Then again, why have four builtins where one can suffice? ;)What do you think? DIP this, yay or nay? Suggestions?..I'm not convinced I'm afraid. :) - I've been thinking in the other direction, treating core.lifetime.{move,forward} as builtins for codegen (possibly restricted to function call argument expressions), in order to save work for the optimizer and less bloat for debug builds.
Nov 28 2021
On Saturday, 27 November 2021 at 21:56:05 UTC, Stanislav Blinov wrote:```d T stuff = T(args); // or new T(args); ``` but this?.. ```d T* ptr = allocateForT(); // now what?.. Can't just do *ptr = T(args) - that's an assignment, not initialization! // is T a struct? A union? A class? An int?.. Is it even a constructor call?..What about taking inspiration from C++'s https://en.cppreference.com/w/cpp/language/new#Placement_new ? Allocators could be supported by either - passing the allocator as to `new()` as an the first normal or template parameter or - have `new` be defineable as a member function of an allocator (or any struct or class) ?
Dec 13 2021
On Saturday, 27 November 2021 at 21:56:05 UTC, Stanislav Blinov wrote:D lacks syntax for initializing the uninitialized. We can do this: [...]Think this can help with that `void` initialization problem? We can have a node attribute `isInitialized` in the frontend, depending on which we can decide whether to call the destructor or not at the end of the scope?
Dec 19 2021
On Saturday, 27 November 2021 at 21:56:05 UTC, Stanislav Blinov wrote:...Nice idea, placement new really missing from D. I have reverse problem with emplace then you. emplaceRef wrongly infers safe attribute for non ctfe code if assignment is system because of this: ```d //emplaceRef: if (__ctfe) ///... chunk = forward!(args[0]); ///... } ``` Example: ```d struct Foo{ this(scope ref const typeof(this) rhs) safe{} void opAssign(scope ref const typeof(this) rhs) system{} } void main() safe{ import core.lifetime : emplace; Foo foo; { const Foo* ptr; emplace(ptr, foo); //OK __ctfe path doesn't exists } { Foo* ptr; emplace(ptr, foo); //ERROR __ctfe path exists and call system opAssign } } ``` Error: ` safe` function `D main` cannot call ` system` function `core.lifetime.emplace!(Foo, Foo).emplace` D has one good thing, you can create custom emplace which run your own code between emplaceInitialize and ctor. You can initialize your own vptr before ctor.
Jan 04 2022