www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Lowerings to strongly pure _d_arrayctor trigger warnings or risk being

reply Teodor Dutu <teodor.dutu gmail.com> writes:
Hi,

I've been working on [this 
PR](https://github.com/dlang/dmd/pull/13116) for a while now and 
after seeing it fail some tests in phobos (for example, [this 
one](https://cirrus-ci.com/task/5609501660282880?logs=test_phobos#L116)), my
mentors for SAoC 2021, Razvan Nitu and Eduard Staniloiu, and I figured out
that, when lowered using `const` or `immutable` arguments, `_d_arrayctor`
becomes a strongly pure function. For instance, in the code snippet below
```d
struct S {};
const S[2] b;
const S[2] a = b;
```
the line `const S[2] a = b` is lowered to `_d_arrayctor(a, b)`, 
which is the intended behaviour.

However, since, in this case, `_d_arrayctor` is **strongly pure** 
and since its return value is ignored, the compiler issues the 
warning in the failed test above. In addition, its strong purity 
might also cause calls to `_d_arrayctor` to be removed by the 
compiler as part of the optimisation phase.

In order to avoid such unwanted events, the only solution we 
could come up with was to force `_d_arrayctor` to be **weakly 
pure** instead. We achieved this by adding a third pointer-type 
parameter, as implemented in [this 
PR](https://github.com/dlang/druntime/pull/3587). But this is 
merely a stop-gap solution, because it acts against the language, 
by denying one of its properties: purity.

We also tried changing the type of either `to` or the `from` 
parameters to a mutable `void[]`, but in this case the compiler 
was unable to instantiate the function's template correctly. So 
this solution didn't work.

Have you faced this issue before? What were your solutions?

Thanks,
Teodor
Oct 12 2021
next sibling parent reply Paul Backus <snarwin gmail.com> writes:
On Tuesday, 12 October 2021 at 16:27:51 UTC, Teodor Dutu wrote:
 Hi,

 I've been working on [this 
 PR](https://github.com/dlang/dmd/pull/13116) for a while now 
 and after seeing it fail some tests in phobos (for example, 
 [this 
 one](https://cirrus-ci.com/task/5609501660282880?logs=test_phobos#L116)), my
mentors for SAoC 2021, Razvan Nitu and Eduard Staniloiu, and I figured out
that, when lowered using `const` or `immutable` arguments, `_d_arrayctor`
becomes a strongly pure function. For instance, in the code snippet below
 ```d
 struct S {};
 const S[2] b;
 const S[2] a = b;
 ```
 the line `const S[2] a = b` is lowered to `_d_arrayctor(a, b)`, 
 which is the intended behaviour.
I think the fundamental problem is that what `_d_arrayctor` is doing here is undefined behavior, according to the language spec:
 Note that casting away a const qualifier and then mutating is 
 undefined behavior, too, even when the referenced data is 
 mutable.
Source: https://dlang.org/spec/const3.html#removing_with_cast The spec makes [a special exception][1] for `struct` and `class` constructors, which allows them to write to non-mutable memory once without invoking UB, but there is no corresponding exception for variables that are not part of a `struct` or `class`. In order to make progress on this, it will be necessary to change both the language spec and the compiler to allow initialization of non-mutable memory by library code. These changes will, presumably, render the optimization you are currently fighting against invalid. [1]: https://dlang.org/spec/struct.html#field-init
Oct 12 2021
parent reply Teodor Dutu <teodor.dutu gmail.com> writes:
On Tuesday, 12 October 2021 at 16:46:00 UTC, Paul Backus wrote:
 On Tuesday, 12 October 2021 at 16:27:51 UTC, Teodor Dutu wrote:
 ```d
 struct S {};
 const S[2] b;
 const S[2] a = b;
 ```
I think the fundamental problem is that what `_d_arrayctor` is doing here is undefined behavior, according to the language spec:
 Note that casting away a const qualifier and then mutating is 
 undefined behavior, too, even when the referenced data is 
 mutable.
Source: https://dlang.org/spec/const3.html#removing_with_cast The spec makes [a special exception][1] for `struct` and `class` constructors, which allows them to write to non-mutable memory once without invoking UB, but there is no corresponding exception for variables that are not part of a `struct` or `class`. [1]: https://dlang.org/spec/struct.html#field-init
I have read the documentation at the links you left, but I can't see why the usage of `_d_arrayctor` would be an undefined behaviour. This lowering only occurs when initialising static arrays or slices, whereas the removal of the `immutable` or `const` qualifiers that you mentioned is showcased like this: ```d immutable int* p = ...; int* q = cast(int*)p; ``` And this has nothing to do with array initialisations, so I think I'm in the clear with using `_d_arrayctor`. Do you think I'm missing anything?
Oct 12 2021
parent reply Paul Backus <snarwin gmail.com> writes:
On Tuesday, 12 October 2021 at 18:04:21 UTC, Teodor Dutu wrote:
 I have read the documentation at the links you left, but I 
 can't see why the usage of `_d_arrayctor` would be an undefined 
 behaviour.
`_d_arrayctor` casts away `immutable` internally--either [here][1], directly, or [here][2] indirectly via `copyEmplace` (which casts it away [here][3]). It then uses `memcpy` to mutate the memory that was originally typed as `immutable`. This is undefined behavior, according to the language spec. [1]: https://github.com/dlang/druntime/blob/16c836a7b084b935504427579e2d646c462ee6e4/src/core/internal/array/construction.d#L77 [2]: https://github.com/dlang/druntime/blob/16c836a7b084b935504427579e2d646c462ee6e4/src/core/internal/array/construction.d#L59 [3]: https://github.com/dlang/druntime/blob/16c836a7b084b935504427579e2d646c462ee6e4/src/core/lifetime.d#L1239
 This lowering only occurs when initialising static arrays or 
 slices
The spec does not currently make an exception for initializing static arrays or slices, so the fact that it only occurs in that context does not make any difference to whether the spec considers it UB.
Oct 12 2021
parent reply Teodor Dutu <teodor.dutu gmail.com> writes:
On Tuesday, 12 October 2021 at 18:27:53 UTC, Paul Backus wrote:
 On Tuesday, 12 October 2021 at 18:04:21 UTC, Teodor Dutu wrote:
 I have read the documentation at the links you left, but I 
 can't see why the usage of `_d_arrayctor` would be an 
 undefined behaviour.
`_d_arrayctor` casts away `immutable` internally--either [here][1], directly, or [here][2] indirectly via `copyEmplace` (which casts it away [here][3]). It then uses `memcpy` to mutate the memory that was originally typed as `immutable`. This is undefined behavior, according to the language spec. [1]: https://github.com/dlang/druntime/blob/16c836a7b084b935504427579e2d646c462ee6e4/src/core/internal/array/construction.d#L77 [2]: https://github.com/dlang/druntime/blob/16c836a7b084b935504427579e2d646c462ee6e4/src/core/internal/array/construction.d#L59 [3]: https://github.com/dlang/druntime/blob/16c836a7b084b935504427579e2d646c462ee6e4/src/core/lifetime.d#L1239
 This lowering only occurs when initialising static arrays or 
 slices
The spec does not currently make an exception for initializing static arrays or slices, so the fact that it only occurs in that context does not make any difference to whether the spec considers it UB.
I see your point now. But this behaviour is not new. The current implementation uses [this hook](https://github.com/dlang/druntime/blob/a17bb23b418405e1ce8e4a317651039758013f39/src/rt/ rrayassign.d#L170), which suffers from the same shortcomings. Furthermore, the hook uses `TypeInfo`, which makes it slower than what I'm trying to achieve with the templated approach. And secondly, bear in mind that the lowering only occurs when the `lhs` is being constructed, which means that two things can happen: - the compiler can perform constant propagation by replacing the `rhs` variable with an array literal, which will **not** be lowered to `_d_arrayctor`: https://github.com/dlang/dmd/blob/3aa786cfb6990ff5f9c5085d0efd3421c3adc956/src/dmd/expressionsem.d#L9778 - the compiler doesn't perform this optimisation, which means that some data has to be written into the `const`/`immutable` array (`lhs`). I don't see how the second scenario can be avoided, with or without `_d_arrayctor`.
Oct 12 2021
parent Paul Backus <snarwin gmail.com> writes:
On Tuesday, 12 October 2021 at 19:31:34 UTC, Teodor Dutu wrote:
 On Tuesday, 12 October 2021 at 18:27:53 UTC, Paul Backus wrote:
 On Tuesday, 12 October 2021 at 18:04:21 UTC, Teodor Dutu wrote:
 I have read the documentation at the links you left, but I 
 can't see why the usage of `_d_arrayctor` would be an 
 undefined behaviour.
`_d_arrayctor` casts away `immutable` internally--either [here][1], directly, or [here][2] indirectly via `copyEmplace` (which casts it away [here][3]). It then uses `memcpy` to mutate the memory that was originally typed as `immutable`. This is undefined behavior, according to the language spec. [1]: https://github.com/dlang/druntime/blob/16c836a7b084b935504427579e2d646c462ee6e4/src/core/internal/array/construction.d#L77 [2]: https://github.com/dlang/druntime/blob/16c836a7b084b935504427579e2d646c462ee6e4/src/core/internal/array/construction.d#L59 [3]: https://github.com/dlang/druntime/blob/16c836a7b084b935504427579e2d646c462ee6e4/src/core/lifetime.d#L1239
 This lowering only occurs when initialising static arrays or 
 slices
The spec does not currently make an exception for initializing static arrays or slices, so the fact that it only occurs in that context does not make any difference to whether the spec considers it UB.
I see your point now. But this behaviour is not new. The current implementation uses [this hook](https://github.com/dlang/druntime/blob/a17bb23b418405e1ce8e4a317651039758013f39/src/rt/ rrayassign.d#L170), which suffers from the same shortcomings. Furthermore, the hook uses `TypeInfo`, which makes it slower than what I'm trying to achieve with the templated approach. And secondly, bear in mind that the lowering only occurs when the `lhs` is being constructed, which means that two things can happen: - the compiler can perform constant propagation by replacing the `rhs` variable with an array literal, which will **not** be lowered to `_d_arrayctor`: https://github.com/dlang/dmd/blob/3aa786cfb6990ff5f9c5085d0efd3421c3adc956/src/dmd/expressionsem.d#L9778 - the compiler doesn't perform this optimisation, which means that some data has to be written into the `const`/`immutable` array (`lhs`). I don't see how the second scenario can be avoided, with or without `_d_arrayctor`.
Yes, I understand that the compiler has to initialize the array somehow, and that in the general case this requires writing to memory typed as `immutable`. My point is not that `_d_arrayctor` should not do its job, but that the current language spec makes it impossible to implement `_d_arrayctor` in D (as opposed to "in compiler magic") without invoking UB. This means that if you want to implement `_d_arrayctor` in D, you will have to start by changing the language spec. If you don't, any workaround you come up with is still going to involve UB, which means it will be at risk of falling apart as soon as it is exposed to a more powerful optimizer.
Oct 12 2021
prev sibling parent reply Johan <j j.nl> writes:
On Tuesday, 12 October 2021 at 16:27:51 UTC, Teodor Dutu wrote:
 Hi,

 I've been working on [this 
 PR](https://github.com/dlang/dmd/pull/13116) for a while now 
 and after seeing it fail some tests in phobos (for example, 
 [this 
 one](https://cirrus-ci.com/task/5609501660282880?logs=test_phobos#L116)), my
mentors for SAoC 2021, Razvan Nitu and Eduard Staniloiu, and I figured out
that, when lowered using `const` or `immutable` arguments, `_d_arrayctor`
becomes a strongly pure function.
Why would you want to instantiate `_d_arrayctor` multiple times for different constness of its arguments? You can choose the exact details of the lowering, so would your problem not be solved by simply lower to calls without const/immutable parameter types? -Johan
Oct 12 2021
parent RazvanN <razvan.nitu1305 gmail.com> writes:
On Tuesday, 12 October 2021 at 16:46:10 UTC, Johan wrote:
 On Tuesday, 12 October 2021 at 16:27:51 UTC, Teodor Dutu wrote:
 Hi,

 I've been working on [this 
 PR](https://github.com/dlang/dmd/pull/13116) for a while now 
 and after seeing it fail some tests in phobos (for example, 
 [this 
 one](https://cirrus-ci.com/task/5609501660282880?logs=test_phobos#L116)), my
mentors for SAoC 2021, Razvan Nitu and Eduard Staniloiu, and I figured out
that, when lowered using `const` or `immutable` arguments, `_d_arrayctor`
becomes a strongly pure function.
Why would you want to instantiate `_d_arrayctor` multiple times for different constness of its arguments? You can choose the exact details of the lowering, so would your problem not be solved by simply lower to calls without const/immutable parameter types? -Johan
Because you want to call the correct copy constructor. Consider: struct S { this(ref S rhs) immutable {} } void main() { S[2] a; immutable S[2] b = a; }
Oct 14 2021