www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - -preview=in might break code

reply Steven Schveighoffer <schveiguy gmail.com> writes:
Is there a way to prevent this?

import std.stdio;
struct S(size_t elems)
{
     int[elems] data;
}

void foo(T)(in T constdata, ref T normaldata)
{
     normaldata.data[0] = 1;
     writeln(constdata.data[0]);
}
void main()
{
     S!1 smallval;
     foo(smallval, smallval);
     S!100 largeval;
     foo(largeval, largeval);
}


Compile without -preview=in, it prints:

0
0

Compile with -preview=in, it prints:

0
1

-Steve
Oct 02 2020
next sibling parent Steven Schveighoffer <schveiguy gmail.com> writes:
On 10/2/20 10:08 AM, Steven Schveighoffer wrote:
 Is there a way to prevent this?
Or at least warn about it? -Steve
Oct 02 2020
prev sibling next sibling parent Imperatorn <johan_forsberg_86 hotmail.com> writes:
On Friday, 2 October 2020 at 14:08:29 UTC, Steven Schveighoffer 
wrote:
 Is there a way to prevent this?

 import std.stdio;
 struct S(size_t elems)
 {
     int[elems] data;
 }

 void foo(T)(in T constdata, ref T normaldata)
 {
     normaldata.data[0] = 1;
     writeln(constdata.data[0]);
 }
 void main()
 {
     S!1 smallval;
     foo(smallval, smallval);
     S!100 largeval;
     foo(largeval, largeval);
 }


 Compile without -preview=in, it prints:

 0
 0

 Compile with -preview=in, it prints:

 0
 1

 -Steve
Hmm, that doesn't look good 🤔
Oct 02 2020
prev sibling next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.com> writes:
On 10/2/20 10:08 AM, Steven Schveighoffer wrote:
 Is there a way to prevent this?
 
 import std.stdio;
 struct S(size_t elems)
 {
      int[elems] data;
 }
 
 void foo(T)(in T constdata, ref T normaldata)
 {
      normaldata.data[0] = 1;
      writeln(constdata.data[0]);
 }
 void main()
 {
      S!1 smallval;
      foo(smallval, smallval);
      S!100 largeval;
      foo(largeval, largeval);
 }
 
 
 Compile without -preview=in, it prints:
 
 0
 0
 
 Compile with -preview=in, it prints:
 
 0
 1
Finally, my "told you so" moment has come! :o) https://forum.dlang.org/post/rhmst4$1vmc$1 digitalmars.com
Oct 02 2020
next sibling parent reply Steven Schveighoffer <schveiguy gmail.com> writes:
On 10/2/20 10:32 AM, Andrei Alexandrescu wrote:
 On 10/2/20 10:08 AM, Steven Schveighoffer wrote:
 Is there a way to prevent this?

 import std.stdio;
 struct S(size_t elems)
 {
      int[elems] data;
 }

 void foo(T)(in T constdata, ref T normaldata)
 {
      normaldata.data[0] = 1;
      writeln(constdata.data[0]);
 }
 void main()
 {
      S!1 smallval;
      foo(smallval, smallval);
      S!100 largeval;
      foo(largeval, largeval);
 }


 Compile without -preview=in, it prints:

 0
 0

 Compile with -preview=in, it prints:

 0
 1
Finally, my "told you so" moment has come! :o) https://forum.dlang.org/post/rhmst4$1vmc$1 digitalmars.com
My problem with it isn't necessarily that it uses references in some cases vs. copies in others, it's that the decision is arbitrary and implementation defined. Legitimately, you can have code that compiles fine on one compiler and breaks subtly on others. But good to see this was at least discussed. However, I'm not sure the result is what should have happened... Just noticed too, if you want to *force* ref by using in ref, you get this message: Error: attribute ref is redundant with previously-applied in That is... not good. I think in should always mean ref. If you want to pass not by ref, use const. -Steve
Oct 02 2020
next sibling parent reply Mathias LANG <geod24 gmail.com> writes:
 Is there a way to prevent this?
In the general case, no. You can have two distinct pointers with the same value, and there's nothing the frontend can do to detect it. This scenario has been brought up during the review. I doubt it will, in practice, be an issue though. This is not a common pattern, nor does it seems useful. It rather looks like a code smell. Bear in mind that in D, `const` data can change under your feet. The following case is similar to your example: ``` char[16] buffer; foo(buffer, buffer); void foo (const(char)[] data, char[] buff); ``` And yet, no one complains that it breaks `const`. The fact that `in` sometimes means by value, and sometimes by `ref`, seems to be the problem. I think, in our explanation of `in`, we ought to phrase things this way: `in` passes by `ref` *unless* it is more efficient to pass by value. With the added note that mutating the parameter through an indirection should not be relied on. On Friday, 2 October 2020 at 14:48:18 UTC, Steven Schveighoffer wrote:
 My problem with it isn't necessarily that it uses references in 
 some cases vs. copies in others, it's that the decision is 
 arbitrary and implementation defined.
Not *quite* arbitrary. It's only passed by value if it's relatively efficient to do so. That means that anything that'd triggers a copy constructor, dtor, postblit, etc... is guaranteed to be passed by `ref`.
 Just noticed too, if you want to *force* ref by using in ref, 
 you get this message:

 Error: attribute ref is redundant with previously-applied in

 That is... not good.

 I think in should always mean ref. If you want to pass not by 
 ref, use const.
This limitation was, I think, the main pain point for people during the review. The main reason for disallowing it is that allowing it would open the door for overloading based on `in`, which is *definitely* not something we want, since it would mean people relying on by-value passing of `in`. While that was the main reason why I went this way (remember I experimented with several designs), there were two additional benefits that cemented the conviction that it was the way to go. First, it creates a nice separation between the three parameters storage classes: `in`, `ref`, `out`, hopefully simplifying the language and making it easier to explain. Second, it allowed me to bake in a little trick: When `ref` is inferred for `in`, the parameter *actually* mangles as `in ref`. The point of doing that was to prevent code compiled with different compilers, or version of the same compilers, to link if they had a mismatch in their inference rules. Regarding the "in should always be `ref`", I following Kinke's advice here, which seems sensible: it should be up to the ABI to decide what is the most efficient way to pass a parameter. But there is also a reason why you don't want everything to be `ref`. Consider the following two types: ``` alias A = void delegate(in char[]); alias B = void delegate(const(char)[]); ``` I wanted `A` to be implicitly convertible to `B`, because not only does it make sense, but it avoids a lot of the code breakage I was seeing in druntime.
Oct 02 2020
parent reply Steven Schveighoffer <schveiguy gmail.com> writes:
On 10/2/20 1:01 PM, Mathias LANG wrote:
 Is there a way to prevent this?
In the general case, no. You can have two distinct pointers with the same value, and there's nothing the frontend can do to detect it. This scenario has been brought up during the review. I doubt it will, in practice, be an issue though. This is not a common pattern, nor does it seems useful. It rather looks like a code smell.
Of course, the exact sample that I wrote is not what happens. What happens is something more convoluted. But it will happen.
 Bear in mind that in D, `const` data can change under your feet. The 
 following case is similar to your example:
 ```
 char[16] buffer;
 foo(buffer, buffer);
 void foo (const(char)[] data, char[] buff);
 ```
 And yet, no one complains that it breaks `const`. The fact that `in` 
 sometimes means by value, and sometimes by `ref`, seems to be the 
 problem. I think, in our explanation of `in`, we ought to phrase things 
 this way: `in` passes by `ref` *unless* it is more efficient to pass by 
 value. With the added note that mutating the parameter through an 
 indirection should not be relied on.
Yes, the problem is the "sometimes ref". Because ref changes the semantics. I read it as, in means by ref, unless the compiler can prove it's the same to pass by value, and that is more efficient. But if it's for *optimization*, it shouldn't change the effective semantics. The optimizer should be invisible. In practice, I don't think the compiler can prove that.
 
 
 On Friday, 2 October 2020 at 14:48:18 UTC, Steven Schveighoffer wrote:
 My problem with it isn't necessarily that it uses references in some 
 cases vs. copies in others, it's that the decision is arbitrary and 
 implementation defined.
Not *quite* arbitrary. It's only passed by value if it's relatively efficient to do so. That means that anything that'd triggers a copy constructor, dtor, postblit, etc... is guaranteed to be passed by `ref`.
One point of decision is arbitrary -- is it big enough to be worth it to pass by ref. That "big enough" decision is clearly specified to depend on compiler/ABI. Quoting from the changelog: "Otherwise, if the type's size requires it, it will be passed by reference. Currently, types which are over twice the machine word size will be passed by reference, however this is controlled by the backend and can be changed based on the platform's ABI."
 Just noticed too, if you want to *force* ref by using in ref, you get 
 this message:

 Error: attribute ref is redundant with previously-applied in

 That is... not good.

 I think in should always mean ref. If you want to pass not by ref, use 
 const.
This limitation was, I think, the main pain point for people during the review. The main reason for disallowing it is that allowing it would open the door for overloading based on `in`, which is *definitely* not something we want, since it would mean people relying on by-value passing of `in`.
Why would it be any different? What I mean is pass by ref always, but still allow binding to lvalues and rvalues.
 While that was the main reason why I went this way (remember I 
 experimented with several designs), there were two additional benefits 
 that cemented the conviction that it was the way to go.
 First, it creates a nice separation between the three parameters storage 
 classes: `in`, `ref`, `out`, hopefully simplifying the language and 
 making it easier to explain.
All three of them are consistent, if `in` always means `by reference`.
 Second, it allowed me to bake in a little trick: When `ref` is inferred 
 for `in`, the parameter *actually* mangles as `in ref`. The point of 
 doing that was to prevent code compiled with different compilers, or 
 version of the same compilers, to link if they had a mismatch in their 
 inference rules.
So you like the situation that 2 compilers will not be compatible? Instead of they just work because everyone does the same thing?
 
 Regarding the "in should always be `ref`", I following Kinke's advice 
 here, which seems sensible: it should be up to the ABI to decide what is 
 the most efficient way to pass a parameter.
This is NOT about how to pass a parameter. The ABI does not make a decision on by value vs. by ref. The semantics are different. The only way it can make this decision and still be sane is if passing by ref does not change the semantics of the resulting code.
 But there is also a reason 
 why you don't want everything to be `ref`. Consider the following two 
 types:
 ```
 alias A = void delegate(in char[]);
 alias B = void delegate(const(char)[]);
 ```
 I wanted `A` to be implicitly convertible to `B`, because not only does 
 it make sense, but it avoids a lot of the code breakage I was seeing in 
 druntime.
And this might not be true on a different compiler. -Steve
Oct 02 2020
next sibling parent reply Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:
On Friday, 2 October 2020 at 17:31:06 UTC, Steven Schveighoffer 
wrote:
 Yes, the problem is the "sometimes ref". Because ref changes 
 the semantics.

 I read it as, in means by ref, unless the compiler can prove 
 it's the same to pass by value, and that is more efficient. But 
 if it's for *optimization*, it shouldn't change the effective 
 semantics. The optimizer should be invisible.

 In practice, I don't think the compiler can prove that.
A good backend most certainly can? This ought to be a pure backend issue and not affect the fontend at all if the separation front/back is as it should be. Calling conventions do not belong in a language spec...
Oct 02 2020
parent reply Steven Schveighoffer <schveiguy gmail.com> writes:
On 10/2/20 2:33 PM, Ola Fosheim Grøstad wrote:
 On Friday, 2 October 2020 at 17:31:06 UTC, Steven Schveighoffer wrote:
 Yes, the problem is the "sometimes ref". Because ref changes the 
 semantics.

 I read it as, in means by ref, unless the compiler can prove it's the 
 same to pass by value, and that is more efficient. But if it's for 
 *optimization*, it shouldn't change the effective semantics. The 
 optimizer should be invisible.

 In practice, I don't think the compiler can prove that.
A good backend most certainly can? This ought to be a pure backend issue and not affect the fontend at all if the separation front/back is as it should be.
How does the compiler prove that passing by value or by reference is not going to affect the resulting code? I mean, it could potentially say, there are no references in all the mutable parameters, and so I can pass by value. But that's kind of a wide net.
 
 Calling conventions do not belong in a language spec...
 
You mean by ref or by value isn't part of the language spec? I don't understand the point. -Steve
Oct 02 2020
parent reply Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:
On Friday, 2 October 2020 at 19:03:52 UTC, Steven Schveighoffer 
wrote:
 On 10/2/20 2:33 PM, Ola Fosheim Grøstad wrote:
 How does the compiler prove that passing by value or by 
 reference is not going to affect the resulting code? I mean, it 
 could potentially say, there are no references in all the 
 mutable parameters, and so I can pass by value. But that's kind 
 of a wide net.
It can do bookeeping of how one param can influence another. If you have potential aliasing you can analyse if the affected const ref is read after a mutation is possible. If it is read before then there is no issue. It may track ownership intenally and determine that the reference is isolated (the only one). Then track it as if it is effectively immutable.
 You mean by ref or by value isn't part of the language spec? I 
 don't understand the point.
The spec should list undefined behaviour and required observable effects, but not parameter passing strategies. You can call something pass-by-value, but the compiler can still pass a reference as long as it cannot be observed unless undefined behaviour has been triggered by the programmer (by stepping outside the language).
Oct 02 2020
parent reply Steven Schveighoffer <schveiguy gmail.com> writes:
On 10/2/20 3:20 PM, Ola Fosheim Grøstad wrote:

 You can call something pass-by-value, but the compiler can still pass a 
 reference as long as it cannot be observed unless undefined behaviour 
 has been triggered by the programmer (by stepping outside the language).
 
That's just the thing. The -preview=in feature does not define whether the parameter is pass by value or by reference. It says, "up to the compiler". So code will work differently on different compilers. I should be able to expect one behavior, and then if the compiler can pass the parameter differently but I can't tell, then that's fine. But this isn't that. -Steve
Oct 02 2020
parent Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:
On Friday, 2 October 2020 at 19:37:43 UTC, Steven Schveighoffer 
wrote:
 On 10/2/20 3:20 PM, Ola Fosheim Grøstad wrote:
 That's just the thing. The -preview=in feature does not define 
 whether the parameter is pass by value or by reference. It 
 says, "up to the compiler".
Yes, I agree with you. The language spec should stick to the proper theoretical concept, e.g. pass by reference. It could also require no-aliasing of in-parameters and define aliasing as undefined behaviour that the compiler may or may not detect. Disallowing aliasing can give better codegen (faster). Then it could put in a footnote that some compilers optimize such and such as values in registers ( for those that read disassembled code ), but it should be no more than a footnote.
 I should be able to expect one behavior, and then if the 
 compiler can pass the parameter differently but I can't tell, 
 then that's fine. But this isn't that.
True.
Oct 02 2020
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 10/2/2020 10:31 AM, Steven Schveighoffer wrote:
 And this might not be true on a different compiler.
This is looking like a serious problem.
Oct 02 2020
next sibling parent Imperatorn <johan_forsberg_86 hotmail.com> writes:
On Friday, 2 October 2020 at 22:11:01 UTC, Walter Bright wrote:
 On 10/2/2020 10:31 AM, Steven Schveighoffer wrote:
 And this might not be true on a different compiler.
This is looking like a serious problem.
It might be. We need to look into this, and once and for all come to a conclusion.
Oct 02 2020
prev sibling next sibling parent reply Steven Schveighoffer <schveiguy gmail.com> writes:
On 10/2/20 6:11 PM, Walter Bright wrote:
 On 10/2/2020 10:31 AM, Steven Schveighoffer wrote:
 And this might not be true on a different compiler.
This is looking like a serious problem.
I was wrong about this case, because it's a dynamic array (which is specifically carved out as an exception). Indeed, there seem to be quite a few exceptions carved out, presumably to make existing code behave reasonably. But this could be true for another type that isn't in the exception categories. Potentially, a compiler can decide to make the following not true, while others say it is true: alias A = void delegate(in long); alias B = void delegate(const long); static assert(is(A : B)); -Steve
Oct 02 2020
next sibling parent Walter Bright <newshound2 digitalmars.com> writes:
On 10/2/2020 4:03 PM, Steven Schveighoffer wrote:
 On 10/2/20 6:11 PM, Walter Bright wrote:
 On 10/2/2020 10:31 AM, Steven Schveighoffer wrote:
 And this might not be true on a different compiler.
This is looking like a serious problem.
I was wrong about this case, because it's a dynamic array (which is specifically carved out as an exception). Indeed, there seem to be quite a few exceptions carved out, presumably to make existing code behave reasonably.
Having a struct wrap a dynamic array comes to mind: struct A { int[] a; } vs: int[] b; Having these have different function calling ABIs is something I've tried to avoid. I.e. a wrapped type should have the same ABI as the type. Before this, it was true.
 
 But this could be true for another type that isn't in the exception
categories. 
 Potentially, a compiler can decide to make the following not true, while
others 
 say it is true:
 
 alias A = void delegate(in long);
 alias B = void delegate(const long);
 
 static assert(is(A : B));
It's not looking good.
Oct 02 2020
prev sibling parent Jacob Carlborg <doob me.com> writes:
On 2020-10-03 01:03, Steven Schveighoffer wrote:

 But this could be true for another type that isn't in the exception 
 categories. Potentially, a compiler can decide to make the following not 
 true, while others say it is true:
 
 alias A = void delegate(in long);
 alias B = void delegate(const long);
 
 static assert(is(A : B));
FYI, same problem with `__traits(isReturnOnStack)` [1]. [1] https://dlang.org/spec/traits.html#isReturnOnStack -- /Jacob Carlborg
Oct 03 2020
prev sibling next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 10/2/20 6:11 PM, Walter Bright wrote:
 On 10/2/2020 10:31 AM, Steven Schveighoffer wrote:
 And this might not be true on a different compiler.
This is looking like a serious problem.
They say one should choose one's fights wisely, so I spent some time pondering. I could just let this thread scroll by and not think twice about it. By next week, I may as well forget. But then I realized this is /exactly/ the kind of crap that we'll all scratch our heads six months from now, "How did this ever pass review? Who approved this? How in the world did a group of competent, well-intended people, looked at this and said - yep, good idea. Let's." ??? This glib take is EXTREMELY concerning:
 In the general case, no. You can have two distinct pointers with the
 same value, and there's nothing the frontend can do to detect it.
 
 This scenario has been brought up during the review. I doubt it will,
 in practice, be an issue though. This is not a common pattern, nor
 does it seems useful. It rather looks like a code smell.
Wait a SECOND! Are we really in the market of developing and deploying language features that come unglued at the slightest and subtlest misuse? We most certainly shouldn't. I sincerely congratulated Mathias and the other participants for working on this. It's an important topic. Knowing that all those involved are very good at what they do, and without having looked closely, I was sure they got something really nice going that avoids this absolutely blatant semantic grenade. And now I see this is exactly it - we got a -preview of a grenade. How is this possible? How can we sleep at night now? Again: take a step back and reconsider, why did this pass muster? This is important, folks. It's really important as parameter passing goes to the core of what the virtual machine does. You can't say, meh, let's just fudge it here, and whatever is surprising it's on the user. Please, we really need to put back the toothpaste in the tube here. I could on everybody's clear head here to reconsider this.
Oct 02 2020
next sibling parent Imperatorn <johan_forsberg_86 hotmail.com> writes:
On Saturday, 3 October 2020 at 05:02:36 UTC, Andrei Alexandrescu 
wrote:
 On 10/2/20 6:11 PM, Walter Bright wrote:
 [...]
They say one should choose one's fights wisely, so I spent some time pondering. I could just let this thread scroll by and not think twice about it. By next week, I may as well forget. [...]
This. We have to be more stringent
Oct 03 2020
prev sibling next sibling parent reply Mathias LANG <geod24 gmail.com> writes:
On Saturday, 3 October 2020 at 05:02:36 UTC, Andrei Alexandrescu 
wrote:
 [...]

 Wait a SECOND! Are we really in the market of developing and 
 deploying language features that come unglued at the slightest 
 and subtlest misuse? We most certainly shouldn't.
I agree. What I don't agree with is that this aliasing is a "slight and subtle misuse". I went through the 64 projects and their dependencies that are on Buildkite, and didn't see a hint of this pattern emerging. Nor did it in any other code I've surveyed in the almost 5 months the PR was open.
 I sincerely congratulated Mathias and the other participants 
 for working on this. It's an important topic. Knowing that all 
 those involved are very good at what they do, and without 
 having looked closely, I was sure they got something really 
 nice going that avoids this absolutely blatant semantic 
 grenade. And now I see this is exactly it - we got a -preview 
 of a grenade. How is this possible? How can we sleep at night 
 now?

 Again: take a step back and reconsider, why did this pass 
 muster?
Maybe because the people that want to use this actually know this is nowhere near as big of an issue as some people try to make it look. Also because some of the criticism is based on the rather loose definition of the promotion to value at the moment. That definition is intentionally loose, not because it needs to be, but because the feature is in `-preview` and the rules need to take into account all platforms that D support, something that cannot be done in DMD alone.
 This is important, folks. It's really important as parameter 
 passing goes to the core of what the virtual machine does. You 
 can't say, meh, let's just fudge it here, and whatever is 
 surprising it's on the user.

 Please, we really need to put back the toothpaste in the tube 
 here. I could on everybody's clear head here to reconsider this.
Perhaps you don't know this, but the very first implementation, the one I had when I opened the PR on April 3rd actually always used `ref`. It had quite a few issues. Kinke suggested an alternative, and that alternative brought many benefits with it, for a very minor downside, which can easily be mitigated: if your function's semantic really depend on mutation through an alias propagating (or not) to an `in` parameter, then you can use `__traits(isRef, paramname)` to check it. Later on, on July 31st, I brought the topic to the forum, which pressed on another suggestion Kinke has made, using the full function type instead of just the parameter type. At the time, you also made mention of this issue, and it was considered. I wonder, did you try to adapt one of your projects to `in` ? Not merely throwing the switch, but also adding a couple `in` here and there, were it makes sense. Not that this `-preview` is the only one that has ever been usable from release day (perhaps `markdown` was as well, but it only affected documentation generation). All the other previews would fail on your project because druntime and Phobos were never adapted after the switch was merged. `-preview=in` not only works with druntime/phobos, but many libraries from Buildkite were also adapted to work with or without it (https://github.com/dlang/dmd/pull/11632). I was expecting this would increase user engagement, allow to gather feedback, use cases, and lead to a healthy discussion, just like it allowed to refine the implementation. But I wasn't expecting FUD to be spread over a `-preview` before people even tried it.
Oct 03 2020
next sibling parent Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:
On Saturday, 3 October 2020 at 12:09:34 UTC, Mathias LANG wrote:
 On Saturday, 3 October 2020 at 05:02:36 UTC, Andrei 
 Alexandrescu wrote:
 [...]

 Wait a SECOND! Are we really in the market of developing and 
 deploying language features that come unglued at the slightest 
 and subtlest misuse? We most certainly shouldn't.
I agree. What I don't agree with is that this aliasing is a "slight and subtle misuse". I went through the 64 projects and their dependencies that are on Buildkite, and didn't see a hint of this pattern emerging. Nor did it in any other code I've surveyed in the almost 5 months the PR was open.
The codebases you look at are too small. If you look at the design rationale for Ada SPARK (which does not allow aliasing) you will see that aliasing is considered to be a problem that leads to very serious bugs in running systems. This is well established irrespective of language.
Oct 03 2020
prev sibling next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 10/3/20 8:09 AM, Mathias LANG wrote:
 On Saturday, 3 October 2020 at 05:02:36 UTC, Andrei Alexandrescu wrote:
 [...]

 Wait a SECOND! Are we really in the market of developing and deploying 
 language features that come unglued at the slightest and subtlest 
 misuse? We most certainly shouldn't.
I agree. What I don't agree with is that this aliasing is a "slight and subtle misuse". I went through the 64 projects and their dependencies that are on Buildkite, and didn't see a hint of this pattern emerging. Nor did it in any other code I've surveyed in the almost 5 months the PR was open.
First off, thanks for engaging. Having taken a look at existing projects is a wise thing to do. There are a few counterpoints here: * Proving a negative is inherently difficult. * It's difficult to look at millions of lines of code, much of which is templated (and so can be subject to problems in the future), and make an assessment. (Clearly it's very impressive that you did.) That is not a guarantee and does not cover future uses, however. * The fallacy of numbers comes to mind. You wouldn't want to introduce, for example, some subtle concurrency problem on account on not having seen it in existing projects. * This has been discussed in C++ circles a number of times, and aliasing has always been a concern. If /C++/ deemed that too dangerous... <insert broadside>. A much more explicit solution has been implemented in https://www.boost.org/doc/libs/1_66_0/libs/utility/call_traits.htm.
 I sincerely congratulated Mathias and the other participants for 
 working on this. It's an important topic. Knowing that all those 
 involved are very good at what they do, and without having looked 
 closely, I was sure they got something really nice going that avoids 
 this absolutely blatant semantic grenade. And now I see this is 
 exactly it - we got a -preview of a grenade. How is this possible? How 
 can we sleep at night now?

 Again: take a step back and reconsider, why did this pass muster?
Maybe because the people that want to use this actually know this is nowhere near as big of an issue as some people try to make it look.
Sorry if it looks like I'm pushing an agenda here - I most certainly am not. I'm acting on the perception that this looks distinctly like one of the mistakes of the past that now we're wondering how they made it and how to undo them. At the very minimum a lot more scrutiny is necessary.
 Perhaps you don't know this, but the very first implementation, the one 
 I had when I opened the PR on April 3rd actually always used `ref`. It 
 had quite a few issues.  Kinke suggested an alternative, and that 
 alternative brought many benefits with it, for a very minor downside, 
 which can easily be mitigated: if your function's semantic really depend 
 on mutation through an alias propagating (or not) to an `in` parameter, 
 then you can use `__traits(isRef, paramname)` to check it.
 Later on, on July 31st, I brought the topic to the forum, which pressed 
 on another suggestion  Kinke has made, using the full function type 
 instead of just the parameter type. At the time, you also made mention 
 of this issue, and it was considered.
Thank you for the context.
 I wonder, did you try to adapt one of your projects to `in` ? Not merely 
 throwing the switch, but also adding a couple `in` here and there, were 
 it makes sense.
I have no doubt it will work in many instances. What I'm worry about is that when it doesn't, the user has very little indication and tooling to help with the most puzzling (and platform-dependant!) behavior.
 Not that this `-preview` is the only one that has ever been usable from 
 release day (perhaps `markdown` was as well, but it only affected 
 documentation generation). All the other previews would fail on your 
 project because druntime and Phobos were never adapted after the switch 
 was merged. `-preview=in` not only works with druntime/phobos, but many 
 libraries from Buildkite were also adapted to work with or without it 
 (https://github.com/dlang/dmd/pull/11632). I was expecting this would 
 increase user engagement, allow to gather feedback, use cases, and lead 
 to a healthy discussion, just like it allowed to refine the 
 implementation. But I wasn't expecting FUD to be spread over a 
 `-preview` before people even tried it.
Again, indeed sorry if it looks I'm trying to spread FUD and thanks for engaging. All I'm doing is to try to be sensible.
Oct 03 2020
next sibling parent Stefan Koch <uplink.coder googlemail.com> writes:
On Saturday, 3 October 2020 at 13:05:43 UTC, Andrei Alexandrescu 
wrote:
 On 10/3/20 8:09 AM, Mathias LANG wrote:
 On Saturday, 3 October 2020 at 05:02:36 UTC, Andrei 
 Alexandrescu wrote:
But I wasn't
 expecting FUD to be spread over a `-preview` before people 
 even tried it.
Again, indeed sorry if it looks I'm trying to spread FUD and thanks for engaging. All I'm doing is to try to be sensible.
It looked like FUD to me as well. Thanks for clarifying that it, "most certainly" was not.
Oct 03 2020
prev sibling parent reply Mathias LANG <geod24 gmail.com> writes:
On Saturday, 3 October 2020 at 13:05:43 UTC, Andrei Alexandrescu 
wrote:
 [...]

 * This has been discussed in C++ circles a number of times, and 
 aliasing has always been a concern. If /C++/ deemed that too 
 dangerous... <insert broadside>. A much more explicit solution 
 has been implemented in 
 https://www.boost.org/doc/libs/1_66_0/libs/utility/call_traits.htm.
I don't deny that aliasing can create issues that could be very hard to debug. But the problem of aliasing is not limited to `in`: code that uses `const ref` (or a `const T` where `T` has indirection) can already misbehave if it doesn't take into account the possibility of parameter aliasing. To put it differently: Why is `auto ref` acceptable but `in` is not ?
Oct 03 2020
next sibling parent Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:
On Saturday, 3 October 2020 at 16:56:06 UTC, Mathias LANG wrote:
 To put it differently: Why is `auto ref` acceptable but `in` is 
  not ?
Aren't `auto ref` parameters more intuitive in their behaviour, though? That the parameter will be passed by ref if it's an lvalue, and by value if it's an rvalue (in which case the only way it can mutate under your feet is if it wraps by reference some other app state)? Unless I'm missing something, that's much more predictable and shouldn't suffer from implementation-dependent differences in the way the preview `in` design does.
Oct 03 2020
prev sibling next sibling parent reply Paul Backus <snarwin gmail.com> writes:
On Saturday, 3 October 2020 at 16:56:06 UTC, Mathias LANG wrote:
 To put it differently: Why is `auto ref` acceptable but `in` is 
 not ?
The issue with `in`, compared to `auto ref`, is that, because its behavior is implementation-defined, it invites programmers to write code that "works on their machine," but is not portable to other environments (including future versions of the same compiler). It's the same issue that C has with features like variable-sized integer types and implementation-defined signedness of `char`. Yes, *technically* it's your fault if you write C code that relies on an `int` being 32 bits, or a `char` being unsigned, just like it would *technically* be your fault if you wrote D code that relied in an `in` parameter being passed by reference. But making these things implementation-defined in the first place is setting the programmer up for failure.
Oct 03 2020
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 10/3/20 2:36 PM, Paul Backus wrote:
 On Saturday, 3 October 2020 at 16:56:06 UTC, Mathias LANG wrote:
 To put it differently: Why is `auto ref` acceptable but `in` is not ?
The issue with `in`, compared to `auto ref`, is that, because its behavior is implementation-defined, it invites programmers to write code that "works on their machine," but is not portable to other environments (including future versions of the same compiler). It's the same issue that C has with features like variable-sized integer types and implementation-defined signedness of `char`. Yes, *technically* it's your fault if you write C code that relies on an `int` being 32 bits, or a `char` being unsigned, just like it would *technically* be your fault if you wrote D code that relied in an `in` parameter being passed by reference. But making these things implementation-defined in the first place is setting the programmer up for failure.
That's a very good comparison, thank you.
Oct 03 2020
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 10/3/20 12:56 PM, Mathias LANG wrote:
 On Saturday, 3 October 2020 at 13:05:43 UTC, Andrei Alexandrescu wrote:
 [...]

 * This has been discussed in C++ circles a number of times, and 
 aliasing has always been a concern. If /C++/ deemed that too 
 dangerous... <insert broadside>. A much more explicit solution has 
 been implemented in 
 https://www.boost.org/doc/libs/1_66_0/libs/utility/call_traits.htm.
I don't deny that aliasing can create issues that could be very hard to debug.
Cool. I hope we now agree there's evidence that such situations are not just hypothetical.
 But the problem of aliasing is not limited to `in`: code that uses 
 `const ref` (or a  `const T` where `T` has indirection) can already 
 misbehave if it doesn't take into account the possibility of parameter 
 aliasing.
 
 To put it differently: Why is `auto ref` acceptable but `in` is not ?
A good point. I can tell for myself. First, binding to reference vs. value is always reproducible the same regardless of platform particulars. Granted, maintenance is liable to introduce puzzlers but more of a first-order nature (change a call, get different resuts) as opposed to a long-distance issue (add a field to the object, suddenly unrelated code breaks - not to mention changes in compiler heuristics). Second, the keyword "ref" is in there, which is a clear giveaway the implementation code needs to expect that.
Oct 03 2020
parent reply Mathias LANG <geod24 gmail.com> writes:
On Saturday, 3 October 2020 at 18:36:57 UTC, Andrei Alexandrescu 
wrote:
 [...]

 A good point. I can tell for myself. First, binding to 
 reference vs. value is always reproducible the same regardless 
 of platform particulars. Granted, maintenance is liable to 
 introduce puzzlers but more of a first-order nature (change a 
 call, get different resuts) as opposed to a long-distance issue 
 (add a field to the object, suddenly unrelated code breaks - 
 not to mention changes in compiler heuristics). Second, the 
 keyword "ref" is in there, which is a clear giveaway the 
 implementation code needs to expect that.
This is missing the point: In an `auto ref` function, *the implementer of the function* not only cannot rely on the `ref`-ness of the parameter(s), but must plan for both, since it's almost a certainty that the function will be instantiated with both ref and non-ref. With `in`, only one state is possible at a time. For anything that isn't a POD, that `ref` state will be stable. So, while both `auto ref` and `in` need to plan for both `ref` state if they appear on a template parameter without constraint, `in` will always behave the same for non-POD type, while `auto ref` will not. From the caller's point of view, it's also simpler with `in`. The same function will always be called, regardless of the lvalue-ness of the arguments provided. Hence, slight change at the call site, or in the surrounding context, that could affect lvalue-ness, will not lead to a different function being called, as it would for `auto ref`. A platform dependent change of lvalue-ness is trivial to craft using only `auto ref`: ``` void foo () (auto ref size_t value) { pragma(msg, __traits(isRef, value)); } auto ref size_t platform (uint* param) { return *param; } extern(C) void main () { uint value; foo(platform(&value)); } ``` Tested on Linux64: ``` % dmd -betterC -run previn.d false % dmd -betterC -m32 -run previn.d true ```
Oct 03 2020
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 10/3/20 5:36 PM, Mathias LANG wrote:
 On Saturday, 3 October 2020 at 18:36:57 UTC, Andrei Alexandrescu wrote:
 [...]

 A good point. I can tell for myself. First, binding to reference vs. 
 value is always reproducible the same regardless of platform 
 particulars. Granted, maintenance is liable to introduce puzzlers but 
 more of a first-order nature (change a call, get different resuts) as 
 opposed to a long-distance issue (add a field to the object, suddenly 
 unrelated code breaks - not to mention changes in compiler 
 heuristics). Second, the keyword "ref" is in there, which is a clear 
 giveaway the implementation code needs to expect that.
This is missing the point: In an `auto ref` function, *the implementer of the function* not only cannot rely on the `ref`-ness of the parameter(s), but must plan for both, since it's almost a certainty that the function will be instantiated with both ref and non-ref. With `in`, only one state is possible at a time. For anything that isn't a POD, that `ref` state will be stable. So, while both `auto ref` and `in` need to plan for both `ref` state if they appear on a template parameter without constraint, `in` will always behave the same for non-POD type, while `auto ref` will not. From the caller's point of view, it's also simpler with `in`. The same function will always be called
Not across long distance changes and platform particulars. This is a very important detail.
Oct 03 2020
parent Mathias LANG <geod24 gmail.com> writes:
On Saturday, 3 October 2020 at 22:55:36 UTC, Andrei Alexandrescu 
wrote:
 On 10/3/20 5:36 PM, Mathias LANG wrote:
 [...]
 
  From the caller's point of view, it's also simpler with `in`. 
 The same function will always be called
Not across long distance changes and platform particulars. This is a very important detail.
This merely sidesteps the question, not actually answering the point raised. Very little, if anything, can resist long distance changes and platform particulars. We don't ban `size_t` from code because adding two values might have a different result depending on the platform. We don't ban `extern(C)` because someone might use a name that happens to be a D symbol and break completely unrelated things. A templated function with an `auto ref` parameter can lead to a different function being called based on your architecture, I demonstrated so in my previous message. Regarding long distance change, we would have to define what a degree is, and how many degrees is long distance. But unless "change" is bound to a very specific meaning made to overfit what happens with `in` and not `auto ref`, rest assured that both (and probably many other languages features) will be affected just the same. I did a bit of digging on `auto ref`, to supplement this conversation. There was one post from Jonathan M. Davies that phrased it well:
 With auto ref, you're specifically saying that you don't care 
 whether the function is given an lvalue or rvalue. You just 
 want it to avoid unnecessary copies. That's very different. And 
 auto ref then not only then protects you from cases of passing 
 an rvalue to a function when it needs an lvalue, but it makes 
 it clear in the function signature which is expected.
https://forum.dlang.org/post/mailman.3031.1356562349.5162.digitalmars-d puremagic.com I found a few other discussions, including this: https://forum.dlang.org/thread/rehsmhmeexpusjwkfnoy forum.dlang.org (page 6 was quite relevant), and of course https://github.com/dlang/dmd/pull/4717 Much to my surprise, the challenges of parameter aliasing was never brought up in any of those topics, because, as quoted before, "you don't care whether the function is given an lvalue or a rvalue", which conversely means "you don't care if your function receives an lvalue or a rvalue". For this to be possible without affecting the observable behavior of the function, one has to rule out mutation via aliasing.
Oct 05 2020
prev sibling next sibling parent Paul Backus <snarwin gmail.com> writes:
On Saturday, 3 October 2020 at 21:36:00 UTC, Mathias LANG wrote:
 A platform dependent change of lvalue-ness is trivial to craft 
 using only `auto ref`:
 ```
 void foo () (auto ref size_t value)
 {
     pragma(msg, __traits(isRef, value));
 }
 auto ref size_t platform (uint* param) { return *param; }
 extern(C) void main ()
 {
     uint value;
     foo(platform(&value));
 }
 ```

 Tested on Linux64:
 ```
 % dmd -betterC -run previn.d
 false
 % dmd -betterC -m32 -run previn.d
 true
 ```
This is a great illustration of the gotchas inherent in using variable-sized integer types like `size_t`.
Oct 03 2020
prev sibling parent reply Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:
On Saturday, 3 October 2020 at 21:36:00 UTC, Mathias LANG wrote:
 A platform dependent change of lvalue-ness is trivial to craft 
 using only `auto ref`:
 ```
 void foo () (auto ref size_t value)
 {
     pragma(msg, __traits(isRef, value));
 }
 auto ref size_t platform (uint* param) { return *param; }
 extern(C) void main ()
 {
     uint value;
     foo(platform(&value));
 }
 ```

 Tested on Linux64:
 ```
 % dmd -betterC -run previn.d
 false
 % dmd -betterC -m32 -run previn.d
 true
 ```
Platform-dependent, yes -- but the behavioural difference here can be 100% anticipated from the language rules (and hence, from the code), no? Isn't the issue with `-preview="in"` as currently implemented that _there is no way_ for the reader of the code to anticipate when ref will be used and when not?
Oct 04 2020
parent reply Iain Buclaw <ibuclaw gdcproject.org> writes:
On Sunday, 4 October 2020 at 09:15:36 UTC, Joseph Rushton 
Wakeling wrote:
 Platform-dependent, yes -- but the behavioural difference here 
 can be 100% anticipated from the language rules (and hence, 
 from the code), no?

 Isn't the issue with `-preview="in"` as currently implemented 
 that _there is no way_ for the reader of the code to anticipate 
 when ref will be used and when not?
As it is an optimization, I think that's best left to the interpretation of the compiler/compiler author, however optimizations should never break the semantic guarantee. ``` void fun1(const long[16] a, ref long[16] b, in long[16] c, out long[16] d); fun1(var, var, var, var); void fun2(in long[16] a, int b); fun2(var, var[0]); ``` In the above, all parameters are passed in memory as per ABI requirements. The difference lies in whether or not `var` is passed by reference directly or a via a temporary copy. When it comes to `in` parameters, I'm of the opinion that a temporary should be passed if the type is trivially copy-able. But as an optimization, if we look at the signature and determine that it's not possible for any parameters to be aliasing each other, then why not pass the `in` parameter as `ref`?
Oct 04 2020
next sibling parent Iain Buclaw <ibuclaw gdcproject.org> writes:
On Sunday, 4 October 2020 at 14:58:13 UTC, Iain Buclaw wrote:
 void fun2(in long[16] a, int b);
 fun2(var, var[0]);
 ```

 In the above, all parameters are passed in memory as per ABI 
 requirements.
(Except the `int`/`var[0]` parameter that I added in last minute ;-)
Oct 04 2020
prev sibling next sibling parent reply Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:
On Sunday, 4 October 2020 at 14:58:13 UTC, Iain Buclaw wrote:
 As it is an optimization, I think that's best left to the 
 interpretation of the compiler/compiler author, however 
 optimizations should never break the semantic guarantee.
What is the intended semantic guarantee in simple words that makes sense to an end user? From what you say I assume it is something along the lines of: «Has const scope semantics. Values may be passed as references. Parameter passing will not trigger sideeffects on the actual or formal parameters.» However, do you also have this constraint: «The caller is responsible for ensuring that the actual parameter does not alias with other parameters or globals available to the called function.» And if not, what do you have? The language spec is way too convoluted. This is C++ level of convolutedness: «The parameter is an input to the function. Input parameters behaves as if they have the const scope storage classes. Input parameters may be passed by reference by the compiler. Unlike ref parameters, in parameters can bind to both lvalues and rvalues (such as literals). Types that would trigger a side effect if passed by value (such as types with postblit, copy constructor, or destructor), and types which cannot be copied, e.g. if their copy constructor is marked as disable, will always be passed by reference. Dynamic arrays, classes, associative arrays, function pointers, and delegates will always be passed by value, to allow for covariance. If the type of the parameter does not fall in one of those categories, whether or not it is passed by reference is implementation defined, and the backend is free to choose the method that will best fit the ABI of the platform.» Stuff like this ought to be an implementors note in an appendix.
Oct 04 2020
parent reply Iain Buclaw <ibuclaw gdcproject.org> writes:
On Sunday, 4 October 2020 at 16:10:42 UTC, Ola Fosheim Grøstad 
wrote:
 On Sunday, 4 October 2020 at 14:58:13 UTC, Iain Buclaw wrote:
 As it is an optimization, I think that's best left to the 
 interpretation of the compiler/compiler author, however 
 optimizations should never break the semantic guarantee.
What is the intended semantic guarantee in simple words that makes sense to an end user? From what you say I assume it is something along the lines of: «Has const scope semantics. Values may be passed as references. Parameter passing will not trigger sideeffects on the actual or formal parameters.» However, do you also have this constraint: «The caller is responsible for ensuring that the actual parameter does not alias with other parameters or globals available to the called function.» And if not, what do you have?
The spec only explicitly states that non-POD types must *always be passed by reference*. So the only semantic guarantee is that copy constructors are elided. Derived and user-defined data types (excluding structs and static arrays) are explicitly *always passed by value*. They might still be passed in memory by ABI, but in that case, it results in a copy-to-temp. As for everything else, unless there is a language requirement to pass by reference then the conservative approach would be to say "why bother?". I don't immediately see a reason to do copy elision for trivial types unless it is provable that there'd be no difference in observable program behaviour.
Oct 04 2020
parent reply Iain Buclaw <ibuclaw gdcproject.org> writes:
On Sunday, 4 October 2020 at 19:26:56 UTC, Iain Buclaw wrote:
 On Sunday, 4 October 2020 at 16:10:42 UTC, Ola Fosheim Grøstad 
 wrote:
 On Sunday, 4 October 2020 at 14:58:13 UTC, Iain Buclaw wrote:
 As it is an optimization, I think that's best left to the 
 interpretation of the compiler/compiler author, however 
 optimizations should never break the semantic guarantee.
What is the intended semantic guarantee in simple words that makes sense to an end user? From what you say I assume it is something along the lines of: «Has const scope semantics. Values may be passed as references. Parameter passing will not trigger sideeffects on the actual or formal parameters.» However, do you also have this constraint: «The caller is responsible for ensuring that the actual parameter does not alias with other parameters or globals available to the called function.» And if not, what do you have?
The spec only explicitly states that non-POD types must *always be passed by reference*. So the only semantic guarantee is that copy constructors are elided.
Of course, I'm forgetting the other side of `in` due to everyone only focusing on the `ref` part. :-) The answer is yes on your mention of const scope semantics. If I'd be pressed to bullet point it, I'd put down the following. When an parameter is annotation with `in`: - The parameter is not modifiable. - All memory reachable from the parameter can not be clobbered (overwritten). - The parameter does not escape. - Copy constructors are elided by passing by-ref. - Other forms of copy elision may occur if doing so does not change program behaviour. Is that simple enough for an end-user?
Oct 04 2020
parent reply kinke <noone nowhere.com> writes:
On Sunday, 4 October 2020 at 20:06:49 UTC, Iain Buclaw wrote:
 If I'd be pressed to bullet point it, I'd put down the 
 following.

 When an parameter is annotation with `in`:
 - The parameter is not modifiable.
 - All memory reachable from the parameter can not be clobbered 
 (overwritten).
 - The parameter does not escape.
 - Copy constructors are elided by passing by-ref.
 - Other forms of copy elision may occur if doing so does not 
 change program behaviour.

 Is that simple enough for an end-user?
I think that's a pretty good summary, although I'd loosen point 2 to the object itself, not all memory reachable from it. The main problem people seem to be focusing on in this thread is point 2, the aliasing issue. Firstly, I think people need to realize `-preview=in` is absolutely not about being compatible to previous semantics, but about entirely new semantics for `in`. Secondly, as the potential aliasing problem regards the callers, not the callee, a simple rule of thumb should be something along the lines of: "If you wouldn't pass the lvalue arg to a `const scope ref` parameter due to aliasing concerns, then don't pass it to `in` either without an explicit copy."
Oct 04 2020
next sibling parent kinke <noone nowhere.com> writes:
As for the callee, the only significant change OTOH (except for 
non-PODs not getting copy-constructed/postblitted and destructed) 
would be that it cannot make any assumptions about the address of 
an `in` param (but there's `traits(isRef)` in the presumably 
extremely rare cases one needs to know whether it was passed by 
ref or value):

void foo(T)(in T a, in T b)
{
     assert(&a != &b);
}

void test(T)()
{
     T x;
     foo(x, x);
}

void main()
{
     test!int(); // succeeds - the tiny int is passed as 2 
distinct values
     test!(int[64])(); // fails - the same large array is passed 
by ref
}

---

Once a program conforms to these new semantics, especially wrt. 
to aliasing as mentioned earlier, the compiler- and 
target-specific implementation details and differences don't 
matter.
Oct 04 2020
prev sibling next sibling parent reply Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:
On Sunday, 4 October 2020 at 20:42:17 UTC, kinke wrote:
 On Sunday, 4 October 2020 at 20:06:49 UTC, Iain Buclaw wrote:
 `in`. Secondly, as the potential aliasing problem regards the 
 callers, not the callee, a simple rule of thumb should be 
 something along the lines of: "If you wouldn't pass the lvalue 
 arg to a `const scope ref` parameter due to aliasing concerns, 
 then don't pass it to `in` either without an explicit
You need to know the implementation as parameters can alias with globals accessed by the function. So not caller only. The docs should list everything you need to be mindful of in order to get consistent behaviour from any compliant compiler.
Oct 04 2020
next sibling parent reply kinke <noone nowhere.com> writes:
On Sunday, 4 October 2020 at 23:06:58 UTC, Ola Fosheim Grøstad 
wrote:
 You need to know the implementation as parameters can alias 
 with globals accessed by the function. So not caller only.
As-is with all refs today: struct S { int a; } int* p; int foo(const scope ref S s) { *p = 123; return s.a; } void main() { S s; p = &s.a; assert(foo(s) == 0); // oops }
Oct 04 2020
parent reply Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:
On Sunday, 4 October 2020 at 23:17:03 UTC, kinke wrote:
 On Sunday, 4 October 2020 at 23:06:58 UTC, Ola Fosheim Grøstad 
 wrote:
 You need to know the implementation as parameters can alias 
 with globals accessed by the function. So not caller only.
As-is with all refs today:
The context for that statement was that value/ref is implementation dependent. Steven also gave another example related to type testing. All such anomalities should be listed.
Oct 04 2020
parent reply Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:
On Sunday, 4 October 2020 at 23:31:03 UTC, Ola Fosheim Grøstad 
wrote:
 The context for that statement was that value/ref is 
 implementation dependent.
It might help some if compilers would run unit tests 3 times with different 'in' implementations. 1 mixed value/ref 2 value 3 ref
Oct 04 2020
parent reply kinke <noone nowhere.com> writes:
On Sunday, 4 October 2020 at 23:40:29 UTC, Ola Fosheim Grøstad 
wrote:
 It might help some if compilers would run unit tests 3 times 
 with different 'in' implementations.

 1 mixed value/ref
 2 value
 3 ref
I was about to propose something like this (restricted to POD types), possibly augmented by some indeterministic fuzzing. Code ported to the new `in` semantics could then even show some previously unintended aliasing issues; getting to the root of the problem would probably still be non-trivial though. A compiler mode enforcing by-ref, coupled with some sort of runtime sanitizer detecting invalid writes to live in-params, would probably be a very valuable tool for validation & troubleshooting.
Oct 04 2020
parent Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:
On Monday, 5 October 2020 at 00:31:33 UTC, kinke wrote:
 On Sunday, 4 October 2020 at 23:40:29 UTC, Ola Fosheim Grøstad 
 wrote:
 It might help some if compilers would run unit tests 3 times 
 with different 'in' implementations.

 1 mixed value/ref
 2 value
 3 ref
I was about to propose something like this (restricted to POD types), possibly augmented by some indeterministic fuzzing. Code ported to the new `in` semantics could then even show some previously unintended aliasing issues; getting to the root of the problem would probably still be non-trivial though. A compiler mode enforcing by-ref, coupled with some sort of runtime sanitizer detecting invalid writes to live in-params, would probably be a very valuable tool for validation & troubleshooting.
You could do this: make it pass by value as the default, than add compiler switches that have increasingly aggressive optimizations up to assuming no aliasing. That ought to be innocent enough.
Oct 05 2020
prev sibling parent Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:
http://archive.adaic.com/standards/83rat/html/ratl-08-02.html

Very entertaining read as they choose completely different 
semantics.
Oct 05 2020
prev sibling parent Iain Buclaw <ibuclaw gdcproject.org> writes:
On Sunday, 4 October 2020 at 20:42:17 UTC, kinke wrote:
 On Sunday, 4 October 2020 at 20:06:49 UTC, Iain Buclaw wrote:
 If I'd be pressed to bullet point it, I'd put down the 
 following.

 When an parameter is annotation with `in`:
 - The parameter is not modifiable.
 - All memory reachable from the parameter can not be clobbered 
 (overwritten).
 - The parameter does not escape.
 - Copy constructors are elided by passing by-ref.
 - Other forms of copy elision may occur if doing so does not 
 change program behaviour.

 Is that simple enough for an end-user?
I think that's a pretty good summary, although I'd loosen point 2 to the object itself, not all memory reachable from it.
2. Assumes memory reachable from the parameter will not be clobbered (overwritten). Basically, this: s.a = 0xacce55ed; fun(s); assert(s.a == 0xacce55ed); The caller can and will optimize based on this assumption never being false.
Oct 04 2020
prev sibling parent reply Johan <j j.nl> writes:
On Sunday, 4 October 2020 at 14:58:13 UTC, Iain Buclaw wrote:
 On Sunday, 4 October 2020 at 09:15:36 UTC, Joseph Rushton 
 Wakeling wrote:
 Platform-dependent, yes -- but the behavioural difference here 
 can be 100% anticipated from the language rules (and hence, 
 from the code), no?

 Isn't the issue with `-preview="in"` as currently implemented 
 that _there is no way_ for the reader of the code to 
 anticipate when ref will be used and when not?
As it is an optimization, I think that's best left to the interpretation of the compiler/compiler author, however optimizations should never break the semantic guarantee.
I agree. Throughout this thread, I notice the confounding of two different concepts which are both referred to as "passing by reference". Please separate _very clearly_ (A) the concept of passing by reference as per D language semantics, from (B) the concept of passing by reference on machine instruction level. It would be horrible if A becomes platform/compiler dependent for `in` parameters. But B is an implementation detail that is at the discretion of the compiler implementer about which the D language user has no say (as Iain already mentioned, there is no D ABI spec and DMD, LDC, and GDC all make different ABI choices). My worry from reading this thread is that `-preview=in` is making A platform/compiler dependent, to give users the feel that they get some control over B. I very much hope I am wrong. -Johan
Oct 04 2020
next sibling parent Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:
On Sunday, 4 October 2020 at 18:02:12 UTC, Johan wrote:
 My worry from reading this thread is that `-preview=in` is 
 making A platform/compiler dependent, to give users the feel 
 that they get some control over B. I very much hope I am wrong.
What is missing is the details of what the user must and must not do in order to ensure that all compliant compilers (all possible compilers that adhere to the spec) produce executables that generate the same output. That shouldn't be guesswork, but explicit.
Oct 04 2020
prev sibling parent Iain Buclaw <ibuclaw gdcproject.org> writes:
On Sunday, 4 October 2020 at 18:02:12 UTC, Johan wrote:
 My worry from reading this thread is that `-preview=in` is 
 making A platform/compiler dependent, to give users the feel 
 that they get some control over B. I very much hope I am wrong.
If any bugs get raised against LDC saying as much, then it is well within your right to close them as wontfix, then raise a bug against DMD for a wrong-code issue.
Oct 04 2020
prev sibling next sibling parent Imperatorn <johan_forsberg_86 hotmail.com> writes:
On Saturday, 3 October 2020 at 12:09:34 UTC, Mathias LANG wrote:
 On Saturday, 3 October 2020 at 05:02:36 UTC, Andrei 
 Alexandrescu wrote:
 [...]
I agree. What I don't agree with is that this aliasing is a "slight and subtle misuse". I went through the 64 projects and their dependencies that are on Buildkite, and didn't see a hint of this pattern emerging. Nor did it in any other code I've surveyed in the almost 5 months the PR was open. [...]
You have obviously done your homework. I just want to say one thing to the audience: Isn't this exactly why -preview exists? So we can try out features and report any problems? I think the process is working as intended, and we should all be thankful for all attempts to improve D. Thanks
Oct 03 2020
prev sibling next sibling parent reply kinke <noone nowhere.com> writes:
On Saturday, 3 October 2020 at 12:09:34 UTC, Mathias LANG wrote:
 Perhaps you don't know this, but the very first implementation, 
 the one I had when I opened the PR on April 3rd actually always 
 used `ref`. It had quite a few issues.  Kinke suggested an 
 alternative, and that alternative brought many benefits with 
 it, for a very minor downside, which can easily be mitigated: 
 if your function's semantic really depend on mutation through 
 an alias propagating (or not) to an `in` parameter, then you 
 can use `__traits(isRef, paramname)` to check it.
The idea is that `in` is explicit, and users of a function with `in` params should think of such params as `const T& __restrict__` in C++ terms, which should clarify the potential aliasing problem. Whether the thing is optimized to pass-by-value then shouldn't make any observable difference for the caller.
 Later on, on July 31st, I brought the topic to the forum, which 
 pressed on another suggestion  Kinke has made, using the full 
 function type instead of just the parameter type.
I don't recall that, are you sure it wasn't someone else? I'd rather have it based on the parameter type alone.
Oct 03 2020
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 10/3/2020 7:08 AM, kinke wrote:
 The idea is that `in` is explicit, and users of a function with `in` params 
 should think of such params as `const T& __restrict__` in C++ terms, which 
 should clarify the potential aliasing problem. Whether the thing is optimized
to 
 pass-by-value then shouldn't make any observable difference for the caller.
__restrict__ is a C feature only and never made it into the C++ Standard. I implemented it as a no-op in Digital Mars C. The reason is because it relaxes the rules on what optimizations can be performed in ways that can subtly break code. Very, very few C programmers understand exactly what is going on with it, and sensibly avoid it. If such a user uses __restrict__ with a compiler that ignores it, then uses mine that enforces it and breaks his code, what happens is that *I* get the blame. I can quote the Standard to him all day, but he'll inevitably say "it works with Microsoft C, it doesn't work with yours, you are wrong." I know this because I've had these conversations with customers that relied on bugs in Microsoft C. It's hopeless, so I implement Microsoft C's bugs. The problem with __restrict__ is that the C compiler is fundamentally incapable of detecting buggy uses of it at compile time. Whether those bugs exhibit at runtime or not is implementation-defined. It's not a surprise that the C++ Standard did not adopt it. `in` is a nice, friendly looking construct. It looks like pass-by-value, which we're all familiar with, and in fact we've trained users to regard it as pass-by-value because that's the existing behavior for 20 years. Changing its behavior so it *may* introduce memory corruption in very un-obvious and un-checkable ways is a very serious problem.
Oct 04 2020
next sibling parent reply kinke <noone nowhere.com> writes:
On Sunday, 4 October 2020 at 21:45:03 UTC, Walter Bright wrote:
 The problem with __restrict__ is that the C compiler is 
 fundamentally incapable of detecting buggy uses of it at 
 compile time. Whether those bugs exhibit at runtime or not is 
 implementation-defined. It's not a surprise that the C++ 
 Standard did not adopt it.
And yet all major C++ compilers support it one way or another (AFAIK, just for more aggressive optimizations though, no need to check for overlaps). Anyway, I probably shouldn't have brought up __restrict__ - simply because every D dev should already know about aliasing pitfalls wrt. regular refs: int foo(const ref int a, ref int b) { b = 123; return a; } void main() { int a = 0; assert(foo(a, a) == 0); // oops, aliasing }
 `in` is a nice, friendly looking construct. It looks like 
 pass-by-value, which we're all familiar with, and in fact we've 
 trained users to regard it as pass-by-value because that's the 
 existing behavior for 20 years. Changing its behavior so it 
 *may* introduce memory corruption in very un-obvious and 
 un-checkable ways is a very serious problem.
I've been unhappy about the wasted potential for `in` ever since working on the ABI details of LDC, and new `-preview=in` is something I've wanted for years. Again, the aliasing issue should be quite easily avoided by imagining `in` params as `const scope ref` at the call sites. For those who find that too complicated, well, just don't opt into the preview feature. We'll see how the adoption goes.
Oct 04 2020
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 10/4/2020 3:27 PM, kinke wrote:
 And yet all major C++ compilers support it one way or another (AFAIK, just for 
 more aggressive optimizations though, no need to check for overlaps).
Ask a random C++ programmer what __restrict__ means. At least it has an ugly syntax, which people who don't know what it means will avoid. `in` is a gingerbread house.
 Anyway, I probably shouldn't have brought up __restrict__ - simply because
every 
 D dev should already know about aliasing pitfalls wrt. regular refs:
 
 int foo(const ref int a, ref int b)
 {
      b = 123;
      return a;
 }
 
 void main()
 {
      int a = 0;
      assert(foo(a, a) == 0); // oops, aliasing
 }
That's right, but they *do not* know about `in` introducing such, and there's no way to reliably detect it.
 `in` is a nice, friendly looking construct. It looks like pass-by-value, which 
 we're all familiar with, and in fact we've trained users to regard it as 
 pass-by-value because that's the existing behavior for 20 years. Changing its 
 behavior so it *may* introduce memory corruption in very un-obvious and 
 un-checkable ways is a very serious problem.
I've been unhappy about the wasted potential for `in` ever since working on the ABI details of LDC, and new `-preview=in` is something I've wanted for years. Again, the aliasing issue should be quite easily avoided by imagining `in` params as `const scope ref` at the call sites.
I'd be fine calling it `const scope ref`. At least it says what it is doing, and its behavior is not implementation defined.
 For those who find that too 
 complicated, well, just don't opt into the preview feature. We'll see how the 
 adoption goes.
Few if any users will notice a problem caused by it. But they're going to hate us when they do have a problem in the field with it. The thing is, I've been working for years on making D as memory safe as we can. This feature is a big step backwards.
Oct 04 2020
parent reply kinke <noone nowhere.com> writes:
On Monday, 5 October 2020 at 01:59:58 UTC, Walter Bright wrote:
 On 10/4/2020 3:27 PM, kinke wrote:
 Again, the aliasing issue should be quite easily avoided by 
 imagining `in` params as `const scope ref` at the call sites.
I'd be fine calling it `const scope ref`. At least it says what it is doing, and its behavior is not implementation defined.
But that's not the whole picture. It's about optimizing small-POD cases such as `in float4` - no, I don't want (extremely verbose) `const scope ref float4` to dump the argument to stack, pass a pointer to it in a GP register, and then have the callee load the 4 floats from the address in that GP register into an XMM register again - I want to pass the argument directly in an XMM register. Similar thing for an int - why waste a GP register for an address to something that fits into that register directly? With `-preview=in`, this works beautifully for generic templated code too - non-PODs and large PODs are passed by ref, small PODs by value, something C++ can only dream of.
 The thing is, I've been working for years on making D as memory 
 safe as we can. This feature is a big step backwards.
But you're solely focusing on static analysis. Static analysis is great but quite obviously limited. Runtime sanitizers are a necessary supplement, and easy to integrate with LLVM for example - adding support for existing language-independent address and thread sanitizers for LDC was pretty simple, and excluding tests, probably amounted to a few hundred lines, mostly wrt. copying and correctly linking against prebuilt libs. Wrt. aliasing, I think it's more of a general problem, and I guess a `const ref` param being mutated while executing that function is almost always an unintended bug. -preview=in would make that definitely a bug for each `in` param.
Oct 04 2020
next sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 10/4/20 10:34 PM, kinke wrote:
 But you're solely focusing on static analysis. Static analysis is great 
 but quite obviously limited. Runtime sanitizers are a necessary supplement
It doesn't sit right to design language features that assume sanitizers.
Oct 04 2020
prev sibling parent Walter Bright <newshound2 digitalmars.com> writes:
On 10/4/2020 7:34 PM, kinke wrote:
 But that's not the whole picture. It's about optimizing small-POD cases such
as 
 `in float4` - no, I don't want (extremely verbose) `const scope ref float4` to 
 dump the argument to stack, pass a pointer to it in a GP register, and then
have 
 the callee load the 4 floats from the address in that GP register into an XMM 
 register again - I want to pass the argument directly in an XMM register. 
 Similar thing for an int - why waste a GP register for an address to something 
 that fits into that register directly?
 With `-preview=in`, this works beautifully for generic templated code too - 
 non-PODs and large PODs are passed by ref, small PODs by value, something C++ 
 can only dream of.
I do understand what it is for.
 The thing is, I've been working for years on making D as memory safe as we 
 can. This feature is a big step backwards.
But you're solely focusing on static analysis. Static analysis is great but quite obviously limited. Runtime sanitizers are a necessary supplement, and easy to integrate with LLVM for example - adding support for existing language-independent address and thread sanitizers for LDC was pretty simple, and excluding tests, probably amounted to a few hundred lines, mostly wrt. copying and correctly linking against prebuilt libs.
Static analysis, especially when it is part of the language (not an add-on) is vastly superior to runtime checking. (Runtime checking, such as array bounds overflow checks, can never prove an overflow is not possible. Static analysis can.) So yes, I very much am focused on static analysis.
 Wrt. aliasing, I think it's more of a general problem, and I guess a `const
ref` 
 param being mutated while executing that function is almost always an
unintended 
 bug. -preview=in would make that definitely a bug for each `in` param.
See my upcoming talk at #dconf2020 !
Oct 05 2020
prev sibling next sibling parent reply kinke <noone nowhere.com> writes:
On Sunday, 4 October 2020 at 21:45:03 UTC, Walter Bright wrote:
 `in` is a nice, friendly looking construct. It looks like 
 pass-by-value
Only because it used to be by-value as you've pointed out. :) - From a tabula rasa syntax standpoint, possibly-by-ref `in` IMO fits nicely as sort-of counterpart to by-ref `out`. Changing it to `in ref` would IMO be too verbose, still slightly confusing (not always a ref - but that'd be in line with `auto ref`... :]) and `in` alone (as `const scope`) wouldn't make much sense anymore, except for backwards-compatibility.
Oct 04 2020
parent foobar <foo bar.com> writes:
On Sunday, 4 October 2020 at 22:53:49 UTC, kinke wrote:
 On Sunday, 4 October 2020 at 21:45:03 UTC, Walter Bright wrote:
 `in` is a nice, friendly looking construct. It looks like 
 pass-by-value
Only because it used to be by-value as you've pointed out. :)
ref gives it away.
Oct 04 2020
prev sibling next sibling parent Adam D. Ruppe <destructionator gmail.com> writes:
On Sunday, 4 October 2020 at 21:45:03 UTC, Walter Bright wrote:
 `in` is a nice, friendly looking construct. It looks like 
 pass-by-value, which we're all familiar with
https://docs.microsoft.com/en-us/dotnet/csharp/language-reference/keywords/in-parameter-modifier
Oct 04 2020
prev sibling parent reply Iain Buclaw <ibuclaw gdcproject.org> writes:
On Sunday, 4 October 2020 at 21:45:03 UTC, Walter Bright wrote:
 On 10/3/2020 7:08 AM, kinke wrote:
 The idea is that `in` is explicit, and users of a function 
 with `in` params should think of such params as `const T& 
 __restrict__` in C++ terms, which should clarify the potential 
 aliasing problem. Whether the thing is optimized to 
 pass-by-value then shouldn't make any observable difference 
 for the caller.
__restrict__ is a C feature only and never made it into the C++ Standard. I implemented it as a no-op in Digital Mars C. The reason is because it relaxes the rules on what optimizations can be performed in ways that can subtly break code. Very, very few C programmers understand exactly what is going on with it, and sensibly avoid it.
I don't think __restrict__ is a good way to reason with expected behaviour. The spec only makes three things clear: No clobber; No escape; Copy elision if possible. In is not ref, and is not restrict. In is in - a value goes in and doesn't come out.
Oct 04 2020
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 10/4/2020 11:37 PM, Iain Buclaw wrote:
 I don't think __restrict__ is a good way to reason with expected behaviour. 
The 
 spec only makes three things clear: No clobber; No escape; Copy elision if 
 possible.
 
 In is not ref, and is not restrict.  In is in - a value goes in and doesn't
come 
 out.
https://dlang.org/spec/function.html#parameters says: "in The parameter is an input to the function. Input parameters behaves as if they have the const scope storage classes. Input parameters may be passed by reference by the compiler. Unlike ref parameters, in parameters can bind to both lvalues and rvalues (such as literals). Types that would trigger a side effect if passed by value (such as types with postblit, copy constructor, or destructor), and types which cannot be copied, e.g. if their copy constructor is marked as disable, will always be passed by reference. Dynamic arrays, classes, associative arrays, function pointers, and delegates will always be passed by value, to allow for covariance. If the type of the parameter does not fall in one of those categories, whether or not it is passed by reference is implementation defined, and the backend is free to choose the method that will best fit the ABI of the platform." The salient points are "may be passed by reference", and "whether or not it is passed by reference is implementation defined". The trouble with passing by reference is when there are other live mutable references to the same memory object. Whether mutating through those references mutates the in argument is "implementation defined". That's the problem with `in`. It's not an issue of a shortcoming in DMD.
Oct 05 2020
parent reply Iain Buclaw <ibuclaw gdcproject.org> writes:
On Monday, 5 October 2020 at 09:49:21 UTC, Walter Bright wrote:
 On 10/4/2020 11:37 PM, Iain Buclaw wrote:
 I don't think __restrict__ is a good way to reason with 
 expected behaviour.  The spec only makes three things clear: 
 No clobber; No escape; Copy elision if possible.
 
 In is not ref, and is not restrict.  In is in - a value goes 
 in and doesn't come out.
https://dlang.org/spec/function.html#parameters says: "in The parameter is an input to the function. Input parameters behaves as if they have the const scope storage classes. Input parameters may be passed by reference by the compiler. Unlike ref parameters, in parameters can bind to both lvalues and rvalues (such as literals). Types that would trigger a side effect if passed by value (such as types with postblit, copy constructor, or destructor), and types which cannot be copied, e.g. if their copy constructor is marked as disable, will always be passed by reference. Dynamic arrays, classes, associative arrays, function pointers, and delegates will always be passed by value, to allow for covariance. If the type of the parameter does not fall in one of those categories, whether or not it is passed by reference is implementation defined, and the backend is free to choose the method that will best fit the ABI of the platform." The salient points are "may be passed by reference", and "whether or not it is passed by reference is implementation defined". The trouble with passing by reference is when there are other live mutable references to the same memory object. Whether mutating through those references mutates the in argument is "implementation defined". That's the problem with `in`. It's not an issue of a shortcoming in DMD.
I think we can all agree that the wording needs to be improved. Regarding the last sentence (If the type does not fall in one of those categories...), if there's no explicit saying so, I take that to mean do nothing unless you can guarantee that there'd be no change in program behaviour. Which may as well be as good as being equal to do nothing. There's no advantage to passing the remaining types not explicitly named in the spec using ref semantics anyway. "In" parameters could be forced in memory, but again with a dubious benefit of doing so.
Oct 05 2020
parent Walter Bright <newshound2 digitalmars.com> writes:
On 10/5/2020 4:52 AM, Iain Buclaw wrote:
 I think we can all agree that the wording needs to be improved.
 
 Regarding the last sentence (If the type does not fall in one of those 
 categories...), if there's no explicit saying so, I take that to mean do
nothing 
 unless you can guarantee that there'd be no change in program behaviour. 
Which 
 may as well be as good as being equal to do nothing.
 
 There's no advantage to passing the remaining types not explicitly named in
the 
 spec using ref semantics anyway.  "In" parameters could be forced in memory,
but 
 again with a dubious benefit of doing so.
POD types that "wrap" a basic type need to work in the ABI like the basic type. An obvious example is using: struct Array { size_t length; void* ptr; } to match dynamic arrays. Another is: T, struct S { T t; }, T[1] should all pass the same way.
Oct 05 2020
prev sibling parent reply Steven Schveighoffer <schveiguy gmail.com> writes:
On 10/3/20 8:09 AM, Mathias LANG wrote:
 On Saturday, 3 October 2020 at 05:02:36 UTC, Andrei Alexandrescu wrote:
 [...]

 Wait a SECOND! Are we really in the market of developing and deploying 
 language features that come unglued at the slightest and subtlest 
 misuse? We most certainly shouldn't.
I agree. What I don't agree with is that this aliasing is a "slight and subtle misuse". I went through the 64 projects and their dependencies that are on Buildkite, and didn't see a hint of this pattern emerging. Nor did it in any other code I've surveyed in the almost 5 months the PR was open.
First, I want to say, I think `in` being adjusted is a good initiative. And I think there are some ways to get back to sanity here. But the problem *IS* that it is *very uncommon* that you will run into issues. An issue that is common is a problem in it's own right. An issue that is subtle and very uncommon (so uncommon that you haven't found a case of it yet) is a different kind of problem. I don't believe that less frequent means we have a better situation, I think it's worse.
 Also because some of the criticism is based on the rather loose 
 definition of the promotion to value at the moment. That definition is 
 intentionally loose, not because it needs to be, but because the feature 
 is in `-preview` and the rules need to take into account all platforms 
 that D support, something that cannot be done in DMD alone.
If we can make this solidified, then I think we can be OK with it. One option is to always pass by reference. Another option is to tag an `in` value as "I'm OK if this is passed by reference". like `in ref` or something (that still binds to rvalues). Then an optimization can say "actually, I'm going to pass this by value", and nobody cares. This is different in that you have to *declare* you're ok with a reference, not the other way around. What I want is for the system to behave in a reliable way, so I can write code that works regardless of optimizations.
 Perhaps you don't know this, but the very first implementation, the one 
 I had when I opened the PR on April 3rd actually always used `ref`. It 
 had quite a few issues.
What were those issues?
  Kinke suggested an alternative, and that 
 alternative brought many benefits with it, for a very minor downside, 
 which can easily be mitigated: if your function's semantic really depend 
 on mutation through an alias propagating (or not) to an `in` parameter, 
 then you can use `__traits(isRef, paramname)` to check it.
But this isn't what happens. What happens is, I write code that assumes some `in` parameter won't change, and it actually doesn't. It works great, because the backend decides to pass by value. Then one day, the backend decides, "you know what, I found that passing this size data by reference is more efficient". Now, because I didn't *proactively* put in some __traits(isRef, paramname) check, my code breaks. And only in one call on one user's code base. And the result is memory corruption or some other subtle off-by-one thing that is difficult to trace. Maybe it results in a thread deadlock. I know because I've ran into stuff like this, and it takes weeks to debug. If all `in` parameters have to be checked with static assert(!__traits(isRef, paramname)) and I have to review all functions with `in` parameters that aren't ref with this relationship in mind (is it possible for the data to change while I'm using the in parameter), then in soon becomes a feature that is rejected at code review (it's too complex to be worth it).
 Not that this `-preview` is the only one that has ever been usable from 
 release day (perhaps `markdown` was as well, but it only affected 
 documentation generation). All the other previews would fail on your 
 project because druntime and Phobos were never adapted after the switch 
 was merged. `-preview=in` not only works with druntime/phobos, but many 
 libraries from Buildkite were also adapted to work with or without it 
 (https://github.com/dlang/dmd/pull/11632). I was expecting this would 
 increase user engagement, allow to gather feedback, use cases, and lead 
 to a healthy discussion, just like it allowed to refine the 
 implementation. But I wasn't expecting FUD to be spread over a 
 `-preview` before people even tried it.
This is a bit ridiculous. If I was trying to sell you a gun, which in 0.001% of cases explodes when you pull the trigger, and I say "Yeah that's true, but that almost never happens. You aren't even trying it!!!" does that make you feel better? I don't want to try it, have it work, and then later on down the road explode in my face. The term FUD has connotations that the problems we are talking about aren't real. They rely on the person's ignorance of the actual issues, or on things that aren't provable. This is not that. -Steve
Oct 03 2020
next sibling parent Stefan Koch <uplink.coder googlemail.com> writes:
On Saturday, 3 October 2020 at 14:10:34 UTC, Steven Schveighoffer 
wrote:
 I don't want to try it, have it work, and then later on down 
 the road explode in my face.

 The term FUD has connotations that the problems we are talking 
 about aren't real. They rely on the person's ignorance of the 
 actual issues, or on things that aren't provable. This is not 
 that.
While I don't like -preview=in because it complicates the ABI somewhat. I have to agree that it is just a preview for now. If it doesn't prove itself, we can pretend it never existed. We could even have a revert switch once, it has proven itself. (I do think that should be a common practice for new feature additions from now on, i.e. allow me to turn them off and get back to an older version of D with a new compiler.)
Oct 03 2020
prev sibling next sibling parent Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:
On Saturday, 3 October 2020 at 14:10:34 UTC, Steven Schveighoffer 
wrote:
 This is a bit ridiculous. If I was trying to sell you a gun, 
 which in 0.001% of cases explodes when you pull the trigger, 
 and I say "Yeah that's true, but that almost never happens. You 
 aren't even trying it!!!" does that make you feel better?
Probably is so difficult to reason about. Example 1: It is very improbable that you catch covid-19 if you are careful and conservative in your actions. It is probable that you catch covid-19 if you regularly interact with other people that are not as careful as yourself, or if you simply isn't aware that they were recently on a long vacation trip. E.g. larger teams, or library authors that might only use one compiler on one platform. Example 2: It is probable that you find the source of a bug in a well structured program (no code smell). It is improbable that at program will remain well structured over time (older programs always smell).
Oct 03 2020
prev sibling next sibling parent reply Adam D. Ruppe <destructionator gmail.com> writes:
On Saturday, 3 October 2020 at 14:10:34 UTC, Steven Schveighoffer 
wrote:
 What happens is, I write code that assumes some `in` parameter 
 won't change
That code is kinda already buggy since in just means you won't change it, but somebody else might. I know it is weird when looking at something that is typically a value copy, but in is still based on const, not immutable, so you must keep some expectation that it might be changed by someone else. (BTW speaking of ref optimizations, any immutable could prolly be passed by reference implicitly as well.....) But maybe like you said later, the spec should say any `in` is treated at the language level as a `ref` (except for rvalue issues of course) just optimized to value in defined ABI places. That'd probably be good enough.
Oct 03 2020
next sibling parent reply Paul Backus <snarwin gmail.com> writes:
On Saturday, 3 October 2020 at 14:48:48 UTC, Adam D. Ruppe wrote:
 On Saturday, 3 October 2020 at 14:10:34 UTC, Steven 
 Schveighoffer wrote:
 What happens is, I write code that assumes some `in` parameter 
 won't change
That code is kinda already buggy since in just means you won't change it, but somebody else might. I know it is weird when looking at something that is typically a value copy, but in is still based on const, not immutable, so you must keep some expectation that it might be changed by someone else. (BTW speaking of ref optimizations, any immutable could prolly be passed by reference implicitly as well.....) But maybe like you said later, the spec should say any `in` is treated at the language level as a `ref` (except for rvalue issues of course) just optimized to value in defined ABI places. That'd probably be good enough.
The real problem here is that existing code, which was developed and tested under one set of assumptions, will now have those assumptions silently changed underneath it, in a way that is impossible to detect until and unless it manifests as a bug. (If this sounds familiar to anyone, it may be because it's the same issue that safe-by-default had with extern(C) functions.) If the decision is ever made to make `-preview=in` the default, the existing meaning of `in` should first be deprecated, and eventually made into an error.
Oct 03 2020
parent Adam D. Ruppe <destructionator gmail.com> writes:
On Saturday, 3 October 2020 at 15:07:05 UTC, Paul Backus wrote:
 If the decision is ever made to make `-preview=in` the default, 
 the existing meaning of `in` should first be deprecated, and 
 eventually made into an error.
That has already been in progress over this last year, that's why `in` is now eligible for modifications. The meaning has changed twice in the last year. One of them caused non-complying (already broken) code to fail to compile, so a second change came to ease up on it, but this was never going to be a permanent solution. The -preview switch's purpose is to see what final form it is going to take. There probably will be a formal deprecation of it over the following year before the -preview is actually solidified. But the changes are already in flux.
Oct 03 2020
prev sibling next sibling parent reply Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:
On Saturday, 3 October 2020 at 14:48:48 UTC, Adam D. Ruppe wrote:
 On Saturday, 3 October 2020 at 14:10:34 UTC, Steven 
 Schveighoffer wrote:
 What happens is, I write code that assumes some `in` parameter 
 won't change
That code is kinda already buggy since in just means you won't change it, but somebody else might.
How come? I thought the current (modern) D semantics is that if you cast away shared the compiler can assume that no other contexts (threads/IRQs) modifies the object?
Oct 03 2020
parent Adam D. Ruppe <destructionator gmail.com> writes:
On Saturday, 3 October 2020 at 15:21:47 UTC, Ola Fosheim Grøstad 
wrote:
 How come? I thought the current (modern) D semantics is that if 
 you cast away shared the compiler can assume that no other 
 contexts (threads/IRQs) modifies the object?
I don't know the rules for shared... I don't think anyone does. But the rule for const vs immutable is well known. Passing the same thing const on one side and mutable on the other doesn't break const, even though it changes - that's exactly expected.
Oct 03 2020
prev sibling parent Walter Bright <newshound2 digitalmars.com> writes:
On 10/3/2020 7:48 AM, Adam D. Ruppe wrote:
 But maybe like you said later, the spec should say any `in` is treated at the 
 language level as a `ref` (except for rvalue issues of course) just optimized
to 
 value in defined ABI places. That'd probably be good enough.
But if the user *expected* to change that const ref view of the data, and relied on it, then that behavior randomly breaks.
Oct 04 2020
prev sibling next sibling parent reply Steven Schveighoffer <schveiguy gmail.com> writes:
On 10/3/20 10:10 AM, Steven Schveighoffer wrote:
 Another option is to tag an `in` value as "I'm OK if this is passed by 
 reference". like `in ref` or something (that still binds to rvalues). 
 Then an optimization can say "actually, I'm going to pass this by 
 value", and nobody cares. This is different in that you have to 
 *declare* you're ok with a reference, not the other way around.
Thinking about this more, it seems my major problem is that `in` as it is without the switch is not a reference. `in ref` is a reference, and it's OK if we make this not a reference in practice, because it's const. And code that takes something via `in ref` can already expect possible changes via other references, but should be OK if it doesn't change also. Can we just change -preview=in so it affects `in ref` instead of `in`? -Steve
Oct 03 2020
next sibling parent reply Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:
On Saturday, 3 October 2020 at 14:49:03 UTC, Steven Schveighoffer 
wrote:
 `in ref` is a reference, and it's OK if we make this not a 
 reference in practice, because it's const. And code that takes 
 something via `in ref` can already expect possible changes via 
 other references, but should be OK if it doesn't change also.
You either support aliasing or not. If you support aliasing then you should be able to write code where aliasing has the expected outcome. Let me refer to ADA. According to the ADA manual you can specify that an integer is aliased, that means that it is guaranteed to exist in memory (and not in a register). Then you use 'access' to reference it. If a language construct says "ref" I would expect 100% support for aliasing. It is not like aliasing is always undesired.
Oct 03 2020
parent reply Steven Schveighoffer <schveiguy gmail.com> writes:
On 10/3/20 11:30 AM, Ola Fosheim Grøstad wrote:
 On Saturday, 3 October 2020 at 14:49:03 UTC, Steven Schveighoffer wrote:
 `in ref` is a reference, and it's OK if we make this not a reference 
 in practice, because it's const. And code that takes something via `in 
 ref` can already expect possible changes via other references, but 
 should be OK if it doesn't change also.
You either support aliasing or not. If you support aliasing then you should be able to write code where aliasing has the expected outcome. Let me refer to ADA. According to the ADA manual you can specify that an integer is aliased, that means that it is guaranteed to exist in memory (and not in a register). Then you use 'access' to reference it. If a language construct says "ref" I would expect 100% support for aliasing. It is not like aliasing is always undesired.
Given that it's a parameter, and the parameter is const, it can only change through another reference. And this means, the function has to deal with the possibility that it can change, but ALSO cannot depend on or enforce being able to change it on purpose. On that, I think I agree with the concept of being able to switch to a value. What I don't agree with is the idea that one can write code expecting something is passed by value, and then have the compiler later switch it to a reference. `in` means by value in all code today. The fact that we tried -preview=in on a bunch of projects and they "didn't break" is not reassuring. -Steve
Oct 03 2020
next sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 10/3/20 11:58 AM, Steven Schveighoffer wrote:
 What I don't agree with is the idea that one can write code expecting 
 something is passed by value, and then have the compiler later switch it 
 to a reference. `in` means by value in all code today. The fact that we 
 tried -preview=in on a bunch of projects and they "didn't break" is not 
 reassuring.
Agreed. Sadly I found (actually remembered) a smoking gun. Over the years I've worked on a few STL-related code (such as flex_string and fbvector). I've once had a really difficult bug related to one or both of these functions: http://www.cplusplus.com/reference/string/string/replace/ https://en.cppreference.com/w/cpp/algorithm/replace The code looked correct and everything, I looked at it for hours. STL implementation subtleties are not really something to google about, but I asked a colleague and he started chuckling. He pointed out that you always must assume that your parameters may alias part of your container. (Sometimes wrapped in a different kind of iterator.) This sort of thing is well known and feared in STL implementer circles, so the rest of us sleep soundly at night. Consider for example: template< class ForwardIt, class T > constexpr void replace( ForwardIt first, ForwardIt last, const T& old_value, const T& new_value ); That may as well be called like this: vector<Widget> v; ... size_t i = ..., j = ...; replace(v.begin(), v.end(), v[i], v[j]); Only one is needed to be a reference inside the vector. Two is a worst case of sorts. You don't know what town you're in after debugging this. At least in the STL this is reproducible with some ease, because STL always passes references. Now consider that this happens only on certain definitions of Widget (possibly maintenance increases its size and... boom!) and on certain platforms. So I ask again: is this the kind of feature we want for the D language?
Oct 03 2020
prev sibling parent reply Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:
On Saturday, 3 October 2020 at 15:58:53 UTC, Steven Schveighoffer 
wrote:
 Given that it's a parameter, and the parameter is const, it can 
 only change through another reference. And this means, the 
 function has to deal with the possibility that it can change, 
 but ALSO cannot depend on or enforce being able to change it on 
 purpose. On that, I think I agree with the concept of being 
 able to switch to a value.
But you can expect it to not change in parallell as it is not shared!? It can change if you call another function or in the context of a coroutine (assuming that coroutines cannot move to other threads). My key point was this, I've never seen "ref" mean anything else than a live view of an object. If D is going to be an easy to learn language anything named "ref" has to retain that expectation. In the context of parallell programming I believe that Chapel has various parameter transfer types that might be worth looking at (I don't remember the details).
 What I don't agree with is the idea that one can write code 
 expecting something is passed by value, and then have the 
 compiler later switch it to a reference. `in` means by value in 
 all code today. The fact that we tried -preview=in on a bunch 
 of projects and they "didn't break" is not reassuring.
Well, it is common for compilers (e.g. Ada/SPARK) to optimize by-value as a reference, but it should not be observable. You could get around this by making the by-value parameter transfer "no-alias" with associated undefined behaviour (or "__restricted__" in C++ as Kinke pointed out). This would be great actually, except... "in" looks very innocent to a newbie so it should have simple semantics... Advanced features ought to look advanced (at least more advanced than "in" or "out"). "in", "out", "in out" should be as simple to use as in SPARK, but that is difficult to achieve without constrained semantics (which would involve a lot more than a simple DIP). SPARK's approach to this looks really great though, but I've never used SPARK so I can't speak from experience. But, it is the kind of semantics that makes me more eager to give it a spin, for sure. It might be helpful to play a bit with languages like Chapel and SPARK to get ideas.
Oct 03 2020
next sibling parent Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:
On Saturday, 3 October 2020 at 16:49:28 UTC, Ola Fosheim Grøstad 
wrote:
 (which would involve a lot more than a simple DIP). SPARK's 
 approach to this looks really great though, but I've never used 
 SPARK so I can't speak from experience. But, it is the kind of 
 semantics that makes me more eager to give it a spin, for sure.
As far as I understand you can choose to write parts of an Ada program in the constrained SPARK subset. Which basically means that D might be able to do something similar. That could be very powerful. Write complicated functions in a restricted verified language subset, but most of the bread-and-butter code remains in the more flexible full language.
Oct 03 2020
prev sibling parent reply Steven Schveighoffer <schveiguy gmail.com> writes:
On 10/3/20 12:49 PM, Ola Fosheim Grøstad wrote:
 On Saturday, 3 October 2020 at 15:58:53 UTC, Steven Schveighoffer wrote:
 Given that it's a parameter, and the parameter is const, it can only 
 change through another reference. And this means, the function has to 
 deal with the possibility that it can change, but ALSO cannot depend 
 on or enforce being able to change it on purpose. On that, I think I 
 agree with the concept of being able to switch to a value.
But you can expect it to not change in parallell as it is not shared!? It can change if you call another function or in the context of a coroutine (assuming that coroutines cannot move to other threads).
You can expect it to change but due to the way it enters your function, you can't rely on that expectation, even today. For example: void foo(const ref int x, ref int y) { auto z = x; bar(); // might change x, but doesn't necessarily y = 5; // might change x, but doesn't necessarily } So given that it *might* change x, but isn't *guaranteed* to change x, you can reason that the function needs to deal with both of these possibilities. There isn't a way to say "parameter which is an alias of this other parameter". In that sense, altering the function to actually accept x by value doesn't change what the function needs to deal with. On the other hand, if the compiler normally passes x by value, and you rely on that current definition, and the definition changes later to mean pass by reference, then you now have code that may have had a correct assumption before, but doesn't now.
 
 My key point was this, I've never seen "ref" mean anything else than a 
 live view of an object. If D is going to be an easy to learn language 
 anything named "ref" has to retain that expectation.
And in a sense, you can rely on that. At a function level, you can't tell whether mutating other data is going to affect `in ref` or `const ref` data. You have to assume in some cases it can, and in some cases it cannot.
 What I don't agree with is the idea that one can write code expecting 
 something is passed by value, and then have the compiler later switch 
 it to a reference. `in` means by value in all code today. The fact 
 that we tried -preview=in on a bunch of projects and they "didn't 
 break" is not reassuring.
Well, it is common for compilers (e.g. Ada/SPARK) to optimize by-value as a reference, but it should not be observable.
If we could have this, it would be useful as well, but doesn't need a language change. You might be able to do this in pure functions.
 You could get around this by making the by-value parameter transfer 
 "no-alias" with associated undefined behaviour (or "__restricted__" in 
 C++ as Kinke pointed out). This would be great actually, except...
 
 "in" looks very innocent to a newbie so it should have simple 
 semantics... Advanced features ought to look advanced (at least more 
 advanced than "in" or "out").
Yeah, that's why I think `in ref`, can be used (or even `const ref`). That being said, if `in` didn't have the definition it already has, this would not be as controversial. -Steve
Oct 03 2020
parent reply Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:
On Saturday, 3 October 2020 at 17:05:12 UTC, Steven Schveighoffer 
wrote:
 On 10/3/20 12:49 PM, Ola Fosheim Grøstad wrote:
 On Saturday, 3 October 2020 at 15:58:53 UTC, Steven 
 Schveighoffer wrote:
 Given that it's a parameter, and the parameter is const, it 
 can only change through another reference. And this means, 
 the function has to deal with the possibility that it can 
 change, but ALSO cannot depend on or enforce being able to 
 change it on purpose. On that, I think I agree with the 
 concept of being able to switch to a value.
But you can expect it to not change in parallell as it is not shared!? It can change if you call another function or in the context of a coroutine (assuming that coroutines cannot move to other threads).
You can expect it to change but due to the way it enters your function, you can't rely on that expectation, even today. For example: void foo(const ref int x, ref int y) { auto z = x; bar(); // might change x, but doesn't necessarily y = 5; // might change x, but doesn't necessarily } So given that it *might* change x, but isn't *guaranteed* to change x, you can reason that the function needs to deal with both of these possibilities. There isn't a way to say "parameter which is an alias of this other parameter".
 In that sense, altering the function to actually accept x by 
 value doesn't change what the function needs to deal with.
I am happy that we seem to agree on the principles, but I am a bit perplexed by this statement as we seem to draw different conclusions from the same principles... :-D I think maybe we have different use-cases in mind. So let me give you one. Assume that many SimulationObject instances form a graph accessible through a Simulation instance: void run_and_print(const ref SimulationObject objview, ref Simulation world){ auto old_state = objview.get_state(); world.run_simulation(); print_difference(old_state, objview.get_state()); } If "objview" is turned into a value, nothing changes. objview is a view of one object in the "world" graph. A deliberate aliasing reference. I don't want this behaviour from something named "ref", const or not const.
 And in a sense, you can rely on that. At a function level, you 
 can't tell whether mutating other data is going to affect `in 
 ref` or `const ref` data. You have to assume in some cases it 
 can, and in some cases it cannot.
Do you mean the compiler or the programmer? As a programmer I most certainly can know this, which is why "__restricted__" semantics would be acceptable for me (although not newbie friendly).
 That being said, if `in` didn't have the definition it already 
 has, this would not be as controversial.
I think it would be interesting if "in", "in out" and "out" had 100% compatible semantics with SPARK. D could get verification for free as the SPARK verifier is a separate tool. All you have to do is find a mapping from D to SPARK and generate the verification-condition output. (or just transpile to SPARK). I think that could be a very powerful upgrade that could make D more favourable for embedded programming.
Oct 03 2020
parent reply Imperatorn <johan_forsberg_86 hotmail.com> writes:
On Saturday, 3 October 2020 at 19:36:43 UTC, Ola Fosheim Grøstad 
wrote:
 On Saturday, 3 October 2020 at 17:05:12 UTC, Steven 
 Schveighoffer wrote:
[...]
 [...]
I am happy that we seem to agree on the principles, but I am a bit perplexed by this statement as we seem to draw different conclusions from the same principles... :-D [...]
Wow. This would be very cool if it could be done (verification through SPARK)
Oct 03 2020
parent Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:
On Saturday, 3 October 2020 at 19:47:43 UTC, Imperatorn wrote:
 Wow. This would be very cool if it could be done (verification 
 through SPARK)
It would only be a small subset of D mapping to a subset of SPARK, but since the implementation effort would be reasonable if the basic semantics (parameter passing in particular) were mappable it could be interesting yes. (But over time, maybe more of the language could be covered.)
Oct 03 2020
prev sibling next sibling parent reply Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:
On Saturday, 3 October 2020 at 14:49:03 UTC, Steven Schveighoffer 
wrote:
 `in ref` is a reference, and it's OK if we make this not a 
 reference in practice, because it's const. And code that takes 
 something via `in ref` can already expect possible changes via 
 other references, but should be OK if it doesn't change also.
Is that still OK in a concurrent or multithreaded context? void foo (in ref bar) { // does something which may yield, and // another context can change value // underlying `bar` ... // result of this writeln will now depend // on how the compiler treated the `in ref` writeln(bar) }
Oct 03 2020
parent reply Steven Schveighoffer <schveiguy gmail.com> writes:
On 10/3/20 11:41 AM, Joseph Rushton Wakeling wrote:
 On Saturday, 3 October 2020 at 14:49:03 UTC, Steven Schveighoffer wrote:
 `in ref` is a reference, and it's OK if we make this not a reference 
 in practice, because it's const. And code that takes something via `in 
 ref` can already expect possible changes via other references, but 
 should be OK if it doesn't change also.
Is that still OK in a concurrent or multithreaded context?     void foo (in ref bar)     {         // does something which may yield, and         // another context can change value         // underlying `bar`         ...         // result of this writeln will now depend         // on how the compiler treated the `in ref`         writeln(bar)     }
This is not any different than calling a function which has a reference to the data elsewhere. In other words, it's not necessarily the function itself that changes the data, it could be changed outside the function. You don't need concurrency to do it. But it's not impossible to define this: "when accepting a parameter by `in ref`, one cannot depend on the value remaining constant, as other references to the data may change it. The compiler can also decide to pass an `in ref` parameter by value for optimization reasons, so one cannot depend on the parameter changing through a different alias." That's essentially what `in` means in this preview switch. But the problem really is that `in` means something else today, and there is already a lot of code that expects that meaning. -Steve
Oct 03 2020
parent Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:
On Saturday, 3 October 2020 at 16:08:46 UTC, Steven Schveighoffer 
wrote:
 This is not any different than calling a function which has a 
 reference to the data elsewhere. In other words, it's not 
 necessarily the function itself that changes the data, it could 
 be changed outside the function. You don't need concurrency to 
 do it.
Sure. Concurrency was just one example of how the implementation-dependent behaviour could arise.
 But it's not impossible to define this:

 "when accepting a parameter by `in ref`, one cannot depend on 
 the value remaining constant, as other references to the data 
 may change it. The compiler can also decide to pass an `in ref` 
 parameter by value for optimization reasons, so one cannot 
 depend on the parameter changing through a different alias."
OK, but that feels rather like it's imposing a cognitive burden on the developer as a way to work around the fact that the feature itself isn't working in an intuitive way. It feels as unintuitive that a parameter marked `ref` could fail (in an implementation dependent way) to display reference semantics, as it does that a non-reference-type parameter _not_ marked `ref` could display them.
Oct 03 2020
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 10/3/2020 7:49 AM, Steven Schveighoffer wrote:
 `in ref` is a reference, and it's OK if we make this not a reference in 
 practice, because it's const.
No, it is not. Because a `const ref` can be changed by another mutable reference to the same memory object. This is defined behavior. This suggestion turns defined behavior into implementation-defined behavior, meaning it will break existing code written in good faith in unpredictable, unreliable ways.
Oct 04 2020
next sibling parent reply Steven Schveighoffer <schveiguy gmail.com> writes:
On 10/4/20 10:08 PM, Walter Bright wrote:
 On 10/3/2020 7:49 AM, Steven Schveighoffer wrote:
 `in ref` is a reference, and it's OK if we make this not a reference 
 in practice, because it's const.
No, it is not. Because a `const ref` can be changed by another mutable reference to the same memory object. This is defined behavior.
My logic in thinking about it is that the function can't know at all that a ref-to-value optimization happened, simply because it cannot infer or prove that another argument or a global, or whatever, is aliased to that ref (and it can't modify the data via the ref, unlike C++). But actually, that's not the whole story, and I'm thinking that really we can't do that. A *caller* can know that two parameters are aliased (trivially -- pass the same parameter twice), and in that case, expect a certain outcome. So yeah, we can't do this. Just make `in` always ref, and bind to rvalues. In a response a few posts up, Mathias says "Perhaps you don't know this, but the very first implementation, the one I had when I opened the PR on April 3rd actually always used `ref`. It had quite a few issues." What were those issues? Why can't in mean ref always? Or maybe just bind `in ref` to rvalues? -Steve
Oct 04 2020
parent Walter Bright <newshound2 digitalmars.com> writes:
On 10/4/2020 7:31 PM, Steven Schveighoffer wrote:
 A *caller* can know that two parameters are aliased (trivially -- pass the
same 
 parameter twice), and in that case, expect a certain outcome.
Unfortunately, only in the trivial case can the caller detect it.
Oct 05 2020
prev sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 10/4/20 10:08 PM, Walter Bright wrote:
 On 10/3/2020 7:49 AM, Steven Schveighoffer wrote:
 `in ref` is a reference, and it's OK if we make this not a reference 
 in practice, because it's const.
No, it is not. Because a `const ref` can be changed by another mutable reference to the same memory object. This is defined behavior. This suggestion turns defined behavior into implementation-defined behavior, meaning it will break existing code written in good faith in unpredictable, unreliable ways.
I will add that C++'s std::min and std::max have had this perennial problem that still bites users. Check https://en.cppreference.com/w/cpp/algorithm/min: "Warning Capturing the result of std::min by reference produces a dangling reference if one of the parameters is a temporary and that parameter is returned: int n = 1; const int& r = std::min(n-1, n+1); // r is dangling " C++ compilers have gotten increasingly adept at detecting and warning about such situations, but voluntarily adding a problematic feature to the D language to then work on fixing its aftermath doesn't seem a wise thing to do.
Oct 04 2020
prev sibling parent Walter Bright <newshound2 digitalmars.com> writes:
On 10/3/2020 7:10 AM, Steven Schveighoffer wrote:
 But the problem *IS* that it is *very uncommon* that you will run into issues. 
 An issue that is common is a problem in it's own right. An issue that is
subtle 
 and very uncommon (so uncommon that you haven't found a case of it yet) is a 
 different kind of problem. I don't believe that less frequent means we have a 
 better situation, I think it's worse.
This. The rarity (and unpredictability) of it causing a problem is the problem. When I have a bug in my code, I want it to fail hard as soon, as obviously, and as often as possible, so it won't escape into a released product.
Oct 04 2020
prev sibling parent reply Iain Buclaw <ibuclaw gdcproject.org> writes:
On Saturday, 3 October 2020 at 05:02:36 UTC, Andrei Alexandrescu 
wrote:
 On 10/2/20 6:11 PM, Walter Bright wrote:
 On 10/2/2020 10:31 AM, Steven Schveighoffer wrote:
 And this might not be true on a different compiler.
This is looking like a serious problem.
They say one should choose one's fights wisely, so I spent some time pondering. I could just let this thread scroll by and not think twice about it. By next week, I may as well forget. But then I realized this is /exactly/ the kind of crap that we'll all scratch our heads six months from now, "How did this ever pass review?
I take offense to that. I'd prefer if you'd moderate your tone please.
 Who approved this? How in the world did a group of competent,
 well-intended people, looked at this and said - yep, good idea. 
 Let's."

 ???
*You* approved it. https://github.com/dlang/dmd/pull/11000#issuecomment-675605193
 This glib take is EXTREMELY concerning:

 In the general case, no. You can have two distinct pointers 
 with the
 same value, and there's nothing the frontend can do to detect 
 it.
 
 This scenario has been brought up during the review. I doubt 
 it will,
 in practice, be an issue though. This is not a common pattern, 
 nor
 does it seems useful. It rather looks like a code smell.
Wait a SECOND! Are we really in the market of developing and deploying language features that come unglued at the slightest and subtlest misuse? We most certainly shouldn't. I sincerely congratulated Mathias and the other participants for working on this. It's an important topic. Knowing that all those involved are very good at what they do, and without having looked closely, I was sure they got something really nice going that avoids this absolutely blatant semantic grenade. And now I see this is exactly it - we got a -preview of a grenade. How is this possible? How can we sleep at night now? Again: take a step back and reconsider, why did this pass muster? This is important, folks. It's really important as parameter passing goes to the core of what the virtual machine does. You can't say, meh, let's just fudge it here, and whatever is surprising it's on the user. Please, we really need to put back the toothpaste in the tube here. I could on everybody's clear head here to reconsider this.
Frankly, I think you are making a mountain out of a molehill here. You are imagining a problem that doesn't exist; and if one does find an issue, the fault lies with the DMD compiler and not the D language specification. Though evidently having clearer wording in the spec benefits all. If you read nothing more of this reply, at least finish up until the end of this paragraph. Please hold fire until GDC and LDC have implemented this feature, then we can discuss the pitfalls that we've encountered with it. Basing decisions on behaviors observed with DMD is not the right approach, and if you are currently finding the situation to be a mess, it is a mess of DMD's own doing. Plucking a fitting example from a Phobos unittest that demonstrates the kind of things DMD let's people get away with: ``` RefCounted!int* p; { auto rc1 = RefCounted!int(5); p = &rc1; assert(rc1 == 5); assert(rc1._refCounted._store._count == 1); auto rc2 = rc1; assert(rc1._refCounted._store._count == 2); } assert(p._refCounted._store == null); ``` Why is this not a compile-time error? DMD just isn't punishing users enough for the buggy code they've written. - - - I don't really have the heart to go through and unpick all points raised in this thread, but I think it's worth sharing the three key conclusions I took away from reviewing the pull request: 1. This is behind a -preview flag, and so should be treated as experimental. Nothing breaks by having it there. Nothing breaks if it were to be suddenly removed without any deprecation cycle. 2. Aliasing was raised multiple times throughout the review, I even gave this example at time to demonstrate my concerns: ``` void bar(in int a, out int b) { } int a = 42; bar(a, a); ``` But ultimately, worrying about this is missing the point, as the problem is already present in the compiler even without `-preview=in`, and Walter is working on fixing it. In the meantime, these sorts of cases are relatively trivial to pick up and can be added as warnings in GDC and LDC until the front-end implements the semantic guarantees. 3. There is no danger of ref/non-ref mismatches in ABI, because `in` parameters that are inferred `ref` are going to be passed in memory anyway. In the following: ``` alias A = void delegate(in long); alias B = void delegate(const long); ``` Either `long` always gets passed in memory, or always gets passed in registers, in both cases, they are always going to be covariant. The same remains true whatever type you replace `long` with. The one place where `in` parameters start to get interesting is rather at the call site. I think it is best illustrated with the following: ``` struct Foo { this(this); } void fun1(const Foo f) { } void fun2(in Foo f) { } ``` The compiler already ensures that non-trivially copy-able types never end up in a situation where a temporary is needed. So `Foo` is always passed by ref, otherwise there'd be a double copy done at the call site. So once again, both are covariant as far as the callee is concerned. But in the case of `fun2`, the caller does something different. The copy constructor is elided entirely, and that is the crux of the optimization that is at play here. If you've assumed anything else, you're assumptions are sorely misplaced. - - - I've also skimmed past a passing concern that the ABI between compilers would be different. Well, let me rest assure you that DMD, GDC and LDC have never been compatible in the first place, so there's no point worrying about that now. Though DMD is really the one at fault for all incompatibilities by choosing to have a non-standard calling convention for extern(D) code...
Oct 04 2020
next sibling parent reply Steven Schveighoffer <schveiguy gmail.com> writes:
On 10/4/20 10:19 AM, Iain Buclaw wrote:
 3. There is no danger of ref/non-ref mismatches in ABI, because `in` 
 parameters that are inferred `ref` are going to be passed in memory anyway.
 
 In the following:
 ```
 alias A = void delegate(in long);
 alias B = void delegate(const long);
 ```
 Either `long` always gets passed in memory, or always gets passed in 
 registers, in both cases, they are always going to be covariant.  The 
 same remains true whatever type you replace `long` with.
As the change says, it's up to the back end to decide but is currently types over 2 machine word size. So this means on 32-bit systems, 80-bit reals would be passed by reference for example. In practice, I don't know what this means for future or other platform compilers. But as a trivial counter-example, "the same remains true whatever type you replace `long` with" isn't correct: struct S { size_t[3] data; } alias A = void delegate(in S); alias B = void delegate(const S); static assert(is(A : B)); // fails with -preview=in One can imagine this being a sticking point where code that builds fine on 64-bit systems because of the "lucky chance" that a struct can be passed by value is not true on a 32-bit system. I just found it odd that this was touted as a benefit. -Steve
Oct 04 2020
parent Iain Buclaw <ibuclaw gdcproject.org> writes:
On Monday, 5 October 2020 at 02:21:01 UTC, Steven Schveighoffer 
wrote:
 On 10/4/20 10:19 AM, Iain Buclaw wrote:
 3. There is no danger of ref/non-ref mismatches in ABI, 
 because `in` parameters that are inferred `ref` are going to 
 be passed in memory anyway.
 
 In the following:
 ```
 alias A = void delegate(in long);
 alias B = void delegate(const long);
 ```
 Either `long` always gets passed in memory, or always gets 
 passed in registers, in both cases, they are always going to 
 be covariant.  The same remains true whatever type you replace 
 `long` with.
As the change says, it's up to the back end to decide but is currently types over 2 machine word size. So this means on 32-bit systems, 80-bit reals would be passed by reference for example. In practice, I don't know what this means for future or other platform compilers. But as a trivial counter-example, "the same remains true whatever type you replace `long` with" isn't correct: struct S { size_t[3] data; } alias A = void delegate(in S); alias B = void delegate(const S); static assert(is(A : B)); // fails with -preview=in
What did I say about the fault lies with the DMD compiler and not the D language specification? S can't be both pass in memory and in registers at the same time. Your example is not a problem with '-preview=in', and shouldn't be construed as one.
Oct 04 2020
prev sibling next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 10/4/20 10:19 AM, Iain Buclaw wrote:
 On Saturday, 3 October 2020 at 05:02:36 UTC, Andrei Alexandrescu wrote:
 On 10/2/20 6:11 PM, Walter Bright wrote:
 On 10/2/2020 10:31 AM, Steven Schveighoffer wrote:
 And this might not be true on a different compiler.
This is looking like a serious problem.
They say one should choose one's fights wisely, so I spent some time pondering. I could just let this thread scroll by and not think twice about it. By next week, I may as well forget. But then I realized this is /exactly/ the kind of crap that we'll all scratch our heads six months from now, "How did this ever pass review?
I take offense to that.  I'd prefer if you'd moderate your tone please.
Of course. Please accept my apologies.
 Who approved this? How in the world did a group of competent,
 well-intended people, looked at this and said - yep, good idea. Let's."

 ???
*You* approved it. https://github.com/dlang/dmd/pull/11000#issuecomment-675605193
Oi. Touché.
 Please, we really need to put back the toothpaste in the tube here. I 
 could on everybody's clear head here to reconsider this.
Frankly, I think you are making a mountain out of a molehill here.  You are imagining a problem that doesn't exist; and if one does find an issue, the fault lies with the DMD compiler and not the D language specification.  Though evidently having clearer wording in the spec benefits all.
I think my STL examples have put the narrative that confusing aliasing is rare to rest.
 If you read nothing more of this reply, at least finish up until the end 
 of this paragraph.  Please hold fire until GDC and LDC have implemented 
 this feature, then we can discuss the pitfalls that we've encountered 
 with it.  Basing decisions on behaviors observed with DMD is not the 
 right approach, and if you are currently finding the situation to be a 
 mess, it is a mess of DMD's own doing.
Implementation details of dmd are not of concern here, and in fact the more different ldc/gdc/dmd are from one another, the more problematic the entire matter is.
 2. Aliasing was raised multiple times throughout the review, I even gave 
 this example at time to demonstrate my concerns:
 ```
 void bar(in int a, out int b) { }
 int a = 42;
 bar(a, a);
 ```
 But ultimately, worrying about this is missing the point, as the problem 
 is already present in the compiler even without `-preview=in`, and 
 Walter is working on fixing it.  In the meantime, these sorts of cases 
 are relatively trivial to pick up and can be added as warnings in GDC 
 and LDC until the front-end implements the semantic guarantees.
That's nice. If the "in" feature is paired appropriately with some sort of verification, my concerns are entirely allayed. "Entirely" of course if the problem is resolved entirely as well. Thanks.
Oct 04 2020
parent reply Iain Buclaw <ibuclaw gdcproject.org> writes:
On Monday, 5 October 2020 at 02:46:42 UTC, Andrei Alexandrescu 
wrote:
 On 10/4/20 10:19 AM, Iain Buclaw wrote:
 On Saturday, 3 October 2020 at 05:02:36 UTC, Andrei 
 Alexandrescu wrote:
 Who approved this? How in the world did a group of competent,
 well-intended people, looked at this and said - yep, good 
 idea. Let's."

 ???
*You* approved it. https://github.com/dlang/dmd/pull/11000#issuecomment-675605193
Oi. Touché.
I know the feeling all too well. I have on a few occasions ran into issues in the D front-end (from GDC) and asked "who on Earth wrote or approved this?", only to discover that it was Me, two months ago. :-)
 Please, we really need to put back the toothpaste in the tube 
 here. I could on everybody's clear head here to reconsider 
 this.
Frankly, I think you are making a mountain out of a molehill here.  You are imagining a problem that doesn't exist; and if one does find an issue, the fault lies with the DMD compiler and not the D language specification.  Though evidently having clearer wording in the spec benefits all.
I think my STL examples have put the narrative that confusing aliasing is rare to rest.
The spec as is currently written does seem to be wide open to interpretation. But I think that trying to reason 'in' in terms of 'ref' and 'restrict' semantics should be left at the door. Any thought-problems that arise from aliasing is hard to justify in my view because in practice I just can't see D having strict aliasing rules so long as it continues to not be enforced in some way. Actually, I think there is zero mention of aliasing in the language spec, so the following can only be interpreted as being valid and precisely defined to work in D. --- float f = 1.0; bool *bptr = cast(bool*)&f; bptr[2] = false; assert(f == 0.5); --- If this gets addressed, then we can use aliasing rules as a measure for how we treat -preview=in. If you are interested in defining some aliasing rules for D, I'd be more than happy to spin off a new thread to discuss them, and I will implement that in GDC and report back the success/failures of applying such rules. :-)
 If you read nothing more of this reply, at least finish up 
 until the end of this paragraph.  Please hold fire until GDC 
 and LDC have implemented this feature, then we can discuss the 
 pitfalls that we've encountered with it.  Basing decisions on 
 behaviors observed with DMD is not the right approach, and if 
 you are currently finding the situation to be a mess, it is a 
 mess of DMD's own doing.
Implementation details of dmd are not of concern here, and in fact the more different ldc/gdc/dmd are from one another, the more problematic the entire matter is.
As this is an experimental feature, a bit of deviation in implementations can be seen as a good thing. Convergence can come later once we work out just who has got it right. Correct me if I'm wrong, but it looks like we'll have three competing implementations: DMD: `const scope`, with `ref` applied on types usually passed in memory. LDC: `const scope ref restrict` GDC: `const scope` with `ref` applied on types usually passed by invisible reference.
Oct 05 2020
next sibling parent Timon Gehr <timon.gehr gmx.ch> writes:
On 05.10.20 09:56, Iain Buclaw wrote:
 
 
 Correct me if I'm wrong, but it looks like we'll have three competing 
 implementations:
 
 DMD: `const scope`, with `ref` applied on types usually passed in memory.
 LDC: `const scope ref  restrict`
 GDC: `const scope` with `ref` applied on types usually passed by 
 invisible reference.
Weren't there different rules for non-POD types? Do those differ between backends? Also, it looks like in this case `__traits(isRef, ...)` will return different results with different compiler backends and aliasing `in` parameters is UB with LDC? Is ` restrict` wrongly treated as ` safe` or does the ` safe`ty of `in` parameters depend on the compiler backend?
Oct 05 2020
prev sibling next sibling parent reply kinke <noone nowhere.com> writes:
On Monday, 5 October 2020 at 07:56:00 UTC, Iain Buclaw wrote:
 Correct me if I'm wrong, but it looks like we'll have three 
 competing implementations:

 DMD: `const scope`, with `ref` applied on types usually passed 
 in memory.
 LDC: `const scope ref  restrict`
 GDC: `const scope` with `ref` applied on types usually passed 
 by invisible reference.
Non-PODs are always by-ref, so wrt. PODs only: DMD: Currently (2.094) `ref` for all types > 2 machine words, with a little exception for x87 `real` on Win64 (ref). To be improved. LDC: With https://github.com/ldc-developers/ldc/pull/3578, `ref` for all types which would be passed by invisible ref (Win64, AArch64), or passed on the stack and larger than 2 machine words (Posix x86_64, 32-bit x86). For all other ABIs (I'm not really familiar with): `ref` if larger than 2 machine words. No ` restrict` (IIUC, the LLVM semantics are more strict and wouldn't allow the same arg to be passed as 2 ` restrict` refs, even if both are const). --- Wrt. the concerns about differing ref/value decisions for PODs across compilers/platforms and thus implementation-dependent potential aliasing issues for lvalue args: a possible approach could be leaving everything as-is ABI-wise, but have the compiler create and pass a temporary in safe callers if the callee takes a ref, unless it can prove there's no way the arg can be aliased. E.g., assuming x87 `real` for Win64: void callee(in real x); // e.g., by-ref for Win64, by-value for Posix x86_64 void safeCaller1(ref x) safe { callee(x); // x might be aliased by global state // for Win64: auto tmp = x, callee(tmp); // Posix x86_64: by-value, so simply `callee(x)` } void safeCaller2() safe { real x = 1; callee(x); // x cannot be aliased, fine to pass directly by-ref for Win64 } void safeCaller3(in real x) safe { callee(x); // safe to forward directly by-ref for Win64 }
Oct 05 2020
parent reply Iain Buclaw <ibuclaw gdcproject.org> writes:
On Monday, 5 October 2020 at 13:27:00 UTC, kinke wrote:
 Wrt. the concerns about differing ref/value decisions for PODs 
 across compilers/platforms and thus implementation-dependent 
 potential aliasing issues for lvalue args: a possible approach 
 could be leaving everything as-is ABI-wise, but have the 
 compiler create and pass a temporary in  safe callers if the 
 callee takes a ref, unless it can prove there's no way the arg 
 can be aliased. E.g., assuming x87 `real` for Win64:

 void callee(in real x); // e.g., by-ref for Win64, by-value for 
 Posix x86_64

 void safeCaller1(ref x)  safe
 {
     callee(x); // x might be aliased by global state
     // for Win64:    auto tmp = x, callee(tmp);
     // Posix x86_64: by-value, so simply `callee(x)`
 }
So then `in` would come with its own semantic, that requires new code to handle, rather than piggy-backing off of `ref`?
Oct 05 2020
parent kinke <noone nowhere.com> writes:
On Monday, 5 October 2020 at 15:25:15 UTC, Iain Buclaw wrote:
 On Monday, 5 October 2020 at 13:27:00 UTC, kinke wrote:
 Wrt. the concerns about differing ref/value decisions for PODs 
 across compilers/platforms and thus implementation-dependent 
 potential aliasing issues for lvalue args: a possible approach 
 could be leaving everything as-is ABI-wise, but have the 
 compiler create and pass a temporary in  safe callers if the 
 callee takes a ref, unless it can prove there's no way the arg 
 can be aliased. E.g., assuming x87 `real` for Win64:

 void callee(in real x); // e.g., by-ref for Win64, by-value 
 for Posix x86_64

 void safeCaller1(ref x)  safe
 {
     callee(x); // x might be aliased by global state
     // for Win64:    auto tmp = x, callee(tmp);
     // Posix x86_64: by-value, so simply `callee(x)`
 }
So then `in` would come with its own semantic, that requires new code to handle, rather than piggy-backing off of `ref`?
It already has its own semantic with -preview=in, so this would be a concession for all those raising concerns about implementation-dependent aliasing issues. It would just reduce new `in` copy elisions for PODs in safe code and prevent all related aliasing trouble (again, PODs only - aliasing could still be an issue for non-PODs, but that's not implementation-dependent). safe is already slower due to enabled bounds checks even with `-release`, so I could live with it.
Oct 05 2020
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 10/5/2020 12:56 AM, Iain Buclaw wrote:
 Actually, I think there is zero mention of aliasing in the language spec, so
the 
 following can only be interpreted as being valid and precisely defined to work 
 in D.
 ---
 float f = 1.0;
 bool *bptr = cast(bool*)&f;
 bptr[2] = false;
 assert(f == 0.5);
 ---
safe code won't allow such a cast. In system code, the result will depend on the layout of the memory, more specifically big/little endianness. I think we can agree that the `in` semantics cannot be restricted to system code only.
 If this gets addressed, then we can use aliasing rules as a measure for how we 
 treat -preview=in.  If you are interested in defining some aliasing rules for
D, 
 I'd be more than happy to spin off a new thread to discuss them, and I will 
 implement that in GDC and report back the success/failures of applying such 
 rules. :-)
The plan for aliasing rules is live.
 Correct me if I'm wrong, but it looks like we'll have three competing 
 implementations:
 
 DMD: `const scope`, with `ref` applied on types usually passed in memory.
 LDC: `const scope ref  restrict`
 GDC: `const scope` with `ref` applied on types usually passed by invisible 
 reference.
The problems come from: 1. the user not knowing if `in` is passing by ref or not 2. being "implementation defined" meaning that the user simply cannot know (1) because it can change from version to version, or with changes in compiler switch settings. Not just in switching from one compiler to another (although that's bad enough) 3. the user would have to look at the disassembly to determine (1) or not, and this is unreasonable 4. if the function is a template with `in T t` as a parameter, the user cannot know if `t` is passed by ref or not
Oct 05 2020
next sibling parent ag0aep6g <anonymous example.com> writes:
On 05.10.20 19:11, Walter Bright wrote:
 On 10/5/2020 12:56 AM, Iain Buclaw wrote:
 Actually, I think there is zero mention of aliasing in the language 
 spec, so the following can only be interpreted as being valid and 
 precisely defined to work in D.
 ---
 float f = 1.0;
 bool *bptr = cast(bool*)&f;
 bptr[2] = false;
 assert(f == 0.5);
 ---
safe code won't allow such a cast.
safe allows the cast just fine. It doesn't allow the pointer arithmetic, but that can easily be worked around: ---- void main() safe { float f = 1.0; bool[4]* bptr = cast(bool[4]*) &f; (*bptr)[2] = false; assert(f == 0.5); } ---- (Compile with `-preview=dip1000`, because `f` is on the stack.)
Oct 05 2020
prev sibling parent Mathias LANG <geod24 gmail.com> writes:
On Monday, 5 October 2020 at 17:11:52 UTC, Walter Bright wrote:
 The problems come from:

 1. the user not knowing if `in` is passing by ref or not

 2. being "implementation defined" meaning that the user simply 
 cannot know (1) because it can change from version to version, 
 or with changes in compiler switch settings. Not just in 
 switching from one compiler to another (although that's bad 
 enough)

 3. the user would have to look at the disassembly to determine 
 (1) or not, and this is unreasonable

 4. if the function is a template with `in T t` as a parameter, 
 the user cannot know if `t` is passed by ref or not
1. Just like `auto ref`, `in` should be user when the user doesn't care whether he gets an lvalue or a rvalue. This means that the user either expect no aliasing, or that aliasing does not affect the observed behavior. 2. Answered in (1) 3. `__traits(isRef)` works perfectly fine, no need to look at the assembly. It's not currently tested though, I'll add it to the test suite. 4. The user can know using `__traits(isRef)`. In general, the user shouldn't care (see 1), but the ability is there.
Oct 05 2020
prev sibling next sibling parent Walter Bright <newshound2 digitalmars.com> writes:
On 10/4/2020 7:19 AM, Iain Buclaw wrote:
 Plucking a fitting example from a Phobos unittest that demonstrates the kind
of 
 things DMD let's people get away with:
 ```
 RefCounted!int* p;
 {
      auto rc1 = RefCounted!int(5);
      p = &rc1;
      assert(rc1 == 5);
      assert(rc1._refCounted._store._count == 1);
      auto rc2 = rc1;
      assert(rc1._refCounted._store._count == 2);
 }
 assert(p._refCounted._store == null);
 ```
 Why is this not a compile-time error?  DMD just isn't punishing users enough
for 
 the buggy code they've written.
It's not allowed in safe code.
Oct 05 2020
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 10/4/2020 7:19 AM, Iain Buclaw wrote:
 I've also skimmed past a passing concern that the ABI between compilers would
be 
 different.  Well, let me rest assure you that DMD, GDC and LDC have never
been 
 compatible in the first place, so there's no point worrying about that now.
The problem is not dmd compiled code calling gdc/ldc compiled code. The problem is code that works with one compiler fails with another, because the "to ref or not to ref" decision is *implementation defined*.
Oct 05 2020
next sibling parent Iain Buclaw <ibuclaw gdcproject.org> writes:
On Monday, 5 October 2020 at 07:55:57 UTC, Walter Bright wrote:
 On 10/4/2020 7:19 AM, Iain Buclaw wrote:
 I've also skimmed past a passing concern that the ABI between 
 compilers would be different.  Well, let me rest assure you 
 that DMD, GDC and LDC have never been compatible in the first 
 place, so there's no point worrying about that now.
The problem is not dmd compiled code calling gdc/ldc compiled code. The problem is code that works with one compiler fails with another, because the "to ref or not to ref" decision is *implementation defined*.
Granted this is a new preview feature, I can only see it being healthy if each vendor tries something different and reports back on how much success they had with it. So let other vendors implement as they interpret the spec, and we can converge later based on success/failings of our given decisions. I imagine we are 10 releases of DMD away from even considering whether or not this should come out of `-preview`.
Oct 05 2020
prev sibling next sibling parent reply Iain Buclaw <ibuclaw gdcproject.org> writes:
On Monday, 5 October 2020 at 07:55:57 UTC, Walter Bright wrote:
 On 10/4/2020 7:19 AM, Iain Buclaw wrote:
 I've also skimmed past a passing concern that the ABI between 
 compilers would be different.  Well, let me rest assure you 
 that DMD, GDC and LDC have never been compatible in the first 
 place, so there's no point worrying about that now.
The problem is not dmd compiled code calling gdc/ldc compiled code. The problem is code that works with one compiler fails with another, because the "to ref or not to ref" decision is *implementation defined*.
Furthermore, NRVO is implementation defined, which results in the same dmd/gdc/ldc making different decisions for "to ref or not to ref", and yet I see no one complaining about that.
Oct 05 2020
next sibling parent reply Mathias LANG <geod24 gmail.com> writes:
On Monday, 5 October 2020 at 08:34:37 UTC, Iain Buclaw wrote:
 On Monday, 5 October 2020 at 07:55:57 UTC, Walter Bright wrote:
 On 10/4/2020 7:19 AM, Iain Buclaw wrote:
 I've also skimmed past a passing concern that the ABI between 
 compilers would be different.  Well, let me rest assure you 
 that DMD, GDC and LDC have never been compatible in the first 
 place, so there's no point worrying about that now.
The problem is not dmd compiled code calling gdc/ldc compiled code. The problem is code that works with one compiler fails with another, because the "to ref or not to ref" decision is *implementation defined*.
Furthermore, NRVO is implementation defined, which results in the same dmd/gdc/ldc making different decisions for "to ref or not to ref", and yet I see no one complaining about that.
I cannot resist the bait: https://issues.dlang.org/show_bug.cgi?id=20752
Oct 05 2020
next sibling parent Walter Bright <newshound2 digitalmars.com> writes:
On 10/5/2020 1:42 AM, Mathias LANG wrote:
 I cannot resist the bait: https://issues.dlang.org/show_bug.cgi?id=20752
And you shouldn't. If there's a memory corruption issue in the language, file a bug report.
Oct 05 2020
prev sibling next sibling parent Iain Buclaw <ibuclaw gdcproject.org> writes:
On Monday, 5 October 2020 at 08:42:21 UTC, Mathias LANG wrote:
 On Monday, 5 October 2020 at 08:34:37 UTC, Iain Buclaw wrote:
 On Monday, 5 October 2020 at 07:55:57 UTC, Walter Bright wrote:
 On 10/4/2020 7:19 AM, Iain Buclaw wrote:
 [...]
The problem is not dmd compiled code calling gdc/ldc compiled code. The problem is code that works with one compiler fails with another, because the "to ref or not to ref" decision is *implementation defined*.
Furthermore, NRVO is implementation defined, which results in the same dmd/gdc/ldc making different decisions for "to ref or not to ref", and yet I see no one complaining about that.
I cannot resist the bait: https://issues.dlang.org/show_bug.cgi?id=20752
Yeah, but isReturnOnStack and NRVO are two different things. NRVO avoids a copy before passing. :-)
Oct 05 2020
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 10/5/20 4:42 AM, Mathias LANG wrote:
 On Monday, 5 October 2020 at 08:34:37 UTC, Iain Buclaw wrote:
 On Monday, 5 October 2020 at 07:55:57 UTC, Walter Bright wrote:
 On 10/4/2020 7:19 AM, Iain Buclaw wrote:
 I've also skimmed past a passing concern that the ABI between 
 compilers would be different.  Well, let me rest assure you that 
 DMD, GDC and LDC have never been compatible in the first place, so 
 there's no point worrying about that now.
The problem is not dmd compiled code calling gdc/ldc compiled code. The problem is code that works with one compiler fails with another, because the "to ref or not to ref" decision is *implementation defined*.
Furthermore, NRVO is implementation defined, which results in the same dmd/gdc/ldc making different decisions for "to ref or not to ref", and yet I see no one complaining about that.
I cannot resist the bait: https://issues.dlang.org/show_bug.cgi?id=20752
So now it's up to us to decide whether we want less of that or more of that.
Oct 05 2020
parent Iain Buclaw <ibuclaw gdcproject.org> writes:
On Monday, 5 October 2020 at 12:04:51 UTC, Andrei Alexandrescu 
wrote:
 So now it's up to us to decide whether we want less of that or 
 more of that.
There are a number of NRVO tests that I've fixed up in the testsuite because one type that is passed in memory on x86_64 is passed in registers on SPARC64. Or the DMD x86_64 ABI was at the time incomplete and wrongly passed a certain type in memory. :-)
Oct 05 2020
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 10/5/2020 1:34 AM, Iain Buclaw wrote:
 Furthermore, NRVO is implementation defined, which results in the same 
 dmd/gdc/ldc making different decisions for "to ref or not to ref", and yet I
see 
 no one complaining about that.
I don't see that as a to ref or not decision. It is an issue of how many copies are made, there shouldn't be dangling references to those copies, or it shouldn't NRVO it. I don't recall any case of NRVO breaking code since I invented it 30 years ago, other than something that relied on the number of copies.
Oct 05 2020
parent reply Iain Buclaw <ibuclaw gdcproject.org> writes:
On Monday, 5 October 2020 at 09:53:37 UTC, Walter Bright wrote:
 On 10/5/2020 1:34 AM, Iain Buclaw wrote:
 Furthermore, NRVO is implementation defined, which results in 
 the same dmd/gdc/ldc making different decisions for "to ref or 
 not to ref", and yet I see no one complaining about that.
I don't see that as a to ref or not decision. It is an issue of how many copies are made, there shouldn't be dangling references to those copies, or it shouldn't NRVO it. I don't recall any case of NRVO breaking code since I invented it 30 years ago, other than something that relied on the number of copies.
I don't consider there to be any difference between the two as far as parameter passing is concerned. As I understood from the review, the point of ref passing is to elide copies. Because this is allowed as an optimization only, none of what it does should spill out into user code. If people notice then something has gone wrong in the implementation.
Oct 05 2020
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 10/5/2020 4:32 AM, Iain Buclaw wrote:
 I don't consider there to be any difference between the two as far as
parameter 
 passing is concerned.  As I understood from the review, the point of ref
passing 
 is to elide copies.
I see a major difference, as relying on the number of copies is not the same as memory corruption. Eliding copies is the bread and butter of optimizers, btw.
 Because this is allowed as an optimization only, none of
 what it does should spill out into user code.  If people notice then something
 has gone wrong in the implementation.
The examples posted here shows it DOES. If an `in` passes by `const ref`, and another mutable reference to the same memory object decides to free the memory, the `in` reference now is a live dangling pointer.
Oct 05 2020
next sibling parent reply Iain Buclaw <ibuclaw gdcproject.org> writes:
On Monday, 5 October 2020 at 17:18:57 UTC, Walter Bright wrote:
 On 10/5/2020 4:32 AM, Iain Buclaw wrote:
 I don't consider there to be any difference between the two as 
 far as parameter passing is concerned.  As I understood from 
 the review, the point of ref passing is to elide copies.
I see a major difference, as relying on the number of copies is not the same as memory corruption. Eliding copies is the bread and butter of optimizers, btw.
 Because this is allowed as an optimization only, none of
 what it does should spill out into user code.  If people
notice then something
 has gone wrong in the implementation.
The examples posted here shows it DOES. If an `in` passes by `const ref`, and another mutable reference to the same memory object decides to free the memory, the `in` reference now is a live dangling pointer.
None of the posted examples I've seen here affect the implementation being trialed in GDC. Though I've already said that I'm likely being more conservative than DMD.
Oct 05 2020
parent Walter Bright <newshound2 digitalmars.com> writes:
On 10/5/2020 10:46 AM, Iain Buclaw wrote:
 None of the posted examples I've seen here affect the implementation being 
 trialed in GDC.  Though I've already said that I'm likely being more 
 conservative than DMD.
I could implement it in DMD by whenever the spec allows it to be by value, doing it by value. Then the problem would never happen. (This is how I implemented __restrict__ in Digital Mars C.) But that would defeat the purpose of the feature. And since the specification allows it to be ref, but does not require it, I would be forced to disallow `in` in any code under my purview.
Oct 05 2020
prev sibling parent Mathias LANG <geod24 gmail.com> writes:
On Monday, 5 October 2020 at 17:18:57 UTC, Walter Bright wrote:
 On 10/5/2020 4:32 AM, Iain Buclaw wrote:
 I don't consider there to be any difference between the two as 
 far as parameter passing is concerned.  As I understood from 
 the review, the point of ref passing is to elide copies.
I see a major difference, as relying on the number of copies is not the same as memory corruption. Eliding copies is the bread and butter of optimizers, btw.
 Because this is allowed as an optimization only, none of
 what it does should spill out into user code.  If people
notice then something
 has gone wrong in the implementation.
The examples posted here shows it DOES. If an `in` passes by `const ref`, and another mutable reference to the same memory object decides to free the memory, the `in` reference now is a live dangling pointer.
The complains seem to be about observable difference, not memory corruption. It was suggested a few times that `in` should just be `ref`. If the current status is really unworkable (but again, I recommend anyone to give it a try first), that would be my preferred course of action. The issue of freeing live data is not specific to `in`, it shows up with `ref` and pointers as well.
Oct 06 2020
prev sibling parent reply IGotD- <nise nise.com> writes:
On Monday, 5 October 2020 at 07:55:57 UTC, Walter Bright wrote:
 The problem is not dmd compiled code calling gdc/ldc compiled 
 code. The problem is code that works with one compiler fails 
 with another, because the "to ref or not to ref" decision is 
 *implementation defined*.
Ouch, so this mean that how 'in' works must be a defined standard that all D compilers adhere to. Also this standard must be defined for each CPU architecture.
Oct 05 2020
parent Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:
On Monday, 5 October 2020 at 10:33:46 UTC, IGotD- wrote:
 Ouch, so this mean that how 'in' works must be a defined 
 standard that all D compilers adhere to. Also this standard 
 must be defined for each CPU architecture.
It breaks for one compiler too because of templating. If all unit tests only trigger ref passing, and it is used with a smaller template param then it will run with value passing...
Oct 05 2020
prev sibling parent Atila Neves <atila.neves gmail.com> writes:
On Friday, 2 October 2020 at 22:11:01 UTC, Walter Bright wrote:
 On 10/2/2020 10:31 AM, Steven Schveighoffer wrote:
 And this might not be true on a different compiler.
This is looking like a serious problem.
I agree. I did mention at the time that how parameters are actually passed should be left to the backend.
Oct 07 2020
prev sibling parent reply tsbockman <thomas.bockman gmail.com> writes:
On Friday, 2 October 2020 at 14:48:18 UTC, Steven Schveighoffer 
wrote:
 I think in should always mean ref. If you want to pass not by 
 ref, use const.
Please do not kill this optimization. There is currently no other reasonable way available in the D language to express the idea of "pass by reference or by value, whichever is faster". But, there is an easy way to require that a parameter always be passed in a specific way: just don't use `in` and use `scope const` or `scope const ref` directly. Implementing this optimization manually is quite painful in D; the possibilities are complex and very ugly, and require detailed knowledge of the platform's ABI to implement correctly in the general case. D is a systems programming language, and should not kill useful features just because they have potential pitfalls. At most, problematic features may be blocked by safe/linters/style guides, but please don't remove them entirely.
Oct 02 2020
next sibling parent reply Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:
On Friday, 2 October 2020 at 19:54:35 UTC, tsbockman wrote:
 On Friday, 2 October 2020 at 14:48:18 UTC, Steven Schveighoffer 
 wrote:
 D is a systems programming language, and should not kill useful 
 features just because they have potential pitfalls.
Systems programming implies explicitness, this feature is the opposite.
Oct 02 2020
parent reply tsbockman <thomas.bockman gmail.com> writes:
On Friday, 2 October 2020 at 20:00:46 UTC, Ola Fosheim Grøstad 
wrote:
 Systems programming implies explicitness, this feature is the 
 opposite.
Systems programming requires explicit control to be *available*. But, the programmer shouldn't be *forced* to explicitly specify every tedious detail to get the desired result.
Oct 02 2020
parent reply Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:
On Friday, 2 October 2020 at 22:04:14 UTC, tsbockman wrote:
 On Friday, 2 October 2020 at 20:00:46 UTC, Ola Fosheim Grøstad 
 wrote:
 Systems programming implies explicitness, this feature is the 
 opposite.
Systems programming requires explicit control to be *available*. But, the programmer shouldn't be *forced* to explicitly specify every tedious detail to get the desired result.
Yes, but if you want passbyvalue for small structs all the time then you need to define enabling constraints as udefined behaviour. The only viable alternative is to only get the optimization when it provably has the same effect. Those are the only possible options (even in theory).
Oct 02 2020
parent reply tsbockman <thomas.bockman gmail.com> writes:
On Friday, 2 October 2020 at 22:27:33 UTC, Ola Fosheim Grøstad 
wrote:
 Yes, but if you want passbyvalue for small structs all the time 
 then you need to define enabling constraints as udefined 
 behaviour.
That's fine. I never suggested otherwise.
 The only viable alternative is to only get the optimization 
 when it provably has the same effect.
As I said earlier in this thread, if people are very worried about the undefined behavior, then perhaps the cases which are not proven by the compiler to have the same effect should be system. There are essential features in D already that cannot be verified by the compiler; that's what system is for.
Oct 02 2020
parent reply Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:
On Friday, 2 October 2020 at 22:46:11 UTC, tsbockman wrote:
 On Friday, 2 October 2020 at 22:27:33 UTC, Ola Fosheim Grøstad 
 wrote:
 As I said earlier in this thread, if people are very worried 
 about the undefined behavior, then perhaps the cases which are 
 not proven by the compiler to have the same effect should be 
  system. There are essential features in D already that cannot 
 be verified by the compiler; that's what  system is for.
I have no issues with undefined behaviour as long as it is easy to understand and explain, like requiring 'in' params to be nonaliased in the function body. It has to be easy to grok and remember. However the D community tends to want D to distinguish itself from c++ in this regard.
Oct 02 2020
next sibling parent reply Guillaume Piolat <first.name gmail.com> writes:
On Friday, 2 October 2020 at 23:03:49 UTC, Ola Fosheim Grøstad 
wrote:
 I have no issues with undefined behaviour as long as it is easy 
 to understand and explain, like requiring 'in' params to be 
 nonaliased in the function body. It has to be easy to grok and 
 remember.
Or the even simpler rule: don't use 'in' :)
Oct 02 2020
parent Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:
On Saturday, 3 October 2020 at 00:35:24 UTC, Guillaume Piolat 
wrote:
 On Friday, 2 October 2020 at 23:03:49 UTC, Ola Fosheim Grøstad 
 wrote:
 I have no issues with undefined behaviour as long as it is 
 easy to understand and explain, like requiring 'in' params to 
 be nonaliased in the function body. It has to be easy to grok 
 and remember.
Or the even simpler rule: don't use 'in' :)
Actually, if "in" implies non-aliased you might want to use it for DSP buffers because modern backend can then more easily generate SIMD code when the buffer is non-aliased. :-)
Oct 03 2020
prev sibling parent reply Daniel N <no public.email> writes:
On Friday, 2 October 2020 at 23:03:49 UTC, Ola Fosheim Grøstad 
wrote:
 On Friday, 2 October 2020 at 22:46:11 UTC, tsbockman wrote:
 On Friday, 2 October 2020 at 22:27:33 UTC, Ola Fosheim Grøstad 
 wrote:
 As I said earlier in this thread, if people are very worried 
 about the undefined behavior, then perhaps the cases which are 
 not proven by the compiler to have the same effect should be 
  system. There are essential features in D already that cannot 
 be verified by the compiler; that's what  system is for.
I have no issues with undefined behaviour as long as it is easy to understand and explain, like requiring 'in' params to be nonaliased in the function body. It has to be easy to grok and remember. However the D community tends to want D to distinguish itself from c++ in this regard.
How about this? You can freely mix any number of 'in' and 'out' parameters and keep the optimized behavior. As soon as you add any 'ref' parameter 'in' will pass by value.
Oct 03 2020
parent Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:
On Saturday, 3 October 2020 at 08:51:26 UTC, Daniel N wrote:
 You can freely mix any number of 'in' and 'out' parameters and 
 keep the optimized behavior.

 As soon as you add any 'ref' parameter 'in' will pass by value.
I don't know what would be the best mix. :-) But it is interesting to think about various options that could work. There are many ways to make it a little better, but very difficult to come up with something that feels right. For instance it is possible to use the type system to prove that in some cases aliasing is not possible simply because the types of objects that are reachable from parameter 1 are all different from the types reachable from parameter 2. Then you know that there can be no aliasing between the parameters. Unfortunately that excludes way too many valid cases where this does not hold. And well, you also need to consider aliasing with globals that are accessed within the function.
Oct 03 2020
prev sibling parent reply Steven Schveighoffer <schveiguy gmail.com> writes:
On 10/2/20 3:54 PM, tsbockman wrote:
 On Friday, 2 October 2020 at 14:48:18 UTC, Steven Schveighoffer wrote:
 I think in should always mean ref. If you want to pass not by ref, use 
 const.
Please do not kill this optimization. There is currently no other reasonable way available in the D language to express the idea of "pass by reference or by value, whichever is faster". But, there is an easy way to require that a parameter always be passed in a specific way: just don't use `in` and use `scope const` or `scope const ref` directly. Implementing this optimization manually is quite painful in D; the possibilities are complex and very ugly, and require detailed knowledge of the platform's ABI to implement correctly in the general case. D is a systems programming language, and should not kill useful features just because they have potential pitfalls. At most, problematic features may be blocked by safe/linters/style guides, but please don't remove them entirely.
The only way this feature can stay the way it is is to add to the undefined behavior of the language: "It is undefined behavior to pass a mutable argument to an `in` parameter, and to read from that parameter after that mutable data has been modified while executing the function." If everyone is OK with that, then it can stay the way it is. Otherwise, the spec has to be more specific about whether it's expected to be ref or not ref. -Steve
Oct 02 2020
next sibling parent Steven Schveighoffer <schveiguy gmail.com> writes:
On 10/2/20 4:06 PM, Steven Schveighoffer wrote:
 "It is undefined behavior to pass a mutable argument to an `in` 
 parameter, and to read from that parameter after that mutable data has 
 been modified while executing the function."
It should actually be "a mutable argument that has no destructor, no postblit, no copy constructor, is copyable, and is not a reference type" As those are defined to be ref or not ref explicitly by the spec. -Steve
Oct 02 2020
prev sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 02.10.20 22:06, Steven Schveighoffer wrote:
 On 10/2/20 3:54 PM, tsbockman wrote:
 On Friday, 2 October 2020 at 14:48:18 UTC, Steven Schveighoffer wrote:
 I think in should always mean ref. If you want to pass not by ref, 
 use const.
Please do not kill this optimization. There is currently no other reasonable way available in the D language to express the idea of "pass by reference or by value, whichever is faster". But, there is an easy way to require that a parameter always be passed in a specific way: just don't use `in` and use `scope const` or `scope const ref` directly. Implementing this optimization manually is quite painful in D; the possibilities are complex and very ugly, and require detailed knowledge of the platform's ABI to implement correctly in the general case. D is a systems programming language, and should not kill useful features just because they have potential pitfalls. At most, problematic features may be blocked by safe/linters/style guides, but please don't remove them entirely.
The only way this feature can stay the way it is is to add to the undefined behavior of the language: ...
No. UB means demons may fly out of your nose. It's not that. You just get one of two behaviors, one is pass-by-reference, the other is pass-by-value.
 "It is undefined behavior to pass a mutable argument to an `in` 
 parameter, and to read from that parameter after that mutable data has 
 been modified while executing the function."
 
 If everyone is OK with that, then it can stay the way it is. Otherwise, 
 the spec has to be more specific about whether it's expected to be ref 
 or not ref.
 
 -Steve
There's a difference between "the behavior may be either A or B" and "the behavior may be anything you like"...
Oct 03 2020
next sibling parent Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:
On Saturday, 3 October 2020 at 11:04:48 UTC, Timon Gehr wrote:
 No. UB means demons may fly out of your nose. It's not that. 
 You just get one of two behaviors, one is pass-by-reference, 
 the other is pass-by-value.
UB just means that it is left out of the language. UB does not mean that implementors cannot specify what will happen. That is a complete misunderstanding of the term. The fact that Clang exploits UB to achieve higher performance in the optimizer is a deliberate choice they made. It is not a consequence of UB in the language spec per se. It is a consequence of deliberate optimization efforts.
Oct 03 2020
prev sibling parent Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:
On Saturday, 3 October 2020 at 11:04:48 UTC, Timon Gehr wrote:
 There's a difference between "the behavior may be either A or 
 B" and "the behavior may be anything you like"...
And just to be clear on this: the example Steven gave showed code that should be defined to be outside of the language. It is not code that anyone would want to pass the compilation stage if the compiler could be made smart enough to detect it. The moment you say the undesirable behaviour is within the language, you have effectively made the example valid and thus an improved compiler cannot reject it. If it is within the language an improved compiler would be a breaking change. The value-implementation of reference passing is different, because such implementation specific behaviour may be desirable! But the code Steven provided was not desirable to allow as a "valid text" in the language. (A language spec can provide details of how the runtime should behave when the compiler fails to detect whether code is within the language or not. Then you just label "undefined behaviour" as "illegal constructs", but it is essence the same thing).
Oct 03 2020
prev sibling next sibling parent reply Max Haughton <maxhaton gmail.com> writes:
On Friday, 2 October 2020 at 14:32:50 UTC, Andrei Alexandrescu 
wrote:
 On 10/2/20 10:08 AM, Steven Schveighoffer wrote:
 Is there a way to prevent this?
 
 import std.stdio;
 struct S(size_t elems)
 {
      int[elems] data;
 }
 
 void foo(T)(in T constdata, ref T normaldata)
 {
      normaldata.data[0] = 1;
      writeln(constdata.data[0]);
 }
 void main()
 {
      S!1 smallval;
      foo(smallval, smallval);
      S!100 largeval;
      foo(largeval, largeval);
 }
 
 
 Compile without -preview=in, it prints:
 
 0
 0
 
 Compile with -preview=in, it prints:
 
 0
 1
Finally, my "told you so" moment has come! :o) https://forum.dlang.org/post/rhmst4$1vmc$1 digitalmars.com
Could we be ambitious and aim to have ownership taken to the max and catch this statically? This particular case is relatively low hanging fruit but having the in parameter work this way would be nice if it was safe.
Oct 02 2020
parent reply Per =?UTF-8?B?Tm9yZGzDtnc=?= <per.nordlow gmail.com> writes:
On Friday, 2 October 2020 at 17:16:46 UTC, Max Haughton wrote:
 Could we be ambitious and aim to have ownership taken to the 
 max and catch this statically? This particular case is 
 relatively low hanging fruit but having the in parameter work 
 this way would be nice if it was safe.
For which parameter types should such ownership checking be performed? All kinds of references types including ref params, classes and pointers or a subset of them?
Oct 02 2020
parent Max Haughton <maxhaton gmail.com> writes:
On Friday, 2 October 2020 at 18:31:20 UTC, Per Nordlöw wrote:
 On Friday, 2 October 2020 at 17:16:46 UTC, Max Haughton wrote:
 Could we be ambitious and aim to have ownership taken to the 
 max and catch this statically? This particular case is 
 relatively low hanging fruit but having the in parameter work 
 this way would be nice if it was safe.
For which parameter types should such ownership checking be performed? All kinds of references types including ref params, classes and pointers or a subset of them?
I think for any scheme to be successful it would have to (within safe code) have to cover pretty much everything, from the parameter all the way up to the allocation (be that a class, pointer to struct etc.). A more limited ownership system could still be very useful (what exists now is getting there) but to *guarantee* safety it must go further. Obviously this would be a huge task - not impossible (we have some very clever people) but big - but if it happens at some point it would need to be very thoroughly planned i.e. from the "basics" like how to handle malloc and free (or your allocator of choice) to the more subtle issues like where to make the surgical cuts to the languages design - D only sort of has move semantics at the moment, which (recall that assignment in rust is move by default) at a glance make a provably safe system more difficult (especially wrt types that own pointers I think). On a more meta level I think dmd needs to be carefully structured to separate the AST from the analysis. I'm aware that talk is cheap, but it could get very ugly if done wrong .
Oct 02 2020
prev sibling parent Walter Bright <newshound2 digitalmars.com> writes:
On 10/2/2020 7:32 AM, Andrei Alexandrescu wrote:
 Finally, my "told you so" moment has come! :o)
 
 https://forum.dlang.org/post/rhmst4$1vmc$1 digitalmars.com
This should be diagnosed by -preview=dip1021. However, #dip1021 will only stop the obvious cases of it. It can be more subtle, which can only be stopped by live. https://github.com/dlang/DIPs/blob/master/DIPs/accepted/DIP1021.md
Oct 02 2020
prev sibling next sibling parent reply Per =?UTF-8?B?Tm9yZGzDtnc=?= <per.nordlow gmail.com> writes:
On Friday, 2 October 2020 at 14:08:29 UTC, Steven Schveighoffer 
wrote:
 Compile without -preview=in, it prints:
 ...
 -Steve
I wonder how Ada handles multiple aliasing of the same argument to a set of reference (`alias`) parameters [1]. Does it have the same ownership and borrowing rules as Rust? I need to check... https://en.wikibooks.org/wiki/Ada_Programming/Subprograms: "Explicitly aliased parameters and access parameters specify pass by reference."
Oct 02 2020
parent Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:
On Friday, 2 October 2020 at 20:44:17 UTC, Per Nordlöw wrote:
 On Friday, 2 October 2020 at 14:08:29 UTC, Steven Schveighoffer 
 wrote:
 Compile without -preview=in, it prints:
 ...
 -Steve
I wonder how Ada handles multiple aliasing of the same argument to a set of reference (`alias`) parameters [1]. Does it have the same ownership and borrowing rules as Rust? I need to check...
SPARK verifies that there is no aliasing with 'in out' and other parameters and globals IIRC
Oct 02 2020
prev sibling parent reply Mathias LANG <geod24 gmail.com> writes:
On Friday, 2 October 2020 at 14:08:29 UTC, Steven Schveighoffer 
wrote:
 Is there a way to prevent this?

 import std.stdio;
 struct S(size_t elems)
 {
     int[elems] data;
 }

 void foo(T)(in T constdata, ref T normaldata)
 {
     normaldata.data[0] = 1;
     writeln(constdata.data[0]);
 }
 void main()
 {
     S!1 smallval;
     foo(smallval, smallval);
     S!100 largeval;
     foo(largeval, largeval);
 }


 Compile without -preview=in, it prints:

 0
 0

 Compile with -preview=in, it prints:

 0
 1

 -Steve
So what do we make of the following ? ``` import std.stdio; struct S(size_t elems) { int[elems] data; S copy () { return this; } } void foo(T)(auto ref T constdata, ref T normaldata) { normaldata.data[0] = 1; writeln(constdata.data[0]); } void main() { S!1 smallval; foo(smallval.copy, smallval); foo(smallval, smallval); } ``` To me, the `in` approach is better. From the point of view of `foo`'s implementer, you get a stable `ref` or non-`ref` (not both!). You can also rely on it being `ref` if you know more about the type (e.g. if it has a destructor). From the point of view of the caller, calling the function with the same values will always behave the same, no matter if you pass rvalues or lvalues.
Oct 02 2020
parent reply Daniel N <no public.email> writes:
On Friday, 2 October 2020 at 21:13:29 UTC, Mathias LANG wrote:
 To me, the `in` approach is better. From the point of view of 
 `foo`'s implementer, you get a stable `ref` or non-`ref` (not 
 both!). You can also rely on it being `ref` if you know more 
 about the type (e.g. if it has a destructor).
 From the point of view of the caller, calling the function with 
 the same values will always behave the same, no matter if you 
 pass rvalues or lvalues.
I totally agree. Also see the formally accepted DIP1021! https://github.com/dlang/DIPs/blob/148001a963f5d6e090bb6beef5caf9854372d0bc/DIPs/accepted/DIP1021.md "Therefore, if more than one reference to the same data is passed to a function, they must all be const."
Oct 02 2020
next sibling parent Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:
On Friday, 2 October 2020 at 21:57:06 UTC, Daniel N wrote:
 "Therefore, if more than one reference to the same data is 
 passed to a function, they must all be const."
Impossible to prove at compiletime in the general case. Way too expensive to check at runtime as well as the aliasing can happen deep down in a graph. So you need to specify it as undefined behaviour.
Oct 02 2020
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 10/2/2020 2:57 PM, Daniel N wrote:
 Also see the formally accepted DIP1021!
 https://github.com/dlang/DIPs/blob/148001a963f5d6e090bb6beef5caf9854372d0bc/DIPs
accepted/DIP1021.md 
That picks up the obvious cases, but the more subtle ones require live's Data Flow Analysis to find.
Oct 04 2020
parent Daniel N <no public.email> writes:
On Sunday, 4 October 2020 at 08:57:01 UTC, Walter Bright wrote:
 On 10/2/2020 2:57 PM, Daniel N wrote:
 Also see the formally accepted DIP1021!
 https://github.com/dlang/DIPs/blob/148001a963f5d6e090bb6beef5caf9854372d0bc/DIPs/accepted/DIP1021.md
That picks up the obvious cases, but the more subtle ones require live's Data Flow Analysis to find.
Yes, that's true. My point is that we are already working towards solving the very same issue in a different context. DIP1021 is the first step, live is the final step. It's nothing inherently wrong with "-preview=in" but as it's currently designed it depends on the solution of live. It's also possible to solve in simpler ways, since "-preview=in" is just an optimization, one could simply disable it when it's not trivial to deduce that it's safe to pass by ref. With every new advance in the compiler it could be made more clever...
Oct 04 2020