digitalmars.D - D-ark corners - algorithms, ranges, and auto ref

digitalmars.D - D-ark corners - algorithms, ranges, and auto ref - best practices

aliak (62/62) Jul 23 2019 Hi, I recently ran in to a problem with stack corruption [0]. I

ag0aep6g (8/20) Jul 23 2019 [...]

aliak (26/48) Jul 24 2019 Hmm, if the spec is not wrong, and return should be inferred,

Timon Gehr (3/8) Jul 23 2019 I think in this case the specification is at fault. It's perfectly fine

aliak (3/13) Jul 24 2019 Void would not be a ref so return would not be inferred

Timon Gehr (2/16) Jul 24 2019 No? The spec says that inout ref parameters imply the return attribute.

Aliak (2/20) Jul 24 2019 Ah right! Parameters.

H. S. Teoh (14/19) Jul 23 2019 [...]

aliak (4/17) Jul 24 2019 Not compiling with those no.

Sebastiaan Koppe (11/14) Jul 24 2019 You probably should.

aliak (8/15) Jul 24 2019 You're probably right :)

Sebastiaan Koppe (21/30) Jul 24 2019 It is not a bug. It is just a result of compiler flags, inferred

H. S. Teoh (16/24) Jul 24 2019 [...]

aliak (35/60) Jul 25 2019 Unfortunately, the code does break because of dependencies.

aliak <something something.com> writes:

Hi, I recently ran in to a problem with stack corruption [0]. I 
basically used up more time than I'd like to admit trying to 
figure out heisenbugs - segfaults, out of memory exceptions, 
thread crashes, and once I saw an overlapping memory access error 
(which I'd never seen before, and never saw again).

Anyway, this is what I narrowed it down to eventually in client 
code.

 safe:
auto foundValue = makeRange("one")
   .algorithm;
useValue(foundValue); // BOOM

With some help from the forums, the essence of the problem was 
the following:

1. algorithm returned by auto ref.
2. the range's front was "ref front() {...}"

What was happening was that madeRange made a temporary and passed 
it on to algorithm, which also saw a temporary. The return type 
on algorithm was auto ref, and since front on the range also 
returned a ref, I was returning a ref to a value in a temporary. 
Algorithm was in a dependency, and the range type was inside a 
different dependency (confession - I wrote both those 
dependencies but, heh, anyway)

The solutions were to either:

1. slap a return on the front function: ref T front() return 
{...} (I think - not even sure)
2. remove auto ref from algorithm and return by value

There are a number of issues, and also a number of questions.

- The problem with solution 1 is that you usually do not have 
control over libraries and their ranges. They may have been 
declared with return, they may have not. I went through Phobos 
and saw a lot of ref front() {...}
- The problem with solution 2 is that you want to return by ref 
so you can to avoid copying.

(Side note: the specification for return ref - 
https://dlang.org/spec/function.html#return-ref-parameters - says 
that "inout ref parameters imply the return attribute.", but I 
got corruption with "ref inout(T) front() inout {...}". Is that a 
bug with inout?)

The access pattern is safe when:
- The object that is passed in to algorithm is not a temporary.
- The algorithm returns by copy

Best practices seem to be:
- If you are returning by ref from any object accessor, always 
use the return qualifier. Is there any reason not to?

The only reliable way to solve this from an algorithm author 
point of view seems to be to have the algorithm return by value. 
But then you sacrifice efficiency. Are there any other ways? Can 
it ever be safe to return by auto ref from a generic algorithm?

And finally, I've been keeping note of various gotchas in D, and 
traps, etc [1] and I'm wondering if anyone else takes note of 
these things? Can you post links to them here if you do?

I know about the D-idioms thing [2] (great info!!). But it feels 
that when you write libraries for D, there're a lot of nuanced 
and non-obvious gotchas like these and they're not documented 
anywhere that I know of. Should we start doing this somewhere - 
the name of which is in the title of the post :)

Cheers,
- Ali

[0]: 
https://forum.dlang.org/thread/hhxabgmrwvqvmezezyym forum.dlang.org
[1]: https://github.com/aliak00/d-isms/tree/master/da-faq
[2]: https://p0nce.github.io/d-idioms/

Jul 23 2019

ag0aep6g <anonymous example.com> writes:

On 23.07.19 15:51, aliak wrote:
 Hi, I recently ran in to a problem with stack corruption [0].

[...]
 Anyway, this is what I narrowed it down to eventually in client code.
 
  safe:
 auto foundValue = makeRange("one")
    .algorithm;
 useValue(foundValue); // BOOM

[...]
 And finally, I've been keeping note of various gotchas in D, and traps, 
 etc [1] and I'm wondering if anyone else takes note of these things? Can 
 you post links to them here if you do?

[...]
 [0]: https://forum.dlang.org/thread/hhxabgmrwvqvmezezyym forum.dlang.org
 [1]: https://github.com/aliak00/d-isms/tree/master/da-faq

If that code leads to memory corruption, it shouldn't compile as  safe. 
You're looking at a compiler bug, not a gotcha.

Might already be filed here:
https://issues.dlang.org/show_bug.cgi?id=17927

Jul 23 2019

aliak <something something.com> writes:

On Tuesday, 23 July 2019 at 15:22:37 UTC, ag0aep6g wrote:
 On 23.07.19 15:51, aliak wrote:
 Hi, I recently ran in to a problem with stack corruption [0].

 [...]
 Anyway, this is what I narrowed it down to eventually in 
 client code.
 
  safe:
 auto foundValue = makeRange("one")
    .algorithm;
 useValue(foundValue); // BOOM

 [...]
 And finally, I've been keeping note of various gotchas in D, 
 and traps, etc [1] and I'm wondering if anyone else takes note 
 of these things? Can you post links to them here if you do?

 [...]
 [0]: 
 https://forum.dlang.org/thread/hhxabgmrwvqvmezezyym forum.dlang.org
 [1]: https://github.com/aliak00/d-isms/tree/master/da-faq

 If that code leads to memory corruption, it shouldn't compile 
 as  safe. You're looking at a compiler bug, not a gotcha.

 Might already be filed here:
 https://issues.dlang.org/show_bug.cgi?id=17927

Hmm, if the spec is not wrong, and return should be inferred, 
then could it be an ldc bug? If you run this with the llvm 
sanitizer (on my system at least) you'll get an ASAN stack 
corruption assert with details.

import std;

struct W(T) {
     T value;
     ref inout(T) front() inout { return value; }
}

auto ref get(T)(W!T value) {
     return value.front;
}

struct S {
     string a;
}

void main()  safe {
     S[] arr;
     arr ~= W!S(S("one")).get;
     writeln(arr[0]);
}

Compiler/run with: ldc2 -fsanitize=address -g -disable-fp-elim 
temp.d && ./temp

But then again, I have no idea if the dmd version is passing "by 
luck" or if for some reason dmd properly infers it as return ref 
and somehow ldc messes that up.

Jul 24 2019

Timon Gehr <timon.gehr gmx.ch> writes:

On 23.07.19 15:51, aliak wrote:
 
 (Side note: the specification for return ref - 
 https://dlang.org/spec/function.html#return-ref-parameters - says that 
 "inout ref parameters imply the return attribute.", but I got corruption 
 with "ref inout(T) front() inout {...}". Is that a bug with inout?)

I think in this case the specification is at fault. It's perfectly fine 
(and useful) to have an `inout` function that returns `void`.

Jul 23 2019

aliak <something something.com> writes:

On Tuesday, 23 July 2019 at 15:40:42 UTC, Timon Gehr wrote:
 On 23.07.19 15:51, aliak wrote:
 
 (Side note: the specification for return ref - 
 https://dlang.org/spec/function.html#return-ref-parameters - 
 says that "inout ref parameters imply the return attribute.", 
 but I got corruption with "ref inout(T) front() inout {...}". 
 Is that a bug with inout?)

 I think in this case the specification is at fault. It's 
 perfectly fine (and useful) to have an `inout` function that 
 returns `void`.

Void would not be a ref so return would not be inferred 
(according to the spec)?

Jul 24 2019

Timon Gehr <timon.gehr gmx.ch> writes:

On 24.07.19 09:18, aliak wrote:
 On Tuesday, 23 July 2019 at 15:40:42 UTC, Timon Gehr wrote:
 On 23.07.19 15:51, aliak wrote:
 (Side note: the specification for return ref - 
 https://dlang.org/spec/function.html#return-ref-parameters - says 
 that "inout ref parameters imply the return attribute.", but I got 
 corruption with "ref inout(T) front() inout {...}". Is that a bug 
 with inout?)

 I think in this case the specification is at fault. It's perfectly 
 fine (and useful) to have an `inout` function that returns `void`.

 
 Void would not be a ref so return would not be inferred (according to 
 the spec)?

No? The spec says that inout ref parameters imply the return attribute.

Jul 24 2019

Aliak <something something.com> writes:

On Wednesday, 24 July 2019 at 13:40:10 UTC, Timon Gehr wrote:
 On 24.07.19 09:18, aliak wrote:
 On Tuesday, 23 July 2019 at 15:40:42 UTC, Timon Gehr wrote:
 On 23.07.19 15:51, aliak wrote:
 (Side note: the specification for return ref - 
 https://dlang.org/spec/function.html#return-ref-parameters - 
 says that "inout ref parameters imply the return 
 attribute.", but I got corruption with "ref inout(T) front() 
 inout {...}". Is that a bug with inout?)

 I think in this case the specification is at fault. It's 
 perfectly fine (and useful) to have an `inout` function that 
 returns `void`.

 
 Void would not be a ref so return would not be inferred 
 (according to the spec)?

 No? The spec says that inout ref parameters imply the return 
 attribute.

Ah right! Parameters.

Jul 24 2019

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Tue, Jul 23, 2019 at 01:51:42PM +0000, aliak via Digitalmars-d wrote:
[...]
 With some help from the forums, the essence of the problem was the
 following:
 
 1. algorithm returned by auto ref.
 2. the range's front was "ref front() {...}"

[...]

Isn't this precisely the sort of problem dip25 / dip1000 are supposed to
catch?  Are you compiling with -dip25 -dip1000?  Are DIP25/DIP1000 (or
their current implementations) not catching this particular case for
some reason?


T

-- 
I've been around long enough to have seen an endless parade of magic new
techniques du jour, most of which purport to remove the necessity of
thought about your programming problem.  In the end they wind up
contributing one or two pieces to the collective wisdom, and fade away
in the rearview mirror. -- Walter Bright

Jul 23 2019

aliak <something something.com> writes:

On Tuesday, 23 July 2019 at 17:17:25 UTC, H. S. Teoh wrote:
 On Tue, Jul 23, 2019 at 01:51:42PM +0000, aliak via 
 Digitalmars-d wrote: [...]
 With some help from the forums, the essence of the problem was 
 the following:
 
 1. algorithm returned by auto ref.
 2. the range's front was "ref front() {...}"

 [...]

 Isn't this precisely the sort of problem dip25 / dip1000 are 
 supposed to catch?  Are you compiling with -dip25 -dip1000?  
 Are DIP25/DIP1000 (or their current implementations) not 
 catching this particular case for some reason?


 T

Not compiling with those no.

Is  safe supposed to solve all memory corruption or is it only if 
dip1000 and dip25 is on. So default  safe is not  safe?

Jul 24 2019

Sebastiaan Koppe <mail skoppe.eu> writes:

On Wednesday, 24 July 2019 at 07:06:07 UTC, aliak wrote:
 Not compiling with those no.

You probably should.

There is one issue with dub that I know of. When you use dip1000 
and use dependencies that don't, things in the dependency might 
mangle differently between the lib build and your project build 
if they have an extra attribute inferred. See e.g. 
https://github.com/dlang-community/stdx-allocator/pull/27#is
uecomment-505000664 for more details.

I should probably create a bug somewhere.

 Is  safe supposed to solve all memory corruption or is it only 
 if dip1000 and dip25 is on. So default  safe is not  safe?

 From what I understood, after the current transition period, 
dip1000/dip25 will be always on. Right now,  safe is like 
 semi_safe

Jul 24 2019

aliak <something something.com> writes:

On Wednesday, 24 July 2019 at 07:54:05 UTC, Sebastiaan Koppe 
wrote:
 On Wednesday, 24 July 2019 at 07:06:07 UTC, aliak wrote:
 Not compiling with those no.

 You probably should.

You're probably right :)

 There is one issue with dub that I know of. When you use 
 dip1000 and use dependencies that don't, things in the 
 dependency might mangle differently between the lib build and 
 your project build if they have an extra attribute inferred.

These are the kinds of things I'm afraid of actually. Is that a 
dub issue though?
Also if the depdendency is not dip1000/dip25 compatible (Which i 
assume - maybe incorreclty - that most are not) then what? Have 
you encountered much of that?

Jul 24 2019

Sebastiaan Koppe <mail skoppe.eu> writes:

On Wednesday, 24 July 2019 at 08:46:18 UTC, aliak wrote:
 There is one issue with dub that I know of. When you use 
 dip1000 and use dependencies that don't, things in the 
 dependency might mangle differently between the lib build and 
 your project build if they have an extra attribute inferred.

 These are the kinds of things I'm afraid of actually. Is that a 
 dub issue though?

It is not a bug. It is just a result of compiler flags, inferred 
attributes, templates and separate compilation.

The problem comes about because a template from a dependency - 
which is compiled without dip1000/25 - is instantiated in your 
project - which is compiled with dip1000/25 - and then that 
template calls code from the dependency, where different 
attributes are inferred in each compilation. (an attribute 
changes the mangling).

The easiest solution is to use the sourceLibrary target type for 
the dependency, if the dependency is dip1000/25 compatible. Which 
it really should be. Unless it is doing complex stuff, in which 
case, it really should be.

I don't know. I suppose dub needs to propagate dip flags to all 
dependencies. But there are downsides to that as well.

 Also if the depdendency is not dip1000/dip25 compatible (Which 
 i assume - maybe incorreclty - that most are not) then what? 
 Have you encountered much of that?

If you aren't using templates from the library that also call 
into the library, there is no issue. Otherwise, either 
pull-requests to the library or no dip1000/25.

I guess we just need to get dip1000/25 on by default as quickly 
as possible. But the problem is going to surface again when there 
are other feature flags that (in)directly influence mangling.

Jul 24 2019

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Wed, Jul 24, 2019 at 07:06:07AM +0000, aliak via Digitalmars-d wrote:
 On Tuesday, 23 July 2019 at 17:17:25 UTC, H. S. Teoh wrote:

[...]
 Isn't this precisely the sort of problem dip25 / dip1000 are
 supposed to catch?  Are you compiling with -dip25 -dip1000?  Are
 DIP25/DIP1000 (or their current implementations) not catching this
 particular case for some reason?


[...]
 Not compiling with those no.

You probably should, if you're relying on  safe in any significant way.


 Is  safe supposed to solve all memory corruption or is it only if
 dip1000 and dip25 is on. So default  safe is not  safe?

 safe is *intended* to prevent memory corruption.  But there are some
loopholes in the current implementation, which are addressed by DIP1000
and DIP25.  AIUI, the intention is that -dip25 and -dip1000 will
eventually be the default behaviour; we're just in the transition period
right now because they could potentially break existing code (as in,
they will cause compilation errors to otherwise working code).  So in
the meantime, if your code doesn't break with -dip25 -dip1000, you
should probably just compile with them by default in order to get the
benefits of a better  safe implementation.


T

-- 
"Holy war is an oxymoron." -- Lazarus Long

Jul 24 2019

aliak <something something.com> writes:

On Wednesday, 24 July 2019 at 16:23:47 UTC, H. S. Teoh wrote:
 On Wed, Jul 24, 2019 at 07:06:07AM +0000, aliak via 
 Digitalmars-d wrote:
 On Tuesday, 23 July 2019 at 17:17:25 UTC, H. S. Teoh wrote:

 [...]
 Isn't this precisely the sort of problem dip25 / dip1000 are 
 supposed to catch?  Are you compiling with -dip25 -dip1000?  
 Are DIP25/DIP1000 (or their current implementations) not 
 catching this particular case for some reason?


 [...]
 Not compiling with those no.

 You probably should, if you're relying on  safe in any 
 significant way.


 Is  safe supposed to solve all memory corruption or is it only 
 if dip1000 and dip25 is on. So default  safe is not  safe?

  safe is *intended* to prevent memory corruption.  But there 
 are some loopholes in the current implementation, which are 
 addressed by DIP1000 and DIP25.  AIUI, the intention is that 
 -dip25 and -dip1000 will eventually be the default behaviour; 
 we're just in the transition period right now because they 
 could potentially break existing code (as in, they will cause 
 compilation errors to otherwise working code).  So in the 
 meantime, if your code doesn't break with -dip25 -dip1000, you 
 should probably just compile with them by default in order to 
 get the benefits of a better  safe implementation.


 T

Unfortunately, the code does break because of dependencies.

But anyway, I did some dip1000 experiments, and it does not seem 
to catch the auto ref issue:

 ("Sanitary check")
unittest {
     auto ref get(T)(auto ref Optional!T value) {
         return value.front;
     }
     assert(get(some(1) == 1);
}

some is defined as:
public auto some(T)(auto ref T value) {
     return Optional!T(value);
}

And Optional.front is:

 property ref inout(T) front() return inout {
         assert(!empty, "Attempting to fetch the front of an empty 
optional.");
         return this._value;
     }

I've been running with ldc and ASAN enabled and:

==68300==ERROR: AddressSanitizer: stack-use-after-return on 
address 0x0001033a2d60 at pc 0x0001012d5006 bp 0x7ffeee967fb0 sp 
0x7ffeee967fa8
READ of size 4 at 0x0001033a2d60 thread T0

optional.d:520




Address 0x0001033a2d60 is located in stack of thread T0 at offset 
32 in frame

_D5tests8optional18__unittest_L512_C1FZ__T3getTiZQhMFNaNbNcNiNfSQCeQC
__T8OptionalTiZQmZi optional.d:514

Jul 25 2019

D Programming

C/C++ Programming

Other

digitalmars.D - D-ark corners - algorithms, ranges, and auto ref - best practices