digitalmars.D - pure functions cannot be removed, actually: pure vs. total

FeepingCreature (18/18) Jun 05 2018 I'm just posting to clear up the misunderstanding that a call to

Stefan Koch (4/12) Jun 05 2018 In that instance you can have false negatives. catgorizing total
Steven Schveighoffer (36/59) Jun 05 2018 I'll repeat what I said in the PR where you made this similar comment, I...

FeepingCreature (12/42) Jun 05 2018 That's correct, this optimization is invalid. The only

Steven Schveighoffer (15/61) Jun 05 2018 I think Haskell would disagree with you:

Johan Engelen (13/21) Jun 07 2018 This is not a valid program, in my opinion. (Still only my

Jonathan M Davis (21/42) Jun 07 2018 I would be _very_ surprised if the D spec said anything like this simply

Steven Schveighoffer (39/75) Jun 07 2018 OK, so FeepingCreature has come up with an interesting case that's

Stefan Koch (17/18) Jun 07 2018 That a function could return does not mean it will.

Steven Schveighoffer (15/36) Jun 07 2018 I'm not talking about the compiler looking at the code to figure out if

Timon Gehr (10/20) Jun 11 2018 FeepingCreature's example does not really qualify as a do-nothing loop,

Timon Gehr (5/8) Jun 11 2018 Hm, ok. The problem is actually there, but it is not that the 'find'

Steven Schveighoffer (13/23) Jun 11 2018 Actually, one thing that can be determined statically is that it will

Timon Gehr (4/33) Jun 11 2018 Well, if the language definition is reasonable, then if calls to `void

Steven Schveighoffer (9/42) Jun 11 2018 Yes, it does seem that way.

FeepingCreature <feepingcreature gmail.com> writes:

I'm just posting to clear up the misunderstanding that a call to 
a pure function can be removed. Actually, even calls to strongly 
pure functions cannot always be removed. This is because there is 
one thing that a pure function can do that will change program 
behavior if it is removed, even if it does not change any global 
state whatsoever: it can simply never return.

void foo() pure { while (true) { } }

By the way, this led to an amusing Phobos bug involving a pure 
function (not) returning a struct with an enum member: 
https://github.com/dlang/dmd/pull/8013#pullrequestreview-110250441 and
assert(false.repeat.any == true); :)

When a strongly pure function is called multiple times with the 
same parameter, all but the first call can be removed; this is 
because if it was going to not return, it would already have 
inf-looped the first time. pure lets you go from n to 1, but not 
from 1 to 0.

A pure function that returns a value for every possible parameter 
is called a total function. Unfortunately, there is no way to 
enforce totality in the compiler, due to the halting problem.

Jun 05 2018

Stefan Koch <uplink.coder googlemail.com> writes:

On Tuesday, 5 June 2018 at 14:48:23 UTC, FeepingCreature wrote:
 I'm just posting to clear up the misunderstanding that a call 
 to a pure function can be removed. Actually, even calls to 
 strongly pure functions cannot always be removed. This is 
 because there is one thing that a pure function can do that 
 will change program behavior if it is removed, even if it does 
 not change any global state whatsoever: it can simply never 
 return.

 [...]

In that instance you can have false negatives. catgorizing total 
functions an non total.

But no false positives afaics.

Jun 05 2018

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 6/5/18 10:48 AM, FeepingCreature wrote:
 I'm just posting to clear up the misunderstanding that a call to a pure 
 function can be removed. Actually, even calls to strongly pure functions 
 cannot always be removed. This is because there is one thing that a pure 
 function can do that will change program behavior if it is removed, even 
 if it does not change any global state whatsoever: it can simply never 
 return.
 
 void foo() pure { while (true) { } }
 
 By the way, this led to an amusing Phobos bug involving a pure function 
 (not) returning a struct with an enum member: 
 https://github.com/dlang/dmd/pull/8013#pullrequestreview-110250441 and 
 assert(false.repeat.any == true); :)
 
 When a strongly pure function is called multiple times with the same 
 parameter, all but the first call can be removed; this is because if it 
 was going to not return, it would already have inf-looped the first 
 time. pure lets you go from n to 1, but not from 1 to 0.
 
 A pure function that returns a value for every possible parameter is 
 called a total function. Unfortunately, there is no way to enforce 
 totality in the compiler, due to the halting problem.
 

I'll repeat what I said in the PR where you made this similar comment, I 
don't think it is important to ensure a pure function that never returns 
is always called. Can you explain the benefit of such a thing?

We've all heard of optimizers that reduce "code that does nothing" down 
to just a return statement, foiling people who are expecting benchmarks 
to run properly, why is this any different?

I'm asking because it seems like we give up a really easy optimization 
for pure functions for the sake of pure infinite loop programs, which I 
suppose have some value, but not much value. Getting around the infinite 
loop elision would be pretty simple, just return an int.

On the flip side, if the compiler can figure out via introspection that 
some template function is strong pure and returns void, therefore it 
doesn't need to call it, that's a much bigger win than preserving the 
possible infinite loopiness that likely is a bug anyway.

Another observation: under the "infinite loops are important observable 
behavior" world-view, pure functions cannot be lazily evaluated either:

pure int foo() { /*infinite loop */}

void main(string[] args)
{
    auto a = foo;
    writeln("hi");
    if(args[1] == "printa")
       writeln(a);
}

With some optimizers, this can be rewritten:

writeln("hi");
if(args[1] == "printa")
    writeln(foo);

Which if foo is a *normally returning* function and not an infinite loop 
one, then this can save cycles in certain cases.

But under your world-view, the optimization is invalid, because foo 
might have an infinite loop, and then the observable behavior changes 
(instead of printing nothing and infinite looping, "hi" is printed, and 
infinite loops).

-Steve

Jun 05 2018

FeepingCreature <feepingcreature gmail.de> writes:

On Tuesday, 5 June 2018 at 17:47:15 UTC, Steven Schveighoffer 
wrote:
 Another observation: under the "infinite loops are important 
 observable behavior" world-view, pure functions cannot be 
 lazily evaluated either:

 pure int foo() { /*infinite loop */}

 void main(string[] args)
 {
    auto a = foo;
    writeln("hi");
    if(args[1] == "printa")
       writeln(a);
 }

 With some optimizers, this can be rewritten:

 writeln("hi");
 if(args[1] == "printa")
    writeln(foo);

 Which if foo is a *normally returning* function and not an 
 infinite loop one, then this can save cycles in certain cases.

 But under your world-view, the optimization is invalid, because 
 foo might have an infinite loop, and then the observable 
 behavior changes (instead of printing nothing and infinite 
 looping, "hi" is printed, and infinite loops).

That's correct, this optimization is invalid. The only 
optimization that can arise from foo being pure is *subsequent* 
calls to foo being removed.

 -Steve

On Tuesday, 5 June 2018 at 17:47:15 UTC, Steven Schveighoffer 
wrote:
 I'll repeat what I said in the PR where you made this similar 
 comment, I don't think it is important to ensure a pure 
 function that never returns is always called. Can you explain 
 the benefit of such a thing?

 We've all heard of optimizers that reduce "code that does 
 nothing" down to just a return statement, foiling people who 
 are expecting benchmarks to run properly, why is this any 
 different?

Frankly speaking, we should not implement optimizations merely on 
the basis that we cannot immediately think of a case where they 
fail. For instance, a practical function that loops forever would 
be a find() call on an infinite range, such as a range returned 
by .repeat or .generate.

Jun 05 2018

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 6/5/18 5:03 PM, FeepingCreature wrote:
 On Tuesday, 5 June 2018 at 17:47:15 UTC, Steven Schveighoffer wrote:
 Another observation: under the "infinite loops are important 
 observable behavior" world-view, pure functions cannot be lazily 
 evaluated either:

 pure int foo() { /*infinite loop */}

 void main(string[] args)
 {
    auto a = foo;
    writeln("hi");
    if(args[1] == "printa")
       writeln(a);
 }

 With some optimizers, this can be rewritten:

 writeln("hi");
 if(args[1] == "printa")
    writeln(foo);

 Which if foo is a *normally returning* function and not an infinite 
 loop one, then this can save cycles in certain cases.

 But under your world-view, the optimization is invalid, because foo 
 might have an infinite loop, and then the observable behavior changes 
 (instead of printing nothing and infinite looping, "hi" is printed, 
 and infinite loops).

 
 That's correct, this optimization is invalid. The only optimization that 
 can arise from foo being pure is *subsequent* calls to foo being removed.

I think Haskell would disagree with you: 
https://wiki.haskell.org/Lazy_evaluation

 On Tuesday, 5 June 2018 at 17:47:15 UTC, Steven Schveighoffer wrote:
 I'll repeat what I said in the PR where you made this similar comment, 
 I don't think it is important to ensure a pure function that never 
 returns is always called. Can you explain the benefit of such a thing?

 We've all heard of optimizers that reduce "code that does nothing" 
 down to just a return statement, foiling people who are expecting 
 benchmarks to run properly, why is this any different?

 
 Frankly speaking, we should not implement optimizations merely on the 
 basis that we cannot immediately think of a case where they fail. For 
 instance, a practical function that loops forever would be a find() call 
 on an infinite range, such as a range returned by .repeat or .generate.

But a call to find doesn't "do nothing". It takes a range and returns a 
range.

We are specifically talking about strong-pure functions that return 
void, or even strong pure functions whose return value is ignored.

And yes, we can actually prove that calls to pure functions do nothing 
based on the rules of pure functions, which is why the optimization is 
easy to prove correct. It's one of the reasons pure optimizations are 
much easier to reason about.

However, if we have a wrinkle of "we have to make sure infinite loops 
execute their thing", then many pure optimizations get thrown out the 
window.

-Steve

Jun 05 2018

Johan Engelen <j j.nl> writes:

On Tuesday, 5 June 2018 at 14:48:23 UTC, FeepingCreature wrote:
 I'm just posting to clear up the misunderstanding that a call 
 to a pure function can be removed. Actually, even calls to 
 strongly pure functions cannot always be removed. This is 
 because there is one thing that a pure function can do that 
 will change program behavior if it is removed, even if it does 
 not change any global state whatsoever: it can simply never 
 return.

 void foo() pure { while (true) { } }

This is not a valid program, in my opinion. (Still only my 
opinion, because I have not found it in the D spec, so needs 
adding.)
A valid C++ program must make progress in each thread. C++ spec 
states:
"The implementation may assume that any thread will eventually do 
one of the following:
- terminate,
- make a call to a library I/O function,
- access or modify a volatile object, or
- perform a synchronization operation or an atomic operation."

-Johan

Jun 07 2018

Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:

On Thursday, June 07, 2018 20:14:19 Johan Engelen via Digitalmars-d wrote:
 On Tuesday, 5 June 2018 at 14:48:23 UTC, FeepingCreature wrote:
 I'm just posting to clear up the misunderstanding that a call
 to a pure function can be removed. Actually, even calls to
 strongly pure functions cannot always be removed. This is
 because there is one thing that a pure function can do that
 will change program behavior if it is removed, even if it does
 not change any global state whatsoever: it can simply never
 return.

 void foo() pure { while (true) { } }

 This is not a valid program, in my opinion. (Still only my
 opinion, because I have not found it in the D spec, so needs
 adding.)
 A valid C++ program must make progress in each thread. C++ spec
 states:
 "The implementation may assume that any thread will eventually do
 one of the following:
 - terminate,
 - make a call to a library I/O function,
 - access or modify a volatile object, or
 - perform a synchronization operation or an atomic operation."

I would be _very_ surprised if the D spec said anything like this simply
because it doesn't do a lot of talking about conceptual stuff like this and
talks a lot more about what the compiler actually does (or is supposed to
do) - though that certainly doesn't mean that the spec shouldn't say
something like that, and Walter has previously stated that anything that
needs to be in the spec but isn't should be reported in bugzilla. So, if you
have anything like this that you really think should be in the spec, please
report it in bugzilla.

Regardless, I see zero value in supporting something like

void foo() pure { while(true) {} }

The compiler doesn't actually give an error when using a function like this
(though it does give an error when calling a strongly pure function which
returns a value, and the return value isn't used), but I could easily see it
doing so on the grounds that the function cannot possible be doing any
actual work. And I see _zero_ reason to support something like this just so
that you can run a loop or do other work that gets thrown away. I suppose
that it could be argued that it would be useful for benchmarks, but there
are better ways to handle that that don't require allowing functions that
clearly do no real work.

- Jonathan M Davis

Jun 07 2018

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 6/7/18 5:04 PM, Jonathan M Davis wrote:
 On Thursday, June 07, 2018 20:14:19 Johan Engelen via Digitalmars-d wrote:
 On Tuesday, 5 June 2018 at 14:48:23 UTC, FeepingCreature wrote:
 I'm just posting to clear up the misunderstanding that a call
 to a pure function can be removed. Actually, even calls to
 strongly pure functions cannot always be removed. This is
 because there is one thing that a pure function can do that
 will change program behavior if it is removed, even if it does
 not change any global state whatsoever: it can simply never
 return.

 void foo() pure { while (true) { } }

 This is not a valid program, in my opinion. (Still only my
 opinion, because I have not found it in the D spec, so needs
 adding.)
 A valid C++ program must make progress in each thread. C++ spec
 states:
 "The implementation may assume that any thread will eventually do
 one of the following:
 - terminate,
 - make a call to a library I/O function,
 - access or modify a volatile object, or
 - perform a synchronization operation or an atomic operation."

 
 I would be _very_ surprised if the D spec said anything like this simply
 because it doesn't do a lot of talking about conceptual stuff like this and
 talks a lot more about what the compiler actually does (or is supposed to
 do) - though that certainly doesn't mean that the spec shouldn't say
 something like that, and Walter has previously stated that anything that
 needs to be in the spec but isn't should be reported in bugzilla. So, if you
 have anything like this that you really think should be in the spec, please
 report it in bugzilla.
 
 Regardless, I see zero value in supporting something like
 
 void foo() pure { while(true) {} }

OK, so FeepingCreature has come up with an interesting case that's 
*similar* to this, and I think it makes things a little less cut and dry.

Let's say you have an infinite range that returns a single value. But 
instead of that value being stored in the range itself, it's an enum.

Maybe something like this:

struct Range
{
    enum empty = false;
    enum front = true;
    void popFront() pure nothrow {}
}

Now, let's say you ran find on it:

auto result = Range.init.find(false);

This *should* do an infinite loop. If it returned, it means that find 
has found where `false` is!

But let's dissect the call to `find`:

* It's a template, and everything that it calls, given this range is 
pure nothrow, so it is inferred pure nothrow.
* It accepts a type that has no mutable data.
* It returns a type that has no mutable data.
* The return type can be proven to not depend on the input.
* Therefore, it is a strong-pure function.

So logically, the compiler could assume if it thinks infinite loops are 
invalid, that it must return. Since it doesn't really need to call the 
function to see what it returns, it can just replace the code with:

auto result = Range.init;

And here, we have found `false` in an infinite range that only contains 
`true`.

So the danger of assuming no infinite loops is that some convoluted case 
like true.repeat.find(false) might incorrectly return something. Note 
that the repeat call wouldn't cause this problem, since it stores a 
mutable element inside it.

I still think that for void-returning functions we can simply assume to 
not do anything. Literally, the only thing it can do is return or 
infinite loop.

But I'm definitely not as sure as I was when I first responded to this 
thread.

-Steve

Jun 07 2018

Stefan Koch <uplink.coder googlemail.com> writes:

On Thursday, 7 June 2018 at 22:23:09 UTC, Steven Schveighoffer 
wrote:
 {...}

That a function could return does not mean it will.

int fn (int arg) /*strongly*/ pure
{
   if (arg == 42)
     return 42;
   else (arg < 43)
     return fn(--arg);
   else
     return fn(++arg);
}

do you see the problem?
If not you would you expect the compiler to ?

note: I would expected that the clang static analyzer
would be able to determine that depending on the value of arg
this function might never return.

Jun 07 2018

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 6/7/18 6:57 PM, Stefan Koch wrote:
 On Thursday, 7 June 2018 at 22:23:09 UTC, Steven Schveighoffer wrote:
 {...}

 
 That a function could return does not mean it will.
 
 int fn (int arg) /*strongly*/ pure
 {
    if (arg == 42)
      return 42;
    else (arg < 43)
      return fn(--arg);
    else
      return fn(++arg);
 }
 
 do you see the problem?
 If not you would you expect the compiler to ?
 
 note: I would expected that the clang static analyzer
 would be able to determine that depending on the value of arg
 this function might never return.

I'm not talking about the compiler looking at the code to figure out if 
it will return.

I'm talking about this:

struct S
{
}

S fn() pure;

No matter what is inside fn, it will return S.init, if it returns. The 
question to answer is, can we assume all functions don't enter infinite 
loops, and if we can assume this, then why couldn't the compiler simply 
replace a call to fn with S.init?

And the case FeepingCreature gives is pretty compelling, because it's a 
real example, not a toy example.

-Steve

Jun 07 2018

Timon Gehr <timon.gehr gmx.ch> writes:

On 08.06.2018 01:14, Steven Schveighoffer wrote:
 
 No matter what is inside fn, it will return S.init, if it returns. The 
 question to answer is, can we assume all functions don't enter infinite 
 loops, and if we can assume this, then why couldn't the compiler simply 
 replace a call to fn with S.init?
 
 And the case FeepingCreature gives is pretty compelling, because it's a 
 real example, not a toy example.
 
 -Steve

FeepingCreature's example does not really qualify as a do-nothing loop, 
because the loop produces a value that you (presumably) later access.

In any case, I don't think the right answer is to make programs that 
enter infinite loops that do nothing undefined behavior (like in C++), 
but they should possibly be allowed to be removed, though it can be 
surprising, see: https://blog.regehr.org/archives/140
(However, the "Fermat" example is a bit contrived, because presumably 
one would also want to print the specific counterexample, breaking the 
"optimization".)

Jun 11 2018

Timon Gehr <timon.gehr gmx.ch> writes:

On 11.06.2018 16:39, Timon Gehr wrote:
 
 FeepingCreature's example does not really qualify as a do-nothing loop, 
 because the loop produces a value that you (presumably) later access.

Hm, ok. The problem is actually there, but it is not that the 'find' 
call will get removed, it is that the loop within find's implementation 
will be seen as doing nothing and producing nothing after inlining, 
breaking find's postcondition.

Jun 11 2018

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 6/11/18 10:43 AM, Timon Gehr wrote:
 On 11.06.2018 16:39, Timon Gehr wrote:
 FeepingCreature's example does not really qualify as a do-nothing 
 loop, because the loop produces a value that you (presumably) later 
 access.

 
 Hm, ok. The problem is actually there, but it is not that the 'find' 
 call will get removed, it is that the loop within find's implementation 
 will be seen as doing nothing and producing nothing after inlining, 
 breaking find's postcondition.

Actually, one thing that can be determined statically is that it will 
never return. Essentially, this particular find function (with predicate 
inlining) will fold to:

while(true) {}

Since the final return outside the loop is obviously never reached, and 
the return inside the loop cannot be reached.

So I would assume this would properly be translated to an infinite loop.

I was not expecting that the compiler might eliminate infinite loops 
that it can prove are infinite loops, more like it should eliminate 
calls that trivially can be proven to do nothing based on the signature 
of the function.

-Steve

Jun 11 2018

Timon Gehr <timon.gehr gmx.ch> writes:

On 11.06.2018 20:15, Steven Schveighoffer wrote:
 On 6/11/18 10:43 AM, Timon Gehr wrote:
 On 11.06.2018 16:39, Timon Gehr wrote:
 FeepingCreature's example does not really qualify as a do-nothing 
 loop, because the loop produces a value that you (presumably) later 
 access.

 Hm, ok. The problem is actually there, but it is not that the 'find' 
 call will get removed, it is that the loop within find's 
 implementation will be seen as doing nothing and producing nothing 
 after inlining, breaking find's postcondition.

 
 Actually, one thing that can be determined statically is that it will 
 never return. Essentially, this particular find function (with predicate 
 inlining) will fold to:
 
 while(true) {}
 
 Since the final return outside the loop is obviously never reached, and 
 the return inside the loop cannot be reached.
 
 So I would assume this would properly be translated to an infinite loop.
 
 I was not expecting that the compiler might eliminate infinite loops 
 that it can prove are infinite loops, more like it should eliminate 
 calls that trivially can be proven to do nothing based on the signature 
 of the function.
 
 -Steve

Well, if the language definition is reasonable, then if calls to `void 
foo()pure  nothrow { while(true){} }` can be elided, so can the 
`while(true){}` loop itself.

Jun 11 2018

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 6/11/18 2:23 PM, Timon Gehr wrote:
 On 11.06.2018 20:15, Steven Schveighoffer wrote:
 On 6/11/18 10:43 AM, Timon Gehr wrote:
 On 11.06.2018 16:39, Timon Gehr wrote:
 FeepingCreature's example does not really qualify as a do-nothing 
 loop, because the loop produces a value that you (presumably) later 
 access.

 Hm, ok. The problem is actually there, but it is not that the 'find' 
 call will get removed, it is that the loop within find's 
 implementation will be seen as doing nothing and producing nothing 
 after inlining, breaking find's postcondition.

 Actually, one thing that can be determined statically is that it will 
 never return. Essentially, this particular find function (with 
 predicate inlining) will fold to:

 while(true) {}

 Since the final return outside the loop is obviously never reached, 
 and the return inside the loop cannot be reached.

 So I would assume this would properly be translated to an infinite loop.

 I was not expecting that the compiler might eliminate infinite loops 
 that it can prove are infinite loops, more like it should eliminate 
 calls that trivially can be proven to do nothing based on the 
 signature of the function.

 
 Well, if the language definition is reasonable, then if calls to `void 
 foo()pure  nothrow { while(true){} }` can be elided, so can the 
 `while(true){}` loop itself.

Yes, it does seem that way.

Eliding (or moving) opaque calls in general if you can't see what the 
call is actually doing seems to be a minefield that probably cannot 
reasonably be navigated. The only reasonable elision seems to be for 
pure calls that have already been made so we know they actually return.

I'm curious if Haskell can properly navigate this with its lazy calls, 
or if it is subject to such oddities.

-Steve

Jun 11 2018

D Programming

C/C++ Programming

Other

digitalmars.D - pure functions cannot be removed, actually: pure vs. total