digitalmars.D - pure functions cannot be removed, actually: pure vs. total
- FeepingCreature (18/18) Jun 05 2018 I'm just posting to clear up the misunderstanding that a call to
- Stefan Koch (4/12) Jun 05 2018 In that instance you can have false negatives. catgorizing total
- Steven Schveighoffer (36/59) Jun 05 2018 I'll repeat what I said in the PR where you made this similar comment, I...
- FeepingCreature (12/42) Jun 05 2018 That's correct, this optimization is invalid. The only
- Steven Schveighoffer (15/61) Jun 05 2018 I think Haskell would disagree with you:
- Johan Engelen (13/21) Jun 07 2018 This is not a valid program, in my opinion. (Still only my
- Jonathan M Davis (21/42) Jun 07 2018 I would be _very_ surprised if the D spec said anything like this simply
- Steven Schveighoffer (39/75) Jun 07 2018 OK, so FeepingCreature has come up with an interesting case that's
- Stefan Koch (17/18) Jun 07 2018 That a function could return does not mean it will.
- Steven Schveighoffer (15/36) Jun 07 2018 I'm not talking about the compiler looking at the code to figure out if
- Timon Gehr (10/20) Jun 11 2018 FeepingCreature's example does not really qualify as a do-nothing loop,
- Timon Gehr (5/8) Jun 11 2018 Hm, ok. The problem is actually there, but it is not that the 'find'
- Steven Schveighoffer (13/23) Jun 11 2018 Actually, one thing that can be determined statically is that it will
- Timon Gehr (4/33) Jun 11 2018 Well, if the language definition is reasonable, then if calls to `void
- Steven Schveighoffer (9/42) Jun 11 2018 Yes, it does seem that way.
I'm just posting to clear up the misunderstanding that a call to a pure function can be removed. Actually, even calls to strongly pure functions cannot always be removed. This is because there is one thing that a pure function can do that will change program behavior if it is removed, even if it does not change any global state whatsoever: it can simply never return. void foo() pure { while (true) { } } By the way, this led to an amusing Phobos bug involving a pure function (not) returning a struct with an enum member: https://github.com/dlang/dmd/pull/8013#pullrequestreview-110250441 and assert(false.repeat.any == true); :) When a strongly pure function is called multiple times with the same parameter, all but the first call can be removed; this is because if it was going to not return, it would already have inf-looped the first time. pure lets you go from n to 1, but not from 1 to 0. A pure function that returns a value for every possible parameter is called a total function. Unfortunately, there is no way to enforce totality in the compiler, due to the halting problem.
Jun 05 2018
On Tuesday, 5 June 2018 at 14:48:23 UTC, FeepingCreature wrote:I'm just posting to clear up the misunderstanding that a call to a pure function can be removed. Actually, even calls to strongly pure functions cannot always be removed. This is because there is one thing that a pure function can do that will change program behavior if it is removed, even if it does not change any global state whatsoever: it can simply never return. [...]In that instance you can have false negatives. catgorizing total functions an non total. But no false positives afaics.
Jun 05 2018
On 6/5/18 10:48 AM, FeepingCreature wrote:I'm just posting to clear up the misunderstanding that a call to a pure function can be removed. Actually, even calls to strongly pure functions cannot always be removed. This is because there is one thing that a pure function can do that will change program behavior if it is removed, even if it does not change any global state whatsoever: it can simply never return. void foo() pure { while (true) { } } By the way, this led to an amusing Phobos bug involving a pure function (not) returning a struct with an enum member: https://github.com/dlang/dmd/pull/8013#pullrequestreview-110250441 and assert(false.repeat.any == true); :) When a strongly pure function is called multiple times with the same parameter, all but the first call can be removed; this is because if it was going to not return, it would already have inf-looped the first time. pure lets you go from n to 1, but not from 1 to 0. A pure function that returns a value for every possible parameter is called a total function. Unfortunately, there is no way to enforce totality in the compiler, due to the halting problem.I'll repeat what I said in the PR where you made this similar comment, I don't think it is important to ensure a pure function that never returns is always called. Can you explain the benefit of such a thing? We've all heard of optimizers that reduce "code that does nothing" down to just a return statement, foiling people who are expecting benchmarks to run properly, why is this any different? I'm asking because it seems like we give up a really easy optimization for pure functions for the sake of pure infinite loop programs, which I suppose have some value, but not much value. Getting around the infinite loop elision would be pretty simple, just return an int. On the flip side, if the compiler can figure out via introspection that some template function is strong pure and returns void, therefore it doesn't need to call it, that's a much bigger win than preserving the possible infinite loopiness that likely is a bug anyway. Another observation: under the "infinite loops are important observable behavior" world-view, pure functions cannot be lazily evaluated either: pure int foo() { /*infinite loop */} void main(string[] args) { auto a = foo; writeln("hi"); if(args[1] == "printa") writeln(a); } With some optimizers, this can be rewritten: writeln("hi"); if(args[1] == "printa") writeln(foo); Which if foo is a *normally returning* function and not an infinite loop one, then this can save cycles in certain cases. But under your world-view, the optimization is invalid, because foo might have an infinite loop, and then the observable behavior changes (instead of printing nothing and infinite looping, "hi" is printed, and infinite loops). -Steve
Jun 05 2018
On Tuesday, 5 June 2018 at 17:47:15 UTC, Steven Schveighoffer wrote:Another observation: under the "infinite loops are important observable behavior" world-view, pure functions cannot be lazily evaluated either: pure int foo() { /*infinite loop */} void main(string[] args) { auto a = foo; writeln("hi"); if(args[1] == "printa") writeln(a); } With some optimizers, this can be rewritten: writeln("hi"); if(args[1] == "printa") writeln(foo); Which if foo is a *normally returning* function and not an infinite loop one, then this can save cycles in certain cases. But under your world-view, the optimization is invalid, because foo might have an infinite loop, and then the observable behavior changes (instead of printing nothing and infinite looping, "hi" is printed, and infinite loops).That's correct, this optimization is invalid. The only optimization that can arise from foo being pure is *subsequent* calls to foo being removed.-SteveOn Tuesday, 5 June 2018 at 17:47:15 UTC, Steven Schveighoffer wrote:I'll repeat what I said in the PR where you made this similar comment, I don't think it is important to ensure a pure function that never returns is always called. Can you explain the benefit of such a thing? We've all heard of optimizers that reduce "code that does nothing" down to just a return statement, foiling people who are expecting benchmarks to run properly, why is this any different?Frankly speaking, we should not implement optimizations merely on the basis that we cannot immediately think of a case where they fail. For instance, a practical function that loops forever would be a find() call on an infinite range, such as a range returned by .repeat or .generate.
Jun 05 2018
On 6/5/18 5:03 PM, FeepingCreature wrote:On Tuesday, 5 June 2018 at 17:47:15 UTC, Steven Schveighoffer wrote:I think Haskell would disagree with you: https://wiki.haskell.org/Lazy_evaluationAnother observation: under the "infinite loops are important observable behavior" world-view, pure functions cannot be lazily evaluated either: pure int foo() { /*infinite loop */} void main(string[] args) { auto a = foo; writeln("hi"); if(args[1] == "printa") writeln(a); } With some optimizers, this can be rewritten: writeln("hi"); if(args[1] == "printa") writeln(foo); Which if foo is a *normally returning* function and not an infinite loop one, then this can save cycles in certain cases. But under your world-view, the optimization is invalid, because foo might have an infinite loop, and then the observable behavior changes (instead of printing nothing and infinite looping, "hi" is printed, and infinite loops).That's correct, this optimization is invalid. The only optimization that can arise from foo being pure is *subsequent* calls to foo being removed.On Tuesday, 5 June 2018 at 17:47:15 UTC, Steven Schveighoffer wrote:But a call to find doesn't "do nothing". It takes a range and returns a range. We are specifically talking about strong-pure functions that return void, or even strong pure functions whose return value is ignored. And yes, we can actually prove that calls to pure functions do nothing based on the rules of pure functions, which is why the optimization is easy to prove correct. It's one of the reasons pure optimizations are much easier to reason about. However, if we have a wrinkle of "we have to make sure infinite loops execute their thing", then many pure optimizations get thrown out the window. -SteveI'll repeat what I said in the PR where you made this similar comment, I don't think it is important to ensure a pure function that never returns is always called. Can you explain the benefit of such a thing? We've all heard of optimizers that reduce "code that does nothing" down to just a return statement, foiling people who are expecting benchmarks to run properly, why is this any different?Frankly speaking, we should not implement optimizations merely on the basis that we cannot immediately think of a case where they fail. For instance, a practical function that loops forever would be a find() call on an infinite range, such as a range returned by .repeat or .generate.
Jun 05 2018
On Tuesday, 5 June 2018 at 14:48:23 UTC, FeepingCreature wrote:I'm just posting to clear up the misunderstanding that a call to a pure function can be removed. Actually, even calls to strongly pure functions cannot always be removed. This is because there is one thing that a pure function can do that will change program behavior if it is removed, even if it does not change any global state whatsoever: it can simply never return. void foo() pure { while (true) { } }This is not a valid program, in my opinion. (Still only my opinion, because I have not found it in the D spec, so needs adding.) A valid C++ program must make progress in each thread. C++ spec states: "The implementation may assume that any thread will eventually do one of the following: - terminate, - make a call to a library I/O function, - access or modify a volatile object, or - perform a synchronization operation or an atomic operation." -Johan
Jun 07 2018
On Thursday, June 07, 2018 20:14:19 Johan Engelen via Digitalmars-d wrote:On Tuesday, 5 June 2018 at 14:48:23 UTC, FeepingCreature wrote:I would be _very_ surprised if the D spec said anything like this simply because it doesn't do a lot of talking about conceptual stuff like this and talks a lot more about what the compiler actually does (or is supposed to do) - though that certainly doesn't mean that the spec shouldn't say something like that, and Walter has previously stated that anything that needs to be in the spec but isn't should be reported in bugzilla. So, if you have anything like this that you really think should be in the spec, please report it in bugzilla. Regardless, I see zero value in supporting something like void foo() pure { while(true) {} } The compiler doesn't actually give an error when using a function like this (though it does give an error when calling a strongly pure function which returns a value, and the return value isn't used), but I could easily see it doing so on the grounds that the function cannot possible be doing any actual work. And I see _zero_ reason to support something like this just so that you can run a loop or do other work that gets thrown away. I suppose that it could be argued that it would be useful for benchmarks, but there are better ways to handle that that don't require allowing functions that clearly do no real work. - Jonathan M DavisI'm just posting to clear up the misunderstanding that a call to a pure function can be removed. Actually, even calls to strongly pure functions cannot always be removed. This is because there is one thing that a pure function can do that will change program behavior if it is removed, even if it does not change any global state whatsoever: it can simply never return. void foo() pure { while (true) { } }This is not a valid program, in my opinion. (Still only my opinion, because I have not found it in the D spec, so needs adding.) A valid C++ program must make progress in each thread. C++ spec states: "The implementation may assume that any thread will eventually do one of the following: - terminate, - make a call to a library I/O function, - access or modify a volatile object, or - perform a synchronization operation or an atomic operation."
Jun 07 2018
On 6/7/18 5:04 PM, Jonathan M Davis wrote:On Thursday, June 07, 2018 20:14:19 Johan Engelen via Digitalmars-d wrote:OK, so FeepingCreature has come up with an interesting case that's *similar* to this, and I think it makes things a little less cut and dry. Let's say you have an infinite range that returns a single value. But instead of that value being stored in the range itself, it's an enum. Maybe something like this: struct Range { enum empty = false; enum front = true; void popFront() pure nothrow {} } Now, let's say you ran find on it: auto result = Range.init.find(false); This *should* do an infinite loop. If it returned, it means that find has found where `false` is! But let's dissect the call to `find`: * It's a template, and everything that it calls, given this range is pure nothrow, so it is inferred pure nothrow. * It accepts a type that has no mutable data. * It returns a type that has no mutable data. * The return type can be proven to not depend on the input. * Therefore, it is a strong-pure function. So logically, the compiler could assume if it thinks infinite loops are invalid, that it must return. Since it doesn't really need to call the function to see what it returns, it can just replace the code with: auto result = Range.init; And here, we have found `false` in an infinite range that only contains `true`. So the danger of assuming no infinite loops is that some convoluted case like true.repeat.find(false) might incorrectly return something. Note that the repeat call wouldn't cause this problem, since it stores a mutable element inside it. I still think that for void-returning functions we can simply assume to not do anything. Literally, the only thing it can do is return or infinite loop. But I'm definitely not as sure as I was when I first responded to this thread. -SteveOn Tuesday, 5 June 2018 at 14:48:23 UTC, FeepingCreature wrote:I would be _very_ surprised if the D spec said anything like this simply because it doesn't do a lot of talking about conceptual stuff like this and talks a lot more about what the compiler actually does (or is supposed to do) - though that certainly doesn't mean that the spec shouldn't say something like that, and Walter has previously stated that anything that needs to be in the spec but isn't should be reported in bugzilla. So, if you have anything like this that you really think should be in the spec, please report it in bugzilla. Regardless, I see zero value in supporting something like void foo() pure { while(true) {} }I'm just posting to clear up the misunderstanding that a call to a pure function can be removed. Actually, even calls to strongly pure functions cannot always be removed. This is because there is one thing that a pure function can do that will change program behavior if it is removed, even if it does not change any global state whatsoever: it can simply never return. void foo() pure { while (true) { } }This is not a valid program, in my opinion. (Still only my opinion, because I have not found it in the D spec, so needs adding.) A valid C++ program must make progress in each thread. C++ spec states: "The implementation may assume that any thread will eventually do one of the following: - terminate, - make a call to a library I/O function, - access or modify a volatile object, or - perform a synchronization operation or an atomic operation."
Jun 07 2018
On Thursday, 7 June 2018 at 22:23:09 UTC, Steven Schveighoffer wrote:{...}That a function could return does not mean it will. int fn (int arg) /*strongly*/ pure { if (arg == 42) return 42; else (arg < 43) return fn(--arg); else return fn(++arg); } do you see the problem? If not you would you expect the compiler to ? note: I would expected that the clang static analyzer would be able to determine that depending on the value of arg this function might never return.
Jun 07 2018
On 6/7/18 6:57 PM, Stefan Koch wrote:On Thursday, 7 June 2018 at 22:23:09 UTC, Steven Schveighoffer wrote:I'm not talking about the compiler looking at the code to figure out if it will return. I'm talking about this: struct S { } S fn() pure; No matter what is inside fn, it will return S.init, if it returns. The question to answer is, can we assume all functions don't enter infinite loops, and if we can assume this, then why couldn't the compiler simply replace a call to fn with S.init? And the case FeepingCreature gives is pretty compelling, because it's a real example, not a toy example. -Steve{...}That a function could return does not mean it will. int fn (int arg) /*strongly*/ pure { if (arg == 42) return 42; else (arg < 43) return fn(--arg); else return fn(++arg); } do you see the problem? If not you would you expect the compiler to ? note: I would expected that the clang static analyzer would be able to determine that depending on the value of arg this function might never return.
Jun 07 2018
On 08.06.2018 01:14, Steven Schveighoffer wrote:No matter what is inside fn, it will return S.init, if it returns. The question to answer is, can we assume all functions don't enter infinite loops, and if we can assume this, then why couldn't the compiler simply replace a call to fn with S.init? And the case FeepingCreature gives is pretty compelling, because it's a real example, not a toy example. -SteveFeepingCreature's example does not really qualify as a do-nothing loop, because the loop produces a value that you (presumably) later access. In any case, I don't think the right answer is to make programs that enter infinite loops that do nothing undefined behavior (like in C++), but they should possibly be allowed to be removed, though it can be surprising, see: https://blog.regehr.org/archives/140 (However, the "Fermat" example is a bit contrived, because presumably one would also want to print the specific counterexample, breaking the "optimization".)
Jun 11 2018
On 11.06.2018 16:39, Timon Gehr wrote:FeepingCreature's example does not really qualify as a do-nothing loop, because the loop produces a value that you (presumably) later access.Hm, ok. The problem is actually there, but it is not that the 'find' call will get removed, it is that the loop within find's implementation will be seen as doing nothing and producing nothing after inlining, breaking find's postcondition.
Jun 11 2018
On 6/11/18 10:43 AM, Timon Gehr wrote:On 11.06.2018 16:39, Timon Gehr wrote:Actually, one thing that can be determined statically is that it will never return. Essentially, this particular find function (with predicate inlining) will fold to: while(true) {} Since the final return outside the loop is obviously never reached, and the return inside the loop cannot be reached. So I would assume this would properly be translated to an infinite loop. I was not expecting that the compiler might eliminate infinite loops that it can prove are infinite loops, more like it should eliminate calls that trivially can be proven to do nothing based on the signature of the function. -SteveFeepingCreature's example does not really qualify as a do-nothing loop, because the loop produces a value that you (presumably) later access.Hm, ok. The problem is actually there, but it is not that the 'find' call will get removed, it is that the loop within find's implementation will be seen as doing nothing and producing nothing after inlining, breaking find's postcondition.
Jun 11 2018
On 11.06.2018 20:15, Steven Schveighoffer wrote:On 6/11/18 10:43 AM, Timon Gehr wrote:Well, if the language definition is reasonable, then if calls to `void foo()pure nothrow { while(true){} }` can be elided, so can the `while(true){}` loop itself.On 11.06.2018 16:39, Timon Gehr wrote:Actually, one thing that can be determined statically is that it will never return. Essentially, this particular find function (with predicate inlining) will fold to: while(true) {} Since the final return outside the loop is obviously never reached, and the return inside the loop cannot be reached. So I would assume this would properly be translated to an infinite loop. I was not expecting that the compiler might eliminate infinite loops that it can prove are infinite loops, more like it should eliminate calls that trivially can be proven to do nothing based on the signature of the function. -SteveFeepingCreature's example does not really qualify as a do-nothing loop, because the loop produces a value that you (presumably) later access.Hm, ok. The problem is actually there, but it is not that the 'find' call will get removed, it is that the loop within find's implementation will be seen as doing nothing and producing nothing after inlining, breaking find's postcondition.
Jun 11 2018
On 6/11/18 2:23 PM, Timon Gehr wrote:On 11.06.2018 20:15, Steven Schveighoffer wrote:Yes, it does seem that way. Eliding (or moving) opaque calls in general if you can't see what the call is actually doing seems to be a minefield that probably cannot reasonably be navigated. The only reasonable elision seems to be for pure calls that have already been made so we know they actually return. I'm curious if Haskell can properly navigate this with its lazy calls, or if it is subject to such oddities. -SteveOn 6/11/18 10:43 AM, Timon Gehr wrote:Well, if the language definition is reasonable, then if calls to `void foo()pure nothrow { while(true){} }` can be elided, so can the `while(true){}` loop itself.On 11.06.2018 16:39, Timon Gehr wrote:Actually, one thing that can be determined statically is that it will never return. Essentially, this particular find function (with predicate inlining) will fold to: while(true) {} Since the final return outside the loop is obviously never reached, and the return inside the loop cannot be reached. So I would assume this would properly be translated to an infinite loop. I was not expecting that the compiler might eliminate infinite loops that it can prove are infinite loops, more like it should eliminate calls that trivially can be proven to do nothing based on the signature of the function.FeepingCreature's example does not really qualify as a do-nothing loop, because the loop produces a value that you (presumably) later access.Hm, ok. The problem is actually there, but it is not that the 'find' call will get removed, it is that the loop within find's implementation will be seen as doing nothing and producing nothing after inlining, breaking find's postcondition.
Jun 11 2018