digitalmars.D - Forcing weak-pure
- Steven Schveighoffer (51/51) Mar 25 2014 There has been an increased effort as of late to make as much as possibl...
- monarch_dodra (13/17) Mar 25 2014 Maybe instead of using null, we could use global junk?
- Steven Schveighoffer (7/23) Mar 25 2014 I think again, a sufficiently intelligent compiler could see that
- Joseph Rushton Wakeling (5/6) Mar 25 2014 That really seems like a nasty workaround which could break at any momen...
- Artur Skawina (10/22) Mar 25 2014 Functions returning 'void' and w/o mutable args cannot be logically pure...
- Steven Schveighoffer (17/46) Mar 25 2014 This could not be elided, because you are using the return value. It wou...
- Artur Skawina (23/55) Mar 25 2014 It's ok to treat allocator and factory functions as pure, because those ...
- Steven Schveighoffer (9/74) Mar 25 2014 You what I wasn't thinking of? parallelization of pure function calls!
- Kagamin (3/8) Mar 26 2014 Allocators don't return unique results, GC being the primary
- Artur Skawina (11/17) Mar 26 2014 I'm not really sure what your point is. I was saying that "allocator fun...
- Steven Schveighoffer (6/13) Mar 26 2014 I think you misunderstand. The result is uniquely referenced, meaning no...
- Artur Skawina (11/15) Apr 05 2014 That was wrong; treating these funcs as pure won't work. While allocator...
- Steven Schveighoffer (5/22) Apr 07 2014 I think this still works, even if makeS is pure, since it would be
There has been an increased effort as of late to make as much as possible inside the runtime safe, pure, and nothrow. The goal is to make it so it's not as nasty writing such code, because the library will allow it. Everything that we mark properly down below will allow higher-level constructs to be marked as well. Of course naturally, with all things this low-level, we have to make decisions as to what is *provably* safe/pure/nothrow, and what is *logically* safe/pure/nothrow. The most obvious example is memory allocation -- It is technically not pure as it accesses global state (the heap), but we can rationalize making it pure since we have more logical knowledge of the situation than the compiler (i.e. locking the global heap, and realizing that if a block is unallocated, it is uniquely accessed with the allocation of it). One issue, however, is modification of heap metadata for immutable data. The D compiler has the (correct) rationalization that for a pure function that has only immutable or implicitly convertible to immutable parameters, it is strong-pure. This means, it can optimize away calls to such functions. But in the case of modifying heap metadata, the call is actually weak-pure, since it is making modifications that you *don't* want to optimize away. For this reason, functions like GC.setAttr and assumeSafeAppend cannot be marked pure. For example: auto str = "hello".idup; str = str[0..1]; str.assumeSafeAppend(); str ~= "iya"; The compiler could rationally elide the call to assumeSafeAppend if it is pure. We are not using the return value, and the only parameter is immutable. Since pure functions technically have no side effects, this call can be eliminated. A recent compiler change made such calls warnings (not using result of strong-pure function). But assumeSafeAppend really should be weak-pure, because it does have an effect. In essence, you are technically passing to assumeSafeAppend a pointer to the block that contains the slice, not the slice itself. And that block is mutable. GC.setAttr has similar issues. How can we force these to be weak-pure? One option suggested is to add a hidden void * parameter that defaults to null, to force the issue. But I think future compilers may be smart enough to realize that can also be a strong-pure call. Another possibility is to pass the array/pointer by reference. This causes its own problems with rvalues, and does not help when the array/pointer itself is immutable. Do we need a way (maybe a pragma) to make a function inside druntime "forced weak" pure? Is there an option any of you wonderful geniuses out there can figure out without changes to the compiler? :) Some reference material: https://github.com/D-Programming-Language/druntime/pull/553 https://github.com/D-Programming-Language/druntime/pull/747 https://github.com/D-Programming-Language/phobos/pull/2025 https://github.com/D-Programming-Language/phobos/pull/2046 -Steve
Mar 25 2014
On Tuesday, 25 March 2014 at 13:30:04 UTC, Steven Schveighoffer wrote:How can we force these to be weak-pure? One option suggested is to add a hidden void * parameter that defaults to null, to force the issue. But I think future compilers may be smart enough to realize that can also be a strong-pure call.Maybe instead of using null, we could use global junk? //Global junk struct WeakPure{int a;} __gshared WeakPure dummyWeakPure; //Signature void weakPureFun(WeakPure* p = &dummyWeakPure) {} //Useage weakPureFun(); AFAIK, the compiler should NOT be able to infer strong purity here. I don't know about performance effects though.
Mar 25 2014
On Tue, 25 Mar 2014 10:08:20 -0400, monarch_dodra <monarchdodra gmail.com> wrote:On Tuesday, 25 March 2014 at 13:30:04 UTC, Steven Schveighoffer wrote:I think again, a sufficiently intelligent compiler could see that dummyWeakPure would not be used in the template, we likely would have to pass it to the underlying opaque function. I'd really like to avoid this option, if it's not going to be optimized down to the minimal case. -SteveHow can we force these to be weak-pure? One option suggested is to add a hidden void * parameter that defaults to null, to force the issue. But I think future compilers may be smart enough to realize that can also be a strong-pure call.Maybe instead of using null, we could use global junk? //Global junk struct WeakPure{int a;} __gshared WeakPure dummyWeakPure; //Signature void weakPureFun(WeakPure* p = &dummyWeakPure) {} //Useage weakPureFun(); AFAIK, the compiler should NOT be able to infer strong purity here. I don't know about performance effects though.
Mar 25 2014
On 25/03/14 15:08, monarch_dodra wrote:Maybe instead of using null, we could use global junk?That really seems like a nasty workaround which could break at any moment depending on how the compiler is tweaked. :-( Better surely to have an explicit indication that a function may not be assumed to be strongly pure?
Mar 25 2014
On 03/25/14 14:30, Steven Schveighoffer wrote:[...] functions like GC.setAttr and assumeSafeAppend cannot be marked pure. For example: auto str = "hello".idup; str = str[0..1]; str.assumeSafeAppend(); str ~= "iya"; The compiler could rationally elide the call to assumeSafeAppend if it is pure. We are not using the return value, and the only parameter is immutable. Since pure functions technically have no side effects, this call can be eliminated. A recent compiler change made such calls warnings (not using result of strong-pure function). But assumeSafeAppend really should be weak-pure, because it does have an effect. In essence, you are technically passing to assumeSafeAppend a pointer to the block that contains the slice, not the slice itself. And that block is mutable. GC.setAttr has similar issues. How can we force these to be weak-pure?Functions returning 'void' and w/o mutable args cannot be logically pure, as long as they aren't no-ops, obviously. While this property could be used to render them "weak-pure" in d-speak, this (or any other approach to marking them as such) would not be enough... // assuming 'assumeSafeAppend()' is "weak-pure": string f(string s) pure { s.assumeSafeAppend(); s ~= "d"; return s; } string a = "abc".idup, b = f(a[0..2]); artur
Mar 25 2014
On Tue, 25 Mar 2014 14:49:27 -0400, Artur Skawina <art.08.09 gmail.com> wrote:On 03/25/14 14:30, Steven Schveighoffer wrote:This could not be elided, because you are using the return value. It would have to be called at least once. However, I am OK with it being elided for a second call with the same input. In other words, this strong pure function does not have the same issues. But I get what you are saying -- just wrap it again, and we're back to the same issue. It is a systematic problem, because the concept is that you are not passing mutable references, even though you are then using those references to mark global data. There is no really good answer I suppose. The huge downside of not marking assumeSafeAppend as weak-pure is that one cannot use assumeSafeAppend or set attributes inside pure functions that deal with arrays with immutable data. For example, the Appender type cannot be marked pure for immutable data. -Steve[...] functions like GC.setAttr and assumeSafeAppend cannot be marked pure. For example: auto str = "hello".idup; str = str[0..1]; str.assumeSafeAppend(); str ~= "iya"; The compiler could rationally elide the call to assumeSafeAppend if it is pure. We are not using the return value, and the only parameter is immutable. Since pure functions technically have no side effects, this call can be eliminated. A recent compiler change made such calls warnings (not using result of strong-pure function). But assumeSafeAppend really should be weak-pure, because it does have an effect. In essence, you are technically passing to assumeSafeAppend a pointer to the block that contains the slice, not the slice itself. And that block is mutable. GC.setAttr has similar issues. How can we force these to be weak-pure?Functions returning 'void' and w/o mutable args cannot be logically pure, as long as they aren't no-ops, obviously. While this property could be used to render them "weak-pure" in d-speak, this (or any other approach to marking them as such) would not be enough... // assuming 'assumeSafeAppend()' is "weak-pure": string f(string s) pure { s.assumeSafeAppend(); s ~= "d"; return s; } string a = "abc".idup, b = f(a[0..2]);
Mar 25 2014
On 03/25/14 21:51, Steven Schveighoffer wrote:On Tue, 25 Mar 2014 14:49:27 -0400, Artur Skawina <art.08.09 gmail.com> wrote:It's ok to treat allocator and factory functions as pure, because those really are logically pure, ie can't affect any /visible/ state and return results that are unique. 'assumeSafeAppend()' does have side effects - it can lead to corruption of other, not directly related, data. The only thing that prevents this is a declaration by the programmer -- "Trust me, I own this data and there are no other live aliases". The problem is that this declaration would now happen /implicitly when calling a (strongly) pure function/. Note that the 'a' string in the above example could no longer be "abc" after 'f()' runs. The "pure" concept (re-)definition in D is bad enough even without having to worry about supposedly pure functions stomping on unrelated data.On 03/25/14 14:30, Steven Schveighoffer wrote:This could not be elided, because you are using the return value. It would have to be called at least once. However, I am OK with it being elided for a second call with the same input. In other words, this strong pure function does not have the same issues.[...] functions like GC.setAttr and assumeSafeAppend cannot be marked pure. For example: auto str = "hello".idup; str = str[0..1]; str.assumeSafeAppend(); str ~= "iya"; The compiler could rationally elide the call to assumeSafeAppend if it is pure. We are not using the return value, and the only parameter is immutable. Since pure functions technically have no side effects, this call can be eliminated. A recent compiler change made such calls warnings (not using result of strong-pure function). But assumeSafeAppend really should be weak-pure, because it does have an effect. In essence, you are technically passing to assumeSafeAppend a pointer to the block that contains the slice, not the slice itself. And that block is mutable. GC.setAttr has similar issues. How can we force these to be weak-pure?Functions returning 'void' and w/o mutable args cannot be logically pure, as long as they aren't no-ops, obviously. While this property could be used to render them "weak-pure" in d-speak, this (or any other approach to marking them as such) would not be enough... // assuming 'assumeSafeAppend()' is "weak-pure": string f(string s) pure { s.assumeSafeAppend(); s ~= "d"; return s; } string a = "abc".idup, b = f(a[0..2]);But I get what you are saying -- just wrap it again, and we're back to the same issue. It is a systematic problem, because the concept is that you are not passing mutable references, even though you are then using those references to mark global data. There is no really good answer I suppose. The huge downside of not marking assumeSafeAppend as weak-pure is that one cannot use assumeSafeAppend or set attributes inside pure functions that deal with arrays with immutable data. For example, the Appender type cannot be marked pure for immutable data.If there was a version of assumeSafeAppend() marked as pure, then that one could be used for such cases, where no aliases are allowed to escape before an ownership transfer happens. It's much easier to audit a contained implementation (eg Appender) than to check the whole program and every caller of every function which relies on a potentially unsafe assumption. If 'assumeSafeAppend()' isn't pure /by default/ then D's purity guarantees are at least a little bit stronger. Or maybe such a change (marking unsafe code as pure and hoping that the programmer knows exactly what (s)he's doing) wouldn't really make the situation significantly worse than it already is - I'm not sure... artur
Mar 25 2014
On Tue, 25 Mar 2014 19:30:04 -0400, Artur Skawina <art.08.09 gmail.com> wrote:On 03/25/14 21:51, Steven Schveighoffer wrote:You what I wasn't thinking of? parallelization of pure function calls! Right there, that kills this whole idea. In fact, I think at this point, any operation to immutable arrays that affects the metadata for that block, including setting length and appending, must be by definition impure. This is not a good turn of events... -SteveOn Tue, 25 Mar 2014 14:49:27 -0400, Artur Skawina <art.08.09 gmail.com> wrote:It's ok to treat allocator and factory functions as pure, because those really are logically pure, ie can't affect any /visible/ state and return results that are unique. 'assumeSafeAppend()' does have side effects - it can lead to corruption of other, not directly related, data. The only thing that prevents this is a declaration by the programmer -- "Trust me, I own this data and there are no other live aliases". The problem is that this declaration would now happen /implicitly when calling a (strongly) pure function/. Note that the 'a' string in the above example could no longer be "abc" after 'f()' runs. The "pure" concept (re-)definition in D is bad enough even without having to worry about supposedly pure functions stomping on unrelated data.On 03/25/14 14:30, Steven Schveighoffer wrote:This could not be elided, because you are using the return value. It would have to be called at least once. However, I am OK with it being elided for a second call with the same input. In other words, this strong pure function does not have the same issues.[...] functions like GC.setAttr and assumeSafeAppend cannot be marked pure. For example: auto str = "hello".idup; str = str[0..1]; str.assumeSafeAppend(); str ~= "iya"; The compiler could rationally elide the call to assumeSafeAppend if it is pure. We are not using the return value, and the only parameter is immutable. Since pure functions technically have no side effects, this call can be eliminated. A recent compiler change made such calls warnings (not using result of strong-pure function). But assumeSafeAppend really should be weak-pure, because it does have an effect. In essence, you are technically passing to assumeSafeAppend a pointer to the block that contains the slice, not the slice itself. And that block is mutable. GC.setAttr has similar issues. How can we force these to be weak-pure?Functions returning 'void' and w/o mutable args cannot be logically pure, as long as they aren't no-ops, obviously. While this property could be used to render them "weak-pure" in d-speak, this (or any other approach to marking them as such) would not be enough... // assuming 'assumeSafeAppend()' is "weak-pure": string f(string s) pure { s.assumeSafeAppend(); s ~= "d"; return s; } string a = "abc".idup, b = f(a[0..2]);
Mar 25 2014
On Tuesday, 25 March 2014 at 23:30:28 UTC, Artur Skawina wrote:It's ok to treat allocator and factory functions as pure, because those really are logically pure, ie can't affect any /visible/ state and return results that are unique.Allocators don't return unique results, GC being the primary example.
Mar 26 2014
On 03/26/14 15:36, Kagamin wrote:On Tuesday, 25 March 2014 at 23:30:28 UTC, Artur Skawina wrote:I'm not really sure what your point is. I was saying that "allocator functions", ie functions that return newly allocated objects, can be treated as pure as long they do not have any other visible side effects and the result is "unique", ie does not have any aliases. This will be true in many cases. The /implementation/ won't likely be pure, but from the caller's POV this is irrelevant. A way to make those functions appear pure would allow for more pure /callers/; that's why Steven was asking for a way to "force" purity. Even C compilers are treating 'malloc' etc specially, exactly because they can assume that nothing aliases the returned object. arturIt's ok to treat allocator and factory functions as pure, because those really are logically pure, ie can't affect any /visible/ state and return results that are unique.Allocators don't return unique results, GC being the primary example.
Mar 26 2014
On Wed, 26 Mar 2014 10:36:45 -0400, Kagamin <spam here.lot> wrote:On Tuesday, 25 March 2014 at 23:30:28 UTC, Artur Skawina wrote:I think you misunderstand. The result is uniquely referenced, meaning no other pointers to the data are present (except the all-seeing heap). Multiple calls to the allocator may return different results, and each result is uniquely referenced. -SteveIt's ok to treat allocator and factory functions as pure, because those really are logically pure, ie can't affect any /visible/ state and return results that are unique.Allocators don't return unique results, GC being the primary example.
Mar 26 2014
On 03/25/14 14:30, Steven Schveighoffer wrote:Of course naturally, with all things this low-level, we have to make decisions as to what is *provably* safe/pure/nothrow, and what is *logically* safe/pure/nothrow. The most obvious example is memory allocation -- It is technically not pure as it accesses global state (the heap), but we can rationalize making it pure since we have more logical knowledge of the situation than the compiler (i.e. locking the global heap, and realizing that if a block is unallocated, it is uniquely accessed with the allocation of it).On 03/26/14 00:30, Artur Skawina wrote:It's ok to treat allocator and factory functions as pure, because those really are logically pure, ie can't affect any /visible/ state and return results that are unique.That was wrong; treating these funcs as pure won't work. While allocator functions have some traits of pure ones -- no side-effects, their result only depends on the args, and the call can be optimized out when the result in unused -- there is one difference, which isn't expressible in D: the uniqueness of the result. Letting them be pure would mean that important purity based optimization could not be allowed (so that "S* s1 = makeS(32), s2 = makeS(32);" does not break). But this case is easy to deal with -- it's enough to (1) mark the alloc funcs with a 'malloc' attribute, (2) allow calling malloc functions inside pure ones. artur
Apr 05 2014
On Sat, 05 Apr 2014 08:26:01 -0400, Artur Skawina <art.08.09 gmail.com> wrote:On 03/26/14 00:30, Artur Skawina wrote:I think this still works, even if makeS is pure, since it would be weak-pure. -SteveIt's ok to treat allocator and factory functions as pure, because those really are logically pure, ie can't affect any /visible/ state and return results that are unique.That was wrong; treating these funcs as pure won't work. While allocator functions have some traits of pure ones -- no side-effects, their result only depends on the args, and the call can be optimized out when the result in unused -- there is one difference, which isn't expressible in D: the uniqueness of the result. Letting them be pure would mean that important purity based optimization could not be allowed (so that "S* s1 = makeS(32), s2 = makeS(32);" does not break).
Apr 07 2014