digitalmars.D - Attribute promises vs inference rules
- Quirin Schroll (115/115) Apr 17 The spec is rather detailed on what operations are valid and
The spec is rather detailed on what operations are valid and invalid in functions that are annotated ` safe`, ` nogc`, `pure`, and/or `nothrow`. However, there is a difference between operations that make the compiler error when an attribute is specified and an invalid operation is used – or (equivalently) make the compiler not infer the attribute in a context where attributes are inferred – and operations that violate the promises the attributes make. Example: ```d int x; bool f(int* p) pure safe { return p is &(x); // Error: `pure` function `f` cannot access mutable static data `x` } ``` This is not a bug: Indeed, `f` accesses `x` and indeed `x` is mutable data. Only by pure happenstance, `f` only uses the address of `x` which isn’t mutable, and never its value which is mutable. If it stored `&x` in a local, it could write to `x`. The fact that `f` doesn’t do that means that `f` is “morally” pure, but it’s not recognized as `pure` by the attribute spec. Don’t get me wrong, the spec could be changed so that accesses like this would be allowed, but currently, it doesn’t, which serves as a great example. So, what about this: ```d int x; bool g(int* p) pure safe { static impl(int* p) safe { return p is &(x); } enum pure_impl = () trusted { return cast(bool function(int* p) pure safe)&(impl); }(); return pure_impl(p); } ``` A cast that adds function attributes isn’t allowed by ` safe`, but we have ` trusted` for that. The question now is: Is it defined behavior if I cast `&impl` to `pure` using an explicit cast? I don’t know and I also don’t know where to look. The second one is an issue for D. Let’s look at each attribute individually, in the order of (what I presume) the easiest to the hardest to answer. My sense is: If it doesn’t allocate on the GC. Even if a function can allocate conditionally, if you can ensure it won’t, you’re good. Probably. The spec doesn’t say it, but anything else would be a big, big surprise. This attribute has the best answer because the question is essentially: What can be annotated ` trusted`? It has no simple answer, but at least there are discussions around it. Also, because ` trusted` exists, such questions are easy to phrase. What `nothrow` is about can be readily guessed. It’s not actually “cannot throw [anything]”, but rather “cannot throw `Exception`s”. Close enough. In all honesty, I don’t know what is “morally `nothrow`”, but if you asked me: “Function `foo` is not annotated `nothrow`, but it simply won’t throw exceptions, can I cast `&foo` to `nothrow`?” I’d answer: “Probably yes, but better use [`assumeWontThrow`](https://dlang.org/library/std/exception/assume_wont_throw.html).” There could be some messy details, though. A `throw` function can fail recoverably, so it must be called in a way that supports stack unwinding; a function that can’t fail recoverably doesn’t. It might be an issue, I don’t know. It’s not clear at all what `pure` promises exactly and what it doesn’t. Contrast this to `nothrow` and especially ` nogc`, where it might just be a single spec paragraph that’s missing. It may seem as easy as: It doesn’t access mutable data. Remember the initial example? It’s not so easy. Even if it were, the guarantees that follow from “it doesn’t access mutable data” are manifold: Unique construction (by a `pure` function that meets some other criteria) allows implicit casts from mutable to `immutable`. Some `pure` functions may be cached without one being able to observe the difference. Some `pure` functions may be run in parallel without requiring synchronization and other fancy stuff. Also consider GC allocation. A `pure` function is explicitly allowed to allocate on the GC heap (unless it’s also ` nogc` of course, but that’s orthogonal). How is that possible? The GC heap is definitely global state! Now, one could argue that there is only one GC, therefore every (`pure`) function morally has a hidden parameter that provides access to the GC, and a `pure` function may access a global variable through a parameter. (In a sense, what ` nogc` morally does (to a `pure` function) is remove this hidden parameter.) If we’re comfortable arguing like that in the general case, the rules of `pure` aren’t as trivial anymore. What about custom global-state APIs that could be modeled similar to the GC? What conditions does a global-state API have to meet such that access to it is well-defined in a `pure` function? In my estimation, nobody knows. For two of the four attributes, a spec paragraph is warranted. For ` safe`, it’s already an ongoing quest to extend as much UB-free code into the domain of ` safe`. For `pure`, there’s a whole discussion pending of what should count as “morally pure”, which casts are to `pure` are UB-free. This can be considered part of the ` safe` discussion. As for positions, there’s one extreme point: _Morally `pure` is only what could have been annotated `pure` without change._ This is probably a good starting point from a theoretical standpoint, i.e. the spec could be explicit about it and say: “A pointer to a function that isn’t annotated `pure` can be cast to a function pointer type that’s additionally annotated `pure` if the pointee function could have been annotated `pure` i.e. the programmer merely ‘forgot’ to annotate where it was possible.” But what about `f` from the initial example? It cannot be annotated `pure`. Do we want to exclude it? That doesn’t seem very practical. It would mean that `g` introduces UB and pose the question: When exactly does `g` enter UB? Is the cast already UB or does the ill-cast function have to be called?
Apr 17