digitalmars.D - Attribute promises vs inference rules
- Quirin Schroll (115/115) Apr 17 2024 The spec is rather detailed on what operations are valid and
The spec is rather detailed on what operations are valid and invalid in functions that are annotated ` safe`, ` nogc`, `pure`, and/or `nothrow`. However, there is a difference between operations that make the compiler error when an attribute is specified and an invalid operation is used – or (equivalently) make the compiler not infer the attribute in a context where attributes are inferred – and operations that violate the promises the attributes make. Example: ```d int x; bool f(int* p) pure safe { return p is &(x); // Error: `pure` function `f` cannot access mutable static data `x` } ``` This is not a bug: Indeed, `f` accesses `x` and indeed `x` is mutable data. Only by pure happenstance, `f` only uses the address of `x` which isn’t mutable, and never its value which is mutable. If it stored `&x` in a local, it could write to `x`. The fact that `f` doesn’t do that means that `f` is “morally” pure, but it’s not recognized as `pure` by the attribute spec. Don’t get me wrong, the spec could be changed so that accesses like this would be allowed, but currently, it doesn’t, which serves as a great example. So, what about this: ```d int x; bool g(int* p) pure safe { static impl(int* p) safe { return p is &(x); } enum pure_impl = () trusted { return cast(bool function(int* p) pure safe)&(impl); }(); return pure_impl(p); } ``` A cast that adds function attributes isn’t allowed by ` safe`, but we have ` trusted` for that. The question now is: Is it defined behavior if I cast `&impl` to `pure` using an explicit cast? I don’t know and I also don’t know where to look. The second one is an issue for D. Let’s look at each attribute individually, in the order of (what I presume) the easiest to the hardest to answer. My sense is: If it doesn’t allocate on the GC. Even if a function can allocate conditionally, if you can ensure it won’t, you’re good. Probably. The spec doesn’t say it, but anything else would be a big, big surprise. This attribute has the best answer because the question is essentially: What can be annotated ` trusted`? It has no simple answer, but at least there are discussions around it. Also, because ` trusted` exists, such questions are easy to phrase. What `nothrow` is about can be readily guessed. It’s not actually “cannot throw [anything]”, but rather “cannot throw `Exception`s”. Close enough. In all honesty, I don’t know what is “morally `nothrow`”, but if you asked me: “Function `foo` is not annotated `nothrow`, but it simply won’t throw exceptions, can I cast `&foo` to `nothrow`?” I’d answer: “Probably yes, but better use [`assumeWontThrow`](https://dlang.org/library/std/exception/assume_wont_throw.html).” There could be some messy details, though. A `throw` function can fail recoverably, so it must be called in a way that supports stack unwinding; a function that can’t fail recoverably doesn’t. It might be an issue, I don’t know. It’s not clear at all what `pure` promises exactly and what it doesn’t. Contrast this to `nothrow` and especially ` nogc`, where it might just be a single spec paragraph that’s missing. It may seem as easy as: It doesn’t access mutable data. Remember the initial example? It’s not so easy. Even if it were, the guarantees that follow from “it doesn’t access mutable data” are manifold: Unique construction (by a `pure` function that meets some other criteria) allows implicit casts from mutable to `immutable`. Some `pure` functions may be cached without one being able to observe the difference. Some `pure` functions may be run in parallel without requiring synchronization and other fancy stuff. Also consider GC allocation. A `pure` function is explicitly allowed to allocate on the GC heap (unless it’s also ` nogc` of course, but that’s orthogonal). How is that possible? The GC heap is definitely global state! Now, one could argue that there is only one GC, therefore every (`pure`) function morally has a hidden parameter that provides access to the GC, and a `pure` function may access a global variable through a parameter. (In a sense, what ` nogc` morally does (to a `pure` function) is remove this hidden parameter.) If we’re comfortable arguing like that in the general case, the rules of `pure` aren’t as trivial anymore. What about custom global-state APIs that could be modeled similar to the GC? What conditions does a global-state API have to meet such that access to it is well-defined in a `pure` function? In my estimation, nobody knows. For two of the four attributes, a spec paragraph is warranted. For ` safe`, it’s already an ongoing quest to extend as much UB-free code into the domain of ` safe`. For `pure`, there’s a whole discussion pending of what should count as “morally pure”, which casts are to `pure` are UB-free. This can be considered part of the ` safe` discussion. As for positions, there’s one extreme point: _Morally `pure` is only what could have been annotated `pure` without change._ This is probably a good starting point from a theoretical standpoint, i.e. the spec could be explicit about it and say: “A pointer to a function that isn’t annotated `pure` can be cast to a function pointer type that’s additionally annotated `pure` if the pointee function could have been annotated `pure` i.e. the programmer merely ‘forgot’ to annotate where it was possible.” But what about `f` from the initial example? It cannot be annotated `pure`. Do we want to exclude it? That doesn’t seem very practical. It would mean that `g` introduces UB and pose the question: When exactly does `g` enter UB? Is the cast already UB or does the ill-cast function have to be called?
Apr 17 2024