digitalmars.D - Attribute promises vs inference rules
- Quirin Schroll (115/115) Apr 17 2024 The spec is rather detailed on what operations are valid and
The spec is rather detailed on what operations are valid and
invalid in functions that are annotated ` safe`, ` nogc`, `pure`,
and/or `nothrow`. However, there is a difference between
operations that make the compiler error when an attribute is
specified and an invalid operation is used – or (equivalently)
make the compiler not infer the attribute in a context where
attributes are inferred – and operations that violate the
promises the attributes make.
Example:
```d
int x;
bool f(int* p) pure safe
{
return p is &(x); // Error: `pure` function `f` cannot access
mutable static data `x`
}
```
This is not a bug: Indeed, `f` accesses `x` and indeed `x` is
mutable data. Only by pure happenstance, `f` only uses the
address of `x` which isn’t mutable, and never its value which is
mutable. If it stored `&x` in a local, it could write to `x`. The
fact that `f` doesn’t do that means that `f` is “morally” pure,
but it’s not recognized as `pure` by the attribute spec. Don’t
get me wrong, the spec could be changed so that accesses like
this would be allowed, but currently, it doesn’t, which serves as
a great example.
So, what about this:
```d
int x;
bool g(int* p) pure safe
{
static impl(int* p) safe { return p is &(x); }
enum pure_impl = () trusted { return cast(bool function(int*
p) pure safe)&(impl); }();
return pure_impl(p);
}
```
A cast that adds function attributes isn’t allowed by ` safe`,
but we have ` trusted` for that. The question now is: Is it
defined behavior if I cast `&impl` to `pure` using an explicit
cast? I don’t know and I also don’t know where to look. The
second one is an issue for D.
Let’s look at each attribute individually, in the order of (what
I presume) the easiest to the hardest to answer.
My sense is: If it doesn’t allocate on the GC. Even if a function
can allocate conditionally, if you can ensure it won’t, you’re
good. Probably. The spec doesn’t say it, but anything else would
be a big, big surprise.
This attribute has the best answer because the question is
essentially: What can be annotated ` trusted`? It has no simple
answer, but at least there are discussions around it. Also,
because ` trusted` exists, such questions are easy to phrase.
What `nothrow` is about can be readily guessed. It’s not actually
“cannot throw [anything]”, but rather “cannot throw
`Exception`s”. Close enough. In all honesty, I don’t know what is
“morally `nothrow`”, but if you asked me: “Function `foo` is not
annotated `nothrow`, but it simply won’t throw exceptions, can I
cast `&foo` to `nothrow`?” I’d answer: “Probably yes, but better
use
[`assumeWontThrow`](https://dlang.org/library/std/exception/assume_wont_throw.html).”
There could be some messy details, though. A `throw` function can
fail recoverably, so it must be called in a way that supports
stack unwinding; a function that can’t fail recoverably doesn’t.
It might be an issue, I don’t know.
It’s not clear at all what `pure` promises exactly and what it
doesn’t. Contrast this to `nothrow` and especially ` nogc`, where
it might just be a single spec paragraph that’s missing. It may
seem as easy as: It doesn’t access mutable data. Remember the
initial example? It’s not so easy. Even if it were, the
guarantees that follow from “it doesn’t access mutable data” are
manifold: Unique construction (by a `pure` function that meets
some other criteria) allows implicit casts from mutable to
`immutable`. Some `pure` functions may be cached without one
being able to observe the difference. Some `pure` functions may
be run in parallel without requiring synchronization and other
fancy stuff.
Also consider GC allocation. A `pure` function is explicitly
allowed to allocate on the GC heap (unless it’s also ` nogc` of
course, but that’s orthogonal). How is that possible? The GC heap
is definitely global state!
Now, one could argue that there is only one GC, therefore every
(`pure`) function morally has a hidden parameter that provides
access to the GC, and a `pure` function may access a global
variable through a parameter. (In a sense, what ` nogc` morally
does (to a `pure` function) is remove this hidden parameter.) If
we’re comfortable arguing like that in the general case, the
rules of `pure` aren’t as trivial anymore. What about custom
global-state APIs that could be modeled similar to the GC?
What conditions does a global-state API have to meet such that
access to it is well-defined in a `pure` function? In my
estimation, nobody knows.
For two of the four attributes, a spec paragraph is warranted.
For ` safe`, it’s already an ongoing quest to extend as much
UB-free code into the domain of ` safe`. For `pure`, there’s a
whole discussion pending of what should count as “morally pure”,
which casts are to `pure` are UB-free. This can be considered
part of the ` safe` discussion.
As for positions, there’s one extreme point: _Morally `pure` is
only what could have been annotated `pure` without change._ This
is probably a good starting point from a theoretical standpoint,
i.e. the spec could be explicit about it and say: “A pointer to a
function that isn’t annotated `pure` can be cast to a function
pointer type that’s additionally annotated `pure` if the pointee
function could have been annotated `pure` i.e. the programmer
merely ‘forgot’ to annotate where it was possible.” But what
about `f` from the initial example? It cannot be annotated
`pure`. Do we want to exclude it? That doesn’t seem very
practical. It would mean that `g` introduces UB and pose the
question: When exactly does `g` enter UB? Is the cast already UB
or does the ill-cast function have to be called?
Apr 17 2024








Quirin Schroll <qs.il.paperinik gmail.com>