www.digitalmars.com         C & C++   DMDScript  

digitalmars.dip.ideas - Universal Function Attribute Inference

reply Paul Backus <snarwin gmail.com> writes:


Currently, function attributes and function parameter attributes 
are only inferred by the D compiler for certain kinds of 
functions. This DIP idea proposes that such inference be extended 
to all non-overridable functions with bodies.

The primary goal of universal inference is to solve D's 
"attribute soup" problem without breaking compatibility with 
existing code. Compatibility with existing code makes universal 
inference a better solution to this problem than " safe by 
default," "nothrow by default," and other similar proposals.

Overridable functions (that is, non-final virtual functions) are 
excluded from universal inference because their bodies may be 
replaced at runtime.

For cases where attribute inference is not desired, an opt-out 
mechanism will be provided.

Currently, `.di` files generated by the compiler do not include 
inferred function attributes. This will have to change.



* [Discussion of inference pros and cons][andrei-comment] by 
Andrei Alexandrescu
* [Thoughts on inferred attributes][adr-post] by Adam Ruppe
* [DIP70:  api/extern(noinfer) attribute][dip70]
* [Add ` default` attribute][at-default]

[andrei-comment]: 
https://github.com/dlang/dmd/pull/1877#issuecomment-16403663
[adr-post]: 
http://dpldocs.info/this-week-in-d/Blog.Posted_2022_07_11.html#inferred-attributes
[dip70]: https://wiki.dlang.org/DIP70
[at-default]: https://github.com/dlang/DIPs/pull/236
Feb 28
next sibling parent zjh <fqbqrr 163.com> writes:
On Wednesday, 28 February 2024 at 17:18:04 UTC, Paul Backus wrote:


 Currently, function attributes and function parameter 
 attributes are only inferred by the D compiler for certain 
 kinds of functions.
This DIP idea proposes that such inference
 be extended to all non-overridable functions with bodies.
Nice.
Feb 28
prev sibling next sibling parent reply Dukc <ajieskola gmail.com> writes:
On Wednesday, 28 February 2024 at 17:18:04 UTC, Paul Backus wrote:
 [snip]
I do not personally care whether inference is on or off by default. It's easy to toggle when I want anyway. On balance maybe I prefer it being off by default, simply because of backwards compatibility. Still, this is a worthwhile direction to pursue. I'm sometimes annoyed that I can't specify a return type for a function and still turn attribute inference on (without making it a template), or that I can't turn attribute inference off for templates or function literals. If the proposal will address those, I might well support it. I feel this proposal is orthogonal to the question of attribute defaults per se - we will still want to debate the defaults regardless of inference - but it will certainly take pressure off from that question. Especially if this dip will include an ability to turn any function attribute either on, off, or to be inferred. In that case it'll be trivial for anyone to add their favorite default attributes on top of each module, and selectively override those as needed. Something like: ```D module example; safe: autoinpure: autonothrow: gc: pure int failsCompilationIfImpure(int) => // .. trusted int canBeUnsafeInternally(int) => // ... autonogc T sometimesNogcTemplate(T)(int) => // ... ```
Feb 29
parent reply "Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:
On 01/03/2024 2:46 AM, Dukc wrote:
snip

Dukc makes a good point, being able to override inference is important 
when it fails to work with other functions or scenarios (such as 
globally set attributes).

We will need to consider adding the opposite of a given attribute, to 
maximize the possibility of inference not interfering with peoples code 
negatively.
Feb 29
parent Paul Backus <snarwin gmail.com> writes:
On Thursday, 29 February 2024 at 14:03:30 UTC, Richard (Rikki) 
Andrew Cattermole wrote:
 We will need to consider adding the opposite of a given 
 attribute, to maximize the possibility of inference not 
 interfering with peoples code negatively.
I agree that adding opposites for the remaining function attributes without them is a good idea, but it isn't strictly necessary. Even if you only have a single noinfer attribute, you can simulate "impure" and " gc" like this: noinfer safe nothrow nogc void inferImpure() {} noinfer safe pure nothrow void inferGC() {} // Will never be pure or nogc void example() { inferImpure(); inferGC(); // etc. } The main benefit of noinfer (compared to impure/ gc) is that it does not require adding a new reserved word. Since one of the main benefits of universal inference is that it works with existing code, I plan to propose it for inclusion in the base language edition, and that means avoiding breaking language changes.
Feb 29
prev sibling next sibling parent reply Sebastiaan Koppe <mail skoppe.eu> writes:
On Wednesday, 28 February 2024 at 17:18:04 UTC, Paul Backus wrote:
 [...]
What about only doing inference when something fails to compile due to a missing attribute? Say a nothrow function calling a non one. Compiler detects it, but before erroring out checks if the function is implicitly nothrow and then continues accordingly. Opting out can be done by setting incompatible attributes. E.g. if I explicitly mark a function as system, no amount of inference will enable it to be called from safe. That way you can have almost no attributes anywhere except for e.g. safe on main and get safety everywhere or an error
Feb 29
parent reply Paul Backus <snarwin gmail.com> writes:
On Thursday, 29 February 2024 at 21:21:12 UTC, Sebastiaan Koppe 
wrote:
 On Wednesday, 28 February 2024 at 17:18:04 UTC, Paul Backus 
 wrote:
 [...]
What about only doing inference when something fails to compile due to a missing attribute?
This would completely break separate compilation, but it would still be a bad idea even if it didn't. Having the attributes of a function depend on the code that calls it violates modularity and creates all kinds of opportunities for "spooky action at a distance." Also as far as I can tell there is literally zero upside to this approach compared to the original version. What is all of this additional complexity supposed to be buying us?
Feb 29
parent Dom DiSc <dominikus scherkl.de> writes:
On Thursday, 29 February 2024 at 22:39:43 UTC, Paul Backus wrote:
 On Thursday, 29 February 2024 at 21:21:12 UTC, Sebastiaan Koppe 
 wrote:
 What about only doing inference when something fails to 
 compile due to a missing attribute?
This would completely break separate compilation,
https://forum.dlang.org/post/gguwkjyiduyxjilyluvf It seems inference need to spend more time to find out if some attribute can really be added and not give up early - else there will be always cases where an attribute is expected but was not inferred. Maybe the compiler should even warn, if it is not sure if some attribute is fulfilled or not? Something like: "purity could not be inferred. Please explicitly state if this function is pure or not" (of course that would require the counterparts of pure, nogc and nothrow)
Apr 05
prev sibling next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
The main difficulty with this is it requires the compiler to compile all the 
functions in the "header" files. This is a significant performance penalty.
That 
is why the current scheme only infers for functions that the compiler must 
compile anyway, such as templates and auto functions.

I'm not necessarilly saying "no", just that everyone should be aware of this.
Mar 09
parent reply "Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:
On 10/03/2024 9:07 AM, Walter Bright wrote:
 The main difficulty with this is it requires the compiler to compile all 
 the functions in the "header" files. This is a significant performance 
 penalty. That is why the current scheme only infers for functions that 
 the compiler must compile anyway, such as templates and auto functions.
 
 I'm not necessarilly saying "no", just that everyone should be aware of 
 this.
There is some concern with multi-step builds yes. However we do have a solution that while it isn't ready for use today, it could be made ready once a preview switch has been implemented. The .di generator. It could be a while until we could turn the preview switch on, perhaps two or three editions. A bit of an adjustment yes, but the benefit means no more attribute soup to write for non-virtual code so I expect it to be worth it!
Mar 09
parent reply ryuukk_ <ryuukk.dev gmail.com> writes:
On Saturday, 9 March 2024 at 20:33:35 UTC, Richard (Rikki) Andrew 
Cattermole wrote:
 On 10/03/2024 9:07 AM, Walter Bright wrote:
 The main difficulty with this is it requires the compiler to 
 compile all the functions in the "header" files. This is a 
 significant performance penalty. That is why the current 
 scheme only infers for functions that the compiler must 
 compile anyway, such as templates and auto functions.
 
 I'm not necessarilly saying "no", just that everyone should be 
 aware of this.
There is some concern with multi-step builds yes. However we do have a solution that while it isn't ready for use today, it could be made ready once a preview switch has been implemented. The .di generator. It could be a while until we could turn the preview switch on, perhaps two or three editions. A bit of an adjustment yes, but the benefit means no more attribute soup to write for non-virtual code so I expect it to be worth it!
If there is a significant build penalty, than i hope it's opt-in, i personally do not use any of the attributes
Mar 09
parent reply "Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:
On 10/03/2024 10:26 AM, ryuukk_ wrote:
 If there is a significant build penalty, than i hope it's opt-in, i 
 personally do not use any of the attributes
It shouldn't be significant. It is a minor cost as things go, no reason to start thinking opt-in at this stage.
Mar 09
parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 3/9/24 22:47, Richard (Rikki) Andrew Cattermole wrote:
 On 10/03/2024 10:26 AM, ryuukk_ wrote:
 If there is a significant build penalty, than i hope it's opt-in, i 
 personally do not use any of the attributes
It shouldn't be significant. It is a minor cost as things go, no reason to start thinking opt-in at this stage.
Really? This proposal requires full semantic analysis of any imported function. It is only minor in certain build setups. (E.g., when the project is compiled in a single invocation.)
Mar 17
parent reply "Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:
On 18/03/2024 12:12 PM, Timon Gehr wrote:
 On 3/9/24 22:47, Richard (Rikki) Andrew Cattermole wrote:
 On 10/03/2024 10:26 AM, ryuukk_ wrote:
 If there is a significant build penalty, than i hope it's opt-in, i 
 personally do not use any of the attributes
It shouldn't be significant. It is a minor cost as things go, no reason to start thinking opt-in at this stage.
Really? This proposal requires full semantic analysis of any imported function. It is only minor in certain build setups. (E.g., when the project is compiled in a single invocation.)
I am assuming that you generate .di files every time and use that for importing, not the .d files. These days I'm leaning quite heavily towards projects using .di files for intermediary steps, due to distribution reasons with shared libraries. It's already impossible to keep bindings to D code up to date manually, I tried pure was a killer on that idea.
Mar 18
next sibling parent reply Paul Backus <snarwin gmail.com> writes:
On Monday, 18 March 2024 at 10:30:28 UTC, Richard (Rikki) Andrew 
Cattermole wrote:
 I am assuming that you generate .di files every time and use 
 that for importing, not the .d files.
Yes, this is one possible mitigation. The main downside of this is that .di files don't include function bodies, so you lose access to cross-module CTFE. Although I guess we could add a compiler switch to include function bodies in .di files.
Mar 20
next sibling parent Dom DiSc <dominikus scherkl.de> writes:
On Wednesday, 20 March 2024 at 23:59:09 UTC, Paul Backus wrote:
 On Monday, 18 March 2024 at 10:30:28 UTC, Richard (Rikki) 
 Andrew Cattermole wrote:
 I am assuming that you generate .di files every time and use 
 that for importing, not the .d files.
Yes, this is one possible mitigation. The main downside of this is that .di files don't include function bodies, so you lose access to cross-module CTFE. Although I guess we could add a compiler switch to include function bodies in .di files.
I would much better like having .di files that don't contain any code (even not for templates) and instead store the template bodies in some intermediary format in another file (of course not an object file, as it need to be possible to create different instances from it, but also not source-code because it should be possible to deliver e.g. a library without sources).
Apr 05
prev sibling parent Atila Neves <atila.neves gmail.com> writes:
On Wednesday, 20 March 2024 at 23:59:09 UTC, Paul Backus wrote:
 On Monday, 18 March 2024 at 10:30:28 UTC, Richard (Rikki) 
 Andrew Cattermole wrote:
 I am assuming that you generate .di files every time and use 
 that for importing, not the .d files.
Yes, this is one possible mitigation. The main downside of this is that .di files don't include function bodies, so you lose access to cross-module CTFE. Although I guess we could add a compiler switch to include function bodies in .di files.
The other problem is inlining. It's what's kept me from always using .di files aided by the build system. I've thought of only doing that for debug builds though, since as I've stated multiple times producing a final binary is boring and all I really care about is running my tests as fast as possible.
May 30
prev sibling parent reply zopsicle <zopsicle use.startmail.com> writes:
On Monday, 18 March 2024 at 10:30:28 UTC, Richard (Rikki) Andrew 
Cattermole wrote:
 On 18/03/2024 12:12 PM, Timon Gehr wrote:
 On 3/9/24 22:47, Richard (Rikki) Andrew Cattermole wrote:
 On 10/03/2024 10:26 AM, ryuukk_ wrote:
 If there is a significant build penalty, than i hope it's 
 opt-in, i personally do not use any of the attributes
It shouldn't be significant. It is a minor cost as things go, no reason to start thinking opt-in at this stage.
Really? This proposal requires full semantic analysis of any imported function. It is only minor in certain build setups. (E.g., when the project is compiled in a single invocation.)
I am assuming that you generate .di files every time and use that for importing, not the .d files. These days I'm leaning quite heavily towards projects using .di files for intermediary steps, due to distribution reasons with shared libraries. It's already impossible to keep bindings to D code up to date manually, I tried pure was a killer on that idea.
One of my favorite features of D is its compilation model and in particular the ability to import from source files directly. This simplifies fast incremental parallel builds by not requiring a separate step to extract the interface of a module, or worse, requiring that modules are compiled in topological order. C++20, Fortran, and Haskell modules and Rust crates work the other way; they require interface files to be generated for their compilation units before they can be imported. Consequently they do not integrate well with traditional build systems. I personally don't mind maintaining the attributes by hand, but I understand other people would gladly trade convenience for build simplicity and performance. Perhaps the compiler could be configured to infer attributes on functions from certain modules and not others, catering to both use cases.
May 22
parent reply Paul Backus <snarwin gmail.com> writes:
On Thursday, 23 May 2024 at 01:14:59 UTC, zopsicle wrote:
 One of my favorite features of D is its compilation model and 
 in particular the ability to import from source files directly. 
 This simplifies fast incremental parallel builds by not 
 requiring a separate step to extract the interface of a module, 
 or worse, requiring that modules are compiled in topological 
 order.

 C++20, Fortran, and Haskell modules and Rust crates work the 
 other way; they require interface files to be generated for 
 their compilation units before they can be imported. 
 Consequently they do not integrate well with traditional build 
 systems.
I should emphasize that the ability to import directly from .d files is not going anywhere, even if this proposal is accepted. The .di files would function purely as a cache to improve performance, and generating them would never be required.
May 24
parent Dukc <ajieskola gmail.com> writes:
Paul Backus kirjoitti 24.5.2024 klo 18.04:
 
 I should emphasize that the ability to import directly from .d files is 
 not going anywhere, even if this proposal is accepted. The .di files 
 would function purely as a cache to improve performance, and generating 
 them would never be required.
True. Still, I feel it's important we retain an easy way to *not* infer the attributes, even when available. For performance reasons like Walter wrote, but also to make sure the client code will not depend on attributes the library wishes to possibly remove in the future. I'm pretty neutral on whether inferring everything by default is a good choice, but I'm saying that regardless of whether it is the non-default option should be easy to pick. Only one keyword per symbol, or several for a group of them.
May 26
prev sibling next sibling parent Zach Tollen <zach notmyrealaddress.org> writes:
On Wednesday, 28 February 2024 at 17:18:04 UTC, Paul Backus wrote:
 The primary goal of universal inference is to solve D's 
 "attribute soup" problem without breaking compatibility with 
 existing code. Compatibility with existing code makes universal 
 inference a better solution to this problem than " safe by 
 default," "nothrow by default," and other similar proposals.

 Overridable functions (that is, non-final virtual functions) 
 are excluded from universal inference because their bodies may 
 be replaced at runtime.

 For cases where attribute inference is not desired, an opt-out 
 mechanism will be provided.

 Currently, `.di` files generated by the compiler do not include 
 inferred function attributes. This will have to change.



 * [Discussion of inference pros and cons][andrei-comment] by 
 Andrei Alexandrescu
 * [Thoughts on inferred attributes][adr-post] by Adam Ruppe
 * [DIP70:  api/extern(noinfer) attribute][dip70]
 * [Add ` default` attribute][at-default]

 [andrei-comment]: 
 https://github.com/dlang/dmd/pull/1877#issuecomment-16403663
 [adr-post]: 
 http://dpldocs.info/this-week-in-d/Blog.Posted_2022_07_11.html#inferred-attributes
 [dip70]: https://wiki.dlang.org/DIP70
 [at-default]: https://github.com/dlang/DIPs/pull/236
I have a few improvements to the suggestions linked to above. (I'm packing all three of these into one post. But each is probably substantial enough to have its own subthread.) DIP][at-default]** More specifically, for the alternative mechanism proposed by the DIP, which aimed to provide a generic way of deactivating any currently active attribute, but was rejected in the DIP because the syntax was "too complex and verbose." Here is that syntax: ```D nogc nothrow pure safe: // ... void f(T)(T x) nogc(default) nothrow(default) pure(default) {} ``` I've seen some other suggestions for the same feature. But I didn't find any particularly compelling. But just now I realized we could have: ` no(...)`, where `...` contains the list of one or more attributes to deactivate. So the above syntax transforms into: ```D nogc nothrow pure safe: // ... void f(T)(T x) no( nogc nothrow pure) {} void g(T)(T x) no( nogc) no(nothrow pure) {} // alternative grouping no(pure): // works for the scope too ... ``` This feature does not mean that the code it applies to wouldn't pass the checks for pure, nogc, etc. It simply deactivates the previous label. The compiler can still infer a given attribute. It just wouldn't be explicit. This syntax requires adding a keyword ` no`, and the ability to group one or more attributes within parentheses. But that's all. It's very simple, and it makes generically deactivating attributes very easy. [at-default]: https://github.com/dlang/DIPs/pull/236 Attributes DIP][ada-dip]** This suggestion requires a little more elaboration to make clear. The [DIP][ada-dip] in question was linked to in [Adam's article][adr-post]. First let's address the question of default attributes for a function which has callable parameter(s). What should the default inference behavior be for `func()` in the following code? (Just focus on ` safe` and `throw`/`nothrow`) ```D void func(void delegate(in char[]) sink) safe { sink("Hello World"); } void g() { func((in char[] arg) { throw new Exception("Length cannot be 0"); }); } void h() { func((in char[] arg) { return; }); } ``` In my opinion, function `func()` should be inferred ` safe throw` when it is called in `g()`, and ` safe nothrow` when it is called in `h()`. In other words, its attributes should be combined at the call site with those of the delegate which is passed to it. These are Argument Dependent Attributes (ADAs). Moreover, this should be the *default behavior*. (Note: I'm not 100% certain about this, and would like to be shown otherwise. But I think it's true.) In other words, any function which takes a delegate or a function as a parameter, should have Argument Dependent Attributes (ADAs) by default. In the existing situation, however, we have no such attributes at all, let alone by default, and the [DIP][ada-dip] above suggests the following syntax in order to add them. (Hint: The `*` means you don't have to specify the specific name of the argument for which the attribute status should propagate to the overall signature.): ```D // Basic usage, with nothrow and safe void func0(void delegate(int) sink) safe(sink) nothrow(sink); // Empty argument list, equivalent to safe void func1() safe(); // Equivalent to func0 void func2(void delegate(int)) safe(*) nothrow(*); // Equivalent to func0 void func3(void delegate(int) arg) safe(arg,) nothrow(*); // Equivalent to func1 void func4(int) safe(*); // Equivalent to func0 void func3(void delegate(int) arg) safe(0) nothrow(0,); ``` Again, the major problem here is with the chosen syntax. If I were suggesting a solution to the same problem, I would go with the following simple syntax using a new keyword ` imply`: ```D void func0(void delegate(int) imply sink); ``` ` imply` simply means: Imply that (all) the attributes of `sink()` apply to `func0()` as well, and determine them each time `func0()` is called. (` imply` as a keyword would only have any meaning as part of a callable parameter.) If you want to limit the implication to one or more particular attributes, indicate So: ```D void func0(void delegate(int) imply( system throw) sink); ``` However, as mentioned above, I don't see why adding ` imply` to every delegate/function passed as an argument shouldn't be the *default* behavior. After all, generally speaking, why have a callable as a function parameter if you're not going to call it in the body of the function? Therefore, what we really need is to make ` imply` the default, and add a way to *opt out* of it, by indicating that a function call should NOT infer its attributes based on the callable passed. So we are now *defaulting* to ADAs, and in rare cases adding ` noimply()` to turn them *off*: ```D // noimply turns the new default off for the specified attribute void func(void delegate(in char[]) noimply(throw) sink) safe { try { sink("Hello World"); } catch (Exception) {} } void g() { func((in char[] arg) { throw new Exception("."); }); } ``` Since `func()` above catches the exception, it can be determined and inferred to be `nothrow` even if `sink()` throws. ` noimply(throw)` indicates this. So, this is a syntax improvement suggestion for the [ADAs DIP][ada-dip]. But it's also a recognition that if they become the default, the primary need will be for an opt-out syntax rather than an opt-in one. [ada-dip]: https://github.com/dlang/DIPs/pull/198/files?short_path=d1fa190#diff-d1fa1908aafd30b6d5044235a23a348f294186638ec3af5dd4d71d455eab2302 [adr-post]: http://dpldocs.info/this-week-in-d/Blog.Posted_2022_07_11.html#inferred-attributes attributes**
 [From the OP:]"Currently, .di files generated by the compiler 
 do not include inferred function attributes. This will have to 
 change."
I assume that the problem is that the mangled names for a function should not include the inferred attributes even if the .di header files do. It may also help with documentation to be able to distinguish officially supported attributes from inferred ones. For this, I suggest another keyword with the same syntax as my previous proposals. Namely, I suggest putting all inferred attributes into an ` inferred()` grouping: ```D void func() safe nogc inferred(pure nothrow); ``` ` inferred()` has no effect other than to tell the compiler or the documentation generator that its contents, while accurate, are not part of the official [API/ABI][api-vs-abi] of the function. The compiler must naturally keep track of which attributes are explicit as opposed to inferred, in order to be able to generate headings like this. [api-vs-abi]: https://stackoverflow.com/questions/3784389/difference-between-api-and-abi
May 01
prev sibling parent reply Quirin Schroll <qs.il.paperinik gmail.com> writes:
On Wednesday, 28 February 2024 at 17:18:04 UTC, Paul Backus wrote:
 Currently, function attributes and function parameter 
 attributes are only inferred by the D compiler for certain 
 kinds of functions. This DIP idea proposes that such inference 
 be extended to all non-overridable functions with bodies.
Why not just provide a tool or a compiler switch that tells you where something should have been annotated, but isn’t? Something like `-helpAnnotate=safe,system,nogc` and the compiler will tell you which functions in your code base could be ` safe` and/or ` nogc`, but aren’t annotated ` safe` and/or ` nogc`, and which functions are ` system`, but lack the annotation.
May 16
parent Paul Backus <snarwin gmail.com> writes:
On Thursday, 16 May 2024 at 17:25:53 UTC, Quirin Schroll wrote:
 On Wednesday, 28 February 2024 at 17:18:04 UTC, Paul Backus 
 wrote:
 Currently, function attributes and function parameter 
 attributes are only inferred by the D compiler for certain 
 kinds of functions. This DIP idea proposes that such inference 
 be extended to all non-overridable functions with bodies.
Why not just provide a tool or a compiler switch that tells you where something should have been annotated, but isn’t? Something like `-helpAnnotate=safe,system,nogc` and the compiler will tell you which functions in your code base could be ` safe` and/or ` nogc`, but aren’t annotated ` safe` and/or ` nogc`, and which functions are ` system`, but lack the annotation.
If programmers are required to take any kind of manual action to add attributes to their functions, no matter how easy or straightforward it is, a large portion of them simply will not do it. The [default effect][1] is extremely strong. The [Github comment from Andrei][2] that I linked in the first post tells the same story:
 [Jonathan] Aldrich has written a number of papers about 
 augmenting Java with annotations for various properties, 
 notably ArchJava and AliasJava. He has conducted thorough 
 experiments with real developers and projects in researching 
 for this work, and has noticed (and mentioned this in his 
 papers, talks, and our group discussions) that as a rule of 
 thumb developers never go back and annotate their code with the 
 most precise signatures. [...] Although Jonathan's early work 
 focused on offering programmers the ability to add annotations, 
 before long he realized he must extend his work to do 
 annotation inference in order to make it useful.
[1]: https://en.wikipedia.org/wiki/Default_effect [2]: https://github.com/dlang/dmd/pull/1877#issuecomment-16403663
May 16