www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Fear of Compiler Magic

reply IchorDev <zxinsworld gmail.com> writes:
I hear people complain about compiler magic a lot. Yes, being 
able to do everything in-language is nice, but compiler magic is 
inevitable and also can be very useful. `assert` is my favourite 
example. Things like Python’s `print` are more dubious. You can 
always make a better print function on your own, right? Whereas 
one assert is never going to transcend another assert, even if 
the way it prints its assertion failure is slow, who cares? The 
program is already functionally dead anyway. Do we really want 
C’s ‘everyone assert for themself’ problem?
And besides that, isn’t ‘compiler magic’ at its logical 
conclusion generally applicable to *any* task performed by the 
compiler? Exception handling, code optimisation, inlining, 
template expansion, `new`, adding two numbers? Compiler magic.
Aug 01
next sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 8/2/24 05:22, IchorDev wrote:
 I hear people complain about compiler magic a lot.
Probably this is partially inspired from here: https://dconf.org/2018/talks/alexandrescu.html
 Yes, being able to do 
 everything in-language is nice, but compiler magic is inevitable and 
 also can be very useful. `assert` is my favourite example.
int foo(){ enforce(0); } // error int foo(){ assert(0); } // ok "No compiler magic" would e.g. mean: `enforce` can similarly influence definite return analysis. It's not inevitable that this is impossible.
 Things like 
 Python’s `print` are more dubious. You can always make a better print 
 function on your own, right? Whereas one assert is never going to 
 transcend another assert, even if the way it prints its assertion 
 failure is slow, who cares? The program is already functionally dead 
 anyway. Do we really want C’s ‘everyone assert for themself’ problem?
Well, this is what assert does. The question is how it achieves it, and whether the same tools are accessible to user code that perhaps does _something different than assert_.
 And besides that, isn’t ‘compiler magic’ at its logical conclusion 
 generally applicable to *any* task performed by the compiler?
Not if the tasks are properly decomposed into orthogonal components that are also available to the user. Anyway, there is quite a bit of existing magic, and it does cause some issues.
Aug 02
next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 8/2/2024 1:34 AM, Timon Gehr wrote:
 Well, this is what assert does. The question is how it achieves it, and
whether 
 the same tools are accessible to user code that perhaps does _something 
 different than assert_.
One reason for some of the builtin stuff is to not tempt people to write their own. Having standardized ways to do common tasks is a big win for making code understandable by others, which advantageous in a team environment. For example, the `debug` conditionals came about from my discussions with a veteran Microsoft programming manager. He complained that every project invented their own scheme for doing debug conditionals, making it unnecessarily difficult to share code. Unittests and Ddoc are other successful examples. Lisp is a language that enables building one's one programming language on top of it. It more or less requires it. The result is every Lisp user invents their own language, incompatible with any other Lisp user, and so successful Lisp programs don't survive their creators. It's why Lisp has never really caught on, much to the bafflement of Lisp advocates.
Aug 03
parent Timon Gehr <timon.gehr gmx.ch> writes:
On 8/3/24 18:54, Walter Bright wrote:
 On 8/2/2024 1:34 AM, Timon Gehr wrote:
 Well, this is what assert does. The question is how it achieves it, 
 and whether the same tools are accessible to user code that perhaps 
 does _something different than assert_.
One reason for some of the builtin stuff is to not tempt people to write their own. Having standardized ways to do common tasks is a big win for making code understandable by others, which advantageous in a team environment. ...
Sure, I am in favor of built-in assert. This thread is however about magic.
 For example, the `debug` conditionals came about from my discussions 
 with a veteran Microsoft programming manager. He complained that every 
 project invented their own scheme for doing debug conditionals, making 
 it unnecessarily difficult to share code.
 
 Unittests and Ddoc are other successful examples.
 ...
Off-topic, but yes.
 Lisp is a language that enables building one's one programming language 
 on top of it. It more or less requires it.
 
 The result is every Lisp user invents their own language, incompatible 
 with any other Lisp user, and so successful Lisp programs don't survive 
 their creators. It's why Lisp has never really caught on, much to the 
 bafflement of Lisp advocates.
Sure.
Aug 04
prev sibling parent Quirin Schroll <qs.il.paperinik gmail.com> writes:
On Friday, 2 August 2024 at 08:34:39 UTC, Timon Gehr wrote:
 On 8/2/24 05:22, IchorDev wrote:
 I hear people complain about compiler magic a lot.
Probably this is partially inspired from here: https://dconf.org/2018/talks/alexandrescu.html
 Yes, being able to do everything in-language is nice, but 
 compiler magic is inevitable and also can be very useful. 
 `assert` is my favourite example.
int foo(){ enforce(0); } // error int foo(){ assert(0); } // ok "No compiler magic" would e.g. mean: `enforce` can similarly influence definite return analysis. It's not inevitable that this is impossible.
This specific case could be solved with Enum Parameters: `enforce` could detect `0` (or `false`) specifically and return type `noreturn`.
Aug 05
prev sibling parent reply Dennis <dkorpel gmail.com> writes:
On Friday, 2 August 2024 at 03:22:28 UTC, IchorDev wrote:
 And besides that, isn’t ‘compiler magic’ at its logical 
 conclusion generally applicable to *any* task performed by the 
 compiler? Exception handling, code optimisation, inlining, 
 template expansion, `new`, adding two numbers? Compiler magic.
A good programming language consists of a small set of orthogonal features that combine into a powerful language. Magic features are situational. They add the maintenance burden of an orthogonal feature, but instead of multiplying the language's expressiveness, they only add a constant to it. So what happens in practice when you pile on magic features? To keep technical debt of large code bases under control, refactoring needs to happen. Refactoring relies on performing code transformations that result in equivalent semantics. As a programmer, you want to have actions you can always do safely, based on general facts: - Comments don't affect code - Unreachable code can be removed - Function definitions can be moved around However, proposals for magic features tend to add more and more exceptions. I have seen proposals that would make `version(all) assert(0);` not always equivalent to `assert(0);`, or `x + 1` different from `x + (1)`. And most recently: [Make printf safe](https://forum.dlang.org/thread/v6uolo$22e4$1 digitalmars.com). You would think it's safe to transform this: ```D printf("x = %s\n", x); printf("x = %s\n", x); ``` Into this: ```D const(char)* fmt = "x = %s\n"; printf(fmt, x); printf(fmt, x); ``` But with magic printf format string rewrites, that transformation turns correct code into memory corrupting code when x is an int. This doesn't mean magic features are always a bad idea. The `__FILE__` and `__LINE__` tokens are somewhat magical, and they break all the aforementioned refactoring equivalences, since any code movement (including comments!) can alter line numbers, which potentially alters the meaning of the program. In practice of course, `__FILE__` and `__LINE__` are not used for control flow, only for logging, so it's not a big problem. But hopefully you see why many people are very wary of magic features, lest the language becomes a minefield of gotchas like the printf example. p.s. It even used to be the case that `__LINE__ + 0` was not the same as `__LINE__`! (https://issues.dlang.org/show_bug.cgi?id=18919)
Aug 02
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 8/2/2024 2:29 AM, Dennis wrote:
 You would think it's safe to transform this:
 ```D
 printf("x = %s\n", x);
 printf("x = %s\n", x);
 ```
 
 Into this:
 ```D
 const(char)* fmt = "x = %s\n";
 printf(fmt, x);
 printf(fmt, x);
 ```
 
 But with magic printf format string rewrites, that transformation turns
correct 
 code into memory corrupting code when x is an int.
The transformation won't compile if the call is marked safe, and won't compile with the various proposals to increase the default safety-ness. It is in the same box as: ``` int[] array; x = array[5]; ``` and rewriting as: ``` int[] array; x = *(array.ptr + 5); ```
Aug 03
next sibling parent Dennis <dkorpel gmail.com> writes:
On Saturday, 3 August 2024 at 17:02:55 UTC, Walter Bright wrote:
 The transformation won't compile if the call is marked  safe, 
 and won't compile with the various proposals to increase the 
 default safety-ness.
This is an interesting aspect that I forgot to mention: Whenever I bring up such examples of problematic cases resulting from 'magic', the defense is often akin to: "Sure, it fails in that theoretical / pathological case, but when are you going to find that in REAL code?" Which can be fair. Like I said, a bit of magic doesn't always have to be problematic. And while I can't tell you when or how these problem cases are going to crop up, do beware that there might just be plenty of opportunity, just from a statistical standpoint. It's like Intel saying, in response to the [Pentium FDIV bug](https://en.wikipedia.org/wiki/Pentium_FDIV_bug): "Well, when are you going to divide 4195835 by 3145727 needing all the precision? Give me an example of when you would divide those specific numbers in a REAL application." That's hard to answer upfront, but with enough users doing enough floating point math, eventually you get [some great stories](https://www.youtube.com/watch?v=22FU31ZUgNA&t=1519): When working on Quake, Michael Abrash spent hours tracking down a graphical glitch, until finding out with the help of a friend from Intel that it was the infamous hardware bug. Going back to printf, it's possible hardly anyone will ever hit this problem. But when eventually there's thousands of calls to magic printf out there (or snprintf which still can't be called in ` safe` code!), each one is a contender to be part of that one spectacular failure. That's why I used the word "minefield": There's no guarantee of things blowing up, it depends on the density of mines and the amount of people crossing. So you could try some sort of risk assessment for magic features, but I prefer to just avoid them as much as possible and look for alternatives.
 It is in the same box as:
That example is different, because the first program wasn't correct to begin with. Or if it were, then the refactoring would result in the second program also being correct. In my example, only the first program was correct.
Aug 03
prev sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 8/3/24 19:02, Walter Bright wrote:
 On 8/2/2024 2:29 AM, Dennis wrote:
 You would think it's safe to transform this:
 ```D
 printf("x = %s\n", x);
 printf("x = %s\n", x);
 ```

 Into this:
 ```D
 const(char)* fmt = "x = %s\n";
 printf(fmt, x);
 printf(fmt, x);
 ```

 But with magic printf format string rewrites, that transformation 
 turns correct code into memory corrupting code when x is an int.
The transformation won't compile if the call is marked safe, and won't compile with the various proposals to increase the default safety-ness. ...
The simple fact is that is that the magic treatment of the string-literal leads to some trouble. I.e., this is a good illustration about how magic instills fear.
 It is in the same box as:
 
 ```
 int[] array;
 x = array[5];
 ```
 
 and rewriting as:
 
 ```
 int[] array;
 x = *(array.ptr + 5);
 ```
Not at all. You just orthogonally removed the range check. This is a completely unrelated case. Nothing surprising happens here.
Aug 04
parent reply Quirin Schroll <qs.il.paperinik gmail.com> writes:
On Sunday, 4 August 2024 at 17:41:52 UTC, Timon Gehr wrote:
 On 8/3/24 19:02, Walter Bright wrote:
 On 8/2/2024 2:29 AM, Dennis wrote:
 You would think it's safe to transform this:
 ```D
 printf("x = %s\n", x);
 printf("x = %s\n", x);
 ```

 Into this:
 ```D
 const(char)* fmt = "x = %s\n";
 printf(fmt, x);
 printf(fmt, x);
 ```

 But with magic printf format string rewrites, that 
 transformation turns correct code into memory corrupting code 
 when x is an int.
The transformation won't compile if the call is marked safe, and won't compile with the various proposals to increase the default safety-ness. ...
The simple fact is that is that the magic treatment of the string-literal leads to some trouble. I.e., this is a good illustration about how magic instills fear.
And it’s why I suggested using `__printf` instead. It can be an intrinsic (a keyword even), and be specified to require a compile-time constant string as its first argument, i.e. a string literal or something synthesized by CTFE, but nothing run-time.
Aug 05
parent Nick Treleaven <nick geany.org> writes:
On Monday, 5 August 2024 at 11:14:44 UTC, Quirin Schroll wrote:
 And it’s why I suggested using `__printf` instead. It can be an 
 intrinsic (a keyword even), and be specified to require a 
 compile-time constant string as its first argument, i.e. a 
 string literal or something synthesized by CTFE, but nothing 
 run-time.
It's not just about C's `printf` - see: https://forum.dlang.org/post/igrcbbodmkilderxxjwq forum.dlang.org
Aug 05