digitalmars.D - Template lowering of druntime hooks that CTFE cannot interpret
- Teodor Dutu (84/84) Jan 17 2022 Hi,
- max haughton (5/22) Jan 17 2022 Moving rewriting steps further up the compiler is a good thing.
- Timon Gehr (6/8) Jan 17 2022 Detect `if(__ctfe)` and don't do the rewrites in its body. Then use
Hi, The current workflow of the compiler is that semantic analysis deduces types for all expressions in the AST. Then, if CTFE is required, the compiler performs the interpretation in `dinterpret.d`. ```d bool f() { // ... } static assert(f()); ``` Before the backend can generate the code, the intermediate code generator performs lowerings from expressions such as `a ~= b` (when `b` is an array) to `_d_arrayappendT(a, b)` [here](https://github.com/dlang/dmd/blob/25bf00749406171f4e7b52dbf0b6df9cb1181854/src/dmd/e2ir.d#L2715-L2734). The intermediate code generator receives a fully decorated AST, therefore it does not run any semantic analysis. As a consequence, it is impossible to instantiate templates at this level without introducing calls to semantic analysis routines (currently there is no such precedent in the intermediate code generator). In addition, this layer differs between the various compilers, because each intermediary representation differs. However, one advantage of this approach is that the CTFE interpreter does not need to be aware of any hooks since the lowering takes place at a lower level. This causes issues when the lowering is moved up from the intermediate code generator to the frontend, because now CTFE must recognize the hooks and interpret them either by interpreting the runtime hook itself or by generating interpretable code The first option is not a viable one since most hooks call C stdlib functions, such as memcpy or malloc, which cannot be interpreted. Therefore, the alternative is to lower the calls to templates during semantic and then intercept such lowerings at CTFE and then bypass interpreting the runtime hooks. As an example, when lowering the expression `S[n] a = b` to `_d_arrayctor(a, b)`, the approach we chose in [this PR](https://github.com/dlang/dmd/pull/13116) was to have CTFE rewrite `_d_arrayctor(a, b)` back to `S[n] a = b` [here](https://github.com/teodutu/dmd/blob/eeb7f7fad360a5955d3db90fc1b98be535d790f6/src/dmd/dinte pret.d#L4816-L4838) and then interpret it as a `ConstructExp`. The solution above doesn't work when dealing with a lowering to `_d_arrayappendcTX`, because there is no single `CallExp` to rewrite to a corresponding `CatAssign` expression. This mismatch existed prior to our work and was solved by lowering `a ~= b` to `_d_arrayappendcTX(a, 1), a[$ - 1] = b, a` in `e2ir.d`. If we kept the same lowering when using the new templated hook, then in order to reconstruct the original expression, CTFE would have to search through the lowered `CommaExp` and look for `_d_arrayappendcTX`. This approach is both inelegant and impractical. Thus, the approach we chose was to lower `a ~= b` to: ```d __ctfe ? a ~= b : _d_arrayappendcTX(a, 1), a[$ - 1] = b, a; ``` This makes it so that CTFE will pick the `true` branch of the `__ctfe` condition and not bother with the `false` branch. But while solving the problem of interpreting the expression correctly during CTFE, this approach passes the entire `CondExp` to e2ir.d, which then has to [ignore](https://github.com/dlang/dmd/blob/92d463064b567dd2e0a88aba2d32117a65be47d6/src/dmd e2ir.d#L2911-L2922) the `CondExp` and the `true` branch. Moreover, s2ir.d has to do [something similar](https://github.com/dlang/dmd/blob/92d463064b567dd2e0a88aba2d32117a65be47d6/src/d d/s2ir.d#L188-L210) for certain `IfStatement`s. The solution above can be improved so as to not require code changes to e2ir.d and s2ir.d. We aim to do this by breaking away from the old hooks when necessary and implementing new templated ones that correspond to the expressions from which they will be lowered. In the case of `_d_arrayappendcTX`, for example, we plan to modify the existing template `_d_arrayappendT` to perform `~=` regardless of whether the rhs is an array or a single element. This way, CTFE will be able to identify calls to `_d_arrayappendT`, convert them to `a ~= b` and then interpret the latter expression. Additionally, we have also considered an alternative solution, whereby we introduce a new visitor between CTFE and the intermediate code generator. This visitor would eliminate all `__ctfe` `CondExp`s and `IfStatement`s as well as their `true` branches before passing the AST to the IR generator. This solution is, however, inefficient, as it adds another pass through the AST in order to remove some code that we ourselves insert. The real problem is the fact that the hooks do not perform the exact same actions as the expressions from which they’re lowered and this approach doesn’t solve the problem. The first approach, however, does. Do you suggest any other solutions than those we propose? Thanks, Teodor
Jan 17 2022
On Monday, 17 January 2022 at 15:25:57 UTC, Teodor Dutu wrote:Hi, The current workflow of the compiler is that semantic analysis deduces types for all expressions in the AST. Then, if CTFE is required, the compiler performs the interpretation in `dinterpret.d`. ```d bool f() { // ... } static assert(f()); ``` Before the backend can generate the code, the intermediate code generator performs lowerings from expressions such as `a ~= b` (when `b` is an array) to `_d_arrayappendT(a, b)` [here](https://github.com/dlang/dmd/blob/25bf00749406171f4e7b52dbf0b6df9cb1181854/src/dmd/e2ir.d#L2715-L2734). [...]Moving rewriting steps further up the compiler is a good thing. In this case it seems like a practical necessity anyway because ldc and GDC will have to do this eventually and they do not use e2ir et al, at all.
Jan 17 2022
On 17.01.22 16:25, Teodor Dutu wrote:Do you suggest any other solutions than those we propose?Detect `if(__ctfe)` and don't do the rewrites in its body. Then use `if(__ctfe)` in the runtime hooks to provide an implementation that's compatible with CTFE. Of course, this won't work with the hooks that you have designed in a way that invokes UB, but I think that's a feature as those should be redesigned.
Jan 17 2022