digitalmars.D.ldc - ROR is now optimized as SHR ?
- user1234 (5/5) Sep 14 2024 I'm confused, not sure if it's a codegen bug but as you can
- kinke (6/10) Sep 14 2024 What can be seen is that optimized core.bitop.ror does use the
- user1234 (15/25) Sep 14 2024 I understand that it's not LDC fault, sorry but that was so weird
- Johan (3/29) Sep 14 2024 Looks like a bad bug.
- David Nadlinger (3/5) Sep 14 2024 Ah, just saw this now; I filed an upstream bug too: https://github.com/l...
I'm confused, not sure if it's a codegen bug but as you can observe here https://godbolt.org/z/PKn4Tnzff, it seems that since LDC 1.38, a SHR is generated but 1.37 previously it was a ROL (and not ROR either). Have a nice week-end ;)
Sep 14 2024
On Saturday, 14 September 2024 at 11:20:26 UTC, user1234 wrote:I'm confused, not sure if it's a codegen bug but as you can observe here https://godbolt.org/z/PKn4Tnzff, it seems that since LDC 1.38, a SHR is generated but 1.37 previously it was a ROL (and not ROR either).What can be seen is that optimized core.bitop.ror does use the ror instruction, but the comparison with inlined ror can apparently be transformed by the optimizer, and LLVM 18 (LDC v1.38+) chooses a different transformation with movzx+shr instead of LLVM 17's rol.
Sep 14 2024
On Saturday, 14 September 2024 at 11:53:14 UTC, kinke wrote:On Saturday, 14 September 2024 at 11:20:26 UTC, user1234 wrote:I understand that it's not LDC fault, sorry but that was so weird that I needed to know. The transformation operated by LLVM is just wrong. I guess it's caused by a combination of aggressive optimz + folding. That should just generates ```asm mov rax, rdi rol rdi, 8 cmp rax, rdi sete al ret ``` Actually If you use the LLVM asm to write a naked asm function then you just get wrong results. Thanks for you time.I'm confused, not sure if it's a codegen bug but as you can observe here https://godbolt.org/z/PKn4Tnzff, it seems that since LDC 1.38, a SHR is generated but 1.37 previously it was a ROL (and not ROR either).What can be seen is that optimized core.bitop.ror does use the ror instruction, but the comparison with inlined ror can apparently be transformed by the optimizer, and LLVM 18 (LDC v1.38+) chooses a different transformation with movzx+shr instead of LLVM 17's rol.
Sep 14 2024
On Saturday, 14 September 2024 at 15:18:43 UTC, user1234 wrote:On Saturday, 14 September 2024 at 11:53:14 UTC, kinke wrote:Looks like a bad bug. https://github.com/ldc-developers/ldc/issues/4753On Saturday, 14 September 2024 at 11:20:26 UTC, user1234 wrote:I understand that it's not LDC fault, sorry but that was so weird that I needed to know. The transformation operated by LLVM is just wrong. I guess it's caused by a combination of aggressive optimz + folding. That should just generates ```asm mov rax, rdi rol rdi, 8 cmp rax, rdi sete al ret ``` Actually If you use the LLVM asm to write a naked asm function then you just get wrong results. Thanks for you time.I'm confused, not sure if it's a codegen bug but as you can observe here https://godbolt.org/z/PKn4Tnzff, it seems that since LDC 1.38, a SHR is generated but 1.37 previously it was a ROL (and not ROR either).What can be seen is that optimized core.bitop.ror does use the ror instruction, but the comparison with inlined ror can apparently be transformed by the optimizer, and LLVM 18 (LDC v1.38+) chooses a different transformation with movzx+shr instead of LLVM 17's rol.
Sep 14 2024
On 14 Sep 2024, at 23:16, Johan via digitalmars-d-ldc wrote:Looks like a bad bug. https://github.com/ldc-developers/ldc/issues/4753Ah, just saw this now; I filed an upstream bug too: https://github.com/llvm/llvm-project/issues/108722 —David
Sep 14 2024