www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.ldc - ROR is now optimized as SHR ?

reply user1234 <user1234 12.de> writes:
I'm confused, not sure if it's a codegen bug but as you can 
observe here https://godbolt.org/z/PKn4Tnzff, it seems that since 
LDC 1.38, a SHR is generated but 1.37 previously it was a ROL 
(and not ROR either).

Have a nice week-end ;)
Sep 14 2024
parent reply kinke <noone nowhere.com> writes:
On Saturday, 14 September 2024 at 11:20:26 UTC, user1234 wrote:
 I'm confused, not sure if it's a codegen bug but as you can 
 observe here https://godbolt.org/z/PKn4Tnzff, it seems that 
 since LDC 1.38, a SHR is generated but 1.37 previously it was a 
 ROL (and not ROR either).
What can be seen is that optimized core.bitop.ror does use the ror instruction, but the comparison with inlined ror can apparently be transformed by the optimizer, and LLVM 18 (LDC v1.38+) chooses a different transformation with movzx+shr instead of LLVM 17's rol.
Sep 14 2024
parent reply user1234 <user1234 12.de> writes:
On Saturday, 14 September 2024 at 11:53:14 UTC, kinke wrote:
 On Saturday, 14 September 2024 at 11:20:26 UTC, user1234 wrote:
 I'm confused, not sure if it's a codegen bug but as you can 
 observe here https://godbolt.org/z/PKn4Tnzff, it seems that 
 since LDC 1.38, a SHR is generated but 1.37 previously it was 
 a ROL (and not ROR either).
What can be seen is that optimized core.bitop.ror does use the ror instruction, but the comparison with inlined ror can apparently be transformed by the optimizer, and LLVM 18 (LDC v1.38+) chooses a different transformation with movzx+shr instead of LLVM 17's rol.
I understand that it's not LDC fault, sorry but that was so weird that I needed to know. The transformation operated by LLVM is just wrong. I guess it's caused by a combination of aggressive optimz + folding. That should just generates ```asm mov rax, rdi rol rdi, 8 cmp rax, rdi sete al ret ``` Actually If you use the LLVM asm to write a naked asm function then you just get wrong results. Thanks for you time.
Sep 14 2024
parent reply Johan <j j.nl> writes:
On Saturday, 14 September 2024 at 15:18:43 UTC, user1234 wrote:
 On Saturday, 14 September 2024 at 11:53:14 UTC, kinke wrote:
 On Saturday, 14 September 2024 at 11:20:26 UTC, user1234 wrote:
 I'm confused, not sure if it's a codegen bug but as you can 
 observe here https://godbolt.org/z/PKn4Tnzff, it seems that 
 since LDC 1.38, a SHR is generated but 1.37 previously it was 
 a ROL (and not ROR either).
What can be seen is that optimized core.bitop.ror does use the ror instruction, but the comparison with inlined ror can apparently be transformed by the optimizer, and LLVM 18 (LDC v1.38+) chooses a different transformation with movzx+shr instead of LLVM 17's rol.
I understand that it's not LDC fault, sorry but that was so weird that I needed to know. The transformation operated by LLVM is just wrong. I guess it's caused by a combination of aggressive optimz + folding. That should just generates ```asm mov rax, rdi rol rdi, 8 cmp rax, rdi sete al ret ``` Actually If you use the LLVM asm to write a naked asm function then you just get wrong results. Thanks for you time.
Looks like a bad bug. https://github.com/ldc-developers/ldc/issues/4753
Sep 14 2024
parent David Nadlinger <code klickverbot.at> writes:
On 14 Sep 2024, at 23:16, Johan via digitalmars-d-ldc wrote:
 Looks like a bad bug.
 https://github.com/ldc-developers/ldc/issues/4753
Ah, just saw this now; I filed an upstream bug too: https://github.com/llvm/llvm-project/issues/108722 —David
Sep 14 2024