digitalmars.D.ldc - Simd instructions
- bearophile (17/17) Jul 18 2013 As a final step to compute the product of two complex numbers I
- jerro (1/1) Jul 18 2013 Try adding flag -mattr=sse3.
- bearophile (5/6) Jul 18 2013 Now it's accepted, thank you. So is LDC2 assuming a very old CPU?
- Kai Nacke (10/16) Jul 22 2013 Hi,
As a final step to compute the product of two complex numbers I perform a simd operation on double2: x3 = [x3.array[1] - x2.array[1], x3.array[0] + x2.array[0]]; But ldc2 compiles that quite badly (I don't know who's to blame, if necessary I will open a LLVM bug report), so I have tried to use an instruction addsubpd. To do it I have imported ldc.gccbuiltins_x86 and then I use: x3 = __builtin_ia32_addsubpd(x3, x2); but ldc2 gives me: LLVM ERROR: Cannot select: intrinsic %llvm.x86.sse3.addsub.pd Stack dump: 0. Running pass 'X86 DAG->DAG Instruction Selection' on function ' "\01__D12complex_mul217__T8compMul6Vk12Z8compMul6FNaNbNfKG12NhG2dKG12NhG2dKG12NhG2dZv"' Can you help me? Bye, bearophile
Jul 18 2013
jerro:Try adding flag -mattr=sse3.Now it's accepted, thank you. So is LDC2 assuming a very old CPU? :-) Bye, bearophile
Jul 18 2013
On Thursday, 18 July 2013 at 21:30:12 UTC, bearophile wrote:jerro:Hi, the behaviour was changed because you can't create a generic package if you optimize for your CPU. (https://github.com/ldc-developers/ldc/issues/414). With LLVM 3.3, the auto vectorizer is not enabled. You have to specify -vectorize on the command line. Maybe you want to try that with your original code. KaiTry adding flag -mattr=sse3.Now it's accepted, thank you. So is LDC2 assuming a very old CPU? :-) Bye, bearophile
Jul 22 2013