www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - __simd_sto confusion

reply Nachtraaf <nachtraaf80 gmail.com> writes:
I'm trying to create some linear algebra functions using simd 
intrinsics. I watched the dconf 2013 presentation by Manu Evans 
but i'm still confused about some aspects and the following piece 
of code doesn't work. I'm trying to copy the result of a dot 
product from the register to memory but dmd fails with an 
overload resolution error, which i guess is due some implicit 
conversion?

dmd error:

simd1.d(34): Error: core.simd.__simd_sto called with argument 
types (XMM, float, __vector(float[4])) matches both:
/usr/include/dlang/dmd/core/simd.d(434):     
core.simd.__simd_sto(XMM opcode, double op1, __vector(void[16]) 
op2)
and:
/usr/include/dlang/dmd/core/simd.d(435):     
core.simd.__simd_sto(XMM opcode, float op1, __vector(void[16]) 
op2)

from the following piece of code:

float dot_simd1(float4  a, float4 b)
{
     float4 result = __simd(XMM.DPPS, a, b, 0xFF);
     float value;
     __simd_sto(XMM.STOSS, value, result);
     return value;
}

What am I doing wrong here?
Oct 03 2015
next sibling parent reply Marco Leise <Marco.Leise gmx.de> writes:
This is a bug in overload resolution when __vector(void[16])
is involved. You can go around it by changing float4 to void16,
only to run into an internal compiler error:
  backend/gother.c 988
So file a bug for both   issues.dlang.org
Also it looks like DMD wants you to use the return value of
the intrinsic, is that expected?

-- 
Marco
Oct 03 2015
parent reply Nachtraaf <nachtraaf80 gmail.com> writes:
On Saturday, 3 October 2015 at 15:39:33 UTC, Marco Leise wrote:
 This is a bug in overload resolution when __vector(void[16])
 is involved. You can go around it by changing float4 to void16,
 only to run into an internal compiler error:
   backend/gother.c 988
 So file a bug for both   issues.dlang.org
 Also it looks like DMD wants you to use the return value of
 the intrinsic, is that expected?
I guessed I wouldn't need the return value as the intel C intrinsic for this opcode has a void return type. I did try supplying a return type but I couldn't circumvent the overload error so I had no clue if it would make any difference. I changed the type of result to void16 like this: float dot_simd1(float4 a, float4 b) { void16 result = __simd(XMM.DPPS, a, b, 0xFF); float value; __simd_sto(XMM.STOSS, value, result); return value; } and for me this code compiles and runs without any errors now. I'm using DMD64 D Compiler v2.068 on Linux. If you got an internal compiler error that means that it's a compiler bug though I have no clue what. Did you try the same thing I did or casting the variable? I guess I should file a bugreport for overload resolution if it's not a duplicate for now?
Oct 03 2015
parent Marco Leise <Marco.Leise gmx.de> writes:
Am Sat, 03 Oct 2015 23:42:22 +0000
schrieb Nachtraaf <nachtraaf80 gmail.com>:

 I changed the type of result to void16 like this:
 
 float dot_simd1(float4  a, float4 b)
 {
      void16 result = __simd(XMM.DPPS, a, b, 0xFF);
      float value;
      __simd_sto(XMM.STOSS, value, result);
      return value;
 }
 
 and for me this code compiles and runs without any errors now.
 I'm using DMD64 D Compiler v2.068 on Linux. If you got an 
 internal compiler error that means that it's a compiler bug 
 though I have no clue what. Did you try the same thing I did or 
 casting the variable?
 I guess I should file a bugreport for overload resolution if it's 
 not a duplicate for now?
Yes. At some point the intrinsics will need a more thorough rework. Currently none of those that return void, int or set flags work as they should. -- Marco
Oct 03 2015
prev sibling parent reply Benjamin Thaut <code benjamin-thaut.de> writes:
On Saturday, 3 October 2015 at 14:47:02 UTC, Nachtraaf wrote:
 I'm trying to create some linear algebra functions using simd 
 intrinsics. I watched the dconf 2013 presentation by Manu Evans 
 but i'm still confused about some aspects and the following 
 piece of code doesn't work. I'm trying to copy the result of a 
 dot product from the register to memory but dmd fails with an 
 overload resolution error, which i guess is due some implicit 
 conversion?

 dmd error:

 simd1.d(34): Error: core.simd.__simd_sto called with argument 
 types (XMM, float, __vector(float[4])) matches both:
 /usr/include/dlang/dmd/core/simd.d(434):     
 core.simd.__simd_sto(XMM opcode, double op1, __vector(void[16]) 
 op2)
 and:
 /usr/include/dlang/dmd/core/simd.d(435):     
 core.simd.__simd_sto(XMM opcode, float op1, __vector(void[16]) 
 op2)

 from the following piece of code:

 float dot_simd1(float4  a, float4 b)
 {
     float4 result = __simd(XMM.DPPS, a, b, 0xFF);
     float value;
     __simd_sto(XMM.STOSS, value, result);
     return value;
 }

 What am I doing wrong here?
core.simd is horribly broken. I recommend that you avoid it for any serious work. If you want to do simd programming with D get LDC or GDC and use their simd intrinsics instead of core.simd. If you have to do simd with dmd write inline assembly.
Oct 04 2015
parent Nachtraaf <nachtraaf80 gmail.com> writes:
That's a shame. I've read that each compiler has his own quirks 
and not support everything dmd supports. I do want to keep the 
code as portable as possible. Guess I'll try using inline 
assembler and runtime checks for the right cpu architecture.

Thanks for the help people.
Oct 04 2015