digitalmars.D - Auto-Vectorization and array/vector operations
- Steven (33/33) Jul 15 2015 I was trying to show someone how awesome Dlang was earlier, and
- jmh530 (6/12) Jul 15 2015 I'm not sure how the compilers handle auto-vectorization, but I
- John Colvin (5/13) Jul 16 2015 Not sure why DMD isn't using SIMD on the first one, haven't
- Iain Buclaw via Digitalmars-d (11/25) Jul 16 2015 DMD makes leverage of vector operations in the library, rather than in
- "Casper =?UTF-8?B?RsOmcmdlbWFuZCI=?= <shorttail hotmail.com> (15/20) Jul 18 2015 If you want to use vector operations, YOU have to write the code
I was trying to show someone how awesome Dlang was earlier, and how the vector operations are expected to take advantage of the CPU vector instructions, and was dumbstruck when dmd and gdc both failed to auto-vectorize a simple case. I've stripped it down to the bare minimum and loaded the example on the interactive compiler: The reference documentation for arrays says: Implementation note: many of the more common vector operations are expected to take advantage of any vector math instructions available on the target computer. Does this mean that while compilers are expected to take advantage of them, they currently do not, even when they have proper alignment? I haven't tried LDC yet, so maybe LDC does perform auto-vectorization and I should attempt to use LDC if I plan on using vector ops a lot? import core.simd; float[256] exampleA(float[256] a, float[256] b) { float[256] c; // results in subss (scalar instruction) c[] = a[] - b[]; return c; } float[256] exampleB(float[256] a, float[256] b) { float8[32]va = cast(float8[32])a; float8[32]vb = cast(float8[32])b; float8[32]vc; // results in subps (vector instruction) vc[] = va[] - vb[]; return cast(float[256])vc; }
Jul 15 2015
On Wednesday, 15 July 2015 at 22:42:05 UTC, Steven wrote:I was trying to show someone how awesome Dlang was earlier, and how the vector operations are expected to take advantage of the CPU vector instructions, and was dumbstruck when dmd and gdc both failed to auto-vectorize a simple case. I've stripped it down to the bare minimum and loaded the example on the interactive compiler:I'm not sure how the compilers handle auto-vectorization, but I found http://dconf.org/2013/talks/evans_2.html informative. It recommends not casting between float and simd types.
Jul 15 2015
On Wednesday, 15 July 2015 at 22:42:05 UTC, Steven wrote:I was trying to show someone how awesome Dlang was earlier, and how the vector operations are expected to take advantage of the CPU vector instructions, and was dumbstruck when dmd and gdc both failed to auto-vectorize a simple case. I've stripped it down to the bare minimum and loaded the example on the interactive compiler: [...]Not sure why DMD isn't using SIMD on the first one, haven't looked at that code in a while. Anyway, gdc vectorises both: http://goo.gl/CzD15s and that's with gcc4.9 backend, it can probably do better build against something more recent.
Jul 16 2015
On 16 July 2015 at 00:42, Steven via Digitalmars-d <digitalmars-d puremagic.com> wrote:I was trying to show someone how awesome Dlang was earlier, and how the vector operations are expected to take advantage of the CPU vector instructions, and was dumbstruck when dmd and gdc both failed to auto-vectorize a simple case. I've stripped it down to the bare minimum and loaded the example on the interactive compiler: The reference documentation for arrays says: Implementation note: many of the more common vector operations are expected to take advantage of any vector math instructions available on the target computer.DMD makes leverage of vector operations in the library, rather than in the generated code. So as long as you are doing array operations using any of the supported types...Does this mean that while compilers are expected to take advantage of them, they currently do not, even when they have proper alignment? I haven't tried LDC yet, so maybe LDC does perform auto-vectorization and I should attempt to use LDC if I plan on using vector ops a lot?Auto-vectorization is deliberately strict in what triggers it to occur. It is possible to give the compiler hints, however I'm not sure that this should be done by the code generator. See, for example: http://goo.gl/iMBbRs Regards Iain
Jul 16 2015
On Wednesday, 15 July 2015 at 22:42:05 UTC, Steven wrote:Does this mean that while compilers are expected to take advantage of them, they currently do not, even when they have proper alignment? I haven't tried LDC yet, so maybe LDC does perform auto-vectorization and I should attempt to use LDC if I plan on using vector ops a lot?If you want to use vector operations, YOU have to write the code for it. Addition and multiplication seem like easy things to have vectorized automatically, but it's complicated to do (I don't know of any compiler that does a convincing and reliable job of auto-vectorization) and likely it won't give you many of the other useful vector operations. Someone from a game engine company (Unreal?) held a nice talk about how SIMD pervades their code base, including memory layout, in order to get decent performance on the less trivial operations, like dot product. I don't have a link though. =/ The main reason to write the SIMD code yourself is that you know that it's going to work the way you want. There won't ever be a case where you add one more variable to some structure and the compiler decides it can no longer auto-vectorize a loop.
Jul 18 2015