www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - SIMD-specialized overloads of Phobos algorithms

reply Per =?UTF-8?B?Tm9yZGzDtnc=?= <per.nordlow gmail.com> writes:
According to

http://0x80.pl/notesen/2018-10-03-simd-index-of-min.html

SIMD-tuning a Phobos function, such as,

     std.algorithm.searching.minIndex

for an `int[]`-haystack on AVX512f leads to a speedup of 15x.

Should such specializations be added to Phobos or is such an 
optimization only possible for LDC or GCC but not for DMD?

Further, when will compilers, such as LDC and GDC, be able to do 
these auto-vectorizations automatically? Will the GCC and Clang 
compiler setting of `-march=native` play a role also for LDC in 
the future. Currently (in LDC 1.16.0) the setting `-march=native` 
is not allowed.
Jun 28
next sibling parent reply kinke <noone nowhere.com> writes:
On Friday, 28 June 2019 at 15:49:34 UTC, Per Nordlöw wrote:
 Will the GCC and Clang compiler setting of `-march=native` play 
 a role also for LDC in the future. Currently (in LDC 1.16.0) 
 the setting `-march=native` is not allowed.
It's `-mcpu=native`.
Jun 28
parent reply Per =?UTF-8?B?Tm9yZGzDtnc=?= <per.nordlow gmail.com> writes:
On Friday, 28 June 2019 at 16:30:31 UTC, kinke wrote:
 It's `-mcpu=native`.
Can/Could the value of this switch be detected at compile-time?
Jun 28
next sibling parent reply Nicholas Wilson <iamthewilsonator hotmail.com> writes:
On Friday, 28 June 2019 at 19:53:10 UTC, Per Nordlöw wrote:
 On Friday, 28 June 2019 at 16:30:31 UTC, kinke wrote:
 It's `-mcpu=native`.
Can/Could the value of this switch be detected at compile-time?
Well from the point of view of the code you usually don't really care because it's just like passing a higher value of n to `-On`. It's probably already implemented as an LDC specific __trait, but if it isn't it shouldn't be too hard to add to __traits(getTargetInfo) (the optimiser values that -mcpu=native sets probably will be made available through getTargetInfo at some point though).
Jun 28
parent reply Johan Engelen <j j.nl> writes:
On Friday, 28 June 2019 at 22:53:45 UTC, Nicholas Wilson wrote:
 On Friday, 28 June 2019 at 19:53:10 UTC, Per Nordlöw wrote:
 On Friday, 28 June 2019 at 16:30:31 UTC, kinke wrote:
 It's `-mcpu=native`.
Can/Could the value of this switch be detected at compile-time?
It's probably already implemented as an LDC specific __trait,
Indeed: https://wiki.dlang.org/LDC-specific_language_changes#targetCPU -Johan
Jun 30
parent reply Per =?UTF-8?B?Tm9yZGzDtnc=?= <per.nordlow gmail.com> writes:
On Sunday, 30 June 2019 at 15:40:07 UTC, Johan Engelen wrote:
 Indeed:
 https://wiki.dlang.org/LDC-specific_language_changes#targetCPU

 -Johan
Cool. Even better, https://wiki.dlang.org/LDC-specific_language_changes#targetHasFeature is _exactly_ what I want! :) Still, the question remains to be answered: Where should these CPU-specialized overloads of Phobos algorithms be placed?
Jul 01
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 7/1/19 11:52 AM, Per Nordlöw wrote:
 On Sunday, 30 June 2019 at 15:40:07 UTC, Johan Engelen wrote:
 Indeed:
 https://wiki.dlang.org/LDC-specific_language_changes#targetCPU

 -Johan
Cool. Even better, https://wiki.dlang.org/LDC-specific_language_changes#targetHasFeature is _exactly_ what I want! :) Still, the question remains to be answered:     Where should these CPU-specialized overloads of Phobos algorithms be placed?
There are several schools of thought. A simple way to ease into it is to place specializations with the algorithms. That's transparent to coders and backward compatible. The decision to create visible, user-selectable versions can be thus postponed.
Jul 04
prev sibling parent a11e99z <black80 bk.ru> writes:
On Friday, 28 June 2019 at 19:53:10 UTC, Per Nordlöw wrote:
 On Friday, 28 June 2019 at 16:30:31 UTC, kinke wrote:
 It's `-mcpu=native`.
Can/Could the value of this switch be detected at compile-time?
dynamic compilation may be useful too https://forum.dlang.org/post/bskpxhrqyfkvaqzoospx forum.dlang.org
Jun 30
prev sibling parent 9il <ilyayaroshenko gmail.com> writes:
On Friday, 28 June 2019 at 15:49:34 UTC, Per Nordlöw wrote:
 According to

 http://0x80.pl/notesen/2018-10-03-simd-index-of-min.html

 SIMD-tuning a Phobos function, such as,

     std.algorithm.searching.minIndex

 for an `int[]`-haystack on AVX512f leads to a speedup of 15x.

 Should such specializations be added to Phobos or is such an 
 optimization only possible for LDC or GCC but not for DMD?
Specializations are welcome for mir-algorithm. http://mir-algorithm.libmir.org/mir_algorithm_iteration.html
Jul 01