www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.announce - ndBenchmarks #1: ndslice.algorithm vs std.numeric vs std.algorithm

reply Ilya Yaroshenko <ilyayaroshenko gmail.com> writes:
Hi all,

There are two first [1] benchmarks for upcoming ndslice.algorithm 
[2].
Recent LDC alpha based on LLVM 3.8 and recent Mir v0.16.0-alpha3 
are required.  fasmath syntax may be changed a little bit and 
will be simplified anyway.

Dot Product:

        ndReduce vectorized = 3 ms, 314 μs
                   ndReduce = 14 ms, 767 μs
numeric.dotProduct, arrays = 7 ms, 260 μs
numeric.dotProduct, slices = 14 ms, 782 μs
               zip & reduce = 74 ms, 280 μs

Euclidean Distance:

                 ndReduce vectorized = 3 ms, 668 μs
                            ndReduce = 14 ms, 595 μs
   numeric.euclideanDistance, arrays = 14 ms, 463 μs
   numeric.euclideanDistance, slices = 14 ms, 465 μs
                        zip & reduce = 73 ms, 678 μs

[1] https://github.com/libmir/mir/tree/master/benchmarks/ndslice
[2] https://github.com/dlang/phobos/pull/4652

Best regards,
Ilya
Aug 03 2016
next sibling parent Ilya Yaroshenko <ilyayaroshenko gmail.com> writes:
On Wednesday, 3 August 2016 at 20:53:59 UTC, Ilya Yaroshenko 
wrote:

Update:

 Dot Product:
               zip & reduce = 74 ms, 280 μs
zip & reduce = 44 ms, 57 μs
 Euclidean Distance:
                        zip & reduce = 73 ms, 678 μs
zip & reduce = 44 ms, 646 μs
Aug 03 2016
prev sibling next sibling parent Ilya Yaroshenko <ilyayaroshenko gmail.com> writes:
The tests above are for double precision floating point numbers. 
The results for single precision are below.

Dot Product (single precision):

        ndReduce vectorized = 2 ms, 200 μs
                   ndReduce = 14 ms, 543 μs
numeric.dotProduct, arrays = 7 ms, 208 μs
numeric.dotProduct, slices = 14 ms, 414 μs
               zip & reduce = 43 ms, 657 μs

Euclidean Distance (single precisoin):

                 ndReduce vectorized = 2 ms, 226 μs
                            ndReduce = 14 ms, 661 μs
   numeric.euclideanDistance, arrays = 14 ms, 597 μs
   numeric.euclideanDistance, slices = 14 ms, 581 μs
                        zip & reduce = 46 ms, 759 μs
Aug 03 2016
prev sibling next sibling parent Seb <seb wilzba.ch> writes:
On Wednesday, 3 August 2016 at 20:53:59 UTC, Ilya Yaroshenko 
wrote:
 Hi all,

 There are two first [1] benchmarks for upcoming 
 ndslice.algorithm [2].
 Recent LDC alpha based on LLVM 3.8 and recent Mir 
 v0.16.0-alpha3 are required.  fasmath syntax may be changed a 
 little bit and will be simplified anyway.

 [...]
Ilya: The result are awesome!! Let's make some noise: https://www.reddit.com/r/programming/comments/4w16i5/ndslicealgorithm_speed_up_your_matrix/
Aug 03 2016
prev sibling next sibling parent reply Johan Engelen <j j.nl> writes:
On Wednesday, 3 August 2016 at 20:53:59 UTC, Ilya Yaroshenko 
wrote:
 Hi all,

 There are two first [1] benchmarks for upcoming 
 ndslice.algorithm [2].
 Recent LDC alpha based on LLVM 3.8 and recent Mir 
 v0.16.0-alpha3 are required.  fasmath syntax may be changed a 
 little bit and will be simplified anyway.

 Dot Product:

        ndReduce vectorized = 3 ms, 314 μs
                   ndReduce = 14 ms, 767 μs
**That's** the difference with or without fastmath?? (awesome work of course!) -Johan
Aug 03 2016
parent Ilya Yaroshenko <ilyayaroshenko gmail.com> writes:
On Wednesday, 3 August 2016 at 22:22:19 UTC, Johan Engelen wrote:
 On Wednesday, 3 August 2016 at 20:53:59 UTC, Ilya Yaroshenko
 Dot Product:

        ndReduce vectorized = 3 ms, 314 μs
                   ndReduce = 14 ms, 767 μs
**That's** the difference with or without fastmath??
The first one is with fastmath and addition execution branch for iteration in case of stride equal to 1.
Aug 03 2016
prev sibling next sibling parent jmh530 <john.michael.hall gmail.com> writes:
On Wednesday, 3 August 2016 at 20:53:59 UTC, Ilya Yaroshenko 
wrote:
 Hi all,

 There are two first [1] benchmarks for upcoming 
 ndslice.algorithm [2].
 Recent LDC alpha based on LLVM 3.8 and recent Mir 
 v0.16.0-alpha3 are required.  fasmath syntax may be changed a 
 little bit and will be simplified anyway.
Keep up the good work!
Aug 03 2016
prev sibling parent Ilya Yaroshenko <ilyayaroshenko gmail.com> writes:
On Wednesday, 3 August 2016 at 20:53:59 UTC, Ilya Yaroshenko 
wrote:
 Hi all,

 There are two first [1] benchmarks for upcoming 
 ndslice.algorithm [2].
 Recent LDC alpha based on LLVM 3.8 and recent Mir 
 v0.16.0-alpha3 are required.  fasmath syntax may be changed a 
 little bit and will be simplified anyway.

 [...]
The PR and Mir v0.16.0-alpha7 have half and triangular selections. They are very helpful to work with matrixes.
Aug 09 2016