digitalmars.D.learn - how to benchmark pure functions?

ab (13/13) Oct 27 2022 Hi,

Imperatorn (4/17) Oct 27 2022 Sorry, I don't understand what you're saying.

H. S. Teoh (22/36) Oct 27 2022 [...]

Dennis (32/34) Oct 27 2022 With many C compilers, you can use volatile assembly blocks for

max haughton (5/39) Oct 29 2022 I recommend a volatile data dependency rather than injecting

ab (9/22) Oct 28 2022 Thanks to H.S. Teoh and Dennis for the suggestions, they both

Imperatorn (2/12) Oct 28 2022 Yeah I didn't read carefully enough sorry 🌷
Siarhei Siamashka (10/13) Oct 28 2022 I used the volatileLoad/volatileStore functions to ensure that

ab <not_a_real_address nowhere.ab> writes:

Hi,

when trying to compare different implementations of the optimized 
builds of a pure function using benchmark from 
std.datetime.stopwatch, I get times equal to zero, I suppose 
because the functions are not executed as they do not have side 
effects.

The same happens with the example from the documentation:
https://dlang.org/library/std/datetime/stopwatch/benchmark.html

How can I prevent the compiler from removing the code I want to 
measure? Is there some utility in the standard library or pragma 
that I should use?

Thanks

AB

Oct 27 2022

Imperatorn <johan_forsberg_86 hotmail.com> writes:

On Thursday, 27 October 2022 at 17:17:01 UTC, ab wrote:
 Hi,

 when trying to compare different implementations of the 
 optimized builds of a pure function using benchmark from 
 std.datetime.stopwatch, I get times equal to zero, I suppose 
 because the functions are not executed as they do not have side 
 effects.

 The same happens with the example from the documentation:
 https://dlang.org/library/std/datetime/stopwatch/benchmark.html

 How can I prevent the compiler from removing the code I want to 
 measure? Is there some utility in the standard library or 
 pragma that I should use?

 Thanks

 AB

Sorry, I don't understand what you're saying.

The examples work for me. Can you provide an exact code example 
which does not work as expected for you?

Oct 27 2022

"H. S. Teoh" <hsteoh qfbox.info> writes:

On Thu, Oct 27, 2022 at 06:20:10PM +0000, Imperatorn via Digitalmars-d-learn
wrote:
 On Thursday, 27 October 2022 at 17:17:01 UTC, ab wrote:
 Hi,
 
 when trying to compare different implementations of the optimized
 builds of a pure function using benchmark from
 std.datetime.stopwatch, I get times equal to zero, I suppose because
 the functions are not executed as they do not have side effects.
 
 The same happens with the example from the documentation:
 https://dlang.org/library/std/datetime/stopwatch/benchmark.html
 
 How can I prevent the compiler from removing the code I want to
 measure?  Is there some utility in the standard library or pragma
 that I should use?


[...]

To prevent the optimizer from eliding the function completely, you need
to do something with the return value.  Usually, this means you combine
the return value into some accumulating variable, e.g., if it's an int
function, have a running int accumulator that you add to:

	int funcToBeMeasured(...) pure { ... }

	int accum;
	auto results = benchmark!({ 
		// Don't just call funcToBeMeasured and ignore the value
		// here, otherwise the optimizer may delete the call
		// completely.
		accum += funcToBeMeasured(...);
	});

Then at the end of the benchmark, do something with the accumulated
value, like print out its value to stdout, so that the optimizer doesn't
notice that the value is unused, and decide to kill all previous
assignments to it. Something like `writeln(accum);` at the end should do
the trick.


T

-- 
Indifference will certainly be the downfall of mankind, but who cares? --
Miquel van Smoorenburg

Oct 27 2022

Dennis <dkorpel gmail.com> writes:

On Thursday, 27 October 2022 at 17:17:01 UTC, ab wrote:
 How can I prevent the compiler from removing the code I want to 
 measure?

With many C compilers, you can use volatile assembly blocks for 
that. With LDC -O3, a regular assembly block also does the trick 
currently:

```D
void main()
{
     import std.datetime.stopwatch;
     import std.stdio: write, writeln, writef, writefln;
     import std.conv : to;

     void f0() {}
     void f1()
     {
         foreach(i; 0..4_000_000)
         {
             // nothing, loop gets optimized out
         }
     }
     void f2()
     {
         foreach(i; 0..4_000_000)
         {
             // defeat optimizations
             asm  safe pure nothrow  nogc {}
         }
     }
     auto r = benchmark!(f0, f1, f2)(1);
     writeln(r[0]); // 4 μs
     writeln(r[1]); // 4 μs
     writeln(r[2]); // 1 ms
}
```

Oct 27 2022

max haughton <maxhaton gmail.com> writes:

On Thursday, 27 October 2022 at 18:41:36 UTC, Dennis wrote:
 On Thursday, 27 October 2022 at 17:17:01 UTC, ab wrote:
 How can I prevent the compiler from removing the code I want 
 to measure?

 With many C compilers, you can use volatile assembly blocks for 
 that. With LDC -O3, a regular assembly block also does the 
 trick currently:

 ```D
 void main()
 {
     import std.datetime.stopwatch;
     import std.stdio: write, writeln, writef, writefln;
     import std.conv : to;

     void f0() {}
     void f1()
     {
         foreach(i; 0..4_000_000)
         {
             // nothing, loop gets optimized out
         }
     }
     void f2()
     {
         foreach(i; 0..4_000_000)
         {
             // defeat optimizations
             asm  safe pure nothrow  nogc {}
         }
     }
     auto r = benchmark!(f0, f1, f2)(1);
     writeln(r[0]); // 4 μs
     writeln(r[1]); // 4 μs
     writeln(r[2]); // 1 ms
 }
 ```

I recommend a volatile data dependency rather than injecting 
volatile ASM into code FYI i.e. don't modify the pure function 
but rather make sure the result is actually used in the eyes of 
the compiler.

Oct 29 2022

ab <not_a_real_address nowhere.ab> writes:

On Thursday, 27 October 2022 at 17:17:01 UTC, ab wrote:
 Hi,

 when trying to compare different implementations of the 
 optimized builds of a pure function using benchmark from 
 std.datetime.stopwatch, I get times equal to zero, I suppose 
 because the functions are not executed as they do not have side 
 effects.

 The same happens with the example from the documentation:
 https://dlang.org/library/std/datetime/stopwatch/benchmark.html

 How can I prevent the compiler from removing the code I want to 
 measure? Is there some utility in the standard library or 
 pragma that I should use?

 Thanks

 AB

Thanks to H.S. Teoh and Dennis for the suggestions, they both 
work. I like the empty asm block a bit more because it is less 
invasive, but it only works with ldc.

 Imperatorn see Dennis code for an example. 
std.datetime.benchmark works, but at high optimization level 
(-O2, -O3) the loop can be removed and the time brought down to 
0hnsec. E.g. try "ldc2 -O3 -run dennis.d".

AB

Oct 28 2022

Imperatorn <johan_forsberg_86 hotmail.com> writes:

On Friday, 28 October 2022 at 09:48:14 UTC, ab wrote:
 On Thursday, 27 October 2022 at 17:17:01 UTC, ab wrote:
 [...]

 Thanks to H.S. Teoh and Dennis for the suggestions, they both 
 work. I like the empty asm block a bit more because it is less 
 invasive, but it only works with ldc.

  Imperatorn see Dennis code for an example. 
 std.datetime.benchmark works, but at high optimization level 
 (-O2, -O3) the loop can be removed and the time brought down to 
 0hnsec. E.g. try "ldc2 -O3 -run dennis.d".

 AB

Yeah I didn't read carefully enough sorry 🌷

Oct 28 2022

Siarhei Siamashka <siarhei.siamashka gmail.com> writes:

On Friday, 28 October 2022 at 09:48:14 UTC, ab wrote:
 Thanks to H.S. Teoh and Dennis for the suggestions, they both 
 work. I like the empty asm block a bit more because it is less 
 invasive, but it only works with ldc.

I used the volatileLoad/volatileStore functions to ensure that 
the compiler doesn't find a way to optimize out the code (for 
example, move repetitive calculations out of the loop or even do 
them at compile time) and the RDTSC/RDTSCP instruction via inline 
assembly for measurements: 
https://gist.github.com/ssvb/5c926ed9bc755900fdaac3b71a0f7cfd

The goal was to have a very fast way to check (with no measurable 
overhead) whether reasonable optimization options had been 
supplied to the compiler.

Oct 28 2022

D Programming

C/C++ Programming

Other

digitalmars.D.learn - how to benchmark pure functions?