digitalmars.D.learn - Profiling calls to small functions

albert-j (27/27) Jan 21 2017 Let's say I want to create an array of random numbers and do some

pineapple (5/10) Jan 21 2017 I'm not sure if it's what happening in this case but, in code as

albert-j (3/7) Jan 21 2017 When compiled with -inline, the profiler does not report the
albert-j (7/11) Jan 22 2017 I guess my question is whether it is possible to have meaningful

albert-j <djftgls ifdflv.com> writes:

Let's say I want to create an array of random numbers and do some 
operations on them:

void main() {

     import std.random;

     //Generate array of random numbers
     int arrSize = 100000000;
     double[] arr = new double[](arrSize);
     foreach (i; 0..arrSize)
         arr[i] = uniform01();

     //Call funcA on array elements
     foreach (i; 1..arr.length-1)
         funcA(arr,i);
}

void funcA(double[] arr, size_t i) {
     arr[i+1] = arr[i-1]+arr[i];
     funcB(arr,i);
}

void funcB(double[] arr, size_t i) {
     arr[i-1]= arr[i] + arr[i+1];
     arr[i] = arr[i-1] + arr[i+1];
     arr[i+1]= arr[i-1] + arr[i];
}

Now I dmd -profile it and look at the performance of funcA with 
d-profile-viewer. Inside funcA, only 20% of time is spend in 
funcB, but the rest 80% is self-time of funcA. How is it 
possible, when funcB has three times the calculations of funcA? 
It appears that the call to funcB itself is very expensive.

Jan 21 2017

pineapple <meapineapple gmail.com> writes:

On Saturday, 21 January 2017 at 12:33:57 UTC, albert-j wrote:
 Now I dmd -profile it and look at the performance of funcA with 
 d-profile-viewer. Inside funcA, only 20% of time is spend in 
 funcB, but the rest 80% is self-time of funcA. How is it 
 possible, when funcB has three times the calculations of funcA? 
 It appears that the call to funcB itself is very expensive.

I'm not sure if it's what happening in this case but, in code as 
simple as this, function calls can sometimes be the bottleneck. 
You should see how compiling with/without -O affects performance, 
and adding `pragma(inline)` to funcB.

Jan 21 2017

albert-j <djftgls ifdflv.com> writes:

 I'm not sure if it's what happening in this case but, in code 
 as simple as this, function calls can sometimes be the 
 bottleneck. You should see how compiling with/without -O 
 affects performance, and adding `pragma(inline)` to funcB.

When compiled with -inline, the profiler does not report the 
performance of funcA and funcB individually, and this is what I 
want to measure.

Jan 21 2017

albert-j <djftgls ifdflv.com> writes:

 I'm not sure if it's what happening in this case but, in code 
 as simple as this, function calls can sometimes be the 
 bottleneck. You should see how compiling with/without -O 
 affects performance, and adding `pragma(inline)` to funcB.

I guess my question is whether it is possible to have meaningful 
profiling results for this case, given a large cost of calling 
funcB? In release builds funcA and funcB are inlined, so profiler 
cannot report on them individually (is it correct, or am I 
misusing the profiler?). Profiling without inlining will show a 
large cost of calling funcB, but this cost will not be there in a 
release build, so the profiling results are irrelevant.

Jan 22 2017

D Programming

C/C++ Programming

Other

digitalmars.D.learn - Profiling calls to small functions