www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - Strange measurements when reproducing issue 5650

reply "SomeDude" <lovelydear mailmetrash.com> writes:
Discussion here: 
http://d.puremagic.com/issues/show_bug.cgi?id=5650

On my Windows box, the following program

import std.stdio, std.container, std.range;

void main() {
     enum int range = 100;
     enum int n = 1_000_000;

     auto t = redBlackTree!int(0);

     for (int i = 0; i < n; i++) {
         if (i > range)
             t.removeFront();
         t.insert(i);
     }

     writeln(walkLength(t[]));
     //writeln(t[]);
}

runs in about 1793 ms.
The strange thing is, if I comment out the writeln line, runtimes 
are in average *slower* by about 20 ms, with timings varying a 
little bit more than when the writeln is included.

How can this be ?
Apr 25 2012
next sibling parent reply "SomeDude" <lovelydear mailmetrash.com> writes:
On Wednesday, 25 April 2012 at 08:34:40 UTC, SomeDude wrote:

Noone reproduces this ?
Apr 25 2012
next sibling parent =?UTF-8?B?TWFydGluIERyYcWhYXI=?= <drasar ics.muni.cz> writes:
Dne 25.4.2012 13:27, SomeDude napsal(a):
 On Wednesday, 25 April 2012 at 08:34:40 UTC, SomeDude wrote:

 Noone reproduces this ?
GNU/Linux DMD64 D Compiler v2.058 Commented writeln on average (10 runs) 40 ms slower. So yes, it happens to me as well. Martin
Apr 25 2012
prev sibling next sibling parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Wed, 25 Apr 2012 07:27:29 -0400, SomeDude <lovelydear mailmetrash.com>  
wrote:

 On Wednesday, 25 April 2012 at 08:34:40 UTC, SomeDude wrote:

 Noone reproduces this ?
On my linux box, it runs in about 580ms, with or without the writeln. This is what I would expect. But things can be strange when dealing with GC timings. Have you tried profiling the code to see where the time is being added? -Steve
Apr 25 2012
parent reply "SomeDude" <lovelydear mailmetrash.com> writes:
On Wednesday, 25 April 2012 at 15:35:44 UTC, Steven Schveighoffer 
wrote:
 On Wed, 25 Apr 2012 07:27:29 -0400, SomeDude 
 <lovelydear mailmetrash.com> wrote:

 On Wednesday, 25 April 2012 at 08:34:40 UTC, SomeDude wrote:

 Noone reproduces this ?
On my linux box, it runs in about 580ms, with or without the writeln. This is what I would expect. But things can be strange when dealing with GC timings. Have you tried profiling the code to see where the time is being added? -Steve
Did you compile with dmd -O -inline -release ? I'll do some profiling later.
Apr 25 2012
next sibling parent "SomeDude" <lovelydear mailmetrash.com> writes:
On Wednesday, 25 April 2012 at 17:06:00 UTC, SomeDude wrote:
 On Wednesday, 25 April 2012 at 15:35:44 UTC, Steven 
 Schveighoffer wrote:
 On Wed, 25 Apr 2012 07:27:29 -0400, SomeDude 
 <lovelydear mailmetrash.com> wrote:

 On Wednesday, 25 April 2012 at 08:34:40 UTC, SomeDude wrote:

 Noone reproduces this ?
On my linux box, it runs in about 580ms, with or without the writeln. This is what I would expect. But things can be strange when dealing with GC timings. Have you tried profiling the code to see where the time is being added? -Steve
Did you compile with dmd -O -inline -release ? I'll do some profiling later.
OK, I'm having a hard time producing data that support what I see without profiling on, but it's still quite consistent accross runs. I dunno what's wrong here.
Apr 25 2012
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Wed, 25 Apr 2012 13:05:59 -0400, SomeDude <lovelydear mailmetrash.com>  
wrote:

 On Wednesday, 25 April 2012 at 15:35:44 UTC, Steven Schveighoffer wrote:
 On Wed, 25 Apr 2012 07:27:29 -0400, SomeDude  
 <lovelydear mailmetrash.com> wrote:

 On Wednesday, 25 April 2012 at 08:34:40 UTC, SomeDude wrote:

 Noone reproduces this ?
On my linux box, it runs in about 580ms, with or without the writeln. This is what I would expect. But things can be strange when dealing with GC timings. Have you tried profiling the code to see where the time is being added? -Steve
Did you compile with dmd -O -inline -release ?
Yes. -Steve
Apr 25 2012
prev sibling parent reply David Brown <davidb davidb.org> writes:
On 2012-04-25, SomeDude <lovelydear mailmetrash.com> wrote:
 On Wednesday, 25 April 2012 at 15:35:44 UTC, Steven Schveighoffer 
 wrote:
 On Wed, 25 Apr 2012 07:27:29 -0400, SomeDude 
 <lovelydear mailmetrash.com> wrote:

 On Wednesday, 25 April 2012 at 08:34:40 UTC, SomeDude wrote:

 Noone reproduces this ?
On my linux box, it runs in about 580ms, with or without the writeln. This is what I would expect. But things can be strange when dealing with GC timings. Have you tried profiling the code to see where the time is being added? -Steve
Did you compile with dmd -O -inline -release ?
I get fairly consistent result on Linux with dmd, but one oddity with gdc. With the first writeln in and the second commented out: gdmd -O slow.d time ./slow ... 0.342 total gdmd -O -release slow.d time ./slow ... 2.866 total If I put both writelns in, I don't see the fast version. % gdc --version gdc (Debian 4.6.2-4) 4.6.2 David
Apr 26 2012
parent Marco Leise <Marco.Leise gmx.de> writes:
Am Fri, 27 Apr 2012 04:18:47 +0000 (UTC)
schrieb David Brown <davidb davidb.org>:

 On 2012-04-25, SomeDude <lovelydear mailmetrash.com> wrote:
 On Wednesday, 25 April 2012 at 15:35:44 UTC, Steven Schveighoffer 
 wrote:
 On Wed, 25 Apr 2012 07:27:29 -0400, SomeDude 
 <lovelydear mailmetrash.com> wrote:

 On Wednesday, 25 April 2012 at 08:34:40 UTC, SomeDude wrote:

 Noone reproduces this ?
On my linux box, it runs in about 580ms, with or without the writeln. This is what I would expect. But things can be strange when dealing with GC timings. Have you tried profiling the code to see where the time is being added? -Steve
Did you compile with dmd -O -inline -release ?
I get fairly consistent result on Linux with dmd, but one oddity with gdc. With the first writeln in and the second commented out: gdmd -O slow.d time ./slow ... 0.342 total gdmd -O -release slow.d time ./slow ... 2.866 total If I put both writelns in, I don't see the fast version. % gdc --version gdc (Debian 4.6.2-4) 4.6.2 David
I cannot reproduce this with the same GCC version and the latest GitHub sources for GDC. Is that behavior consistent after clean recompiles? I get the ~3 seconds result no matter how I compile. -- Marco
Apr 27 2012
prev sibling parent Jordi Sayol <g.sayol yahoo.es> writes:
Al 25/04/12 13:27, En/na SomeDude ha escrit:
 On Wednesday, 25 April 2012 at 08:34:40 UTC, SomeDude wrote:
 
 Noone reproduces this ?
 
My results in Linux: 32-bit executable: About 574ms, with or without writeln. 64-bit executable: About 798ms, with or without writeln. -- Jordi Sayol
Apr 25 2012
prev sibling next sibling parent Don Clugston <dac nospam.com> writes:
On 25/04/12 10:34, SomeDude wrote:
 Discussion here: http://d.puremagic.com/issues/show_bug.cgi?id=5650

 On my Windows box, the following program

 import std.stdio, std.container, std.range;

 void main() {
 enum int range = 100;
 enum int n = 1_000_000;

 auto t = redBlackTree!int(0);

 for (int i = 0; i < n; i++) {
 if (i > range)
 t.removeFront();
 t.insert(i);
 }

 writeln(walkLength(t[]));
 //writeln(t[]);
 }

 runs in about 1793 ms.
 The strange thing is, if I comment out the writeln line, runtimes are in
 average *slower* by about 20 ms, with timings varying a little bit more
 than when the writeln is included.

 How can this be ?
Very strange. Maybe there is some std library cleanup which is slower if nothing got written?
Apr 25 2012
prev sibling next sibling parent Marco Leise <Marco.Leise gmx.de> writes:
Am Wed, 25 Apr 2012 10:34:38 +0200
schrieb "SomeDude" <lovelydear mailmetrash.com>:

 Discussion here: 
 http://d.puremagic.com/issues/show_bug.cgi?id=5650
 
 On my Windows box, the following program
 
 import std.stdio, std.container, std.range;
 
 void main() {
      enum int range = 100;
      enum int n = 1_000_000;
 
      auto t = redBlackTree!int(0);
 
      for (int i = 0; i < n; i++) {
          if (i > range)
              t.removeFront();
          t.insert(i);
      }
 
      writeln(walkLength(t[]));
      //writeln(t[]);
 }
 
 runs in about 1793 ms.
 The strange thing is, if I comment out the writeln line, runtimes 
 are in average *slower* by about 20 ms, with timings varying a 
 little bit more than when the writeln is included.
 
 How can this be ?
GDC (using DMD 2.057) Linux 64-bit, Core 2 Duo 2 Ghz ----------------+------------------- without writeln | with writeln 3.150 s | 3.146 s .. 3,195 s Besides the fact that my rig seems to be a bit dated, I can't reproduce what you see. Only that the writeln makes the timing less accurate. -- Marco
Apr 25 2012
prev sibling parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Wed, Apr 25, 2012 at 10:34:38AM +0200, SomeDude wrote:
[...]
 import std.stdio, std.container, std.range;
 
 void main() {
     enum int range = 100;
     enum int n = 1_000_000;
 
     auto t = redBlackTree!int(0);
 
     for (int i = 0; i < n; i++) {
         if (i > range)
             t.removeFront();
         t.insert(i);
     }
 
     writeln(walkLength(t[]));
     //writeln(t[]);
 }
 
 runs in about 1793 ms.
 The strange thing is, if I comment out the writeln line, runtimes
 are in average *slower* by about 20 ms, with timings varying a
 little bit more than when the writeln is included.
[...] First of all, differences as small as 20ms really should be considered as background noise. The exact measurements depend on a lot of system-specific and environment-specific factors, such as OS memory usage, CPU cache behaviour, disk activity & speed, the exact points of context switches, etc.. If you really want to check for substantial performance differences, you need to magnify your test case so that differences are measured >5 seconds. Second, on my AMD hexacore 64-bit Linux system, the running time consistently measures between 0.57 or 0.58 seconds for both cases. The exact figure changes between runs, and as far as I can tell, there's no discernible difference between the two. Third, to make any timing differences stand out from the background noise, I increased n to 20_000_000, and both versions of the program consistently runs in about 11 seconds each time. There's no discernible difference between the two. What all this means is that a single call to writeln does not make enough difference to be measurable compared to the rest of the program. It doesn't mean that the version with writeln is "faster", just that the difference is too small and you're probably just seeing background noise. If you put the writeln inside a loop, on the other hand, you'll see a big difference, because now its cost is magnified by the number of times the loop runs. (Say if you put it inside a foreach(i;0..1000) at the end of the program, you'll see the difference when you comment it out.) So I'd chalk it up to inherent measurement inaccuracies. T -- He who sacrifices functionality for ease of use, loses both and deserves neither. -- Slashdotter
Apr 25 2012
parent reply "SomeDude" <lovelydear mailmetrash.com> writes:
On Wednesday, 25 April 2012 at 17:37:33 UTC, H. S. Teoh wrote:
 First of all, differences as small as 20ms really should be 
 considered
 as background noise. The exact measurements depend on a lot of
 system-specific and environment-specific factors, such as OS 
 memory
 usage, CPU cache behaviour, disk activity & speed, the exact 
 points of
 context switches, etc.. If you really want to check for 
 substantial
 performance differences, you need to magnify your test case so 
 that
 differences are measured >5 seconds.

 Second, on my AMD hexacore 64-bit Linux system, the running time
 consistently measures between 0.57 or 0.58 seconds for both 
 cases. The
 exact figure changes between runs, and as far as I can tell, 
 there's no
 discernible difference between the two.

 Third, to make any timing differences stand out from the 
 background
 noise, I increased n to 20_000_000, and both versions of the 
 program
 consistently runs in about 11 seconds each time. There's no 
 discernible
 difference between the two.

 What all this means is that a single call to writeln does not 
 make
 enough difference to be measurable compared to the rest of the 
 program.
 It doesn't mean that the version with writeln is "faster", just 
 that the
 difference is too small and you're probably just seeing 
 background
 noise. If you put the writeln inside a loop, on the other hand, 
 you'll
 see a big difference, because now its cost is magnified by the 
 number of
 times the loop runs. (Say if you put it inside a 
 foreach(i;0..1000) at
 the end of the program, you'll see the difference when you 
 comment it
 out.)

 So I'd chalk it up to inherent measurement inaccuracies.


 T
Thanks, indeed, for n = 5_000_000, I observe the expected result, i.e the writeln version being almost 1 second slower than without. Below 2_000_000, though, I consistently see the opposite on my machine, i.e the version with writeln being slightly faster. If it was background noise, it would be about equal, no ?
Apr 27 2012
parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Fri, 27 Apr 2012 10:09:18 -0400, SomeDude <lovelydear mailmetrash.com>  
wrote:

 On Wednesday, 25 April 2012 at 17:37:33 UTC, H. S. Teoh wrote:
 First of all, differences as small as 20ms really should be considered
 as background noise. The exact measurements depend on a lot of
 system-specific and environment-specific factors, such as OS memory
 usage, CPU cache behaviour, disk activity & speed, the exact points of
 context switches, etc.. If you really want to check for substantial
 performance differences, you need to magnify your test case so that
 differences are measured >5 seconds.

 Second, on my AMD hexacore 64-bit Linux system, the running time
 consistently measures between 0.57 or 0.58 seconds for both cases. The
 exact figure changes between runs, and as far as I can tell, there's no
 discernible difference between the two.

 Third, to make any timing differences stand out from the background
 noise, I increased n to 20_000_000, and both versions of the program
 consistently runs in about 11 seconds each time. There's no discernible
 difference between the two.

 What all this means is that a single call to writeln does not make
 enough difference to be measurable compared to the rest of the program.
 It doesn't mean that the version with writeln is "faster", just that the
 difference is too small and you're probably just seeing background
 noise. If you put the writeln inside a loop, on the other hand, you'll
 see a big difference, because now its cost is magnified by the number of
 times the loop runs. (Say if you put it inside a foreach(i;0..1000) at
 the end of the program, you'll see the difference when you comment it
 out.)

 So I'd chalk it up to inherent measurement inaccuracies.


 T
Thanks, indeed, for n = 5_000_000, I observe the expected result, i.e the writeln version being almost 1 second slower than without. Below 2_000_000, though, I consistently see the opposite on my machine, i.e the version with writeln being slightly faster. If it was background noise, it would be about equal, no ?
writeln should not make a difference. Even walkLength should not, Especially over 100 nodes. What you are likely seeing is some weird issue with GC cleanup. Have you tried measuring the code timings just inside main instead of the full execution of the program including runtime startup and shutdown? -Steve
Apr 27 2012
parent "SomeDude" <lovelydear mailmetrash.com> writes:
On Friday, 27 April 2012 at 14:14:37 UTC, Steven Schveighoffer 
wrote:
 Have you tried measuring the code timings just inside main 
 instead of the full execution of the program including runtime 
 startup and shutdown?

 -Steve
OK, it seems you are right. It turns out using Measure-Command{...} was a rather poor way of measuring runtimes because of startup/shutdown. Using StopWatch, I now get consistent results, i.e nearly the same times with or without the writeln, but always slightly faster without. So all is sorted now. Thx again.
Apr 27 2012