digitalmars.D - Why is D significantly slower than C# in this instance?
- Artjom (66/66) Apr 10 2023 I have written this simple bruteforce algorithm that finds max
- Artjom (59/130) Apr 10 2023 ```
- Steven Schveighoffer (7/42) Apr 10 2023 You realize these are 2 different loop setups?
- Steven Schveighoffer (7/9) Apr 10 2023 oof, forgot the formatting ticks:
- Artjom (4/13) Apr 11 2023 silly me, I'll fix that and try compile with ldc, see whatll
- WebFreak001 (6/20) Apr 12 2023 the logic here is wrong, `.total!"..."` already sums the
- =?UTF-8?Q?Ali_=c3=87ehreli?= (4/6) Apr 12 2023 I am beginning to think performance comparisons are trickier than they
- bachmeier (3/10) Apr 12 2023 Ironic that it takes so long to do a proper performance
- =?UTF-8?Q?Ali_=c3=87ehreli?= (5/7) Apr 12 2023 I sometimes have the same concern for optimized builds: Will the number
- H. S. Teoh (37/45) Apr 12 2023 [...]
- Salih Dincer (7/22) Apr 12 2023 I agree with all your views. But you are missing something! So is
- Jacob Shtokolov (10/17) Apr 13 2023 Or, he can just use
- H. S. Teoh (8/40) Apr 10 2023 This doesn't call the above function solveBruteForce. What's the
- Artjom (5/46) Apr 10 2023 Sorry, accidentaly pasted solveDynamic() insted of
- H. S. Teoh (9/12) Apr 10 2023 1) You're compiling without optimization (-O), so performance is not
- max haughton (3/16) Apr 10 2023 Unless the result is actually used the optimizer may also strip
- Artjom (3/16) Apr 10 2023 well, i tried running it with this flag and its still 8 seconds
- ryuukk_ (10/10) Apr 10 2023 Please format your code using markdown syntax
- Walter Bright (2/20) Apr 10 2023 The outer loop does nothing, perhaps C# optimizes it away.
- WollyTheWombat (14/20) Apr 12 2023 By C#, I presume you mean dotnet.exe (and not csc.exe).
I have written this simple bruteforce algorithm that finds max the size of array is 1000000 elements - D is 20 seconds lated. Why? D code (algorithm): ' public int solveBruteForce() { int currMax = arr[0], currSum; foreach (_; arr) { currSum = 0; foreach (int el; arr) { currSum += el; currMax = max(currSum, currMax); } } return currMax; } ' D code (measuring exec time): ' double measureTime() { StopWatch sw; sw.start(); solveDynamic(); double sec = (to!double(sw.peek.total!"msecs") / 1000) + (to!double(sw.peek.total!"usecs") / 1000000); return sec; } ' ' int EachOther(int[] array) { int current_max = array[0]; for (int i = 0; i < array.Length; i++) { int subset_sum = 0; for (int j = i; j < array.Length; j++) { subset_sum += array[j]; current_max = Math.Max(subset_sum, current_max); } } return current_max; } ' ' Stopwatch stopwatch = new Stopwatch(); for (int cur_size = 1000; cur_size <= size_max; cur_size *= 10) { int[] array = GenArray(cur_size); stopwatch.Start(); EachOther(array); stopwatch.Stop(); double time = (double)stopwatch.ElapsedTicks / Stopwatch.Frequency; Console.Write("{0, -10}|", string.Format("{0:f7}", time)); } Console.WriteLine(); '
Apr 10 2023
On Monday, 10 April 2023 at 21:18:59 UTC, Artjom wrote:I have written this simple bruteforce algorithm that finds max the size of array is 1000000 elements - D is 20 seconds lated. Why? D code (algorithm): ' public int solveBruteForce() { int currMax = arr[0], currSum; foreach (_; arr) { currSum = 0; foreach (int el; arr) { currSum += el; currMax = max(currSum, currMax); } } return currMax; } ' D code (measuring exec time): ' double measureTime() { StopWatch sw; sw.start(); solveDynamic(); double sec = (to!double(sw.peek.total!"msecs") / 1000) + (to!double(sw.peek.total!"usecs") / 1000000); return sec; } ' ' int EachOther(int[] array) { int current_max = array[0]; for (int i = 0; i < array.Length; i++) { int subset_sum = 0; for (int j = i; j < array.Length; j++) { subset_sum += array[j]; current_max = Math.Max(subset_sum, current_max); } } return current_max; } ' ' Stopwatch stopwatch = new Stopwatch(); for (int cur_size = 1000; cur_size <= size_max; cur_size *= 10) { int[] array = GenArray(cur_size); stopwatch.Start(); EachOther(array); stopwatch.Stop(); double time = (double)stopwatch.ElapsedTicks / Stopwatch.Frequency; Console.Write("{0, -10}|", string.Format("{0:f7}", time)); } Console.WriteLine(); 'Sorry, wrong code formattingD code (algorithm):``` public int solveBruteForce() { int currMax = arr[0], currSum; foreach (_; arr) { currSum = 0; foreach (int el; arr) { currSum += el; currMax = max(currSum, currMax); } } return currMax; } ```D code (measuring exec time):``` double measureTime() { StopWatch sw; sw.start(); solveDynamic(); double sec = (to!double(sw.peek.total!"msecs") / 1000) + (to!double(sw.peek.total!"usecs") / 1000000); return sec; } `````` int EachOther(int[] array) { int current_max = array[0]; for (int i = 0; i < array.Length; i++) { int subset_sum = 0; for (int j = i; j < array.Length; j++) { subset_sum += array[j]; current_max = Math.Max(subset_sum, current_max); } } return current_max; } `````` Stopwatch stopwatch = new Stopwatch(); for (int cur_size = 1000; cur_size <= size_max; cur_size *= 10) { int[] array = GenArray(cur_size); stopwatch.Start(); EachOther(array); stopwatch.Stop(); double time = (double)stopwatch.ElapsedTicks / Stopwatch.Frequency; Console.Write("{0, -10}|", string.Format("{0:f7}", time)); } Console.WriteLine(); ```
Apr 10 2023
On 4/10/23 5:29 PM, Artjom wrote:...D code (algorithm):``` Â Â Â public int solveBruteForce() Â Â Â { Â Â Â Â Â Â Â int currMax = arr[0], currSum; Â Â Â Â Â Â Â foreach (_; arr) Â Â Â Â Â Â Â { Â Â Â Â Â Â Â Â Â Â Â currSum = 0; Â Â Â Â Â Â Â Â Â Â Â foreach (int el; arr) Â Â Â Â Â Â Â Â Â Â Â { Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â currSum += el; Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â currMax = max(currSum, currMax); Â Â Â Â Â Â Â Â Â Â Â } Â Â Â Â Â Â Â } Â Â Â Â Â Â Â return currMax; Â Â Â } ```You realize these are 2 different loop setups? Lining up pseudocode so you can see: D: foreach(i; 0 .. len) foreach(j; 0 .. len) -Steve``` int EachOther(int[] array) { Â int current_max = array[0]; Â for (int i = 0; i < array.Length; i++) Â { Â Â Â int subset_sum = 0; Â Â Â for (int j = i; j < array.Length; j++) Â Â Â { Â Â Â Â Â Â Â subset_sum += array[j]; Â Â Â Â Â Â Â current_max = Math.Max(subset_sum, current_max); Â Â Â Â } Â } Â return current_max; } ```
Apr 10 2023
On 4/10/23 9:49 PM, Steven Schveighoffer wrote:Lining up pseudocode so you can see:oof, forgot the formatting ticks: ``` D:Â foreach(i; 0 .. len) foreach(j; 0 .. len) ``` -Steve
Apr 10 2023
On Tuesday, 11 April 2023 at 01:50:58 UTC, Steven Schveighoffer wrote:On 4/10/23 9:49 PM, Steven Schveighoffer wrote:silly me, I'll fix that and try compile with ldc, see whatll happenLining up pseudocode so you can see:oof, forgot the formatting ticks: ``` D:Â foreach(i; 0 .. len) foreach(j; 0 .. len) ``` -Steve
Apr 11 2023
On Monday, 10 April 2023 at 21:29:11 UTC, Artjom wrote:[...]the logic here is wrong, `.total!"..."` already sums the different units, you can just do `sw.peek.total!"hnsecs" / 10_000_000.0` to get the total seconds as double. Otherwise this currently always doubled the time, since you added the total (converted) seconds twice.D code (measuring exec time):``` double measureTime() { StopWatch sw; sw.start(); solveDynamic(); double sec = (to!double(sw.peek.total!"msecs") / 1000) + (to!double(sw.peek.total!"usecs") / 1000000); return sec; } ``` [...]
Apr 12 2023
On 4/12/23 05:19, WebFreak001 wrote:Otherwise this currently always doubled the time, since you added the total (converted) seconds twice.I am beginning to think performance comparisons are trickier than they look. ;) Ali
Apr 12 2023
On Wednesday, 12 April 2023 at 13:13:42 UTC, Ali Çehreli wrote:On 4/12/23 05:19, WebFreak001 wrote:Ironic that it takes so long to do a proper performance comparison when the only reason to do it is to save time.Otherwise this currently always doubled the time, since youadded thetotal (converted) seconds twice.I am beginning to think performance comparisons are trickier than they look. ;) Ali
Apr 12 2023
On 4/12/23 06:24, bachmeier wrote:Ironic that it takes so long to do a proper performance comparison when the only reason to do it is to save time.I sometimes have the same concern for optimized builds: Will the number of times a program will ever be executed warrant the time lost on optimized builds? :) Ali
Apr 12 2023
On Wed, Apr 12, 2023 at 07:35:29AM -0700, Ali Çehreli via Digitalmars-d wrote:On 4/12/23 06:24, bachmeier wrote:[...] Depends on the program and how it will be used. For one-off shell-script replacements, the answer is no, it's not worth the bother. For this kind of task, what you want is fast turnaround time: to fix bugs in the code, then once it does what it needs to, you just get a result and that's good enough. No need to sweat it. If the difference in performance is 10ms per run vs. 100ms per run, who cares, you won't even notice the difference. Waiting for 5s compiles as LDC optimizes it to the end of the world and back to get it down to 10ms is simply not worth it, when using dmd gives you a 2s turnaround time for 100ms runtime that you won't even notice. For long-running compute-intensive tasks, though, the tables begin shifting the other way. If using dmd gives you 2s turnaround times but it takes the program 10 hours to come up with the answer vs. using ldc with 8s compile times but the program comes up with the answer in 4 hours, then yeah, the time spent on optimization is totally worth it. Or if you're running a high-traffic website and using dmd gives you fast turnaround during development, but caps website performance at say 1000 requests per second, then you'd want to consider using ldc for the production build, it will take a little bit longer but may get you to 5000 requests per second. If your business depends on the volume of processed requests, this would totally be a worthwhile investment. But if it's a low traffic website and the difference is between 1000 requests per *day* vs. 5000 requests per day, then I wouldn't bother, dmd gives me faster turnaround and therefore faster development, and I won't even notice the difference anyway. It's not just about compilers, of course. The same reasoning goes for time spent hand-optimizing your code. If you spend 10 hours optimizing a shell script from 100ms runs to 10ms runs, you can feel great about yourself but in the grand scheme of things it's a waste of time. But if 10 hours of hand-optimization gets you from 1000 requests per second to 10000 requests per second on a high-traffic website that will be continuously running for a long time, then yeah it's totally worth it. T -- Insanity is doing the same thing over and over again and expecting different results.Ironic that it takes so long to do a proper performance comparison when the only reason to do it is to save time.I sometimes have the same concern for optimized builds: Will the number of times a program will ever be executed warrant the time lost on optimized builds? :)
Apr 12 2023
On Wednesday, 12 April 2023 at 17:16:55 UTC, H. S. Teoh wrote:On Wed, Apr 12, 2023 at 07:35:29AM -0700, Ali Çehreli via Digitalmars-d wrote:I agree with all your views. But you are missing something! So is the questioner: C sharp is optimized to be compatible with W$. So if you are using windowze, the comparison you make is similar to comparing jet-ski and motorcycle. SDB 79On 4/12/23 06:24, bachmeier wrote:[...] Depends on the program and how it will be used. For one-off shell-script replacements, the answer is no, it's not worth the bother. For this kind of task, what you want is fast turnaround time:Ironic that it takes so long to do a proper performance comparison when the only reason to do it is to save time.I sometimes have the same concern for optimized builds: Will the number of times a program will ever be executed warrant the time lost on optimized builds? :)
Apr 12 2023
On Wednesday, 12 April 2023 at 12:19:12 UTC, WebFreak001 wrote:On Monday, 10 April 2023 at 21:29:11 UTC, Artjom wrote:Or, he can just use ```d void main() { import std.datetime.stopwatch : benchmark; auto results = benchmark!(solveBruteforce)(1); results[0].writeln; } ```[...]the logic here is wrong, `.total!"..."` already sums the different units, you can just do `sw.peek.total!"hnsecs" / 10_000_000.0` to get the total seconds as double. Otherwise this currently always doubled the time, since you added the total (converted) seconds twice.
Apr 13 2023
On Mon, Apr 10, 2023 at 09:18:59PM +0000, Artjom via Digitalmars-d wrote:I have written this simple bruteforce algorithm that finds max sum of array is 1000000 elements - D is 20 seconds lated. Why?Which compiler are you using and what are the compiler options?D code (algorithm): ' public int solveBruteForce() { int currMax = arr[0], currSum; foreach (_; arr) { currSum = 0; foreach (int el; arr) { currSum += el; currMax = max(currSum, currMax); } } return currMax; } ' D code (measuring exec time): ' double measureTime() { StopWatch sw; sw.start(); solveDynamic();This doesn't call the above function solveBruteForce. What's the definition of solveDynamic?double sec = (to!double(sw.peek.total!"msecs") / 1000) + (to!double(sw.peek.total!"usecs") / 1000000); return sec; }[...] T -- Two wrongs don't make a right; but three rights do make a left...
Apr 10 2023
On Monday, 10 April 2023 at 21:40:40 UTC, H. S. Teoh wrote:On Mon, Apr 10, 2023 at 09:18:59PM +0000, Artjom via Digitalmars-d wrote:Sorry, accidentaly pasted solveDynamic() insted of solveBruteForce(), it's solveBruteForce() in the code though. Im using DMD and compile like "dmd .\source\app.d .\classes\solver.d"I have written this simple bruteforce algorithm that finds max when the size of array is 1000000 elements - D is 20 seconds lated. Why?Which compiler are you using and what are the compiler options?D code (algorithm): ' public int solveBruteForce() { int currMax = arr[0], currSum; foreach (_; arr) { currSum = 0; foreach (int el; arr) { currSum += el; currMax = max(currSum, currMax); } } return currMax; } ' D code (measuring exec time): ' double measureTime() { StopWatch sw; sw.start(); solveDynamic();This doesn't call the above function solveBruteForce. What's the definition of solveDynamic?double sec = (to!double(sw.peek.total!"msecs") / 1000) + (to!double(sw.peek.total!"usecs") / 1000000); return sec; }[...] T
Apr 10 2023
On Mon, Apr 10, 2023 at 09:46:25PM +0000, Artjom via Digitalmars-d wrote: [...]Sorry, accidentaly pasted solveDynamic() insted of solveBruteForce(), it's solveBruteForce() in the code though. Im using DMD and compile like "dmd .\source\app.d .\classes\solver.d"1) You're compiling without optimization (-O), so performance is not guaranteed to be good. 2) For anything where performance is important, don't use DMD. Instead, use `gdc -O3` or `ldc2 -O2`. T -- What's the difference between a 4D tube and an overweight Dutchman? One is a hollow spherinder, and the other is a spherical Hollander.
Apr 10 2023
On Monday, 10 April 2023 at 22:00:40 UTC, H. S. Teoh wrote:On Mon, Apr 10, 2023 at 09:46:25PM +0000, Artjom via Digitalmars-d wrote: [...]Unless the result is actually used the optimizer may also strip out the code entirely.Sorry, accidentaly pasted solveDynamic() insted of solveBruteForce(), it's solveBruteForce() in the code though. Im using DMD and compile like "dmd .\source\app.d .\classes\solver.d"1) You're compiling without optimization (-O), so performance is not guaranteed to be good. 2) For anything where performance is important, don't use DMD. Instead, use `gdc -O3` or `ldc2 -O2`. T
Apr 10 2023
On Monday, 10 April 2023 at 22:00:40 UTC, H. S. Teoh wrote:On Mon, Apr 10, 2023 at 09:46:25PM +0000, Artjom via Digitalmars-d wrote: [...]well, i tried running it with this flag and its still 8 secondsSorry, accidentaly pasted solveDynamic() insted of solveBruteForce(), it's solveBruteForce() in the code though. Im using DMD and compile like "dmd .\source\app.d .\classes\solver.d"1) You're compiling without optimization (-O), so performance is not guaranteed to be good. 2) For anything where performance is important, don't use DMD. Instead, use `gdc -O3` or `ldc2 -O2`. T
Apr 10 2023
Please format your code using markdown syntax ```D // your code here ``` Also please share the exact code you benchmark, the one you shared is incomplete As other mentioned you are building in debug mode, dmd is not suited for performance, it's favors compile speed instead As other mentioned please use LDC on windows/unix or GDC on unix with proper release flag: -O2 for LDC for example
Apr 10 2023
On 4/10/2023 2:18 PM, Artjom wrote:I have written this simple bruteforce algorithm that finds max sum of 1000000 elements - D is 20 seconds lated. Why? Â Â Â public int solveBruteForce() Â Â Â { Â Â Â Â Â Â Â int currMax = arr[0], currSum; Â Â Â Â Â Â Â foreach (_; arr) Â Â Â Â Â Â Â { Â Â Â Â Â Â Â Â Â Â Â currSum = 0; Â Â Â Â Â Â Â Â Â Â Â foreach (int el; arr) Â Â Â Â Â Â Â Â Â Â Â { Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â currSum += el; Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â currMax = max(currSum, currMax); Â Â Â Â Â Â Â Â Â Â Â } Â Â Â Â Â Â Â } Â Â Â Â Â Â Â return currMax; Â Â Â }
Apr 10 2023
On Monday, 10 April 2023 at 21:18:59 UTC, Artjom wrote:I have written this simple bruteforce algorithm that finds max the size of array is 1000000 elements - D is 20 seconds lated. Why? D code (algorithm): ..dotnet has many, many optimizations with regards to looping, and many more to come. btw: 'quick JIT is not used for methods that contain loops' https://learn.microsoft.com/en-us/dotnet/core/runtime-config/compilation also: 'Loop optimizations can be surprising' https://leveluppp.ghost.io/loop-optimizations-in-various-compilers/ As for D, well you have 3 different compilers to choose from, and various defaults, and various flags you can pass to them....so good luck with that ;-) - "The 80-20 rule is the dominating force driving the argument that premature tuning is a sin."
Apr 12 2023