digitalmars.D - cost of calling class function
- =?UTF-8?B?RHXFoWFu?= Pavkov (84/84) Feb 22 2017 Hello,
- Seb (10/18) Feb 22 2017 I think I can provide a couple of pointers for one reason. The
- Jeremy DeHaan (3/5) Feb 22 2017 I believe Andrei made an executive decision to shut down final by
- Jonathan M Davis via Digitalmars-d (7/12) Feb 22 2017 Yeah, the change that introduced virtual to start the change to making c...
- Chris Wright (5/20) Feb 23 2017 It's an interesting debate, but there's not a ton of reason to prefer on...
- Johan Engelen (8/12) Feb 23 2017 Interesting test case, thanks :-)
- Johan Engelen (11/23) Feb 23 2017 We're in good company: both clang and gcc also do not
- Patrick Schluter (3/15) Feb 23 2017 Marking the method as @pure changes anything?
- Johan Engelen (4/5) Feb 23 2017 Here is the link to play with it yourself :-)
Hello, I have tried to measure how much would some simple task be faster causes one by one I have an example which shows where the problem is. If the function is outside of class code runs much faster. I'm obviously doing something wrong and would appreciate any help with this. Here is the code: import std.stdio; import std.conv; import std.datetime; public float getTotal(string s, int add) { float result = add; for (int j = 0; j < s.length; j++) { result += s[j]; } return result; } class A { public float getTotal(string s, int add) { float result = add; for (int j = 0; j < s.length; j++) { result += s[j]; } return result; } } void main(string[] args) { StopWatch sw; sw.start(); int n = args.length == 2 ? to!int(args[1]) : 100000; string inputA = "qwertyuiopasdfghjklzxcvbnm0123456789"; double total = 0; for (int i = 0; i < n; i++) { for (int ii = 0; ii < inputA.length; ii++) { total += getTotal(inputA, i); } } sw.stop(); writeln("direct call: "); writeln("total: ", total); writeln("elapsed: ", sw.peek().msecs, " [ms]"); writeln(); total = 0; auto a = new A(); sw.reset(); sw.start(); for (int i = 0; i < n; i++) { for (int ii = 0; ii < inputA.length; ii++) { total += a.getTotal(inputA, i); } } sw.stop(); writeln("func in class call: ", total); writeln("total: ", total); writeln("elapsed: ", sw.peek().msecs, " [ms]"); } here are the build configuration and execution times: C:\projects\D\benchmarks\reduced problem>dub run --config=application --arch=x86_64 --build=release-nobounds --compiler=ldc2 Performing "release-nobounds" build using ldc2 for x86_64. benchmark1 ~master: target for configuration "application" is up to date. To force a rebuild of up-to-date targets, run again with --force. Running .\benchmark1.exe direct call: total: 1.92137e+11 elapsed: 4 [ms] func in class call: 1.92137e+11 total: 1.92137e+11 elapsed: 138 [ms] Thanks in advance.
Feb 22 2017
On Wednesday, 22 February 2017 at 23:49:43 UTC, Dušan Pavkov wrote:Hello, I have tried to measure how much would some simple task be eliminating causes one by one I have an example which shows where the problem is. If the function is outside of class code runs much faster. I'm obviously doing something wrong and would appreciate any help with this.I think I can provide a couple of pointers for one reason. The function isn't final and virtual calls are inefficient: https://dlang.org/spec/function.html#virtual-functions http://forum.dlang.org/post/mailman.840.1332033836.4860.digitalmars-d puremagic.com https://issues.dlang.org/show_bug.cgi?id=11616 http://wiki.dlang.org/DIP51 AFAICT though it was approved, the switch to final by default has never happened.
Feb 22 2017
On Thursday, 23 February 2017 at 01:48:40 UTC, Seb wrote:AFAICT though it was approved, the switch to final by default has never happened.I believe Andrei made an executive decision to shut down final by default.
Feb 22 2017
On Thursday, February 23, 2017 02:17:02 Jeremy DeHaan via Digitalmars-d wrote:On Thursday, 23 February 2017 at 01:48:40 UTC, Seb wrote:Yeah, the change that introduced virtual to start the change to making class member functions non-virtual by default was actually committed, and then Andrei found out about it and insisted that it be reverted. So, it was reverted, and we're never going to get non-virtual by default. - Jonathan M DavisAFAICT though it was approved, the switch to final by default has never happened.I believe Andrei made an executive decision to shut down final by default.
Feb 22 2017
On Wed, 22 Feb 2017 18:31:37 -0800, Jonathan M Davis via Digitalmars-d wrote:On Thursday, February 23, 2017 02:17:02 Jeremy DeHaan via Digitalmars-d wrote:It's an interesting debate, but there's not a ton of reason to prefer one over the other design-wise. It can be considered for D3, but for D2, the ship has sailed.On Thursday, 23 February 2017 at 01:48:40 UTC, Seb wrote:Yeah, the change that introduced virtual to start the change to making class member functions non-virtual by default was actually committed, and then Andrei found out about it and insisted that it be reverted. So, it was reverted, and we're never going to get non-virtual by default. - Jonathan M DavisAFAICT though it was approved, the switch to final by default has never happened.I believe Andrei made an executive decision to shut down final by default.
Feb 23 2017
On Wednesday, 22 February 2017 at 23:49:43 UTC, Dušan Pavkov wrote:If the function is outside of class code runs much faster. I'm obviously doing something wrong and would appreciate any help with this.Interesting test case, thanks :-) Adding "final" to the class method nullifies the speed difference. Somehow, LDC does not devirtualize the call in your testcase. Without the for-loops the call is nicely devirtualized, so no performance difference. -Johan
Feb 23 2017
On Thursday, 23 February 2017 at 16:25:34 UTC, Johan Engelen wrote:On Wednesday, 22 February 2017 at 23:49:43 UTC, Dušan Pavkov wrote:We're in good company: both clang and gcc also do not devirtualize the call when the loopcount is too large (when the loop count is 4, the indirect calls are gone, when it is 160, they are back). Btw, with PGO, the performance is 4 ms(direct call) vs 6 ms (virtual call). Pathological, but still. I am submitting a DConf talk on optimization and the cost of D idioms. This gave me some new ideas to present, thanks :) -JohanIf the function is outside of class code runs much faster. I'm obviously doing something wrong and would appreciate any help with this.Interesting test case, thanks :-) Adding "final" to the class method nullifies the speed difference. Somehow, LDC does not devirtualize the call in your testcase. Without the for-loops the call is nicely devirtualized, so no performance difference.
Feb 23 2017
On Thursday, 23 February 2017 at 17:02:55 UTC, Johan Engelen wrote:On Thursday, 23 February 2017 at 16:25:34 UTC, Johan Engelen wrote:Marking the method as pure changes anything?[...]We're in good company: both clang and gcc also do not devirtualize the call when the loopcount is too large (when the loop count is 4, the indirect calls are gone, when it is 160, they are back). Btw, with PGO, the performance is 4 ms(direct call) vs 6 ms (virtual call). Pathological, but still. I am submitting a DConf talk on optimization and the cost of D idioms. This gave me some new ideas to present, thanks :) -Johan
Feb 23 2017
On Thursday, 23 February 2017 at 19:57:18 UTC, Patrick Schluter wrote:Marking the method as pure changes anything?Here is the link to play with it yourself :-) https://godbolt.org/g/se4dCZ
Feb 23 2017