digitalmars.D - [Performance] Why D's std.string.atoi is 4x slower than std.c.stdlib.atoi?
- Andrey Khropov (67/67) Nov 11 2006 Consider the following code:
- Burton Radons (7/14) Nov 11 2006 It's because std.string.atoi is implemented simply with:
- Andrey Khropov (31/33) Nov 11 2006 Thanks for the advice. As a matter of fact it is faster than C's atoi!
- Bill Baxter (4/45) Nov 11 2006 Holy moly! Now that's dedication! You must really have a lot of
-
pragma
(7/11)
Nov 11 2006
Whoa. - Sean Kelly (15/26) Nov 11 2006 Not if you look at what lexical_cast does:
- Walter Bright (7/8) Nov 11 2006 The implementation of std.string.atoi is:
- Chris Miller (2/2) Nov 11 2006 I think std.string.atoi should be deprecated and removed. It's a stupid ...
- Andrey Khropov (7/9) Nov 11 2006 Absolutely agree.
Consider the following code: ------------------------------------------------- import std.stdio, std.string, std.c.stdlib, std.perf; int DStdLib(char[] cs) { int res = 0; for(int i = 0; i < 10000000; i++) res += std.string.atoi(cs); return res; } int CStdLib(char* cs) { int res = 0; for(int i = 0; i < 10000000; i++) res += std.c.stdlib.atoi(cs); return res; } void main() { auto t = new HighPerformanceCounter(); int nIter = 5; int meanTime = 0; for(int it = 0; it <= nIter; ++it) { t.start(); int res = DStdLib("123"); t.stop(); writefln("D-StdLib: res = ", res, ", ", t.milliseconds() ," ms elapsed "); if( it ) meanTime += t.milliseconds(); } writefln("D-StdLib:" , meanTime/nIter ," ms elapsed (mean)."); meanTime = 0; for(int it = 0; it <= nIter; ++it) { t.start(); int res = CStdLib("123"); t.stop(); writefln("C-StdLib: res = ", res, ", ", t.milliseconds() ," ms elapsed "); if( it ) meanTime += t.milliseconds(); } writefln("C-StdLib:" , meanTime/nIter ," ms elapsed (mean)."); } ------------------------------------------------- On my machine (P-M 1.7 Dothan) the mean times are: D-StdLib:1695 ms elapsed (mean). C-StdLib:374 ms elapsed (mean). Why is it so? What could be done? -- AKhropov
Nov 11 2006
Andrey Khropov wrote:------------------------------------------------- On my machine (P-M 1.7 Dothan) the mean times are: D-StdLib:1695 ms elapsed (mean). C-StdLib:374 ms elapsed (mean). Why is it so? What could be done?It's because std.string.atoi is implemented simply with: return std.c.stdlib.atoi(toStringz(s)); The fault is toStringz - originally that operation tried to tell whether the string was NUL-terminated, but now it just allocates a copy. Use std.conv.toInt instead, although from the looks of the implementation, that will be slightly slower as well.
Nov 11 2006
Burton Radons wrote:Use std.conv.toInt instead, although from the looks of the implementation, that will be slightly slower as well.Thanks for the advice. As a matter of fact it is faster than C's atoi! And better handles errors (through exceptions) D-std.conv.toInt: 277 ms elapsed (mean). vs D-std.c.stdlib.atoi: 348 ms elapsed (mean). And it's really the fastest implementation among the different languages std libraries: Here is the list of results for different languages and implementations (all optimization options were turned to the maximum): 1) DMD 0.173 (toInt) - 0.277 sec 2) MinGW GCC 3.4.2 (atoi) - 0.345 sec 3) MS VC++ 8.0 (atoi) - 0.645 sec 5) Java on HotSpot 1.5.0_08 (Integer.decode) - 1.796 sec (-server) 6) Java on JRockit 26.4.0 (Integer.decode) - 1.969 sec (-server, that's the mean for 5 runs, first run (when Jitting is performed) is 2.905 sec) 8) CPython 2.4.2 + Psyco 1.5 (int()) - 5.406 sec 9) IronPython 1.0 on Mono 1.1.18 (int()) - 10.625 sec 10) IronPython 1.0 on .NET 2.0 (int()) - 10.685 sec 11) CPython 2.4.2 (int()) - 11.218 sec 12) MinGW GCC 3.4.2 (boost 1.33.1::lexical_cast<int>) - 21.305 sec 13) MS VC++ 8.0 (boost 1.33.1::lexical_cast<int>) - 51.700 sec (Yes, it's hard to believe but check yourself!) I actually cannot believe it, but D's std.conv.toInt is almost 100x faster than boost version! -- AKhropov
Nov 11 2006
Andrey Khropov wrote:Burton Radons wrote:Holy moly! Now that's dedication! You must really have a lot of strings you need to convert to integers! --bbUse std.conv.toInt instead, although from the looks of the implementation, that will be slightly slower as well.Thanks for the advice. As a matter of fact it is faster than C's atoi! And better handles errors (through exceptions) D-std.conv.toInt: 277 ms elapsed (mean). vs D-std.c.stdlib.atoi: 348 ms elapsed (mean). And it's really the fastest implementation among the different languages std libraries: Here is the list of results for different languages and implementations (all optimization options were turned to the maximum): 1) DMD 0.173 (toInt) - 0.277 sec 2) MinGW GCC 3.4.2 (atoi) - 0.345 sec 3) MS VC++ 8.0 (atoi) - 0.645 sec 5) Java on HotSpot 1.5.0_08 (Integer.decode) - 1.796 sec (-server) 6) Java on JRockit 26.4.0 (Integer.decode) - 1.969 sec (-server, that's the mean for 5 runs, first run (when Jitting is performed) is 2.905 sec) 8) CPython 2.4.2 + Psyco 1.5 (int()) - 5.406 sec 9) IronPython 1.0 on Mono 1.1.18 (int()) - 10.625 sec 10) IronPython 1.0 on .NET 2.0 (int()) - 10.685 sec 11) CPython 2.4.2 (int()) - 11.218 sec 12) MinGW GCC 3.4.2 (boost 1.33.1::lexical_cast<int>) - 21.305 sec 13) MS VC++ 8.0 (boost 1.33.1::lexical_cast<int>) - 51.700 sec (Yes, it's hard to believe but check yourself!) I actually cannot believe it, but D's std.conv.toInt is almost 100x faster than boost version!
Nov 11 2006
Andrey Khropov wrote:Burton Radons wrote: 13) MS VC++ 8.0 (boost 1.33.1::lexical_cast<int>) - 51.700 sec (Yes, it's hard to believe but check yourself!)<neo>Whoa.</neo> You'd think that with all that templating going on it would simply unroll into a *faster* routine, not something that runs *two whole orders of magnitude* slower. BTW, you may have just sounded the call to compile some performance comparisons of D Templates and portions of Boost.
Nov 11 2006
pragma wrote:Andrey Khropov wrote:Not if you look at what lexical_cast does: * Creates a new std::stringstream object. * Sets a bunch of properties on the stringstream to ensure data is processed correctly. * Passes the data to the stringstream via operator<<, which will involve DMA if the data as a string occupies more than 16 chars (the small string optimization catches smaller cases). * Pulls the data back out again via operator>>. * Checks the stringstream to ensure that no errors occurred and that no data remains in the stream, which includes processing any trailing whitespace, etc. * Returns the new value. It's clean and works well, but is hardly fast :-) SeanBurton Radons wrote: 13) MS VC++ 8.0 (boost 1.33.1::lexical_cast<int>) - 51.700 sec (Yes, it's hard to believe but check yourself!)<neo>Whoa.</neo> You'd think that with all that templating going on it would simply unroll into a *faster* routine, not something that runs *two whole orders of magnitude* slower.
Nov 11 2006
Andrey Khropov wrote:Why is it so? What could be done?The implementation of std.string.atoi is: long atoi(char[] s) { return std.c.stdlib.atoi(toStringz(s)); } In other words, it's the allocation/copy done by toStringz.
Nov 11 2006
I think std.string.atoi should be deprecated and removed. It's a stupid name and is already supported by std.conv.
Nov 11 2006
Chris Miller wrote:I think std.string.atoi should be deprecated and removed. It's a stupid name and is already supported by std.conv.Absolutely agree. When I looked at std.string docs I found it and I saw no references that toInt exists. And besides that it doesn't handle errors very well: just simply returns 0. -- AKhropov
Nov 11 2006