digitalmars.D - [Performance] Why D's std.string.atoi is 4x slower than std.c.stdlib.atoi?

Andrey Khropov (67/67) Nov 11 2006 Consider the following code:

Burton Radons (7/14) Nov 11 2006 It's because std.string.atoi is implemented simply with:

Andrey Khropov (31/33) Nov 11 2006 Thanks for the advice. As a matter of fact it is faster than C's atoi!

Bill Baxter (4/45) Nov 11 2006 Holy moly! Now that's dedication! You must really have a lot of
pragma (7/11) Nov 11 2006 Whoa.

Sean Kelly (15/26) Nov 11 2006 Not if you look at what lexical_cast does:

Walter Bright (7/8) Nov 11 2006 The implementation of std.string.atoi is:
Chris Miller (2/2) Nov 11 2006 I think std.string.atoi should be deprecated and removed. It's a stupid ...

Andrey Khropov (7/9) Nov 11 2006 Absolutely agree.

"Andrey Khropov" <andrey.khropov gmail.com> writes:

Consider the following code:
-------------------------------------------------
import std.stdio, std.string, std.c.stdlib, std.perf;

int DStdLib(char[] cs)
{
    int res = 0;    
    for(int i = 0; i < 10000000; i++)
      res += std.string.atoi(cs);
         
    return res;
}

int CStdLib(char* cs)
{
    int res = 0;    
   
    for(int i = 0; i < 10000000; i++)
      res += std.c.stdlib.atoi(cs);    
      
    return res;
}

void main()
{
  auto t = new HighPerformanceCounter();
  
  int nIter = 5;
  
  int meanTime = 0;
  
  for(int it = 0; it <= nIter; ++it) {
    t.start();
    
    int res = DStdLib("123");
    
    t.stop();
    
    writefln("D-StdLib: res = ", res, ", ", t.milliseconds() ," ms elapsed ");
    
    if( it )
      meanTime += t.milliseconds();
  }
    
  writefln("D-StdLib:" , meanTime/nIter ," ms elapsed (mean).");
  
  meanTime = 0;
  
  for(int it = 0; it <= nIter; ++it) {
    t.start();
    
    int res = CStdLib("123");
    
    t.stop();
    
    writefln("C-StdLib: res = ", res, ", ", t.milliseconds() ," ms elapsed ");
    
    if( it )
      meanTime += t.milliseconds();
  }
    
  writefln("C-StdLib:" , meanTime/nIter ," ms elapsed (mean).");
}

-------------------------------------------------
On my machine (P-M 1.7 Dothan) the mean times are:

D-StdLib:1695 ms elapsed (mean).
C-StdLib:374 ms elapsed (mean).

Why is it so? What could be done?

-- 
AKhropov

Nov 11 2006

Burton Radons <burton-radons smocky.com> writes:

Andrey Khropov wrote:
 -------------------------------------------------
 On my machine (P-M 1.7 Dothan) the mean times are:
 
 D-StdLib:1695 ms elapsed (mean).
 C-StdLib:374 ms elapsed (mean).
 
 Why is it so? What could be done?

It's because std.string.atoi is implemented simply with:

     return std.c.stdlib.atoi(toStringz(s));

The fault is toStringz - originally that operation tried to tell whether 
the string was NUL-terminated, but now it just allocates a copy.

Use std.conv.toInt instead, although from the looks of the 
implementation, that will be slightly slower as well.

Nov 11 2006

"Andrey Khropov" <andkhropov_nosp m_mtu-net.ru> writes:

Burton Radons wrote:

 Use std.conv.toInt instead, although from the looks of the implementation,
 that will be slightly slower as well.

Thanks for the advice. As a matter of fact it is faster than C's atoi!
And better handles errors (through exceptions)

D-std.conv.toInt: 277 ms elapsed (mean).
vs
D-std.c.stdlib.atoi: 348 ms elapsed (mean).

And it's really the fastest implementation among the different languages std
libraries:

Here is the list of results for different languages and implementations
(all optimization options were turned to the maximum):

1) DMD 0.173 (toInt)	      				- 0.277 sec
2) MinGW GCC 3.4.2 (atoi)  	 			- 0.345 sec
3) MS VC++ 8.0 (atoi)       				- 0.645 sec

5) Java on HotSpot 1.5.0_08 (Integer.decode)	- 1.796 sec (-server)
6) Java on JRockit 26.4.0 (Integer.decode)		- 1.969 sec (-server, that's the
mean for 5 runs, first run (when Jitting is performed) is 2.905 sec)

8) CPython 2.4.2 + Psyco 1.5 (int())			- 5.406 sec
9) IronPython 1.0 on Mono 1.1.18 (int())		- 10.625 sec 
10) IronPython 1.0 on .NET 2.0 (int())		- 10.685 sec 
11) CPython 2.4.2 (int())					- 11.218 sec
12) MinGW GCC 3.4.2 
   (boost 1.33.1::lexical_cast<int>)  		- 21.305 sec
13) MS VC++ 8.0 
   (boost 1.33.1::lexical_cast<int>)			- 51.700 sec (Yes, it's hard to believe
but check yourself!)

I actually cannot believe it, but D's std.conv.toInt is almost 100x faster than
boost version!

-- 
AKhropov

Nov 11 2006

Bill Baxter <wbaxter gmail.com> writes:

Andrey Khropov wrote:
 Burton Radons wrote:
 
 
Use std.conv.toInt instead, although from the looks of the implementation,
that will be slightly slower as well.

 
 
 Thanks for the advice. As a matter of fact it is faster than C's atoi!
 And better handles errors (through exceptions)
 
 D-std.conv.toInt: 277 ms elapsed (mean).
 vs
 D-std.c.stdlib.atoi: 348 ms elapsed (mean).
 
 And it's really the fastest implementation among the different languages std
 libraries:
 
 Here is the list of results for different languages and implementations
 (all optimization options were turned to the maximum):
 
 1) DMD 0.173 (toInt)	      				- 0.277 sec
 2) MinGW GCC 3.4.2 (atoi)  	 			- 0.345 sec
 3) MS VC++ 8.0 (atoi)       				- 0.645 sec

 5) Java on HotSpot 1.5.0_08 (Integer.decode)	- 1.796 sec (-server)
 6) Java on JRockit 26.4.0 (Integer.decode)		- 1.969 sec (-server, that's the
 mean for 5 runs, first run (when Jitting is performed) is 2.905 sec)

 8) CPython 2.4.2 + Psyco 1.5 (int())			- 5.406 sec
 9) IronPython 1.0 on Mono 1.1.18 (int())		- 10.625 sec 
 10) IronPython 1.0 on .NET 2.0 (int())		- 10.685 sec 
 11) CPython 2.4.2 (int())					- 11.218 sec
 12) MinGW GCC 3.4.2 
    (boost 1.33.1::lexical_cast<int>)  		- 21.305 sec
 13) MS VC++ 8.0 
    (boost 1.33.1::lexical_cast<int>)			- 51.700 sec (Yes, it's hard to believe
 but check yourself!)
 
 I actually cannot believe it, but D's std.conv.toInt is almost 100x faster than
 boost version!
 


Holy moly!  Now that's dedication!  You must really have a lot of 
strings you need to convert to integers!

--bb

Nov 11 2006

pragma <ericanderton yahoo.com> writes:

Andrey Khropov wrote:
 Burton Radons wrote:
 13) MS VC++ 8.0 
    (boost 1.33.1::lexical_cast<int>)			- 51.700 sec (Yes, it's hard to believe
 but check yourself!)

<neo>Whoa.</neo>

You'd think that with all that templating going on it would simply 
unroll into a *faster* routine, not something that runs *two whole 
orders of magnitude* slower.

BTW, you may have just sounded the call to compile some performance 
comparisons of D Templates and portions of Boost.

Nov 11 2006

Sean Kelly <sean f4.ca> writes:

pragma wrote:
 Andrey Khropov wrote:
 Burton Radons wrote:
 13) MS VC++ 8.0    (boost 1.33.1::lexical_cast<int>)            - 
 51.700 sec (Yes, it's hard to believe
 but check yourself!)

 
 <neo>Whoa.</neo>
 
 You'd think that with all that templating going on it would simply 
 unroll into a *faster* routine, not something that runs *two whole 
 orders of magnitude* slower.

Not if you look at what lexical_cast does:

* Creates a new std::stringstream object.
* Sets a bunch of properties on the stringstream to ensure data is 
processed correctly.
* Passes the data to the stringstream via operator<<, which will involve 
DMA if the data as a string occupies more than 16 chars (the small 
string optimization catches smaller cases).
* Pulls the data back out again via operator>>.
* Checks the stringstream to ensure that no errors occurred and that no 
data remains in the stream, which includes processing any trailing 
whitespace, etc.
* Returns the new value.

It's clean and works well, but is hardly fast :-)


Sean

Nov 11 2006

Walter Bright <newshound digitalmars.com> writes:

Andrey Khropov wrote:
 Why is it so? What could be done?

The implementation of std.string.atoi is:

long atoi(char[] s)
{
     return std.c.stdlib.atoi(toStringz(s));
}

In other words, it's the allocation/copy done by toStringz.

Nov 11 2006

"Chris Miller" <chris dprogramming.com> writes:

I think std.string.atoi should be deprecated and removed. It's a stupid  
name and is already supported by std.conv.

Nov 11 2006

"Andrey Khropov" <andrey.khropov gmail.com> writes:

Chris Miller wrote:

 I think std.string.atoi should be deprecated and removed. It's a stupid  name
 and is already supported by std.conv.

Absolutely agree. 
When I looked at std.string docs I found it and I saw no references that toInt
exists.

And besides that it doesn't handle errors very well: just simply returns 0.

-- 
AKhropov

Nov 11 2006

D Programming

C/C++ Programming

Other

digitalmars.D - [Performance] Why D's std.string.atoi is 4x slower than std.c.stdlib.atoi?