digitalmars.D - Expected performance
- Martin Hess (58/58) Nov 07 2007 I'm considering D for a project that is very CPU intensive so I'm evalua...
- Sascha Katzner (6/8) Nov 07 2007 Perhaps they use different hash functions? If you really want a fast as
- Aarti_pl (8/22) Nov 07 2007 Did you turn on optimizations when compiling D code?
- Michel Fortin (12/17) Nov 07 2007 He's using GDC, so that would rather be:
- Martin Hess (2/26) Nov 07 2007
- bearophile (8/11) Nov 07 2007 From my tests I think D AAs aren't much optimized yet, in the past I hav...
- sambeau (6/10) Nov 07 2007 You might also want to take a look at this:
- Lionello Lunesu (14/86) Nov 08 2007 A toHash for a GUID could basically just return the first 4 bytes of the...
I'm considering D for a project that is very CPU intensive so I'm evaluating D's performance. As a first test I started with associative arrays and ended up with a result that surprised me a bit so I would like to get some input on the test's construction. The test builds a dictionary of 50k byte arrays that map to integers. int[byte[]] dictionary; Next it looks up elements a half billion times. Assuming 1.5 instructions per cycle I'm seeing about 387 instructions per lookup. I tried it in Java and was seeing about 309 instructions per lookup. Nievely I assumed I would get better performance that Java. Does anyone have any thoughts on this? Test details: Mac OSX 10.5 GDC 4.0.1 Java 1.5.0_13 ================ import std.stdio; import std.random; char[] letters = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"; byte[] makeGUID(int size) { byte[] a = new byte[size]; for (int i = 0; i < size; i++) { a[i] = letters[rand() % letters.length]; } return a; } const int tableSize = 50000; version(Unix) { private import std.c.unix.unix; int main (char[][] args) { byte[][tableSize] guids; int[byte[]] dictionary; for (int i = 0; i < tableSize; i++) { byte[] guid = makeGUID(26); guids[i] = guid; dictionary[guid] = i; } dictionary.rehash; ulong start = time(null); for (int i = 0; i < (500 * 1000 * 1000); i++) { byte[] guid = guids[i % guids.length]; int j = dictionary[guid]; } ulong stop = time(null); writefln(stop-start); return 0; } }
Nov 07 2007
Martin Hess wrote:Nievely I assumed I would get better performance that Java. Does anyone have any thoughts on this?Perhaps they use different hash functions? If you really want a fast as possible implementation I would suggest you implement your own hash table, with a customized hash function. LLAP, Sascha Katzner
Nov 07 2007
Martin Hess pisze:I'm considering D for a project that is very CPU intensive so I'm evaluating D's performance. As a first test I started with associative arrays and ended up with a result that surprised me a bit so I would like to get some input on the test's construction. The test builds a dictionary of 50k byte arrays that map to integers. int[byte[]] dictionary; Next it looks up elements a half billion times. Assuming 1.5 instructions per cycle I'm seeing about 387 instructions per lookup. I tried it in Java and was seeing about 309 instructions per lookup. Nievely I assumed I would get better performance that Java. Does anyone have any thoughts on this?Did you turn on optimizations when compiling D code? -inline do function inlining -O optimize -release compile release version BR Marcin Kuszczak (aarti_pl - www.zapytajmnie.com)
Nov 07 2007
On 2007-11-07 05:06:55 -0500, Aarti_pl <aarti interia.pl> said:Did you turn on optimizations when compiling D code? -inline do function inlining -O optimize -release compile release versionHe's using GDC, so that would rather be: -finline-functions one of -O3 (fastest) or -Os (fastest, smaller) -frelease You could also optimize intruction scheduling for a specific processor; for me that would be: -mtune=G4 -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Nov 07 2007
I had -O3 already but I added the other 2 optimizations and didn't see any time difference. Michel Fortin Wrote:On 2007-11-07 05:06:55 -0500, Aarti_pl <aarti interia.pl> said:Did you turn on optimizations when compiling D code? -inline do function inlining -O optimize -release compile release versionHe's using GDC, so that would rather be: -finline-functions one of -O3 (fastest) or -Os (fastest, smaller) -frelease You could also optimize intruction scheduling for a specific processor; for me that would be: -mtune=G4 -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Nov 07 2007
Martin Hess:The test builds a dictionary of 50k byte arrays that map to integers. int[byte[]] dictionary; Next it looks up elements a half billion times.From my tests I think D AAs aren't much optimized yet, in the past I have shown simple Python code that uses Python dicts that runs faster than D code (you can find it in the group or I can try to find it again), and I have done similar good hash function and good hash collision management. And if you put tons of keys like that you also need a good GC (or memory management). With time D will probably improve, D developers can take a look at how Python implemens dicts, to copy them all (Perl dicts may be fit too). With your situation you have both lot of memory to manage and a large data block to compute hash on, so you need both quick memory management and and a fast hash function. For the memory management I can't help you much, but for the hash function you may take a look at the assembly version of the superfasthash: http://www.azillionmonkeys.com/qed/hash.html http://www.team5150.com/~andrew/blog/2007/03/breaking_superfasthash.html http://www.team5150.com/~andrew/blog/2007/03/when_bad_hashing_means_good_caching.html Bye, bearophile
Nov 07 2007
bearophile Wrote: For the memory management I can't help you much, but for the hash function you may take a look at the assembly version of the superfasthash:http://www.azillionmonkeys.com/qed/hash.html http://www.team5150.com/~andrew/blog/2007/03/breaking_superfasthash.html http://www.team5150.com/~andrew/blog/2007/03/when_bad_hashing_means_good_caching.htmlYou might also want to take a look at this: http://judy.sourceforge.net/ Cheers, s
Nov 07 2007
Martin Hess wrote:I'm considering D for a project that is very CPU intensive so I'm evaluating D's performance. As a first test I started with associative arrays and ended up with a result that surprised me a bit so I would like to get some input on the test's construction. The test builds a dictionary of 50k byte arrays that map to integers. int[byte[]] dictionary; Next it looks up elements a half billion times. Assuming 1.5 instructions per cycle I'm seeing about 387 instructions per lookup. I tried it in Java and was seeing about 309 instructions per lookup. Nievely I assumed I would get better performance that Java. Does anyone have any thoughts on this? Test details: Mac OSX 10.5 GDC 4.0.1 Java 1.5.0_13 ================ import std.stdio; import std.random; char[] letters = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"; byte[] makeGUID(int size) { byte[] a = new byte[size]; for (int i = 0; i < size; i++) { a[i] = letters[rand() % letters.length]; } return a; } const int tableSize = 50000; version(Unix) { private import std.c.unix.unix; int main (char[][] args) { byte[][tableSize] guids; int[byte[]] dictionary; for (int i = 0; i < tableSize; i++) { byte[] guid = makeGUID(26); guids[i] = guid; dictionary[guid] = i; } dictionary.rehash; ulong start = time(null); for (int i = 0; i < (500 * 1000 * 1000); i++) { byte[] guid = guids[i % guids.length]; int j = dictionary[guid]; } ulong stop = time(null); writefln(stop-start); return 0; } }A toHash for a GUID could basically just return the first 4 bytes of the GUID as hash: struct Guid { byte[] guid;// or fixed size? uint toHash() { return *cast(uint*)byte.ptr; } } int[Guid] dictionary; I've boosted AArray filling/reading in the past by temporarily disabling the garbage collector (std.gc.disable()/enable()). Anyway, if performance doesn't turn you to D, surely this will: L.
Nov 08 2007