digitalmars.D.announce - AA dsource project
- dsimcha (10/10) Nov 04 2009 A few weeks ago I mentioned that I was going to create some kind of foru...
- Moritz Warning (5/16) Nov 04 2009 I've committed a port of Pythons dictionary implementation.
- dsimcha (21/37) Nov 04 2009 Nice job. I ported your code to D2, (quick and dirty, doesn't work righ...
- Walter Bright (5/24) Nov 04 2009 Perhaps some combination would work. I'd be reluctant to go with a 3x
- dsimcha (6/25) Nov 04 2009 I've also just added my RandAA candidate, which uses parallel arrays and...
- Moritz Warning (16/64) Nov 05 2009 I've committed an initial test bench (test.d).
- Moritz Warning (9/63) Nov 05 2009 [..]
- Saaa (26/39) Nov 05 2009 Slightly related, I recently looked at the typecons code and was a bit
- bearophile (6/8) Nov 05 2009 You may try this too (C code, the "khash.h" file, plus few demos), the o...
- dsimcha (6/14) Nov 05 2009 For the purposes of this project, I only accept D code. If you want to ...
A few weeks ago I mentioned that I was going to create some kind of forum for people to post candidate associative array implementations to replace the current, much-despised implementation as the new "builtin" AA for D. It's now up at http://dsource.org/projects/aa/wiki/WikiStart . SVN write access should work for all registered dsource users. If you have an associative array implementation that you feel is a significant improvement over the current builtin and are able/willing to license it under the Boost license, please post it for comment. Also, if you have a good benchmark for AAs, please create a benchmarks/ directory and submit it.
Nov 04 2009
On Wed, 04 Nov 2009 20:53:20 +0000, dsimcha wrote:A few weeks ago I mentioned that I was going to create some kind of forum for people to post candidate associative array implementations to replace the current, much-despised implementation as the new "builtin" AA for D. It's now up at http://dsource.org/projects/aa/wiki/WikiStart . SVN write access should work for all registered dsource users. If you have an associative array implementation that you feel is a significant improvement over the current builtin and are able/willing to license it under the Boost license, please post it for comment. Also, if you have a good benchmark for AAs, please create a benchmarks/ directory and submit it.I've committed a port of Pythons dictionary implementation. Here is my announcement along with some benchmark figures: http://digitalmars.com/d/archives/digitalmars/D/ Python_dictionary_inspired_hash_table_implementation_73176.html
Nov 04 2009
== Quote from Moritz Warning (moritzwarning web.de)'s articleOn Wed, 04 Nov 2009 20:53:20 +0000, dsimcha wrote:Nice job. I ported your code to D2, (quick and dirty, doesn't work right with const and all, just enough to be able to test it), and uploaded that version for D2 users. I had in the back of my mind that this was to be D2-centric, but I guess D1 could easily get better AAs, too, since it wouldn't affect the spec. I couldn't find your benchmarks, but here are my results: Builtin: Start Benchmark. 1 x 10000000 iterations inserts: 394601/s (25.342000s) lookups: 9813542/s (1.019000s) PyDict: Start Benchmark. 1 x 10000000 iterations inserts: 2044153/s (4.892000s) lookups: 3401360/s (2.940000s) This pretty much confirms my suspicion that the builtin AAs are optimized for lookup speed at the expense of insertion speed/GC overhead to an absurd degree. Also, regarding the license for Python's dictionary, apparently the Python License doesn't allow removing of copyright notices from stuff distributed in binary form. This might prove problematic if this were to be integrated into druntime.A few weeks ago I mentioned that I was going to create some kind of forum for people to post candidate associative array implementations to replace the current, much-despised implementation as the new "builtin" AA for D. It's now up at http://dsource.org/projects/aa/wiki/WikiStart . SVN write access should work for all registered dsource users. If you have an associative array implementation that you feel is a significant improvement over the current builtin and are able/willing to license it under the Boost license, please post it for comment. Also, if you have a good benchmark for AAs, please create a benchmarks/ directory and submit it.I've committed a port of Pythons dictionary implementation. Here is my announcement along with some benchmark figures: http://digitalmars.com/d/archives/digitalmars/D/ Python_dictionary_inspired_hash_table_implementation_73176.html
Nov 04 2009
dsimcha wrote:I couldn't find your benchmarks, but here are my results: Builtin: Start Benchmark. 1 x 10000000 iterations inserts: 394601/s (25.342000s) lookups: 9813542/s (1.019000s) PyDict: Start Benchmark. 1 x 10000000 iterations inserts: 2044153/s (4.892000s) lookups: 3401360/s (2.940000s) This pretty much confirms my suspicion that the builtin AAs are optimized for lookup speed at the expense of insertion speed/GC overhead to an absurd degree.Perhaps some combination would work. I'd be reluctant to go with a 3x slower lookup speed.Also, regarding the license for Python's dictionary, apparently the Python License doesn't allow removing of copyright notices from stuff distributed in binary form. This might prove problematic if this were to be integrated into druntime.That's kind of an insurmountable problem. The core stuff needs to be free of licensing problems.
Nov 04 2009
== Quote from Walter Bright (newshound1 digitalmars.com)'s articledsimcha wrote:I've also just added my RandAA candidate, which uses parallel arrays and linear congruential randomized probing. It gives insertion speed comparable to the PyDict, with lookup speed only about 30-40% slower instead of 150-200%. For more benchmark details, see the wiki http://dsource.org/projects/aa/wiki/WikiStart, which I've just updated.I couldn't find your benchmarks, but here are my results: Builtin: Start Benchmark. 1 x 10000000 iterations inserts: 394601/s (25.342000s) lookups: 9813542/s (1.019000s) PyDict: Start Benchmark. 1 x 10000000 iterations inserts: 2044153/s (4.892000s) lookups: 3401360/s (2.940000s) This pretty much confirms my suspicion that the builtin AAs are optimized for lookup speed at the expense of insertion speed/GC overhead to an absurd degree.Perhaps some combination would work. I'd be reluctant to go with a 3x slower lookup speed.
Nov 04 2009
On Thu, 05 Nov 2009 03:52:22 +0000, dsimcha wrote:== Quote from Moritz Warning (moritzwarning web.de)'s articleI've committed an initial test bench (test.d). dmd2 test.d pydict/pyDictD2.d randAA/RandAA.d -O -release Linear test, GC enabled, 10 x 5_000_000 inserts/lookups: RandAA(K,V,bool storeHash = shouldStoreHash!(K)): 10 x 1000000 iterations inserts: 677460/s (1.476100s) lookups: 3549875/s (0.281700s) BuildIn: 10 x 1000000 iterations inserts: 218627/s (4.574000s) lookups: 4766444/s (0.209800s) PyDict(K,V): 10 x 1000000 iterations inserts: 1621271/s (0.616800s) lookups: 3834355/s (0.260800s)On Wed, 04 Nov 2009 20:53:20 +0000, dsimcha wrote:Nice job. I ported your code to D2, (quick and dirty, doesn't work right with const and all, just enough to be able to test it), and uploaded that version for D2 users. I had in the back of my mind that this was to be D2-centric, but I guess D1 could easily get better AAs, too, since it wouldn't affect the spec. I couldn't find your benchmarks, but here are my results: Builtin: Start Benchmark. 1 x 10000000 iterations inserts: 394601/s (25.342000s) lookups: 9813542/s (1.019000s) PyDict: Start Benchmark. 1 x 10000000 iterations inserts: 2044153/s (4.892000s) lookups: 3401360/s (2.940000s) This pretty much confirms my suspicion that the builtin AAs are optimized for lookup speed at the expense of insertion speed/GC overhead to an absurd degree. Also, regarding the license for Python's dictionary, apparently the Python License doesn't allow removing of copyright notices from stuff distributed in binary form. This might prove problematic if this were to be integrated into druntime.A few weeks ago I mentioned that I was going to create some kind of forum for people to post candidate associative array implementations to replace the current, much-despised implementation as the new "builtin" AA for D. It's now up at http://dsource.org/projects/aa/wiki/WikiStart . SVN write access should work for all registered dsource users. If you have an associative array implementation that you feel is a significant improvement over the current builtin and are able/willing to license it under the Boost license, please post it for comment. Also, if you have a good benchmark for AAs, please create a benchmarks/ directory and submit it.I've committed a port of Pythons dictionary implementation. Here is my announcement along with some benchmark figures: http://digitalmars.com/d/archives/digitalmars/D/ Python_dictionary_inspired_hash_table_implementation_73176.html
Nov 05 2009
On Thu, 05 Nov 2009 15:44:49 +0000, Moritz Warning wrote:On Thu, 05 Nov 2009 03:52:22 +0000, dsimcha wrote:[..] Could you check the test bench and see how to integrate the buildin for nicely? (isn't there some alias this that can be used?) I wasn't able to reproduce your results. Could you retry with the test.d please? About the license issue. I have spoken to some python folks about it, but I need to search my logs first.== Quote from Moritz Warning (moritzwarning web.de)'s articleI've committed an initial test bench (test.d). dmd2 test.d pydict/pyDictD2.d randAA/RandAA.d -O -releaseOn Wed, 04 Nov 2009 20:53:20 +0000, dsimcha wrote:Nice job. I ported your code to D2, (quick and dirty, doesn't work right with const and all, just enough to be able to test it), and uploaded that version for D2 users. I had in the back of my mind that this was to be D2-centric, but I guess D1 could easily get better AAs, too, since it wouldn't affect the spec. I couldn't find your benchmarks, but here are my results: Builtin: Start Benchmark. 1 x 10000000 iterations inserts: 394601/s (25.342000s) lookups: 9813542/s (1.019000s) PyDict: Start Benchmark. 1 x 10000000 iterations inserts: 2044153/s (4.892000s) lookups: 3401360/s (2.940000s) This pretty much confirms my suspicion that the builtin AAs are optimized for lookup speed at the expense of insertion speed/GC overhead to an absurd degree. Also, regarding the license for Python's dictionary, apparently the Python License doesn't allow removing of copyright notices from stuff distributed in binary form. This might prove problematic if this were to be integrated into druntime.A few weeks ago I mentioned that I was going to create some kind of forum for people to post candidate associative array implementations to replace the current, much-despised implementation as the new "builtin" AA for D. It's now up at http://dsource.org/projects/aa/wiki/WikiStart . SVN write access should work for all registered dsource users. If you have an associative array implementation that you feel is a significant improvement over the current builtin and are able/willing to license it under the Boost license, please post it for comment. Also, if you have a good benchmark for AAs, please create a benchmarks/ directory and submit it.I've committed a port of Pythons dictionary implementation. Here is my announcement along with some benchmark figures: http://digitalmars.com/d/archives/digitalmars/D/ Python_dictionary_inspired_hash_table_implementation_73176.html
Nov 05 2009
"dsimcha" <dsimcha yahoo.com> wrote in message news:hcspk0$pk5$1 digitalmars.com...A few weeks ago I mentioned that I was going to create some kind of forum for people to post candidate associative array implementations to replace the current, much-despised implementation as the new "builtin" AA for D. It's now up at http://dsource.org/projects/aa/wiki/WikiStart . SVN write access should work for all registered dsource users. If you have an associative array implementation that you feel is a significant improvement over the current builtin and are able/willing to license it under the Boost license, please post it for comment. Also, if you have a good benchmark for AAs, please create a benchmarks/ directory and submit it.Slightly related, I recently looked at the typecons code and was a bit surprised that the string to enum works by checking the string against all possible enum elements. Couldn't a compile time perfect hashed aa fix this? :) private template enumParserImpl(string name, bool first, T...) { static if (first) { enum string enumParserImpl = "bool enumFromString(string s, ref " ~name~" v) {\n" ~enumParserImpl!(name, false, T) ~"return false;\n}\n"; } else { static if (T.length) enum string enumParserImpl = "if (s == `"~T[0]~"`) return (v = "~name~"."~T[0]~"), true;\n" ~enumParserImpl!(name, false, T[1 .. $]); else enum string enumParserImpl = ""; } }
Nov 05 2009
dsimcha:Also, if you have a good benchmark for AAs, please create a benchmarks/ directory and submit it.You may try this too (C code, the "khash.h" file, plus few demos), the original code is not mine, with MIT License: http://www.fantascienza.net/leonardo/js/khash.zip I have found it to be quick for some of my usages. Bye, bearophile
Nov 05 2009
== Quote from bearophile (bearophileHUGS lycos.com)'s articledsimcha:code is not mine, with MIT License:Also, if you have a good benchmark for AAs, please create a benchmarks/ directory and submit it.You may try this too (C code, the "khash.h" file, plus few demos), the originalhttp://www.fantascienza.net/leonardo/js/khash.zip I have found it to be quick for some of my usages. Bye, bearophileFor the purposes of this project, I only accept D code. If you want to submit it, port it to D. Also, if you want to submit it, please submit it so we can browse the source code online instead of linking to a zip file that we then have to unzip, etc.
Nov 05 2009