D - assoc array performance
- Ben Hinkle (21/21) Apr 17 2004 well, based on Serge's eariler post I've been playing around with
- Kris (9/30) Apr 17 2004 Thanks Ben!
- Ben Hinkle (13/14) Apr 17 2004 either
- Kris (4/18) Apr 17 2004 Cheers! Really appreciate that.
- Scott Egan (8/29) Apr 18 2004 Isn't this dangerous? You're relying on the semantics of how an Assoc a...
- Kris (5/7) Apr 18 2004 array
well, based on Serge's eariler post I've been playing around with different assoc array implementations and believe it or not the current one is the only one that behaves well on both Windows and Linux under various tests that vary the table size and GC interactions. The only trick that significantly improved performance was preallocating the node pointer array. I hadn't realized this before but you can already do this as follows: int[char[]] X; // an assoc array of ints indexed by strings void*[] Y; Y.length = 1000000; // allocate array of void* X = cast(int[char[]])Y; // voila, a pre-allocated assoc array Another performance boost came by using stdup with malloc to duplicate strings instead of .dup but if you do that you have to manage the dup'ed strings so they don't leak. I tried all kinds of combinations of pre-allocating the data nodes as well as the node pointers and messing with the hashing function but they all had drawbacks when mixed with either the GC or the default pointer array (the default assoc array allocates 10 pointers no matter how many entries you eventually put in). Go figure. Either Walter got lucky or he actually tested this stuff out ;-) -Ben
Apr 17 2004
Thanks Ben! I've been tearing my hair out trying to figure a way to preallocate AA's. Was just about to post to the "What's needed for 1.0" thread regarding this. Would be nice/appropriate if there was a cleaner 'syntax' that the example below <g> BTW, did you find a quick way to delete all entries? - Kris "Ben Hinkle" <bhinkle4 juno.com> wrote in message news:c5sgbp$2p8j$1 digitaldaemon.com...well, based on Serge's eariler post I've been playing around with different assoc array implementations and believe it or not the current one is the only one that behaves well on both Windows and Linux under various tests that vary the table size and GC interactions. The only trick that significantly improved performance was preallocating the node pointer array. I hadn't realized this before but you can already do this as follows: int[char[]] X; // an assoc array of ints indexed by strings void*[] Y; Y.length = 1000000; // allocate array of void* X = cast(int[char[]])Y; // voila, a pre-allocated assoc array Another performance boost came by using stdup with malloc to duplicate strings instead of .dup but if you do that you have to manage the dup'ed strings so they don't leak. I tried all kinds of combinations of pre-allocating the data nodes as well as the node pointers and messing with the hashing function but they all had drawbacks when mixed with either the GC or the default pointer array (the default assoc array allocates 10 pointers no matter how many entries you eventually put in). Go figure. Either Walter got lucky or he actually tested this stuff out ;-) -Ben
Apr 17 2004
BTW, did you find a quick way to delete all entries?either void*[] Y; Y.length = 0; // not really needed since Y has 0 length by default X = cast(int[char[]])Y; to dump the node pointer array entirely. If you want to preserve the array (for instance you had preallocated the pointer array) and just clear out all the pointers try void*[] Y = cast(void*[])X; Y[] = null; // set all the node pointers to null X = cast(int[char[]])Y; Isolating code like that would be nice since it depends on implementation details of assoc arrays. -Ben
Apr 17 2004
Cheers! Really appreciate that. Let's hope Walter will add some properties for doing this ... "Ben Hinkle" <bhinkle4 juno.com> wrote in message news:c5sr12$7q4$1 digitaldaemon.com...BTW, did you find a quick way to delete all entries?either void*[] Y; Y.length = 0; // not really needed since Y has 0 length by default X = cast(int[char[]])Y; to dump the node pointer array entirely. If you want to preserve the array (for instance you had preallocated the pointer array) and just clear out all the pointers try void*[] Y = cast(void*[])X; Y[] = null; // set all the node pointers to null X = cast(int[char[]])Y; Isolating code like that would be nice since it depends on implementation details of assoc arrays. -Ben
Apr 17 2004
Isn't this dangerous? You're relying on the semantics of how an Assoc array is implemented by the compiler. I would have thought the X.length = 100000 would work, but I'm obviously missing a subtle point. Regards, Scott "Ben Hinkle" <bhinkle4 juno.com> wrote in message news:c5sgbp$2p8j$1 digitaldaemon.com...well, based on Serge's eariler post I've been playing around with different assoc array implementations and believe it or not the current one is the only one that behaves well on both Windows and Linux under various tests that vary the table size and GC interactions. The only trick that significantly improved performance was preallocating the node pointer array. I hadn't realized this before but you can already do this as follows: int[char[]] X; // an assoc array of ints indexed by strings void*[] Y; Y.length = 1000000; // allocate array of void* X = cast(int[char[]])Y; // voila, a pre-allocated assoc array Another performance boost came by using stdup with malloc to duplicate strings instead of .dup but if you do that you have to manage the dup'ed strings so they don't leak. I tried all kinds of combinations of pre-allocating the data nodes as well as the node pointers and messing with the hashing function but they all had drawbacks when mixed with either the GC or the default pointer array (the default assoc array allocates 10 pointers no matter how many entries you eventually put in). Go figure. Either Walter got lucky or he actually tested this stuff out ;-) -Ben
Apr 18 2004
"Scott Egan" <scotte tpg.com.aux> wrote in message news:c5tels$175v$1 digitaldaemon.com...Isn't this dangerous? You're relying on the semantics of how an Assocarrayis implemented by the compiler.Sure it is, Scott. But at least this is a workaround, until Walter is "encouraged" by enough of us to add such properties to AAs <g>
Apr 18 2004