digitalmars.D.learn - Cost of assoc array?
- Chris (8/8) May 14 2014 I have code that uses the following:
- bearophile (8/10) May 14 2014 An associative array is a rather more complex data structure, so
- Chris (4/16) May 14 2014 Thanks. Do you mean the difference is negligible in many cases?
- dennis luehring (6/25) May 14 2014 a simple array would be faster because no access is generate for your
- bearophile (6/7) May 14 2014 Yes, but you have to profile the code (or reason about it well,
- John Colvin (7/15) May 14 2014 If you don't need the features of associative arrays, don't use
- Chris (16/37) May 14 2014 The question is, if they are _much_ faster. With this type
- dennis luehring (7/11) May 14 2014 why not using an alias for easier switch between the versions?
- Chris (3/14) May 14 2014 Talking about not seeing the forest for the trees ... I'll try
- John Colvin (14/53) May 14 2014 Yes, they are much faster. Normal array indexing is equivalent to
- Chris (7/21) May 14 2014 It is very old code (dmd 2.052 times). When I set out to write
- bearophile (7/10) May 14 2014 There are various better ways to use a foreach on an array:
- John Colvin (2/12) May 14 2014 His code was for an associative array.
I have code that uses the following: string[][size_t] myArray; 1. myArray = [0:["t", "o", "m"], 1:["s", "m", "i", "th"]]; However, I've found out that I never need an assoc array and a "linear" array would be just fine, as in 2. myArray = [["t", "o", "m"], ["s", "m", "i", "th"]]; Is there any huge difference as regards performance and memory footprint between the two? Or is 2. basically 1. under the hood?
May 14 2014
Chris:Is there any huge difference as regards performance and memory footprint between the two? Or is 2. basically 1. under the hood?An associative array is a rather more complex data structure, so if you don't need it, use something simpler. There is difference in both the amount of memory used and performance (in many cases such difference doesn't matter). In D there are also differences in the way they are initialized from a null or fat null. Bye, bearophile
May 14 2014
On Wednesday, 14 May 2014 at 10:20:51 UTC, bearophile wrote:Chris:Thanks. Do you mean the difference is negligible in many cases? I'm not sure, because changing it would be a breaking change in the old code, meaning I would have to change various methods.Is there any huge difference as regards performance and memory footprint between the two? Or is 2. basically 1. under the hood?An associative array is a rather more complex data structure, so if you don't need it, use something simpler. There is difference in both the amount of memory used and performance (in many cases such difference doesn't matter). In D there are also differences in the way they are initialized from a null or fat null. Bye, bearophile
May 14 2014
Am 14.05.2014 12:33, schrieb Chris:On Wednesday, 14 May 2014 at 10:20:51 UTC, bearophile wrote:a simple array would be faster because no access is generate for your key - just plain access, just don't use assoc arrays if you don't need key based access read more manuals about hashmaps and stuff and how to do benchmarking - helps alot in the futureChris:Thanks. Do you mean the difference is negligible in many cases? I'm not sure, because changing it would be a breaking change in the old code, meaning I would have to change various methods.Is there any huge difference as regards performance and memory footprint between the two? Or is 2. basically 1. under the hood?An associative array is a rather more complex data structure, so if you don't need it, use something simpler. There is difference in both the amount of memory used and performance (in many cases such difference doesn't matter). In D there are also differences in the way they are initialized from a null or fat null. Bye, bearophile
May 14 2014
Chris:Do you mean the difference is negligible in many cases?Yes, but you have to profile the code (or reason about it well, with knowledge of the data structures and their usage patterns) if you want to know what your case is. Bye, bearophile
May 14 2014
On Wednesday, 14 May 2014 at 09:38:10 UTC, Chris wrote:I have code that uses the following: string[][size_t] myArray; 1. myArray = [0:["t", "o", "m"], 1:["s", "m", "i", "th"]]; However, I've found out that I never need an assoc array and a "linear" array would be just fine, as in 2. myArray = [["t", "o", "m"], ["s", "m", "i", "th"]]; Is there any huge difference as regards performance and memory footprint between the two? Or is 2. basically 1. under the hood?If you don't need the features of associative arrays, don't use them. Normal arrays are much simpler, faster and (due to some outstanding problems with associative arrays in D) less bug-prone. Associative arrays, by definition, require a lot more work behind the scenes for both reading and writing.
May 14 2014
On Wednesday, 14 May 2014 at 11:13:10 UTC, John Colvin wrote:On Wednesday, 14 May 2014 at 09:38:10 UTC, Chris wrote:The question is, if they are _much_ faster. With this type (string[][size_t]) I haven't encountered any bugs yet. On the other hand, it introduces some rather stilted logic sometimes, as in foreach (size_t i; 0..myArray.length) { // do something with myArray[i]; } because it's not sorted. This, or I sort it first. Anyway, there's always an overhead associated with associative arrays. I'll have to see how big this breaking change would be, and decide, if it's worth it. Profiling is not really feasible, because for this to work properly, I would have to introduce the change first to be able to compare both. Nothing worse than carefully changing things only to find out, it doesn't really speed up things.I have code that uses the following: string[][size_t] myArray; 1. myArray = [0:["t", "o", "m"], 1:["s", "m", "i", "th"]]; However, I've found out that I never need an assoc array and a "linear" array would be just fine, as in 2. myArray = [["t", "o", "m"], ["s", "m", "i", "th"]]; Is there any huge difference as regards performance and memory footprint between the two? Or is 2. basically 1. under the hood?If you don't need the features of associative arrays, don't use them. Normal arrays are much simpler, faster and (due to some outstanding problems with associative arrays in D) less bug-prone. Associative arrays, by definition, require a lot more work behind the scenes for both reading and writing.
May 14 2014
Am 14.05.2014 15:20, schrieb Chris:Profiling is not really feasible, because for this to work properly, I would have to introduce the change first to be able to compare both. Nothing worse than carefully changing things only to find out, it doesn't really speed up things.why not using an alias for easier switch between the versions? alias string[][size_t] my_array_type or alias string[][] my_array_type an do an search&replace of "string[][size_t]" with "my_array_type" thats it - still too hard :)
May 14 2014
On Wednesday, 14 May 2014 at 13:31:53 UTC, dennis luehring wrote:Am 14.05.2014 15:20, schrieb Chris:Talking about not seeing the forest for the trees ... I'll try that. :)Profiling is not really feasible, because for this to work properly, I would have to introduce the change first to be able to compare both. Nothing worse than carefully changing things only to find out, it doesn't really speed up things.why not using an alias for easier switch between the versions? alias string[][size_t] my_array_type or alias string[][] my_array_type an do an search&replace of "string[][size_t]" with "my_array_type" thats it - still too hard :)
May 14 2014
On Wednesday, 14 May 2014 at 13:20:40 UTC, Chris wrote:On Wednesday, 14 May 2014 at 11:13:10 UTC, John Colvin wrote:Yes, they are much faster. Normal array indexing is equivalent to *(myArray.ptr + index) plus an optional bounds check, whereas associative array indexing is a much, much larger job. Why were you using associative arrays in the first place? Unless your keys are somehow sparse* or of a non-integer type there isn't any reason to. * How I see that constraint in that context: (maxKey - minKey) / nElements > 1 + epsilon where epsilon is the maximum proportional wasted space you could afford in a normal array (emptyElements / usedElements). Bear in mind the memory overhead of associative arrays is itself non-zero. Also, while normal arrays tend to be more cache friendly than associative arrays, this isn't true for very sparse arrays.On Wednesday, 14 May 2014 at 09:38:10 UTC, Chris wrote:The question is, if they are _much_ faster. With this type (string[][size_t]) I haven't encountered any bugs yet. On the other hand, it introduces some rather stilted logic sometimes, as in foreach (size_t i; 0..myArray.length) { // do something with myArray[i]; } because it's not sorted. This, or I sort it first. Anyway, there's always an overhead associated with associative arrays. I'll have to see how big this breaking change would be, and decide, if it's worth it. Profiling is not really feasible, because for this to work properly, I would have to introduce the change first to be able to compare both. Nothing worse than carefully changing things only to find out, it doesn't really speed up things.I have code that uses the following: string[][size_t] myArray; 1. myArray = [0:["t", "o", "m"], 1:["s", "m", "i", "th"]]; However, I've found out that I never need an assoc array and a "linear" array would be just fine, as in 2. myArray = [["t", "o", "m"], ["s", "m", "i", "th"]]; Is there any huge difference as regards performance and memory footprint between the two? Or is 2. basically 1. under the hood?If you don't need the features of associative arrays, don't use them. Normal arrays are much simpler, faster and (due to some outstanding problems with associative arrays in D) less bug-prone. Associative arrays, by definition, require a lot more work behind the scenes for both reading and writing.
May 14 2014
On Wednesday, 14 May 2014 at 13:44:40 UTC, John Colvin wrote:Yes, they are much faster. Normal array indexing is equivalent to *(myArray.ptr + index) plus an optional bounds check, whereas associative array indexing is a much, much larger job. Why were you using associative arrays in the first place? Unless your keys are somehow sparse* or of a non-integer type there isn't any reason to.It is very old code (dmd 2.052 times). When I set out to write it, I was a) new to the language and b) I could not yet tell whether or not I would need it (I think it also had to do with the way D was back then, but I don't remember my exact reasoning). It has always been in the back of my mind to change it into a normal array. It's basically a code corpse.* How I see that constraint in that context: (maxKey - minKey) / nElements > 1 + epsilon where epsilon is the maximum proportional wasted space you could afford in a normal array (emptyElements / usedElements). Bear in mind the memory overhead of associative arrays is itself non-zero. Also, while normal arrays tend to be more cache friendly than associative arrays, this isn't true for very sparse arrays.
May 14 2014
Chris:foreach (size_t i; 0..myArray.length) { // do something with myArray[i]; }There are various better ways to use a foreach on an array: foreach (immutable x; myArray) { foreach (ref const x; myArray) { foreach (ref x; myArray) { Bye, bearophile
May 14 2014
On Wednesday, 14 May 2014 at 13:49:22 UTC, bearophile wrote:Chris:His code was for an associative array.foreach (size_t i; 0..myArray.length) { // do something with myArray[i]; }There are various better ways to use a foreach on an array: foreach (immutable x; myArray) { foreach (ref const x; myArray) { foreach (ref x; myArray) { Bye, bearophile
May 14 2014