digitalmars.D.learn - associative array: unexpected results after static initialization
- kdevel (35/35) Nov 30 2017 This program
- H. S. Teoh (6/20) Nov 30 2017 [...]
- H. S. Teoh (4/4) Nov 30 2017 Hmm, which compiler are you using, and which version? I cannot
- H. S. Teoh (8/11) Nov 30 2017 [...]
- H. S. Teoh (5/5) Nov 30 2017 Seems like this problem is already known:
- kdevel (3/5) Dec 01 2017 Great. By the way: It it true, that there cannot be more than
- Steven Schveighoffer (11/18) Dec 01 2017 Yes, and technically even less because there is a requirement to grow
- kdevel (4/6) Dec 01 2017 And wouldn't it be reasonable to add
- H. S. Teoh (8/18) Dec 01 2017 I did think about adding that, but then I didn't want the unittest to
- kdevel (6/23) Dec 01 2017 I personally would have preferred a run time or even a
- Steven Schveighoffer (13/29) Dec 01 2017 1. If there is going to be an "implementation defined" ambiguity, then
- H. S. Teoh (24/55) Dec 01 2017 [...]
- Steven Schveighoffer (9/15) Dec 01 2017 I'm not sure it even needs a spec "update", as there is no spec for it :...
- H. S. Teoh (11/29) Dec 01 2017 What I meant is, the AA spec needs to be updated to specify this
- kdevel (41/59) Dec 01 2017 Sure. this looks "idiomatic" but the keys are not constants. If
This program ``` void aa_stat(T, U) (T[U] aa) { import std.stdio; writefln ("aa : %s", aa); writefln ("aa.length : %d", aa.length); writefln ("aa.keys : %s", aa.keys); writefln ("aa.values : %s", aa.values); foreach (k, v; aa) writeln (k, ": ", v); } void main () { string[int] aa = [ 0: "null", 0: "eins" ]; aa.aa_stat; } ``` produces this output: aa : [0:"eins", ] aa.length : 2 aa.keys : [0, 32767] aa.values : ["eins", ""] 0: eins I would have expected either a compile time or a runtime error or this result: aa : [0:"eins"] aa.length : 1 aa.keys : [0] aa.values : ["eins"] 0: eins What's happening here?
Nov 30 2017
On Thu, Nov 30, 2017 at 11:50:13PM +0000, kdevel via Digitalmars-d-learn wrote:This program ``` void aa_stat(T, U) (T[U] aa) { import std.stdio; writefln ("aa : %s", aa); writefln ("aa.length : %d", aa.length); writefln ("aa.keys : %s", aa.keys); writefln ("aa.values : %s", aa.values); foreach (k, v; aa) writeln (k, ": ", v); } void main () { string[int] aa = [ 0: "null", 0: "eins" ]; aa.aa_stat; } ``` produces this output: aa : [0:"eins", ] aa.length : 2 aa.keys : [0, 32767] aa.values : ["eins", ""] 0: eins[...] This looks like a bug in the implementation of AA literals. Please file a bug at http://issues.dlang.org/. --T
Nov 30 2017
Hmm, which compiler are you using, and which version? I cannot reproduce this bug with DMD git head. It may be have fixed since the release you're using? --T
Nov 30 2017
On Thu, Nov 30, 2017 at 03:57:37PM -0800, H. S. Teoh via Digitalmars-d-learn wrote:Hmm, which compiler are you using, and which version? I cannot reproduce this bug with DMD git head. It may be have fixed since the release you're using?[...] Sorry, I was testing the wrong code. This problem does still happen on DMD git head. Looking into the code right now... Please still file the bug if you haven't already. T -- "How are you doing?" "Doing what?"
Nov 30 2017
Seems like this problem is already known: https://issues.dlang.org/show_bug.cgi?id=15290 Here's the fix: https://github.com/dlang/druntime/pull/1980 --T
Nov 30 2017
On Friday, 1 December 2017 at 00:42:00 UTC, H. S. Teoh wrote:Here's the fix: https://github.com/dlang/druntime/pull/1980Great. By the way: It it true, that there cannot be more than 2^32 keys in an associative array (due to the use of uint)?
Dec 01 2017
On 12/1/17 11:01 AM, kdevel wrote:On Friday, 1 December 2017 at 00:42:00 UTC, H. S. Teoh wrote:Yes, and technically even less because there is a requirement to grow the array when it's 4/5 full. But such a thing is easily fixed if we need to. Note that due to the way it's currently implemented (and you aren't going to get much smaller than this), each key takes 16 bytes (bucket ptr + hash) + at least 16 bytes per key/value pair (depending on what's stored there). So you are talking minimum 4GB * 32 = 128GB RAM required if you wanted to have such an AA! -SteveHere's the fix: https://github.com/dlang/druntime/pull/1980Great. By the way: It it true, that there cannot be more than 2^32 keys in an associative array (due to the use of uint)?
Dec 01 2017
On Friday, 1 December 2017 at 00:42:00 UTC, H. S. Teoh wrote:Here's the fix: https://github.com/dlang/druntime/pull/1980And wouldn't it be reasonable to add assert(aa.values == [ "b" ]); to the unittest?
Dec 01 2017
On Fri, Dec 01, 2017 at 04:06:50PM +0000, kdevel via Digitalmars-d-learn wrote:On Friday, 1 December 2017 at 00:42:00 UTC, H. S. Teoh wrote:I did think about adding that, but then I didn't want the unittest to dictate which value should take precedence when there are duplicate keys, since that is currently implementation-dependent. At the very least, it merits further discussion. T -- Never trust an operating system you don't have source for! -- Martin SchulzeHere's the fix: https://github.com/dlang/druntime/pull/1980And wouldn't it be reasonable to add assert(aa.values == [ "b" ]); to the unittest?
Dec 01 2017
On Friday, 1 December 2017 at 16:23:33 UTC, H. S. Teoh wrote:On Fri, Dec 01, 2017 at 04:06:50PM +0000, kdevel via Digitalmars-d-learn wrote:I personally would have preferred a run time or even a compile-time error in the case of my orginal example. That would support the "fail fast" philosohphy. IMHO it makes no sense at all to initialize an AA in such a way. The way the initialization works hides copy-and-paste errors like the one of my example code.On Friday, 1 December 2017 at 00:42:00 UTC, H. S. Teoh wrote:I did think about adding that, but then I didn't want the unittest to dictate which value should take precedence when there are duplicate keys, since that is currently implementation-dependent. At the very least, it merits further discussion.Here's the fix: https://github.com/dlang/druntime/pull/1980And wouldn't it be reasonable to add assert(aa.values == [ "b" ]); to the unittest?
Dec 01 2017
On 12/1/17 11:23 AM, H. S. Teoh wrote:On Fri, Dec 01, 2017 at 04:06:50PM +0000, kdevel via Digitalmars-d-learn wrote:1. If there is going to be an "implementation defined" ambiguity, then nobody should ever use it, and it should be an error. This would mean a runtime error, since the keys may be runtime values. 2. I can think of possible (and questionable) reasons one may put in duplicate keys in an AA, like if they are defined separately (i.e. you use 2 symbols as keys that are sometimes the same, sometimes different, and the result is some defined order) 3. I can't think of a reason why anyone would implement the AA filling algorithm other than foreach-ing over the keys and values. So I would say we should define that the last key/value combo takes precedence, and it would be good to ensure that's the case with a test. -SteveOn Friday, 1 December 2017 at 00:42:00 UTC, H. S. Teoh wrote:I did think about adding that, but then I didn't want the unittest to dictate which value should take precedence when there are duplicate keys, since that is currently implementation-dependent. At the very least, it merits further discussion.Here's the fix: https://github.com/dlang/druntime/pull/1980And wouldn't it be reasonable to add assert(aa.values == [ "b" ]); to the unittest?
Dec 01 2017
On Fri, Dec 01, 2017 at 11:59:08AM -0500, Steven Schveighoffer via Digitalmars-d-learn wrote:On 12/1/17 11:23 AM, H. S. Teoh wrote:[...] There is at least 1 use case in the bugzilla issue that justifies AA literals with (possibly) duplicated keys: https://issues.dlang.org/show_bug.cgi?id=15290#c1 Code snippet: ------- foreach (i; 0..10) foreach (j; 0..10) if (pred2 (i, j)) foreach (k; 0..10) if (pred3 (i, j, k)) ... and maybe more ... { auto set = [i: true, j: true; k: true]; if (set.length == 3) { ... we found distinct i, j, k, ... } } ------- In this case, the order doesn't matter.On Fri, Dec 01, 2017 at 04:06:50PM +0000, kdevel via Digitalmars-d-learn wrote:1. If there is going to be an "implementation defined" ambiguity, then nobody should ever use it, and it should be an error. This would mean a runtime error, since the keys may be runtime values. 2. I can think of possible (and questionable) reasons one may put in duplicate keys in an AA, like if they are defined separately (i.e. you use 2 symbols as keys that are sometimes the same, sometimes different, and the result is some defined order)On Friday, 1 December 2017 at 00:42:00 UTC, H. S. Teoh wrote:I did think about adding that, but then I didn't want the unittest to dictate which value should take precedence when there are duplicate keys, since that is currently implementation-dependent. At the very least, it merits further discussion.Here's the fix: https://github.com/dlang/druntime/pull/1980And wouldn't it be reasonable to add assert(aa.values == [ "b" ]); to the unittest?3. I can't think of a reason why anyone would implement the AA filling algorithm other than foreach-ing over the keys and values. So I would say we should define that the last key/value combo takes precedence, and it would be good to ensure that's the case with a test.Sure, but that means a spec update besides the unittest addition. T -- "The number you have dialed is imaginary. Please rotate your phone 90 degrees and try again."
Dec 01 2017
On 12/1/17 12:02 PM, H. S. Teoh wrote:On Fri, Dec 01, 2017 at 11:59:08AM -0500, Steven Schveighoffer via Digitalmars-d-learn wrote:I'm not sure it even needs a spec "update", as there is no spec for it :) But having the unit test at least ensures that someone implementing something different (though I don't ever see it happening) will notice the test. As of now, someone can be depending on this behavior, and then such an implementation rewrite could get through and break their code. I'll write a unit test, shouldn't be too hard. -SteveSo I would say we should define that the last key/value combo takes precedence, and it would be good to ensure that's the case with a test.Sure, but that means a spec update besides the unittest addition.
Dec 01 2017
On Fri, Dec 01, 2017 at 01:14:14PM -0500, Steven Schveighoffer via Digitalmars-d-learn wrote:On 12/1/17 12:02 PM, H. S. Teoh wrote:What I meant is, the AA spec needs to be updated to specify this behaviour. :-) Otherwise, it opens up the chance of somebody implementing an alternative D compiler (let's say, SDC, just for an actual example) that exhibits a different behaviour and breaking user code.On Fri, Dec 01, 2017 at 11:59:08AM -0500, Steven Schveighoffer via Digitalmars-d-learn wrote:I'm not sure it even needs a spec "update", as there is no spec for it :)So I would say we should define that the last key/value combo takes precedence, and it would be good to ensure that's the case with a test.Sure, but that means a spec update besides the unittest addition.But having the unit test at least ensures that someone implementing something different (though I don't ever see it happening) will notice the test. As of now, someone can be depending on this behavior, and then such an implementation rewrite could get through and break their code. I'll write a unit test, shouldn't be too hard.[...] Maybe I'll take a shot at updating the AA spec. T -- Never step over a puddle, always step around it. Chances are that whatever made it is still dripping.
Dec 01 2017
On Friday, 1 December 2017 at 17:02:48 UTC, H. S. Teoh wrote:[...] There is at least 1 use case in the bugzilla issue that justifies AA literals with (possibly) duplicated keys: https://issues.dlang.org/show_bug.cgi?id=15290#c1 Code snippet: ------- foreach (i; 0..10) foreach (j; 0..10) if (pred2 (i, j)) foreach (k; 0..10) if (pred3 (i, j, k)) ... and maybe more ... { auto set = [i: true, j: true; k: true]; if (set.length == 3) { ... we found distinct i, j, k, ... } } -------Sure. this looks "idiomatic" but the keys are not constants. If the compiler only complained on compile-time duplicate keys it would not do so in this case. BTW: I was under the impression (never checked or profiled such code) that in the following code trans_AA and trans_switch are technically equivalent: ``` string trans_AA (string s) { immutable string[string] aa = [ "src1" : "repl1", "src2" : "repl2", "src3" : "repl3", ]; return aa[s]; } string trans_switch (string s) { switch (s) { case "src1": return "repl1"; case "src2": return "repl2"; case "src3": return "repl3"; // case "src3": return "repl3"; default: throw new Exception ("no key " ~ s); } } void main () { import std.stdio; import std.string; auto a = stdin.readln.chomp; a.trans_switch.writeln; a.trans_AA.writeln; } ``` If one uncomments the "src3" line in trans_switch the compiler complains with aa.d(17): Error: duplicate case "src3" in switch statement while a duplication in trans_AA goes unnoticed.
Dec 01 2017