www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - Associative array literal: length wrong when duplicate keys found

reply Ivan Kazmenko <gassa mail.ru> writes:
Hi.

I wanted to check whether a few variables of the same type are 
all distinct, in a quick and dirty way.  I tried to do it similar 
to Python's "len(set(value_list)) == len(value_list)" idiom by 
using an associative array (AA).  At this point, I found out that 
when initializing the AA with a literal, the length is the number 
of keys given, regardless of whether some of them were the same.

A minimized example:

-----
import std.stdio;
void main () {
	auto aa = [1 : 2, 1 : 3];
	writeln (aa.length, " ", aa); // 2 [1:3, ]
}
-----

See, the length is 2, but iteration over aa yields only one 
key:value pair.  Also, note the comma which is a sign of internal 
confusion as well.

My question is, what's the state of this?  Is this a bug?  Or 
should it be forbidden to have such an initializer?  Or maybe it 
is a feature with some actual merit?

Ivan Kazmenko.
Jan 31 2017
parent reply John Colvin <john.loughran.colvin gmail.com> writes:
On Tuesday, 31 January 2017 at 14:15:58 UTC, Ivan Kazmenko wrote:
 Hi.

 I wanted to check whether a few variables of the same type are 
 all distinct, in a quick and dirty way.  I tried to do it 
 similar to Python's "len(set(value_list)) == len(value_list)" 
 idiom by using an associative array (AA).  At this point, I 
 found out that when initializing the AA with a literal, the 
 length is the number of keys given, regardless of whether some 
 of them were the same.

 A minimized example:

 -----
 import std.stdio;
 void main () {
 	auto aa = [1 : 2, 1 : 3];
 	writeln (aa.length, " ", aa); // 2 [1:3, ]
 }
 -----

 See, the length is 2, but iteration over aa yields only one 
 key:value pair.  Also, note the comma which is a sign of 
 internal confusion as well.

 My question is, what's the state of this?  Is this a bug?  Or 
 should it be forbidden to have such an initializer?  Or maybe 
 it is a feature with some actual merit?

 Ivan Kazmenko.
It's a bug, please report it. The initializer should be statically disallowed. Adding a .dup works around the problem. By the way, you can do sets like this, avoiding storing any dummy values, only keys: struct Set(T) { void[0][T] data; void insert(T x) { data[x] = (void[0]).init; } void remove(T x) { data.remove(x); } bool opBinaryRight(string op : "in")(T e) { return !!(e in data); } // other things like length, etc. } unittest { Set!int s; s.insert(4); assert(4 in s); s.remove(4); assert(4 !in s); }
Jan 31 2017
parent reply Ivan Kazmenko <gassa mail.ru> writes:
On Tuesday, 31 January 2017 at 17:20:00 UTC, John Colvin wrote:
 It's a bug, please report it. The initializer should be 
 statically disallowed.

 Adding a .dup works around the problem.
OK. Hmm, but the real use case was a bit more complicated, more like: ----- int n = 10; foreach (i; 0..n) foreach (j; 0..n) foreach (k; 0..n) ... and maybe a couple more ... if ([i: true, j: true, k: true].length == 3) {...} // i, j, k is a set of distinct values ----- Here, we don't know i, j and k statically, yet the problem is the same. Anyway, I'll file a bug report.
 By the way, you can do sets like this, avoiding storing any 
 dummy values, only keys:

 struct Set(T)
 {
 	void[0][T] data;
 	
 	void insert(T x)
 	{
 		data[x] = (void[0]).init;
 	}
 	
 	void remove(T x)
 	{
 		data.remove(x);
 	}
 	
 	bool opBinaryRight(string op : "in")(T e)
 	{
 		return !!(e in data);
 	}
 	
 	// other things like length, etc.
 }

 unittest
 {
 	Set!int s;
 	s.insert(4);
 	assert(4 in s);
 	s.remove(4);
 	assert(4 !in s);
 }
Yeah, thanks for the recipe! I usually do bool [key] since it does not add much overhead, but would definitely like the real set (void[0] or otherwise) when performance matters. Ivan Kazmenko.
Jan 31 2017
parent Ivan Kazmenko <gassa mail.ru> writes:
On Tuesday, 31 January 2017 at 19:45:33 UTC, Ivan Kazmenko wrote:
 On Tuesday, 31 January 2017 at 17:20:00 UTC, John Colvin wrote:
 It's a bug, please report it. The initializer should be 
 statically disallowed.
Anyway, I'll file a bug report.
Hmm, found it: https://issues.dlang.org/show_bug.cgi?id=15290 I'll add details about my use case to the report, for what it's worth.
Feb 02 2017