digitalmars.D.bugs - [Issue 10876] New: foreach and remove over associative array causes invalid data to appear
- d-bugmail puremagic.com (64/64) Aug 23 2013 http://d.puremagic.com/issues/show_bug.cgi?id=10876
- d-bugmail puremagic.com (11/11) Aug 23 2013 http://d.puremagic.com/issues/show_bug.cgi?id=10876
- d-bugmail puremagic.com (24/24) Aug 23 2013 http://d.puremagic.com/issues/show_bug.cgi?id=10876
- d-bugmail puremagic.com (10/15) Aug 23 2013 http://d.puremagic.com/issues/show_bug.cgi?id=10876
- d-bugmail puremagic.com (12/12) Aug 23 2013 http://d.puremagic.com/issues/show_bug.cgi?id=10876
- d-bugmail puremagic.com (11/11) Aug 23 2013 http://d.puremagic.com/issues/show_bug.cgi?id=10876
- d-bugmail puremagic.com (7/10) Aug 23 2013 http://d.puremagic.com/issues/show_bug.cgi?id=10876
http://d.puremagic.com/issues/show_bug.cgi?id=10876 Summary: foreach and remove over associative array causes invalid data to appear Product: D Version: D2 Platform: x86_64 OS/Version: Linux Status: NEW Severity: major Priority: P2 Component: DMD AssignedTo: nobody puremagic.com ReportedBy: maximechevalierb gmail.com 2013-08-23 09:35:40 PDT --- This issue shows up using DMD64 D Compiler v2.063 on Linux. I have a foreach loop in which I remove values from an associative array. The behavior is completely broken. Items which are not in the associative array (including the null value!) end up appearing in the iteration. The following code snippet from my compiler produces the issue: writeln("associative array length: ", allocState.length); writeln("keys length: ", allocState.keys.length); // Remove dead values from the alloc state foreach (value, allocSt; allocState) { writeln("key: ", value); writeln((value in allocState)? true:false); if (value is null) writeln(allocState[null]); if (liveInfo.liveAtEntry(value, block) is false) allocState.remove(value); } The output is as follows: associative array length: 4 keys length: 4 key: $22 = arg 6 "v" true key: $20 = arg 4 "o" true key: $8 = add_i32 $1, $21 true key: $21 = arg 5 "i" true key: $7 = phi [call_reg(5662):0 => $14, call_cont(565C):0 => $12] false key: $12 = load_u32 $20, $11 false key: $12 = load_u32 $20, $11 false key: $1 = add_i32 32, $0 false key: $1 = add_i32 32, $0 false key: $0 = mul_i32 8, $7 false key: $0 = mul_i32 8, $7 false key: null false core.exception.RangeError jit.jit(1410): Range violation As you can see, most of the keys listed are not in the associative array, and the null value appears as a key, out of nowhere, which causes a crash. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Aug 23 2013
http://d.puremagic.com/issues/show_bug.cgi?id=10876 Andrej Mitrovic <andrej.mitrovich gmail.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |andrej.mitrovich gmail.com 09:36:54 PDT --- This will be marked as invalid, you can't safely iterate over an associative array and remove its keys while iterating. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Aug 23 2013
http://d.puremagic.com/issues/show_bug.cgi?id=10876 hsteoh quickfur.ath.cx changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |hsteoh quickfur.ath.cx Modifying a container that you're iterating over has undefined results, because you're iterating over a set of keys that changes from under you as you go. In the case of AAs, you may end up dereferencing invalid pointers. The safe way to do it is to get an array of keys and iterate over that: foreach (k; aa.keys) { auto value = aa[k]; ... if (...) aa.remove(k); } N.B. you can only use .keys, not .byKey, because the latter also amounts to walking the data structure while it is being modified, which will cause problems. Using .keys is OK because it creates an array of keys (a snapshot of the set of keys at that point in time) before starting to modify the data structure. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Aug 23 2013
http://d.puremagic.com/issues/show_bug.cgi?id=10876 14:16:26 PDT ---N.B. you can only use .keys, not .byKey, because the latter also amounts to walking the data structure while it is being modified, which will cause problems. Using .keys is OK because it creates an array of keys (a snapshot of the set of keys at that point in time) before starting to modify the data structure.Yeah, as I learned in Issue 10821. Perhaps one could do an "!is null" check against the key if iterating via .byKey and removing at the same time? Well anyway, the safest bet is to use .keys. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Aug 23 2013
http://d.puremagic.com/issues/show_bug.cgi?id=10876 The problem with trying to iterate over a container that's being modified at the same time is that you can't guarantee much of anything. For example, you could insert !is null checks, but what happens if removing the key causes the AA to rehash itself (e.g., to optimize for space as the number of keys shrinks)? Then your subsequent iteration will be completely screwed up. Basically, such hacks can only work by relying on implementation details, which are not guaranteed by the spec. IOW, it's undefined behaviour. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Aug 23 2013
http://d.puremagic.com/issues/show_bug.cgi?id=10876 Some relevant references: http://stackoverflow.com/questions/3346696/python-why-is-it-not-safe-to-modify-sequence-being-iterated-on http://stackoverflow.com/questions/13981886/simultaneously-iterating-over-and-modifying-an-unordered-set The bottomline is that modifying the container while you're iterating over it is a bad idea (unless the container was specifically designed to support it). It almost always doesn't do what you think it should do. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Aug 23 2013
http://d.puremagic.com/issues/show_bug.cgi?id=10876 14:34:18 PDT ---The bottomline is that modifying the container while you're iterating over it is a bad idea (unless the container was specifically designed to support it). It almost always doesn't do what you think it should do.Btw, we should put this info up on dlang.org if it's not already there. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Aug 23 2013