digitalmars.D.bugs - [Issue 2418] New: Same-value string (char[]) literals get overwritten (unlike array literals)
- d-bugmail puremagic.com (115/115) Oct 14 2008 http://d.puremagic.com/issues/show_bug.cgi?id=2418
- d-bugmail puremagic.com (83/83) Oct 14 2008 http://d.puremagic.com/issues/show_bug.cgi?id=2418
- d-bugmail puremagic.com (26/26) Oct 14 2008 http://d.puremagic.com/issues/show_bug.cgi?id=2418
- d-bugmail puremagic.com (36/53) Oct 14 2008 http://d.puremagic.com/issues/show_bug.cgi?id=2418
- d-bugmail puremagic.com (46/66) Oct 14 2008 http://d.puremagic.com/issues/show_bug.cgi?id=2418
- d-bugmail puremagic.com (10/10) Oct 15 2008 http://d.puremagic.com/issues/show_bug.cgi?id=2418
- d-bugmail puremagic.com (10/21) Oct 15 2008 http://d.puremagic.com/issues/show_bug.cgi?id=2418
http://d.puremagic.com/issues/show_bug.cgi?id=2418 Summary: Same-value string (char[]) literals get overwritten (unlike array literals) Product: D Version: 1.035 Platform: PC OS/Version: Windows Status: NEW Severity: normal Priority: P2 Component: DMD AssignedTo: bugzilla digitalmars.com ReportedBy: business3 twistedpairgaming.com When a string is initialized with a string literal, and part of the string is then changed, all other string literals with the same value reflect the same change. For the following code: ------------------ module test; import tango.io.Stdout; class FooChar { char[] str; this(char[] _str) { this.str = _str; } } class FooInt { int[] ints; this(int[] _ints) { this.ints = _ints; } } void main() { FooChar a; FooChar b; FooChar c; FooChar d; a = new FooChar("AAA"); Stdout.formatln("a.str: {}", a.str); a.str[0] = '!'; Stdout.formatln("a.str: {}", a.str); a = new FooChar("AAA"); Stdout.formatln("a.str: {}", a.str); Stdout.formatln(""); b = new FooChar("AAA"); Stdout.formatln("b.str: {}", b.str); b.str[0] = '!'; Stdout.formatln("b.str: {}", b.str); b = new FooChar("AAA"); Stdout.formatln("b.str: {}", b.str); Stdout.formatln(""); c = new FooChar("AA1"); Stdout.formatln("c.str: {}", c.str); c.str[0] = '!'; Stdout.formatln("c.str: {}", c.str); c = new FooChar("AA1"); Stdout.formatln("c.str: {}", c.str); Stdout.formatln(""); const char[] dInit = "AA2"; d = new FooChar(dInit); Stdout.formatln("d.str: {}", d.str); d.str[0] = '!'; Stdout.formatln("d.str: {}", d.str); d = new FooChar(dInit); Stdout.formatln("d.str: {}", d.str); Stdout.formatln(""); FooInt i; i = new FooInt([1, 2]); Stdout.formatln("i.ints: {}", i.ints); i.ints[0] = 77; Stdout.formatln("i.ints: {}", i.ints); i = new FooInt([1, 2]); Stdout.formatln("i.ints: {}", i.ints); } ------------------ Expected output: a.str: AAA a.str: !AA a.str: AAA b.str: AAA b.str: !AA b.str: AAA c.str: AA1 c.str: !A1 c.str: AA1 d.str: AA2 d.str: !A2 d.str: AA2 i.ints: [1, 2] i.ints: [77, 2] i.ints: [1, 2] Actual output: a.str: AAA a.str: !AA a.str: !AA b.str: !AA b.str: !AA b.str: !AA c.str: AA1 c.str: !A1 c.str: !A1 d.str: AA2 d.str: !A2 d.str: !A2 i.ints: [1, 2] i.ints: [77, 2] i.ints: [1, 2] http://d.puremagic.com/issues/show_bug.cgi?id=817 --
Oct 14 2008
http://d.puremagic.com/issues/show_bug.cgi?id=2418 ------- Sorry, I guess those classes are not needed: --------------- module test; import tango.io.Stdout; void main() { char[] a; char[] b; char[] c; char[] d; a = "AAA"; Stdout.formatln("a: {}", a); a[0] = '!'; Stdout.formatln("a: {}", a); a = "AAA"; Stdout.formatln("a: {}", a); Stdout.formatln(""); b = "AAA"; Stdout.formatln("b: {}", b); b[0] = '!'; Stdout.formatln("b: {}", b); b = "AAA"; Stdout.formatln("b: {}", b); Stdout.formatln(""); c = "AA1"; Stdout.formatln("c: {}", c); c[0] = '!'; Stdout.formatln("c: {}", c); c = "AA1"; Stdout.formatln("c: {}", c); Stdout.formatln(""); const char[] dInit = "AA2"; d = dInit; Stdout.formatln("d: {}", d); d[0] = '!'; Stdout.formatln("d: {}", d); d = dInit; Stdout.formatln("d: {}", d); Stdout.formatln(""); int[] i; i = [1, 2]; Stdout.formatln("i: {}", i); i[0] = 77; Stdout.formatln("i: {}", i); i = [1, 2]; Stdout.formatln("i: {}", i); } --------------- Expected output: a: AAA a: !AA a: AAA b: AAA b: !AA b: AAA c: AA1 c: !A1 c: AA1 d: AA2 d: !A2 d: AA2 i: [1, 2] i: [77, 2] i: [1, 2] Actual output: a: AAA a: !AA a: !AA b: !AA b: !AA b: !AA c: AA1 c: !A1 c: !A1 d: AA2 d: !A2 d: !A2 i: [1, 2] i: [77, 2] i: [1, 2] --
Oct 14 2008
http://d.puremagic.com/issues/show_bug.cgi?id=2418 2korden gmail.com changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution| |INVALID The bug is at your site. You don't make string copies and work on a single string. char[] a = "AAA"; char[] b = a; // both point to the same location a[0] = '!'; assert(b[0] == '!'); // should be true since they both point to the same location Same example in C (this may help understand better): char* str = "AAA"; char* ptr = str; str[0] = '!'; assert(ptr[0] == '!'); The actual bug is that the following line should not compile: char[] a = "AAA"; Problem is, D1 doesn't have invariant type modifier and the spec is frozen, so this will never be fixed. D2, however, doesn't allow you to do this. Solution would be to make copies explicitly: a = new FooChar("AAA".dup); --
Oct 14 2008
http://d.puremagic.com/issues/show_bug.cgi?id=2418 business3 twistedpairgaming.com changed: What |Removed |Added ---------------------------------------------------------------------------- Status|RESOLVED |REOPENED Resolution|INVALID | -------The bug is at your site. You don't make string copies and work on a single string. char[] a = "AAA"; char[] b = a; // both point to the same location a[0] = '!'; assert(b[0] == '!'); // should be true since they both point to the same location Same example in C (this may help understand better): char* str = "AAA"; char* ptr = str; str[0] = '!'; assert(ptr[0] == '!');I don't think that's applicable (or at least, it shouldn't be). Here's a shortened example: char[] a = "AAA"; a[0] = '!'; char[] b = "AAA"; It's absurd that 'b' should become "!AA". It's clear that the compiler is "optimizing" both "AAA" string literals into the same reference: char[] _tmp1 = "AAA"; char[] a = _tmp1; a[0] = '!'; char[] b = _tmp1; But I'm saying that transformation should not be occurring. The two "AAA" literals should not share the same reference. In D2, array literals are immutable, so yes, in D2 it works out fine. But in D1, array literals are *not* immutable, so the compiler shouldn't assume that array literals with a equal *value* can safely share the same *reference*.The actual bug is that the following line should not compile: char[] a = "AAA";Again, this is for D1, so the idea of strings being immutable doesn't apply. Besides, at the very least there's still an inconsistency: char[] a = "AAA"; a[0] = '!'; char[] b = "AAA"; // b is "!AA" char[] c = ['A', 'A', 'A']; c[0] = '!'; char[] d = ['A', 'A', 'A']; // d is "AAA" The above shows that two occurrences of "AAA" share the same reference, but two occurrences of ['A', 'A', 'A'] have separate references. The former leads to odd side-effects, the latter does not. --
Oct 14 2008
http://d.puremagic.com/issues/show_bug.cgi?id=2418 2korden gmail.com changed: What |Removed |Added ---------------------------------------------------------------------------- Status|REOPENED |RESOLVED Resolution| |INVALIDI don't think that's applicable (or at least, it shouldn't be). Here's a shortened example: char[] a = "AAA"; a[0] = '!'; char[] b = "AAA"; It's absurd that 'b' should become "!AA".Well, that's what I expect, at least :)It's clear that the compiler is "optimizing" both "AAA" string literals into the same reference: char[] _tmp1 = "AAA"; char[] a = _tmp1; a[0] = '!'; char[] b = _tmp1;Of course it does! This string literal is put into executable. "AAA" is just a pointer to it. Please, understand that char[] is nothing more than a pointer to a string and an associated length. When you change the string via pointer, everyone who has that pointer gets the change. Compare to object: class A { int i = 15; } A aaa = new A(); // aaa is an analog of "AAA" A a = aaa; writefln(a); // prints 15 a.i = 42; A b = aaa; writefln(b); // why the hell it prints 42???? This is because you don't make a copy of aaa. Here is another example: while (true) { char[] a = "AAA"; a[0] = '!'; // in the next iteration, should a be "AAA" again? } If answer is yes, then there should be a memory allocation in line 2 under the hood. You can't get new string every time without a memory allocation. This, however, definitely should not occur. If you want to make an allocation - do it yourself explicitely. Code with the behaviour you expect should be as follows: while (true) { char[] a = "AAA".dup; writefln(a); a[0] = '!'; writefln(a); } This is by design, it works as it should. It can't be 'fixed' because it is not broken.Besides, at the very least there's still an inconsistency: char[] a = "AAA"; a[0] = '!'; char[] b = "AAA"; // b is "!AA" char[] c = ['A', 'A', 'A']; c[0] = '!'; char[] d = ['A', 'A', 'A']; // d is "AAA"I agree, this might be a bug, d really ought to be "!AAA"! Do you want to create a new bugreport so that the latter case would be 'fixed'? :) --
Oct 14 2008
http://d.puremagic.com/issues/show_bug.cgi?id=2418 ------- The D spec says "String literals are immutable (read only)." even in D1. So code like this char[] a = "ABC"; a[0] = '!'; is illegal (and will segfault on linux!). Array literals are allocated on the heap (the D spec explicitly says so) and will thus behave fundamentally different from string literals. --
Oct 15 2008
http://d.puremagic.com/issues/show_bug.cgi?id=2418 -------The D spec says "String literals are immutable (read only)." even in D1. So code like this char[] a = "ABC"; a[0] = '!'; is illegal (and will segfault on linux!). Array literals are allocated on the heap (the D spec explicitly says so) and will thus behave fundamentally different from string literals.Alright, I see now. It does seem odd though that array literals would carry a type that lacks any sort of const/read-only/immutable/etc qualifier, but is still immutable anyway. Just another of D2's improvements I guess. Out of curiosity, any idea if the lack of a run-time error on windows is due to windows itself or just something DMD/OPTLINK does differently between the platforms? --
Oct 15 2008