digitalmars.D.bugs - [Issue 2093] New: string concatenation modifies original
- d-bugmail puremagic.com (34/34) May 10 2008 http://d.puremagic.com/issues/show_bug.cgi?id=2093
- d-bugmail puremagic.com (6/6) May 10 2008 http://d.puremagic.com/issues/show_bug.cgi?id=2093
- Jarrett Billingsley (5/9) May 10 2008 ~ always creates a copy, but ~= will attempt to expand the array in-plac...
- d-bugmail puremagic.com (17/17) Nov 21 2008 http://d.puremagic.com/issues/show_bug.cgi?id=2093
- d-bugmail puremagic.com (30/30) Nov 21 2008 http://d.puremagic.com/issues/show_bug.cgi?id=2093
- d-bugmail puremagic.com (32/32) Nov 21 2008 http://d.puremagic.com/issues/show_bug.cgi?id=2093
- d-bugmail puremagic.com (6/6) Nov 22 2008 http://d.puremagic.com/issues/show_bug.cgi?id=2093
- Denis Koroskin (3/8) Nov 22 2008 No, string is a mutable array of immutable chars:
- d-bugmail puremagic.com (5/5) Nov 22 2008 http://d.puremagic.com/issues/show_bug.cgi?id=2093
- d-bugmail puremagic.com (14/20) Nov 22 2008 http://d.puremagic.com/issues/show_bug.cgi?id=2093
- d-bugmail puremagic.com (33/33) Nov 22 2008 http://d.puremagic.com/issues/show_bug.cgi?id=2093
- d-bugmail puremagic.com (9/9) Nov 22 2008 http://d.puremagic.com/issues/show_bug.cgi?id=2093
- d-bugmail puremagic.com (12/12) Nov 22 2008 http://d.puremagic.com/issues/show_bug.cgi?id=2093
- d-bugmail puremagic.com (13/13) Nov 22 2008 http://d.puremagic.com/issues/show_bug.cgi?id=2093
- d-bugmail puremagic.com (13/13) Nov 22 2008 http://d.puremagic.com/issues/show_bug.cgi?id=2093
- d-bugmail puremagic.com (10/20) Nov 22 2008 http://d.puremagic.com/issues/show_bug.cgi?id=2093
- d-bugmail puremagic.com (4/4) Feb 19 2009 http://d.puremagic.com/issues/show_bug.cgi?id=2093
- d-bugmail puremagic.com (12/12) Mar 16 2010 http://d.puremagic.com/issues/show_bug.cgi?id=2093
http://d.puremagic.com/issues/show_bug.cgi?id=2093 Summary: string concatenation modifies original Product: D Version: 2.014 Platform: PC OS/Version: Windows Status: NEW Severity: normal Priority: P2 Component: DMD AssignedTo: bugzilla digitalmars.com ReportedBy: bartosz relisoft.com I will attach source code for this example. It's an XML parser. It should produce the following output: c:\D\Work>xml root child color=red Text=foo bar baz Instead it produces this: c:\D\Work>xml root rootd rootd=red Text=rootdar baz The problem is that strings are modified after being copied, when the original is concatenated upon. The problem goes away if I idup strings: _name = name.idup; _value = value.idup; or when I replace a ~= b; with a = a ~ b; --
May 10 2008
http://d.puremagic.com/issues/show_bug.cgi?id=2093 Created an attachment (id=256) --> (http://d.puremagic.com/issues/attachment.cgi?id=256&action=view) Test case --
May 10 2008
<d-bugmail puremagic.com> wrote in message news:bug-2093-3 http.d.puremagic.com/issues/...or when I replace a ~= b; with a = a ~ b;~ always creates a copy, but ~= will attempt to expand the array in-place. Now, if this is D2, and ~= is expanding an invariant(char)[] in-place, then _that_ is definitely an issue.
May 10 2008
http://d.puremagic.com/issues/show_bug.cgi?id=2093 smjg iname.com changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |smjg iname.com Welcome to the world of bug reporting. The way to report a bug isn't to attach a 695-line program that contains some functionality somewhere that exhibits the problem. The correct manner is to post a small example that illustrates the problem, typically either by writing a test program from scratch or by simplifying little by little the program in which you found it. If done well, the result will be small enough to post straight into the bug report rather than attaching it. DMD's code coverage analysis is a useful tool for identifying unused parts of a program in order to cut them out, among other things. --
Nov 21 2008
http://d.puremagic.com/issues/show_bug.cgi?id=2093 smjg iname.com changed: What |Removed |Added ---------------------------------------------------------------------------- Keywords| |wrong-code I think I've finally managed to figure out what was going on. ---------- import std.stdio; void main() { string s1, s2; s1 ~= "hello"; s2 = s1; writefln(s1); writefln(s2); s1.length = 0; s1 ~= "Hi"; writefln(s1); writefln(s2); } ---------- hello hello Hi Hillo ---------- This is the kind of testcase we like here. Walter is more likely to fix a bug if you make life easier for him by supplying something on which the cause can easily be seen. --
Nov 21 2008
http://d.puremagic.com/issues/show_bug.cgi?id=2093 This is a known bug and is a major array design flow. Arrays has no determined owner (the only one who can grow without a reallocation if capacity permits): import std.stdio; void main() { char[] s1, s2; s1.length = 100; // reserve the capacity s1.length = 0; s2 = s1; // both are pointing to an empty string with the capacity of 100 s1 ~= "Hello"; // array is not reallocated, it is grown in-place writefln(s1); writefln(s2); // prints empty string. s2 still points to the same string (which is now "Hello") and carries length of 0 s2 ~= "Hi"; // overwrites s1 writefln(s2); // "Hi" writefln(s1); // "Hillo" } s1 is the array owner and s2 is a slice (even though it really points to the entire array), i.e. it should reallocate and take the ownership of the reallocated array on append, but it doesn't happen. Currently an 'owner' is anyone who has a pointer to array's beginning: char[] s = "hello".dup; char[] s1 = s[0..4]; s1 ~= "!"; assert(s != s1); // fails, both are "hell!", s is overwritten s = "_hello".dup; char[] s2 = s[1..5]; s2 ~= "!"; assert(s != s1); // succeeds, s1 is not changed --
Nov 21 2008
http://d.puremagic.com/issues/show_bug.cgi?id=2093 I thought 'string' types were immutable and thus ... s1.length = 0; should fail as it updates the string (trucates it to zero characters). --
Nov 22 2008
22.11.08 в 12:58 в своём письме писал(а):http://d.puremagic.com/issues/show_bug.cgi?id=2093 I thought 'string' types were immutable and thus ... s1.length = 0; should fail as it updates the string (trucates it to zero characters).No, string is a mutable array of immutable chars: string == const(char)[]
Nov 22 2008
http://d.puremagic.com/issues/show_bug.cgi?id=2093 No, string is aliased to invariant(char)[], i.e. an array of invariant characters. You can change its length (usually, decreasing) but not contents. --
Nov 22 2008
http://d.puremagic.com/issues/show_bug.cgi?id=2093Currently an 'owner' is anyone who has a pointer to array's beginning: char[] s = "hello".dup; char[] s1 = s[0..4]; s1 ~= "!"; assert(s != s1); // fails, both are "hell!", s is overwrittenA simple char[] is fully mutable, so that doesn't violate any established rule, but whether it's desirable is another matter. With const(char)[] or invariant(char)[], obviously this isn't going to work, so ~= should always reallocate (unless the optimiser can be sure that no other reference to the data can possibly exist). Alternatively, the GC could maintain a note of the actual length of every heap-allocated array. Ownership would be determined by matching in both start pointer and length. When the length is increased, whether by .length or ~=, either update this actual length (if it's the owner that we're extending, IWC all other references to the same data lose ownership) or reallocate the array. --
Nov 22 2008
http://d.puremagic.com/issues/show_bug.cgi?id=2093 schveiguy yahoo.com changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution| |WONTFIX Note that this behavior is defined in the spec. See http://www.digitalmars.com/d/2.0/arrays.html#resize "To maximize efficiency, the runtime always tries to resize the array in place to avoid extra copying. It will always do a copy if the new size is larger and the array was not allocated via the new operator or a previous resize operation. This means that if there is an array slice immediately following the array being resized, the resized array could overlap the slice" The fact that it violates invariantness is a side effect that Walter has not yet dealt with. There have been proposals to fix this, two of which I have proposed: 1. As you said, store the requested length along with the block length in the GC. Only appending to an array that ends at the end of the allocated memory will realloc in place. Original proposal: http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D&article_id=63146 Note nobody responded to this one 2. Store the length of the allocated array in the first element of the array. Then modify the meaning of the length member of the array struct to flag whether it is pointing to the beginning of the array or not. Original proposal: http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D&article_id=77437 Some people had questions, but nobody proved the proposal wouldn't work. I don't think Walter is interested in fixing this issue, as it has been a 'feature' for a while, and he never has responded positively to any decent proposals to fix this. --
Nov 22 2008
http://d.puremagic.com/issues/show_bug.cgi?id=2093 smjg iname.com changed: What |Removed |Added ---------------------------------------------------------------------------- Status|RESOLVED |REOPENED Resolution|WONTFIX | Steven, you have no authority to mark this WONTFIX. --
Nov 22 2008
http://d.puremagic.com/issues/show_bug.cgi?id=2093 schveiguy yahoo.com changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |schveiguy yahoo.com Severity|normal |enhancement Keywords|wrong-code | Sorry, I was thinking wontfix because the compiler functions as designed. I marked it as an enhancment instead. Removing wrong-code keyword also, as this is intended behavior. --
Nov 22 2008
http://d.puremagic.com/issues/show_bug.cgi?id=2093 smjg iname.com changed: What |Removed |Added ---------------------------------------------------------------------------- Severity|enhancement |normal Keywords| |accepts-invalid, spec So the problem is that it _always_ leaves the decision to resize in place or reallocate to the runtime. The only way in which this can coexist with the principle of invariant is that it is illegal to increase the length of a const/invariant array. Therefore, going by the current spec, the bug is that DMD accepts the code. --
Nov 22 2008
http://d.puremagic.com/issues/show_bug.cgi?id=2093 It seems to me then that this is a design choice - does the string length belong to the string or to the reference? For slices it must be the reference but for arrays? hmmm... Curently in D, a dynamic array and a slice are indistinguishable and I'm not so sure that should be the case. There are good arguments for the current design and also for the separation of slices and dynamic arrays. Common sense seems to say that if I change the length of a string that therefore every other reference to the same string should also honour the new length, and that this should also have no effect on previously captured slices of the string. --
Nov 22 2008
http://d.puremagic.com/issues/show_bug.cgi?id=2093It seems to me then that this is a design choice - does the string length belong to the string or to the reference? For slices it must be the reference but for arrays? hmmm... Curently in D, a dynamic array and a slice are indistinguishable and I'm not so sure that should be the case. There are good arguments for the current design and also for the separation of slices and dynamic arrays. Common sense seems to say that if I change the length of a string that therefore every other reference to the same string should also honour the new length, and that this should also have no effect on previously captured slices of the string.Arrays should not be typed differently than slices IMO, they should be able to be passed to the same functions. I think one of the two solutions I proposed would place the 'allocated length' of an array on the heap with the array data, thereby having the length stored in a shared location. Slices should respect this length, and if they cannot see the length, they should be reallocated as a full-blown array. --
Nov 22 2008
http://d.puremagic.com/issues/show_bug.cgi?id=2093 see also bug 2095 comment 6 --
Feb 19 2009
http://d.puremagic.com/issues/show_bug.cgi?id=2093 Steven Schveighoffer <schveiguy yahoo.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|REOPENED |RESOLVED Resolution| |FIXED 19:55:38 PDT --- This is fixed by the patch in bug 3637. It is in dmd 2.041. Compiling the attached file results in the desired output. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Mar 16 2010