digitalmars.D.bugs - Moving structs containing strings ...
- Bob W (55/55) Apr 23 2006 After moving structs containing char[n] arrays
-
Derek Parnell
(12/12)
Apr 23 2006
On Mon, 24 Apr 2006 08:44:57 +1000, Bob W
wrote: - Bob W (10/22) Apr 23 2006 Thanks for your suggestion. But I am more keen to know
- Derek Parnell (12/15) Apr 23 2006 Another solution is to recalculate the addresses of the id strings after
- Bob W (9/14) Apr 24 2006 I'll gladly accept this as a solution, because it will
- Derek Parnell (44/61) Apr 24 2006 However, there's a typo in it. Here is a fuller test of it ...
- Bob W (50/53) Apr 24 2006 While I like your source in principal, I am
- Frits van Bommel (10/27) Apr 24 2006 How about defining the struct like this:
- Bob W (18/26) Apr 24 2006 No idea what slightly slower means to you, but I can
- Bob W (14/14) Apr 24 2006 Thanks guys for your replies!
After moving structs containing char[n] arrays and char[] strings, the latter can point to the wrong string. This is a sample program producing the 'unexpected': ------ /////////////////////////////// // test - array struct quirks /////////////////////////////// // This program demonstrates an unexpected // D 'feature': // // When members of an array of structs are copied // from higher to lower array indices, some char[] // vars of the resulting structs are pointing to // wrong string locations (dmd 0.154). import std.stdio, std.string; alias writefln wrl; alias writef wr; alias toString tS; struct Sc { uint val; char[3] idz; char[] id; } Sc[] arr; int main() { // allocate array space and init array arr.length=3; with (arr[0]) { idz[]="ab\0"; id=idz[0..2]; val=101; } with (arr[1]) { idz[]="cd\0"; id=idz[0..2]; val=102; } with (arr[2]) { idz[]="ef\0"; id=idz[0..2]; val=103; } // display original wrl(); wrl("The original array:"); foreach (uint x, Sc v; arr) wrl("arr[%d] - val:%d, id:'%s'", x,v.val,v.id); // delete arr[0] by moving arr[1] and arr[2] arr[0]=arr[1]; arr[1]=arr[2]; arr=arr[0..2]; // adjust length // diplay 'expected' and 'actual' wrl(); wrl("Reduced array should look like this:"); foreach (uint x, Sc v; arr) wrl("arr[%d] - val:%d, id:'%s'", x,v.val,tS(v.idz)); wrl(); wrl("But it actually looks like that:"); foreach (uint x, Sc v; arr) wrl("arr[%d] - val:%d, id:'%s'", x,v.val,v.id); return 0; } -------------- Output of the above example: The original array: arr[0] - val:101, id:'ab' arr[1] - val:102, id:'cd' arr[2] - val:103, id:'ef' Reduced array should look like this: arr[0] - val:102, id:'cd' arr[1] - val:103, id:'ef' But it actually looks like that: arr[0] - val:102, id:'ef' <<<< arr[1] - val:103, id:'ef'
Apr 23 2006
On Mon, 24 Apr 2006 08:44:57 +1000, Bob W <nospam aol.com> wrote: For a workaround, with (arr[0]) { idz[]="ab\0"; id=idz[0..2]; val=101; } with (arr[1]) { idz[]="cd\0"; id=idz[0..2]; val=102; } with (arr[2]) { idz[]="ef\0"; id=idz[0..2]; val=103; } replace those lines with with (arr[0]) { idz[]="ab\0"; id=idz[0..2].dup; val=101; } with (arr[1]) { idz[]="cd\0"; id=idz[0..2].dup; val=102; } with (arr[2]) { idz[]="ef\0"; id=idz[0..2].dup; val=103; } -- Derek Parnell Melbourne, Australia
Apr 23 2006
"Derek Parnell" <derek psych.ward> wrote in message news:op.s8hhktza6b8z09 ginger.vic.bigpond.net.au...On Mon, 24 Apr 2006 08:44:57 +1000, Bob W <nospam aol.com> wrote: For a workaround, with (arr[0]) { idz[]="ab\0"; id=idz[0..2]; val=101; } with (arr[1]) { idz[]="cd\0"; id=idz[0..2]; val=102; } with (arr[2]) { idz[]="ef\0"; id=idz[0..2]; val=103; } replace those lines with with (arr[0]) { idz[]="ab\0"; id=idz[0..2].dup; val=101; } with (arr[1]) { idz[]="cd\0"; id=idz[0..2].dup; val=102; } with (arr[2]) { idz[]="ef\0"; id=idz[0..2].dup; val=103; } -- Derek Parnell Melbourne, AustraliaThanks for your suggestion. But I am more keen to know why the example mentioned in my post is not producing the desired results than in applying a workaround. If this question remains unanswered I'll have to revert to using pointers for the application which initially has produced these erroneus results. Using '.dup' several million times would probably bring my system to a standstill. I might be wrong with this guess, however ...
Apr 23 2006
Bob W wrote:"Derek Parnell" <derek psych.ward> wrote in message news:op.s8hhktza6b8z09 ginger.vic.bigpond.net.au...eeeewe :) The 'id' slices are mapped to absolute addresses, and thus are mapped to specific addresses within the array. When you remove array element 0 and compress the array, those addresses are not updated, and you end up with the result shown. After compression, the 'id' slice for val:102 is still pointing at where it used to live, although the original "cd" text has now been overwritten with the val:103 "ef" text instead :) The 'id' slice for val:103 is actually still pointing to its original location, which is now /past/ the end of the array (hehe). One solution is to adjust each 'id' slice when you compress or otherwise adjust the array content. Wouldn't you have to do this with pointers also? - KrisOn Mon, 24 Apr 2006 08:44:57 +1000, Bob W <nospam aol.com> wrote: For a workaround, with (arr[0]) { idz[]="ab\0"; id=idz[0..2]; val=101; } with (arr[1]) { idz[]="cd\0"; id=idz[0..2]; val=102; } with (arr[2]) { idz[]="ef\0"; id=idz[0..2]; val=103; } replace those lines with with (arr[0]) { idz[]="ab\0"; id=idz[0..2].dup; val=101; } with (arr[1]) { idz[]="cd\0"; id=idz[0..2].dup; val=102; } with (arr[2]) { idz[]="ef\0"; id=idz[0..2].dup; val=103; } -- Derek Parnell Melbourne, AustraliaThanks for your suggestion. But I am more keen to know why the example mentioned in my post is not producing the desired results than in applying a workaround. If this question remains unanswered I'll have to revert to using pointers for the application which initially has produced these erroneus results. Using '.dup' several million times would probably bring my system to a standstill. I might be wrong with this guess, however ...
Apr 23 2006
"kris" <foo bar.com> wrote in message news:e2hbp1$pnk$1 digitaldaemon.com...eeeewe :) The 'id' slices are mapped to absolute addresses, and thus are mapped to specific addresses within the array.Absolutely right! Now I am asking myself the usual question: Why didn't I take a break over the weekend and recognize this first thing on Monday morning?When you remove array element 0 and compress the array, those addresses are not updated, and you end up with the result shown. After compression, the 'id' slice for val:102 is still pointing at where it used to live, although the original "cd" text has now been overwritten with the val:103 "ef" text instead :) The 'id' slice for val:103 is actually still pointing to its original location, which is now /past/ the end of the array (hehe). One solution is to adjust each 'id' slice when you compress or otherwise adjust the array content. Wouldn't you have to do this with pointers also?I'll adjust. Thank you very much!
Apr 23 2006
On Mon, 24 Apr 2006 00:44:57 +0200, Bob W wrote:After moving structs containing char[n] arrays and char[] strings, the latter can point to the wrong string.Another solution is to recalculate the addresses of the id strings after changing the length ... arr=arr[0..2]; // adjust length foreach( inout Sc s; arr) s.id = s.idz[0..s.id.length]; -- Derek (skype: derek.j.parnell) Melbourne, Australia "Down with mediocracy!" 24/04/2006 12:19:51 PM
Apr 23 2006
"Derek Parnell" <derek psych.ward> wrote in message news:c7u4nwvom7ca$.o0vru71wqxwe.dlg 40tude.net...Another solution is to recalculate the addresses of the id strings after changing the length ... arr=arr[0..2]; // adjust length foreach( inout Sc s; arr) s.id = s.idz[0..s.id.length];I'll gladly accept this as a solution, because it will utilize existing strings within the structs-array. The workaround you have posted earlier allocates memory and duplicates strings one by one. That is something I would not want to try in real-world applications due to obvious reasons. Thanks for your input.
Apr 24 2006
On Mon, 24 Apr 2006 20:19:43 +1000, Bob W <nospam aol.com> wrote:"Derek Parnell" <derek psych.ward> wrote in message news:c7u4nwvom7ca$.o0vru71wqxwe.dlg 40tude.net...However, there's a typo in it. Here is a fuller test of it ... //------------------------ import std.stdio; alias std.stdio.writefln wrl; void AssignIdAddr(Sc[] pArr) { foreach( inout Sc s; pArr) s.id = s.idz[0 .. $]; } void DisplayArray(Sc[] pArr, char[] pMsg) { wrl("\n%s...", pMsg); foreach (uint i, Sc s; pArr) wrl("arr[%d] - val:%d, id:'%s'", i, s.val, s.id); } struct Sc { uint val; char[2] idz; char[] id; void opCall(char[] z, uint v) { idz[]=z; val = v; } } Sc[] arr; void main() { // init array arr.length = arr.length + 1; arr[$-1]("ab", 101); arr.length = arr.length + 1; arr[$-1]("cd", 102); arr.length = arr.length + 1; arr[$-1]("ef", 103); arr.length = arr.length + 1; arr[$-1]("gh", 104); AssignIdAddr(arr); DisplayArray(arr, "Original"); // delete first item for( uint i = 0; (i+1) < arr.length; i++) arr[i]=arr[i+1]; arr.length = arr.length - 1; // adjust length AssignIdAddr(arr); DisplayArray(arr, "Updated"); } //------------------------ -- Derek Parnell Melbourne, AustraliaAnother solution is to recalculate the addresses of the id strings after changing the length ... arr=arr[0..2]; // adjust length foreach( inout Sc s; arr) s.id = s.idz[0..s.id.length];I'll gladly accept this as a solution, because it will utilize existing strings within the structs-array. The workaround you have posted earlier allocates memory and duplicates strings one by one. That is something I would not want to try in real-world applications due to obvious reasons. Thanks for your input.
Apr 24 2006
"Derek Parnell" <derek psych.ward> wrote in message news:op.s8if7ipr6b8z09 ginger.vic.bigpond.net.au............ However, there's a typo in it. Here is a fuller test of it ... .........While I like your source in principal, I am somewhat allergic to sequential memory allocations. Therefore I have made some modifications (see sample below). std.file.read() is probably still not working properly with large files due to a similar reason. Please note that this is my last post about the subject and I definitely do not intend to start anything like a programming competition (I wish I had the time for it though). My own program, which looks completely different than the simplified examples, was fixed in a couple of minutes after reading kris' post. So I won't concentrate on this issue any further. ------------------ import std.stdio; alias std.stdio.writefln wrl; struct Sc { uint val; char[2] idz; char[] id; void opCall() { id=idz; } void opCall(char[] z, uint v) { idz[]=z; val=v; id=idz; } } Sc[] arr; void ShowMe(char[] msg) { wrl("\n%s ...", msg); foreach (uint i, Sc s; arr) wrl("arr[%d] - val:%d, id:'%s'", i,s.val,s.id); } void main() { // DmD documentation - quote: " ... Resizing a dynamic // array is a relatively expensive operation ..." arr.length=99; // leave enough space for ev. additions uint n=0; arr[n++]("ab",101); arr[n++]("cd",102); arr[n++]("ef",103); arr[n++]("gh",104); arr.length=n; // now cut it's size back ShowMe("Original"); // delete 1st item for (uint i=1; i<arr.length; i++) { arr[i-1]=arr[i]; arr[i-1](); } arr.length=arr.length-1; // adjust length ShowMe("Updated"); }
Apr 24 2006
Bob W wrote:"Derek Parnell" <derek psych.ward> wrote in message news:c7u4nwvom7ca$.o0vru71wqxwe.dlg 40tude.net...How about defining the struct like this: struct Sc { uint val; char[3] idz; char[] id() { return idz[0..3]; } } This turns 'id' into a property instead of a field. This might be slightly slower*, but has several advantages: - It will always return the current string without needing manual updates. - It doesn't allocate. - It removes the need to initialize 'id' separately. - Added bonus: Sc.sizeof is halved. (goes from 16 bytes to 8 bytes) *) Not necessarily though, it may well be optimized out.Another solution is to recalculate the addresses of the id strings after changing the length ... arr=arr[0..2]; // adjust length foreach( inout Sc s; arr) s.id = s.idz[0..s.id.length];I'll gladly accept this as a solution, because it will utilize existing strings within the structs-array. The workaround you have posted earlier allocates memory and duplicates strings one by one. That is something I would not want to try in real-world applications due to obvious reasons.
Apr 24 2006
"Frits van Bommel" <fvbommel REMwOVExCAPSs.nl> wrote in message news:e2iqkp$2fvl$1 digitaldaemon.com...How about defining the struct like this: struct Sc { uint val; char[3] idz; char[] id() { return idz[0..3]; } } This turns 'id' into a property instead of a field. This might be slightly slower*, but has several advantages:No idea what slightly slower means to you, but I can guarantee you that with indirect string access you won't even get to 50% of the original speed.- It will always return the current string without needing manual updates.I need direct access to the strings, due to performance reasons. Therefore I do not mind the updates whenerver they are required.- It doesn't allocate.This would be unacceptable anyway.- It removes the need to initialize 'id' separately.You are referring to the sample which has those convenient constant length strings. My application does not work this way, however. I'd have to have some basic initialisation first in order to get quicker access later.- Added bonus: Sc.sizeof is halved. (goes from 16 bytes to 8 bytes)I guess most of us have 1 or 2GB installed these days, true? Don't get me wrong. Your idea would be definitely worthwhile considering for other applications. But it just does not fit my current requirements.
Apr 24 2006
Thanks guys for your replies! For anyone who had no time to check upon my suspicions that D was producing something 'unexpected', here is a short explanation: My sample program can be imagined as the attempt to mark a freeway exit with an arrow inside a car in order to find it later. Unfortunately I was moving the car afterwards so the arrow was pointing to ... (you guess). As a result I have to admit that D did not deserve any of my secret suspicions about its stability in this respect. At least I was careful enough not to use the word 'bug' in any of my posts. ;-)
Apr 24 2006