digitalmars.D - Resizable Arrays?
- Jason House (5/5) Feb 27 2009 Are there any good reasons to allow built in arrays to be resizable?
- Tim M (5/15) Feb 27 2009 C++ has stl vectors to make it up for lack of resizeable arrays. Theres ...
- BCS (11/14) Feb 27 2009 I think the thought is that arrays can have dynamic length but can't be ...
- Jason House (5/26) Feb 27 2009 Right. Length is fixed until (re)allication.
- Denis Koroskin (2/30) Feb 27 2009 I might be wrong, but it seems that arrays are always reallocating memor...
-
Stewart Gordon
(6/9)
Feb 27 2009
- dsimcha (49/54) Feb 27 2009 to another violates type safety. Those bugs can be solved by always real...
- dsimcha (7/61) Feb 27 2009 One minor detail that I thought of: When returning a T[new] from a func...
- Sean Kelly (2/11) Feb 28 2009 C has dynamic arrays now. If D got rid of them I might just cry.
- Tim M (7/19) Feb 28 2009 How do you mean? Are you talking about something like this:
- Jason House (2/14) Feb 28 2009 By dynamic, do you mean no length as part of the type or resizable in pl...
- dsimcha (17/31) Mar 01 2009 array to another violates type safety. Those bugs can be solved by alway...
- Bill Baxter (21/52) Mar 01 2009 ..
- bearophile (4/4) Feb 28 2009 Jason House>By dynamic, do you mean no length as part of the type or res...
- Lutger (8/14) Mar 01 2009 generally takes place. If you don't want reallocations, then never grow ...
Are there any good reasons to allow built in arrays to be resizable? There's already plenty of empirical evidence that building arrays by appending is incredibly slow. There are also bugs in bugzilla where resizing arrays after assigning one array to another violates type safety. Those bugs can be solved by always reallocatimg arrays when resizing, but that makes performance even worse... Is it time to rely on standard libraries for data structures and let built in arrays be fixed length? It's a breaking change...
Feb 27 2009
On Sat, 28 Feb 2009 12:55:44 +1300, Jason House <jason.james.house gmail.com> wrote:Are there any good reasons to allow built in arrays to be resizable? There's already plenty of empirical evidence that building arrays by appending is incredibly slow. There are also bugs in bugzilla where resizing arrays after assigning one array to another violates type safety. Those bugs can be solved by always reallocatimg arrays when resizing, but that makes performance even worse... Is it time to rely on standard libraries for data structures and let built in arrays be fixed length? It's a breaking change...C++ has stl vectors to make it up for lack of resizeable arrays. Theres nothing stopping you from having all your arrays static in D. I think your crazy :)
Feb 27 2009
Reply to tim,Theres nothing stopping you from having all your arrays static in D. I think your crazy :)I think the thought is that arrays can have dynamic length but can't be changed after allocation. starting with: int[] arr = new int[15]; assigning to length arr.length = 30; would always become auto tmp = new int[30]; tmp[0..min(arr.length,$)] = arr[0..min(tmp.length,$)]; arr = tmp;
Feb 27 2009
BCS Wrote:Reply to tim,Right. Length is fixed until (re)allication.Theres nothing stopping you from having all your arrays static in D. I think your crazy :)I think the thought is that arrays can have dynamic length but can't be changed after allocation.starting with: int[] arr = new int[15]; assigning to length arr.length = 30; would always become auto tmp = new int[30]; tmp[0..min(arr.length,$)] = arr[0..min(tmp.length,$)]; arr = tmp;Either that or a simple compile error... It's also possible a middle ground where only const(T)[] is fixed length until (re)allocation. I'm hoping there will be good discussion on the pros and cons of alternatives.
Feb 27 2009
On Sat, 28 Feb 2009 04:27:52 +0300, Jason House <jason.james.house gmail.com> wrote:BCS Wrote:I might be wrong, but it seems that arrays are always reallocating memory when resize. I might be wrong but that's the behaviour I discovered in a recent test.Reply to tim,Right. Length is fixed until (re)allication.Theres nothing stopping you from having all your arrays static in D. I think your crazy :)I think the thought is that arrays can have dynamic length but can't be changed after allocation.starting with: int[] arr = new int[15]; assigning to length arr.length = 30; would always become auto tmp = new int[30]; tmp[0..min(arr.length,$)] = arr[0..min(tmp.length,$)]; arr = tmp;Either that or a simple compile error... It's also possible a middle ground where only const(T)[] is fixed length until (re)allocation. I'm hoping there will be good discussion on the pros and cons of alternatives.
Feb 27 2009
Jason House wrote:Are there any good reasons to allow built in arrays to be resizable?http://www.digitalmars.com/d/2.0/builtin.htmlThere's already plenty of empirical evidence that building arrays by appending is incredibly slow.<snip> You need to really get into programming in D and explore the many possible ways of using arrays. Stewart.
Feb 27 2009
== Quote from Jason House (jason.james.house gmail.com)'s articleAre there any good reasons to allow built in arrays to be resizable? There's already plenty of empirical evidence that building arrays by appendingis incredibly slow.There are also bugs in bugzilla where resizing arrays after assigning one arrayto another violates type safety. Those bugs can be solved by always reallocatimg arrays when resizing, but that makes performance even worse...Is it time to rely on standard libraries for data structures and let built inarrays be fixed length? It's a breaking change... I think that builtin arrays in D are a godsend. Resizable arrays are such an important, heavily used feature that they should get special treatment from the language, to ensure that they are syntactically elegant, efficient both at runtime and in terms of compilation time, easy to use, free of odd corner cases, and in general are true first class citizens. Nonetheless, D's arrays do have some rough edges. My proposal would be as follows: There should be two array types: T[new] and T[]. This has been proposed before, though not in as much detail as what I'm suggesting. The semantics should work as follows: T[new] is considered a subtype of T[]. This works mostly like you would expect. T[new] can be implicitly converted to T[]. The twist is that T[] can also be assigned to T[new] without any explicit conversion, but a copy of the underlying data is implicitly made for safety. Assigning either a T[] or a T[new] to a T[] will result in aliasing: T[] foo; T[new] bar = new T[N]; bar[0] = 1; foo = bar; foo[0] = 2; writeln(bar[0]); // Prints 2. T[new]s are always assigned by value to make concatenation and resizing safe. The underlying data is copied implicitly when assigning from a T[] to a T[new] or from a T[new] to another T[new]. T[new] foo; T[] bar = new T[N]; // new returns T[new], using implicit conversion. bar[0] = 1; foo = bar; foo[0] = 2; writeln(bar[0]); // Prints 1. T[]s can be sliced, but cannot be enlarged or appended to. They should be laid out the same way they are now, as a struct of a pointer and a length. T[new]s can be appended to and enlarged in addition to being sliced. Their layout should be a struct of pointer, length, capacity to make appending fast. This will also make implicit conversion to T[] essentially free, since all that is necessary is slicing off the capacity field. I also wonder if this proposal could be extended to fix some of the covariance issues with arrays (see Bugzilla 2095). This might work by only allowing covariant types to be assigned to T[new], not T[]. For example: class A{} class B:A{} B[new] ba = [new B]; A[] aa = ba; // Error: Cannot assign covariant types to T[]. A[new] aa = ba; // Works, but copy is implicitly made.
Feb 27 2009
== Quote from dsimcha (dsimcha yahoo.com)'s article== Quote from Jason House (jason.james.house gmail.com)'s articleOne minor detail that I thought of: When returning a T[new] from a function, the copying could be optimized away. If any escaping happened during the function call, this would be either by value or as a T[] instead of a T[new]. Therefore, at the end of a function, the only T[new] reference to the heap space in question is guaranteed to be the T[new] that is about to be returned. Since the reference in the callee is about to be destroyed, it can safely be returned without any copying.Are there any good reasons to allow built in arrays to be resizable? There's already plenty of empirical evidence that building arrays by appendingis incredibly slow.There are also bugs in bugzilla where resizing arrays after assigning one arrayto another violates type safety. Those bugs can be solved by always reallocatimg arrays when resizing, but that makes performance even worse...Is it time to rely on standard libraries for data structures and let built inarrays be fixed length? It's a breaking change... I think that builtin arrays in D are a godsend. Resizable arrays are such an important, heavily used feature that they should get special treatment from the language, to ensure that they are syntactically elegant, efficient both at runtime and in terms of compilation time, easy to use, free of odd corner cases, and in general are true first class citizens. Nonetheless, D's arrays do have some rough edges. My proposal would be as follows: There should be two array types: T[new] and T[]. This has been proposed before, though not in as much detail as what I'm suggesting. The semantics should work as follows: T[new] is considered a subtype of T[]. This works mostly like you would expect. T[new] can be implicitly converted to T[]. The twist is that T[] can also be assigned to T[new] without any explicit conversion, but a copy of the underlying data is implicitly made for safety. Assigning either a T[] or a T[new] to a T[] will result in aliasing: T[] foo; T[new] bar = new T[N]; bar[0] = 1; foo = bar; foo[0] = 2; writeln(bar[0]); // Prints 2. T[new]s are always assigned by value to make concatenation and resizing safe. The underlying data is copied implicitly when assigning from a T[] to a T[new] or from a T[new] to another T[new]. T[new] foo; T[] bar = new T[N]; // new returns T[new], using implicit conversion. bar[0] = 1; foo = bar; foo[0] = 2; writeln(bar[0]); // Prints 1. T[]s can be sliced, but cannot be enlarged or appended to. They should be laid out the same way they are now, as a struct of a pointer and a length. T[new]s can be appended to and enlarged in addition to being sliced. Their layout should be a struct of pointer, length, capacity to make appending fast. This will also make implicit conversion to T[] essentially free, since all that is necessary is slicing off the capacity field. I also wonder if this proposal could be extended to fix some of the covariance issues with arrays (see Bugzilla 2095). This might work by only allowing covariant types to be assigned to T[new], not T[]. For example: class A{} class B:A{} B[new] ba = [new B]; A[] aa = ba; // Error: Cannot assign covariant types to T[]. A[new] aa = ba; // Works, but copy is implicitly made.
Feb 27 2009
Jason House wrote:Are there any good reasons to allow built in arrays to be resizable? There's already plenty of empirical evidence that building arrays by appending is incredibly slow. There are also bugs in bugzilla where resizing arrays after assigning one array to another violates type safety. Those bugs can be solved by always reallocatimg arrays when resizing, but that makes performance even worse... Is it time to rely on standard libraries for data structures and let built in arrays be fixed length? It's a breaking change...C has dynamic arrays now. If D got rid of them I might just cry.
Feb 28 2009
On Sun, 01 Mar 2009 05:30:28 +1300, Sean Kelly <sean invisibleduck.org> wrote:Jason House wrote:How do you mean? Are you talking about something like this: void someFunc(int s) { int ar[s]; }Are there any good reasons to allow built in arrays to be resizable? There's already plenty of empirical evidence that building arrays by appending is incredibly slow. There are also bugs in bugzilla where resizing arrays after assigning one array to another violates type safety. Those bugs can be solved by always reallocatimg arrays when resizing, but that makes performance even worse... Is it time to rely on standard libraries for data structures and let built in arrays be fixed length? It's a breaking change...C has dynamic arrays now. If D got rid of them I might just cry.
Feb 28 2009
Sean Kelly Wrote:Jason House wrote:By dynamic, do you mean no length as part of the type or resizable in place? Array resizing can lead to hidden allocations, which is against Tango design (my closest proxy for your style preferences). My past use of arrays in C did not include calls to reallocate them in place.Are there any good reasons to allow built in arrays to be resizable? There's already plenty of empirical evidence that building arrays by appending is incredibly slow. There are also bugs in bugzilla where resizing arrays after assigning one array to another violates type safety. Those bugs can be solved by always reallocatimg arrays when resizing, but that makes performance even worse... Is it time to rely on standard libraries for data structures and let built in arrays be fixed length? It's a breaking change...C has dynamic arrays now. If D got rid of them I might just cry.
Feb 28 2009
== Quote from Jason House (jason.james.house gmail.com)'s articleSean Kelly Wrote:appending is incredibly slow.Jason House wrote:Are there any good reasons to allow built in arrays to be resizable? There's already plenty of empirical evidence that building arrays byarray to another violates type safety. Those bugs can be solved by always reallocatimg arrays when resizing, but that makes performance even worse...There are also bugs in bugzilla where resizing arrays after assigning onein arrays be fixed length? It's a breaking change...Is it time to rely on standard libraries for data structures and let builtArray resizing can lead to hidden allocations, which is against Tango design (my closest proxy for your style preferences). My past use of arrays in C did not include calls to reallocate them in place. IMHO, you can't have a high level language while at the same time completely avoiding hidden allocations. The whole point of D is that it's supposed to be a higher level language than the others in its niche, namely C and C++. The cliche is that we should forget about small efficiencies 97% of the time. If you're writing something that falls in that 97%, in a high level language, stuff should just work and you shouldn't care about a few allocations. If you're writing stuff in the other 3%, the onus should be on you to avoid features with hidden allocations. Writing performance critical or real-time code is hard, and cutting back on (slightly) hidden allocations isn't going to help much.C has dynamic arrays now. If D got rid of them I might just cry.By dynamic, do you mean no length as part of the type or resizable in place?
Mar 01 2009
On Sun, Mar 1, 2009 at 11:17 PM, dsimcha <dsimcha yahoo.com> wrote:=3D=3D Quote from Jason House (jason.james.house gmail.com)'s articleg oneSean Kelly Wrote:appending is incredibly slow.Jason House wrote:Are there any good reasons to allow built in arrays to be resizable? There's already plenty of empirical evidence that building arrays byThere are also bugs in bugzilla where resizing arrays after assignin=array to another violates type safety. Those bugs can be solved by always reallocatimg arrays when resizing, but that makes performance even worse.=..builtIs it time to rely on standard libraries for data structures and let=in arrays be fixed length? It's a breaking change...ace?C has dynamic arrays now. =A0If D got rid of them I might just cry.By dynamic, do you mean no length as part of the type or resizable in pl=Array resizing can lead to hidden allocations, which is against Tango des=ign (myclosest proxy for your style preferences). My past use of arrays in C did=notinclude calls to reallocate them in place. IMHO, you can't have a high level language while at the same time complet=elyavoiding hidden allocations. =A0The whole point of D is that it's suppose=d to be ahigher level language than the others in its niche, namely C and C++. =A0=The clicheis that we should forget about small efficiencies 97% of the time. =A0If =you'rewriting something that falls in that 97%, in a high level language, stuff=shouldjust work and you shouldn't care about a few allocations. =A0If you're wr=iting stuffin the other 3%, the onus should be on you to avoid features with hidden allocations. =A0Writing performance critical or real-time code is hard, a=nd cuttingback on (slightly) hidden allocations isn't going to help much.I pretty much agree, but it would be nice if there were some debugging flag you could toggle to cause the gc to assert() if some allocation is attempted. Like std.gc.prohibitAllocations() or somesuch. That way in the block of code where you think you are avoiding allocation it would be easy to find a place where you slipped up. --bb --bb
Mar 01 2009
Jason House>By dynamic, do you mean no length as part of the type or resizable in place? Array resizing can lead to hidden allocations,< Thy aren't hidden: when you grow a dynamic array you know an allocation generally takes place. If you don't want reallocations, then never grow or join them, and the reallocations will not happen. There's nothing hidden. Bye, bearophile
Feb 28 2009
bearophile wrote:Jason House>By dynamic, do you mean no length as part of the type orresizable in place? Array resizing can lead to hidden allocations,<Thy aren't hidden: when you grow a dynamic array you know an allocationgenerally takes place. If you don't want reallocations, then never grow or join them, and the reallocations will not happen. There's nothing hidden.Bye, bearophileI agree, it's something that I think is quickly understood by newcomers to where the common idiom is to use StringBuilder. I have seen such builders floating around in the ng, but is it phobos or tango already?
Mar 01 2009