digitalmars.D - Resizable Arrays?

Jason House (5/5) Feb 27 2009 Are there any good reasons to allow built in arrays to be resizable?

Tim M (5/15) Feb 27 2009 C++ has stl vectors to make it up for lack of resizeable arrays. Theres ...

BCS (11/14) Feb 27 2009 I think the thought is that arrays can have dynamic length but can't be ...

Jason House (5/26) Feb 27 2009 Right. Length is fixed until (re)allication.

Denis Koroskin (2/30) Feb 27 2009 I might be wrong, but it seems that arrays are always reallocating memor...

Stewart Gordon (6/9) Feb 27 2009
dsimcha (49/54) Feb 27 2009 to another violates type safety. Those bugs can be solved by always real...

dsimcha (7/61) Feb 27 2009 One minor detail that I thought of: When returning a T[new] from a func...

Sean Kelly (2/11) Feb 28 2009 C has dynamic arrays now. If D got rid of them I might just cry.

Tim M (7/19) Feb 28 2009 How do you mean? Are you talking about something like this:

Jason House (2/14) Feb 28 2009 By dynamic, do you mean no length as part of the type or resizable in pl...

dsimcha (17/31) Mar 01 2009 array to another violates type safety. Those bugs can be solved by alway...

Bill Baxter (21/52) Mar 01 2009 ..

bearophile (4/4) Feb 28 2009 Jason House>By dynamic, do you mean no length as part of the type or res...

Lutger (8/14) Mar 01 2009 generally takes place. If you don't want reallocations, then never grow ...

Jason House <jason.james.house gmail.com> writes:

Are there any good reasons to allow built in arrays to be resizable?

There's already plenty of empirical evidence that building arrays by appending
is incredibly slow.

There are also bugs in bugzilla where resizing arrays after assigning one array
to another violates type safety. Those bugs can be solved by always
reallocatimg arrays when resizing, but that makes performance even worse...



Is it time to rely on standard libraries for data structures and let built in
arrays be fixed length? It's a breaking change...

Feb 27 2009

"Tim M" <a b.com> writes:

On Sat, 28 Feb 2009 12:55:44 +1300, Jason House  
<jason.james.house gmail.com> wrote:

 Are there any good reasons to allow built in arrays to be resizable?

 There's already plenty of empirical evidence that building arrays by  
 appending is incredibly slow.

 There are also bugs in bugzilla where resizing arrays after assigning  
 one array to another violates type safety. Those bugs can be solved by  
 always reallocatimg arrays when resizing, but that makes performance  
 even worse...



 Is it time to rely on standard libraries for data structures and let  
 built in arrays be fixed length? It's a breaking change...


C++ has stl vectors to make it up for lack of resizeable arrays. Theres  
nothing stopping you from having all your arrays static in D. I think your  
crazy :)

Feb 27 2009

BCS <ao pathlink.com> writes:

Reply to tim,

 Theres nothing stopping you from having all your arrays static in D. I
 think your crazy :)
 

I think the thought is that arrays can have dynamic length but can't be changed 
after allocation.

starting with:

int[] arr = new int[15];

assigning to length

arr.length = 30;

would always become

auto tmp = new int[30];
tmp[0..min(arr.length,$)] = arr[0..min(tmp.length,$)];
arr = tmp;

Feb 27 2009

Jason House <jason.james.house gmail.com> writes:

BCS Wrote:

 Reply to tim,
 
 Theres nothing stopping you from having all your arrays static in D. I
 think your crazy :)
 

 
 I think the thought is that arrays can have dynamic length but can't be
changed 
 after allocation.

Right. Length is fixed until (re)allication.

 
 starting with:
 
 int[] arr = new int[15];
 
 assigning to length
 
 arr.length = 30;
 
 would always become
 
 auto tmp = new int[30];
 tmp[0..min(arr.length,$)] = arr[0..min(tmp.length,$)];
 arr = tmp;

Either that or a simple compile error...

It's also possible a middle ground where only const(T)[] is fixed length until
(re)allocation. I'm hoping there will be good discussion on the pros and cons
of alternatives.

Feb 27 2009

"Denis Koroskin" <2korden gmail.com> writes:

On Sat, 28 Feb 2009 04:27:52 +0300, Jason House <jason.james.house gmail.com>
wrote:

 BCS Wrote:

 Reply to tim,

 Theres nothing stopping you from having all your arrays static in D. I
 think your crazy :)

 I think the thought is that arrays can have dynamic length but can't be  
 changed
 after allocation.

 Right. Length is fixed until (re)allication.


 starting with:

 int[] arr = new int[15];

 assigning to length

 arr.length = 30;

 would always become

 auto tmp = new int[30];
 tmp[0..min(arr.length,$)] = arr[0..min(tmp.length,$)];
 arr = tmp;

 Either that or a simple compile error...

 It's also possible a middle ground where only const(T)[] is fixed length  
 until (re)allocation. I'm hoping there will be good discussion on the  
 pros and cons of alternatives.

I might be wrong, but it seems that arrays are always reallocating memory when
resize. I might be wrong but that's the behaviour I discovered in a recent test.

Feb 27 2009

Stewart Gordon <smjg_1998 yahoo.com> writes:

Jason House wrote:
 Are there any good reasons to allow built in arrays to be resizable?

http://www.digitalmars.com/d/2.0/builtin.html

 There's already plenty of empirical evidence that building arrays by 
 appending is incredibly slow.

<snip>

You need to really get into programming in D and explore the many 
possible ways of using arrays.

Stewart.

Feb 27 2009

dsimcha <dsimcha yahoo.com> writes:

== Quote from Jason House (jason.james.house gmail.com)'s article
 Are there any good reasons to allow built in arrays to be resizable?
 There's already plenty of empirical evidence that building arrays by appending

is incredibly slow.
 There are also bugs in bugzilla where resizing arrays after assigning one array

to another violates type safety. Those bugs can be solved by always reallocatimg
arrays when resizing, but that makes performance even worse...

 Is it time to rely on standard libraries for data structures and let built in

arrays be fixed length? It's a breaking change...

I think that builtin arrays in D are a godsend.  Resizable arrays are such an
important, heavily used feature that they should get special treatment from the
language, to ensure that they are syntactically elegant, efficient both at
runtime
and in terms of compilation time, easy to use, free of odd corner cases, and in
general are true first class citizens.  Nonetheless, D's arrays do have some
rough
edges.  My proposal would be as follows:

There should be two array types:
T[new] and T[].
This has been proposed before, though not in as much detail as what I'm
suggesting.  The semantics should work as follows:

T[new] is considered a subtype of T[].  This works mostly like you would expect.
T[new] can be implicitly converted to T[].  The twist is that T[] can also be
assigned to T[new] without any explicit conversion, but a copy of the underlying
data is implicitly made for safety.

Assigning either a T[] or a T[new] to a T[] will result in aliasing:

T[] foo;
T[new] bar = new T[N];
bar[0] = 1;
foo = bar;
foo[0] = 2;
writeln(bar[0]);  // Prints 2.

T[new]s are always assigned by value to make concatenation and resizing safe. 
The
underlying data is copied implicitly when assigning from a T[] to a T[new] or
from
a T[new] to another T[new].

T[new] foo;
T[] bar = new T[N];  // new returns T[new], using implicit conversion.
bar[0] = 1;
foo = bar;
foo[0] = 2;
writeln(bar[0]);  // Prints 1.

T[]s can be sliced, but cannot be enlarged or appended to.  They should be laid
out the same way they are now, as a struct of a pointer and a length.

T[new]s can be appended to and enlarged in addition to being sliced.  Their
layout
should be a struct of pointer, length, capacity to make appending fast.  This
will
also make implicit conversion to T[] essentially free, since all that is
necessary
is slicing off the capacity field.

I also wonder if this proposal could be extended to fix some of the covariance
issues with arrays (see Bugzilla 2095).  This might work by only allowing
covariant types to be assigned to T[new], not T[].  For example:

class A{}
class B:A{}

B[new] ba = [new B];
A[] aa = ba;  // Error:  Cannot assign covariant types to T[].
A[new] aa = ba;  // Works, but copy is implicitly made.

Feb 27 2009

dsimcha <dsimcha yahoo.com> writes:

== Quote from dsimcha (dsimcha yahoo.com)'s article
 == Quote from Jason House (jason.james.house gmail.com)'s article
 Are there any good reasons to allow built in arrays to be resizable?
 There's already plenty of empirical evidence that building arrays by appending

 is incredibly slow.
 There are also bugs in bugzilla where resizing arrays after assigning one array

 to another violates type safety. Those bugs can be solved by always
reallocatimg
 arrays when resizing, but that makes performance even worse...

 Is it time to rely on standard libraries for data structures and let built in

 arrays be fixed length? It's a breaking change...
 I think that builtin arrays in D are a godsend.  Resizable arrays are such an
 important, heavily used feature that they should get special treatment from the
 language, to ensure that they are syntactically elegant, efficient both at
runtime
 and in terms of compilation time, easy to use, free of odd corner cases, and in
 general are true first class citizens.  Nonetheless, D's arrays do have some
rough
 edges.  My proposal would be as follows:
 There should be two array types:
 T[new] and T[].
 This has been proposed before, though not in as much detail as what I'm
 suggesting.  The semantics should work as follows:
 T[new] is considered a subtype of T[].  This works mostly like you would
expect.
 T[new] can be implicitly converted to T[].  The twist is that T[] can also be
 assigned to T[new] without any explicit conversion, but a copy of the
underlying
 data is implicitly made for safety.
 Assigning either a T[] or a T[new] to a T[] will result in aliasing:
 T[] foo;
 T[new] bar = new T[N];
 bar[0] = 1;
 foo = bar;
 foo[0] = 2;
 writeln(bar[0]);  // Prints 2.
 T[new]s are always assigned by value to make concatenation and resizing safe. 
The
 underlying data is copied implicitly when assigning from a T[] to a T[new] or
from
 a T[new] to another T[new].
 T[new] foo;
 T[] bar = new T[N];  // new returns T[new], using implicit conversion.
 bar[0] = 1;
 foo = bar;
 foo[0] = 2;
 writeln(bar[0]);  // Prints 1.
 T[]s can be sliced, but cannot be enlarged or appended to.  They should be laid
 out the same way they are now, as a struct of a pointer and a length.
 T[new]s can be appended to and enlarged in addition to being sliced.  Their
layout
 should be a struct of pointer, length, capacity to make appending fast.  This
will
 also make implicit conversion to T[] essentially free, since all that is
necessary
 is slicing off the capacity field.
 I also wonder if this proposal could be extended to fix some of the covariance
 issues with arrays (see Bugzilla 2095).  This might work by only allowing
 covariant types to be assigned to T[new], not T[].  For example:
 class A{}
 class B:A{}
 B[new] ba = [new B];
 A[] aa = ba;  // Error:  Cannot assign covariant types to T[].
 A[new] aa = ba;  // Works, but copy is implicitly made.

One minor detail that I thought of:  When returning a T[new] from a function,
the
copying could be optimized away.  If any escaping happened during the function
call, this would be either by value or as a T[] instead of a T[new].  Therefore,
at the end of a function, the only T[new] reference to the heap space in
question
is guaranteed to be the T[new] that is about to be returned.  Since the
reference
in the callee is about to be destroyed, it can safely be returned without any
copying.

Feb 27 2009

Sean Kelly <sean invisibleduck.org> writes:

Jason House wrote:
 Are there any good reasons to allow built in arrays to be resizable?
 
 There's already plenty of empirical evidence that building arrays by appending
is incredibly slow.
 
 There are also bugs in bugzilla where resizing arrays after assigning one
array to another violates type safety. Those bugs can be solved by always
reallocatimg arrays when resizing, but that makes performance even worse...
 

 
 Is it time to rely on standard libraries for data structures and let built in
arrays be fixed length? It's a breaking change... 

C has dynamic arrays now.  If D got rid of them I might just cry.

Feb 28 2009

"Tim M" <a b.com> writes:

On Sun, 01 Mar 2009 05:30:28 +1300, Sean Kelly <sean invisibleduck.org>  
wrote:

 Jason House wrote:
 Are there any good reasons to allow built in arrays to be resizable?
  There's already plenty of empirical evidence that building arrays by  
 appending is incredibly slow.
  There are also bugs in bugzilla where resizing arrays after assigning  
 one array to another violates type safety. Those bugs can be solved by  
 always reallocatimg arrays when resizing, but that makes performance  
 even worse...

  Is it time to rely on standard libraries for data structures and let  
 built in arrays be fixed length? It's a breaking change...

 C has dynamic arrays now.  If D got rid of them I might just cry.


How do you mean? Are you talking about something like this:

void someFunc(int s)
{
    int ar[s];
}

Feb 28 2009

Jason House <jason.james.house gmail.com> writes:

Sean Kelly Wrote:

 Jason House wrote:
 Are there any good reasons to allow built in arrays to be resizable?
 
 There's already plenty of empirical evidence that building arrays by appending
is incredibly slow.
 
 There are also bugs in bugzilla where resizing arrays after assigning one
array to another violates type safety. Those bugs can be solved by always
reallocatimg arrays when resizing, but that makes performance even worse...
 

 
 Is it time to rely on standard libraries for data structures and let built in
arrays be fixed length? It's a breaking change... 

 
 C has dynamic arrays now.  If D got rid of them I might just cry.

By dynamic, do you mean no length as part of the type or resizable in place?
Array resizing can lead to hidden allocations, which is against Tango design
(my closest proxy for your style preferences). My past use of arrays in C did
not include calls to reallocate them in place.

Feb 28 2009

dsimcha <dsimcha yahoo.com> writes:

== Quote from Jason House (jason.james.house gmail.com)'s article
 Sean Kelly Wrote:
 Jason House wrote:
 Are there any good reasons to allow built in arrays to be resizable?

 There's already plenty of empirical evidence that building arrays by



appending is incredibly slow.
 There are also bugs in bugzilla where resizing arrays after assigning one



array to another violates type safety. Those bugs can be solved by always
reallocatimg arrays when resizing, but that makes performance even worse...


 Is it time to rely on standard libraries for data structures and let built



in arrays be fixed length? It's a breaking change...
 C has dynamic arrays now.  If D got rid of them I might just cry.

 By dynamic, do you mean no length as part of the type or resizable in place?

Array resizing can lead to hidden allocations, which is against Tango design (my
closest proxy for your style preferences). My past use of arrays in C did not
include calls to reallocate them in place.

IMHO, you can't have a high level language while at the same time completely
avoiding hidden allocations.  The whole point of D is that it's supposed to be a
higher level language than the others in its niche, namely C and C++.  The
cliche
is that we should forget about small efficiencies 97% of the time.  If you're
writing something that falls in that 97%, in a high level language, stuff should
just work and you shouldn't care about a few allocations.  If you're writing
stuff
in the other 3%, the onus should be on you to avoid features with hidden
allocations.  Writing performance critical or real-time code is hard, and
cutting
back on (slightly) hidden allocations isn't going to help much.

Mar 01 2009

Bill Baxter <wbaxter gmail.com> writes:

On Sun, Mar 1, 2009 at 11:17 PM, dsimcha <dsimcha yahoo.com> wrote:
 =3D=3D Quote from Jason House (jason.james.house gmail.com)'s article
 Sean Kelly Wrote:
 Jason House wrote:
 Are there any good reasons to allow built in arrays to be resizable?

 There's already plenty of empirical evidence that building arrays by



 appending is incredibly slow.
 There are also bugs in bugzilla where resizing arrays after assignin=




g one
 array to another violates type safety. Those bugs can be solved by always
 reallocatimg arrays when resizing, but that makes performance even worse.=

..


 Is it time to rely on standard libraries for data structures and let=




 built
 in arrays be fixed length? It's a breaking change...
 C has dynamic arrays now. =A0If D got rid of them I might just cry.

 By dynamic, do you mean no length as part of the type or resizable in pl=


ace?
 Array resizing can lead to hidden allocations, which is against Tango des=

ign (my
 closest proxy for your style preferences). My past use of arrays in C did=

 not
 include calls to reallocate them in place.

 IMHO, you can't have a high level language while at the same time complet=

ely
 avoiding hidden allocations. =A0The whole point of D is that it's suppose=

d to be a
 higher level language than the others in its niche, namely C and C++. =A0=

The cliche
 is that we should forget about small efficiencies 97% of the time. =A0If =

you're
 writing something that falls in that 97%, in a high level language, stuff=

 should
 just work and you shouldn't care about a few allocations. =A0If you're wr=

iting stuff
 in the other 3%, the onus should be on you to avoid features with hidden
 allocations. =A0Writing performance critical or real-time code is hard, a=

nd cutting
 back on (slightly) hidden allocations isn't going to help much.

I pretty much agree, but it would be nice if there were some debugging
flag you could toggle to cause the gc to assert() if some allocation
is attempted.   Like   std.gc.prohibitAllocations()  or somesuch.
That way in the block of code where you think you are avoiding
allocation it would be easy to find a place where you slipped up.

--bb

--bb

Mar 01 2009

bearophile <bearophileHUGS lycos.com> writes:

Jason House>By dynamic, do you mean no length as part of the type or resizable
in place? Array resizing can lead to hidden allocations,<

Thy aren't hidden: when you grow a dynamic array you know an allocation
generally takes place. If you don't want reallocations, then never grow or join
them, and the reallocations will not happen. There's nothing hidden.

Bye,
bearophile

Feb 28 2009

Lutger <lutger.blijdestijn gmail.com> writes:

bearophile wrote:

 Jason House>By dynamic, do you mean no length as part of the type or 

resizable in place? Array resizing can lead to hidden allocations,<
 
 Thy aren't hidden: when you grow a dynamic array you know an allocation 

generally takes place. If you don't want reallocations, then never grow or 
join them, and the reallocations will not happen. There's nothing hidden.
 
 Bye,
 bearophile

I agree, it's something that I think is quickly understood by newcomers to 

where the common idiom is to use StringBuilder. I have seen such builders 
floating around in the ng, but is it phobos or tango already?

Mar 01 2009

D Programming

C/C++ Programming

Other

digitalmars.D - Resizable Arrays?