digitalmars.D - T[new]

Walter Bright (22/22) Aug 09 2009 D has a number of subtle problems (performance and semantic) that arise

Brad Roberts (4/41) Aug 09 2009 Yay. What will happen with slices over a T[new] when the underlying T[n...

Walter Bright (17/20) Aug 09 2009 Great question! It's no different from what happens with any container

Brad Roberts (12/39) Aug 09 2009 As expected, but like I said.. this needs to be clear in the spec. Beca...

Andrei Alexandrescu (10/52) Aug 09 2009 Well yah but in all languages this is the case. C++'s iterators not only...

Graham St Jack (1/1) Aug 09 2009 This sounds excellent.

grauzone (9/9) Aug 09 2009 That's great. I think this would finally fix some of the more pressing

Walter Bright (8/18) Aug 09 2009 Yes.

bearophile (11/16) Aug 09 2009 Now there are four ways to create/use "arrays" in D2:

Walter Bright (2/2) Aug 09 2009 By the way, T[new] is Andrei's idea, as well as it being his idea to
bearophile (9/20) Aug 09 2009 Such syntaxes have to be chosen wisely. I don't fully understand that sy...

Andrei Alexandrescu (4/15) Aug 09 2009 Slices will remain ubiquitous, arrays probably not so much.
BCS (6/13) Aug 09 2009 I beg to differ. D is much better than c# in many ways (D's abstraction ...
grauzone (4/7) Aug 09 2009 I see two things a dotnet implementation of D could have over native D:

bearophile (5/7) Aug 09 2009 The dotnet GC is probably better than the current D GC, but I think it's...

Jeremie Pelletier (3/13) Aug 09 2009 I just finished reading about Bartosz's early entry about the shared qua...

Walter Bright (5/12) Aug 09 2009 Even if you're correct that D.net is pointless, and I don't agree with

Robert Jacques (6/17) Aug 09 2009 But (IIRC) it's not actually a canary. .NET has slices and there's no

div0 (16/28) Aug 10 2009 -----BEGIN PGP SIGNED MESSAGE-----

bearophile (4/6) Aug 09 2009 I have another question: are there some performance penalities in using ...

Andrei Alexandrescu (3/8) Aug 09 2009 One extra indirection, usually nearby.

Lionello Lunesu (8/17) Aug 09 2009 If the size and capacity get added to the begining of the memory block

Andrei Alexandrescu (4/23) Aug 09 2009 I wish too. The problem is that when you resize the array, the control

dsimcha (34/40) Aug 09 2009 import std.stdio, std.perf;

Stewart Gordon (19/27) Aug 09 2009

grauzone (5/7) Aug 09 2009 Just what happens to the ~= operator anyway? Right now, it appends data
Walter Bright (9/44) Aug 09 2009 No, that would defeat the whole purpose of making T[new] a reference

Leandro Lucarella (9/46) Aug 09 2009 What about writing a DIP? ;)
Michel Fortin (14/39) Aug 09 2009 I have nothing against the concept of container, but I dislike like the

Walter Bright (3/5) Aug 09 2009 We explored unique at length, and trying to make it work would render

Michiel Helvensteijn (5/10) Aug 09 2009 I've heard 'unique' mentioned on this group now and then. What does it m...

Sergey Gromov (8/18) Aug 09 2009 Unique reference is a reference which is statically known to be the only

Kagamin (2/15) Aug 10 2009 Of what type will strings be? Of what type will be the result of concate...

Walter Bright (3/5) Aug 10 2009 T[new]

Lars T. Kyllingstad (6/10) Aug 10 2009 I've always wondered: Why are strings of type immutable(char)[], and not...

Walter Bright (5/7) Aug 10 2009 So:

Lars T. Kyllingstad (3/13) Aug 10 2009 Ah, of course. :) Thanks.

Jeremie Pelletier (4/20) Aug 10 2009 The problem with immutable(char)[] was that any string can be resized, e...

Leandro Lucarella (14/38) Aug 10 2009 From: Walter Bright
Stewart Gordon (9/16) Aug 10 2009 It would be more efficient to cut out this middleman for things that
Walter Bright (4/11) Aug 10 2009 The vast majority of uses will not need to be resizeable.

Andrei Alexandrescu (8/16) Aug 10 2009 Hmmm, I see a problem.

Steven Schveighoffer (8/21) Aug 10 2009 auto c1 = 'a';

Sergey Gromov (3/12) Aug 10 2009 Good news. I'm all for it. The T[new] syntax is a bit... weird... bit
Steven Schveighoffer (30/52) Aug 10 2009 Yay!
Jarrett Billingsley (14/36) Aug 10 2009 hen
Stewart Gordon (5/5) Aug 10 2009 Since nobody's yet asked....

Steven Schveighoffer (32/36) Aug 10 2009 This brings up another question. Should something like immutable(T)[new...

Andrei Alexandrescu (6/55) Aug 10 2009 I think 1 is a good option. Essentially you can't have a resizeable

Steven Schveighoffer (10/54) Aug 10 2009 Consider this then:

bearophile (193/197) Aug 10 2009 Thank you. Following your example I have tried to do a bit more realisti...

Jarrett Billingsley (6/7) Aug 10 2009 I don't see why that's a big problem. If you're using a resizable

bearophile (4/5) Aug 11 2009 Like St Thomas I touch first and believe (maybe) later :-)

Walter Bright <newshound1 digitalmars.com> writes:

D has a number of subtle problems (performance and semantic) that arise 
when arrays are resized. The solution is to separate resizeable array 
types from slices. Slices will retain the old:

    T[] slice;

syntax. Resizeable arrays will be declared as:

    T[new] array;

The new expression:

    new T[10]

will return a T[new].

T[new] will implicitly convert to T[], but not the other way.

slice.length will become read-only.

Under the hood, a T[new] will be a single pointer to a library defined 
type. This library defined type will likely contain three properties:

     size_t length;
     T* ptr;
     size_t capacity;

The usual array operations will work on T[new] as well as T[].

Doing this change will:

1. fix many nasties at the edges of array semantics

2. make arrays implementable on .net

3. make clear in function signatures if the function can resize the 
array or not

Aug 09 2009

Brad Roberts <braddr puremagic.com> writes:

Yay.  What will happen with slices over a T[new] when the underlying T[new] must
be moved due to resizing?  The behavior needs to be at least well specified, if
not well defined.

Walter Bright wrote:
 D has a number of subtle problems (performance and semantic) that arise
 when arrays are resized. The solution is to separate resizeable array
 types from slices. Slices will retain the old:
 
    T[] slice;
 
 syntax. Resizeable arrays will be declared as:
 
    T[new] array;
 
 The new expression:
 
    new T[10]
 
 will return a T[new].
 
 T[new] will implicitly convert to T[], but not the other way.
 
 slice.length will become read-only.
 
 Under the hood, a T[new] will be a single pointer to a library defined
 type. This library defined type will likely contain three properties:
 
     size_t length;
     T* ptr;
     size_t capacity;
 
 The usual array operations will work on T[new] as well as T[].
 
 Doing this change will:
 
 1. fix many nasties at the edges of array semantics
 
 2. make arrays implementable on .net
 
 3. make clear in function signatures if the function can resize the
 array or not

Aug 09 2009

Walter Bright <newshound1 digitalmars.com> writes:

Brad Roberts wrote:
 Yay.  What will happen with slices over a T[new] when the underlying T[new]
must
 be moved due to resizing?  The behavior needs to be at least well specified, if
 not well defined.

Great question! It's no different from what happens with any container 
when its contents change and there are references to the old content.

What won't happen is the slice won't be pointing to unallocated memory, 
thanks to the gc. Resizing the T[new] will not cause the old contents to 
be deleted.

The slice will either point to the old content, if the resize caused a 
move, or the new content, if it was resized in place. So it will still 
be implementation-defined behavior.

But the likelihood of getting caught by such behavior is much less 
likely with T[new], as one can clearly see in the code where the 
resizeables are, rather than the current situation where any slice could 
be resized by anyone.

I think that taking a slice of a T[new], then resizing the T[new], will 
be rare (or at least should be). Normally, a T[new] will be built, and 
when it is all done, it is converted to a T[] and there is no longer any 
way to resize it.

Aug 09 2009

Brad Roberts <braddr puremagic.com> writes:

Walter Bright wrote:
 Brad Roberts wrote:
 Yay.  What will happen with slices over a T[new] when the underlying
 T[new] must
 be moved due to resizing?  The behavior needs to be at least well
 specified, if
 not well defined.

 
 Great question! It's no different from what happens with any container
 when its contents change and there are references to the old content.
 
 What won't happen is the slice won't be pointing to unallocated memory,
 thanks to the gc. Resizing the T[new] will not cause the old contents to
 be deleted.
 
 The slice will either point to the old content, if the resize caused a
 move, or the new content, if it was resized in place. So it will still
 be implementation-defined behavior.
 
 But the likelihood of getting caught by such behavior is much less
 likely with T[new], as one can clearly see in the code where the
 resizeables are, rather than the current situation where any slice could
 be resized by anyone.
 
 I think that taking a slice of a T[new], then resizing the T[new], will
 be rare (or at least should be). Normally, a T[new] will be built, and
 when it is all done, it is converted to a T[] and there is no longer any
 way to resize it.

As expected, but like I said.. this needs to be clear in the spec.  Because
something like this is just confusing (and yes, I know it's not new behavior):

    auto a = new int[10];
    a[0] = 1;
    auto s1 = a[0..1];
    a.length = <some value large enough to force a move>;
    auto s2 = a[0..1];
    s2[0] = 2;
    assert(s1[0] == s2[1]); // fail

Later,
Brad

Aug 09 2009

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

Brad Roberts wrote:
 Walter Bright wrote:
 Brad Roberts wrote:
 Yay.  What will happen with slices over a T[new] when the underlying
 T[new] must
 be moved due to resizing?  The behavior needs to be at least well
 specified, if
 not well defined.

 Great question! It's no different from what happens with any container
 when its contents change and there are references to the old content.

 What won't happen is the slice won't be pointing to unallocated memory,
 thanks to the gc. Resizing the T[new] will not cause the old contents to
 be deleted.

 The slice will either point to the old content, if the resize caused a
 move, or the new content, if it was resized in place. So it will still
 be implementation-defined behavior.

 But the likelihood of getting caught by such behavior is much less
 likely with T[new], as one can clearly see in the code where the
 resizeables are, rather than the current situation where any slice could
 be resized by anyone.

 I think that taking a slice of a T[new], then resizing the T[new], will
 be rare (or at least should be). Normally, a T[new] will be built, and
 when it is all done, it is converted to a T[] and there is no longer any
 way to resize it.

 
 As expected, but like I said.. this needs to be clear in the spec.  Because
 something like this is just confusing (and yes, I know it's not new behavior):
 
     auto a = new int[10];
     a[0] = 1;
     auto s1 = a[0..1];
     a.length = <some value large enough to force a move>;
     auto s2 = a[0..1];
     s2[0] = 2;
     assert(s1[0] == s2[1]); // fail
 
 Later,
 Brad
 

Well yah but in all languages this is the case. C++'s iterators not only 
invalidate at the drop of a hat, but you can't really tell whether 
they're invalid. Java iterators can also be invalidated and may throw an 
exception.

When resizing a D array, its underlying slices may be *orphaned*. That 
means they are still valid, they just don't refer to the same data as 
the array. I think that's a reasonable tradeoff between efficiency and 
safety.


Andrei

Aug 09 2009

Graham St Jack <Graham.StJack internode.on.net> writes:

This sounds excellent.

Aug 09 2009

grauzone <none example.net> writes:

That's great. I think this would finally fix some of the more pressing 
issues of D.

I have a question:

will

T[new] a;

allocate something? Or will the allocation of the hidden library array 
object delayed until the length is set to non null? If the library array 
object is not allocated, will a.ptr, a.capacity and a.length simply 
return null?

Aug 09 2009

Walter Bright <newshound1 digitalmars.com> writes:

grauzone wrote:
 will
 
 T[new] a;
 
 allocate something?

Nope.

 Or will the allocation of the hidden library array 
 object delayed until the length is set to non null?

Yes.

 If the library array 
 object is not allocated, will a.ptr, a.capacity and a.length simply 
 return null?

Yes. Clearly, those properties will have to be functions under the hood. 
So T[new] operations will be a bit slower than for slices. For faster 
indexing, you'd probably want to do:

auto slice = a[];

and then operate on the slice.

Aug 09 2009

bearophile <bearophileHUGS lycos.com> writes:

Now there are four ways to create/use "arrays" in D2:
- the * syntax on a raw memory block, like in C.
- fixed sized arrays, that are just a pointer somewhere. they may even become
fully value types........
- the slice struct, about 2 * size_t.sizeof
- the new reference arrays, a pointer on the stack, plus a 3 items struct
somewhere, probably on the heap. If it doesn't escape the scope LDC may be able
to put it too on the stack. The GC may allocate a space of 4 items anyway for
it, so there's an item wasted.
:-)
In the meantime I have understood that the slice has to be the default syntax
to keep the compatibility with C :-)


Walter Bright:

 Yes. Clearly, those properties will have to be functions under the hood. 
 So T[new] operations will be a bit slower than for slices. For faster 
 indexing, you'd probably want to do:
 auto slice = a[];
 and then operate on the slice.

Let's hope such functions can be inlined. Assuming they can be inlined, a smart
compiler can remove some of those ifs where it knows the array surely exists
(virtual machines today are usually able to remove some array bound tests using
similar tricks). I don't hold my breath for D2 to become this smart...

Bye,
bearophile

Aug 09 2009

Walter Bright <newshound1 digitalmars.com> writes:

By the way, T[new] is Andrei's idea, as well as it being his idea to 
make it a reference type.

Aug 09 2009

bearophile <bearophileHUGS lycos.com> writes:

Walter Bright:

I like this general proposal, it sounds like something that can improve D a
little. There are many other things that have to be improved in D, but baby
steps are enough to go somewhere :-)

Slices will retain the old:
     T[] slice;
 syntax. Resizeable arrays will be declared as:
     T[new] array;

Such syntaxes have to be chosen wisely. I don't fully understand that syntax.
And maybe I don't fully like it.
I think the default (simpler) syntax has to be the most flexible and safer
construct, that is resizable arrays. Slices can be seen as an optimization, so
they can have a bit longer syntax.


 Under the hood, a T[new] will be a single pointer to a library defined 
 type. This library defined type will likely contain three properties:
      size_t length;
      T* ptr;
      size_t capacity;

Weren't you suggestion to use the start pointer - end pointer instead?


 2. make arrays implementable on .net


than D, and it's similar anyway. So I don't think people will use D on dotnet.
So even if creating a D for dotnet can be positive, I don't want D2 to change
its design to allow a better implementation on dotnet.

The question by Brad Roberts looks important, it has to find some kind of
answer:
What will happen with slices over a T[new] when the underlying T[new] must be
moved due to resizing?<

Bye,
bearophile

Aug 09 2009

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

bearophile wrote:
 Walter Bright:
 
 I like this general proposal, it sounds like something that can improve D a
little. There are many other things that have to be improved in D, but baby
steps are enough to go somewhere :-)
 
 Slices will retain the old:
     T[] slice;
 syntax. Resizeable arrays will be declared as:
     T[new] array;

 
 Such syntaxes have to be chosen wisely. I don't fully understand that syntax.
And maybe I don't fully like it.

Why am I not surprised :o).

 I think the default (simpler) syntax has to be the most flexible and safer
construct, that is resizable arrays. Slices can be seen as an optimization, so
they can have a bit longer syntax.

Slices will remain ubiquitous, arrays probably not so much.


Andrei

Aug 09 2009

BCS <ao pathlink.com> writes:

Reply to bearophile,

 Walter Bright:

 2. make arrays implementable on .net
 


 better than D,



not as good in a few others (tools, libs, runtime refection).

 and it's similar anyway. So I don't think people will
 use D on dotnet.

This I agree with because D looses many if not all of its advantages when 
forced into the .NET/CLI/managed-code world

Aug 09 2009

grauzone <none example.net> writes:

bearophile wrote:
 2. make arrays implementable on .net

 

than D, and it's similar anyway. So I don't think people will use D on dotnet.
So even if creating a D for dotnet can be positive, I don't want D2 to change
its design to allow a better implementation on dotnet.

I see two things a dotnet implementation of D could have over native D:
- better garbage collector (the D one barely does its job...)
- better interoperability (*hint* OMF *hint*)

Aug 09 2009

bearophile <bearophileHUGS lycos.com> writes:

grauzone:

 I see two things a dotnet implementation of D could have over native D:
 - better garbage collector (the D one barely does its job...)

The dotnet GC is probably better than the current D GC, but I think it's
designed for mostly movable objects. Currently most (or all) D objects are
pinned (they can't be moved around in memory), so I don't know if the dotnet GC
will do much good.
D needs a GC designed for its peculiar characteristics. And I believe D will
also need some extra semantics to allow the creation of such efficient
half-movable half-pinned GC (I have explained such ideas one time in the past).

Bye,
bearophile

Aug 09 2009

Jeremie Pelletier <jeremiep gmail.com> writes:

bearophile Wrote:

 grauzone:
 
 I see two things a dotnet implementation of D could have over native D:
 - better garbage collector (the D one barely does its job...)

 
 The dotnet GC is probably better than the current D GC, but I think it's
designed for mostly movable objects. Currently most (or all) D objects are
pinned (they can't be moved around in memory), so I don't know if the dotnet GC
will do much good.
 D needs a GC designed for its peculiar characteristics. And I believe D will
also need some extra semantics to allow the creation of such efficient
half-movable half-pinned GC (I have explained such ideas one time in the past).
 
 Bye,
 bearophile

I just finished reading about Bartosz's early entry about the shared qualifier
and how it could lead to per-thead memory heaps (as only shared allocations
should use the shared heap). This means the GC can maintain different heaps per
thread, and perform collection locally without pausing the world, and the
entire heap can just be trashed when the thread exits.

I think it could be much more efficient than a moving GC (as maintaining
different shared heaps and moving memory between them is a slow process). I am
really tempted to modify my custom GC to do just that heh.

Aug 09 2009

Walter Bright <newshound1 digitalmars.com> writes:

bearophile wrote:
 2. make arrays implementable on .net

 

 better than D, and it's similar anyway. So I don't think people will
 use D on dotnet. So even if creating a D for dotnet can be positive,
 I don't want D2 to change its design to allow a better implementation
 on dotnet.

Even if you're correct that D.net is pointless, and I don't agree with 
that assessment, I think the problems implementing D arrays on .net will 
show up elsewhere in attempts to support other targets. So I think it's 
a "canary" issue rather than a .net one.

Aug 09 2009

"Robert Jacques" <sandford jhu.edu> writes:

On Sun, 09 Aug 2009 14:21:31 -0700, Walter Bright  
<newshound1 digitalmars.com> wrote:

 bearophile wrote:
 2. make arrays implementable on .net


 better than D, and it's similar anyway. So I don't think people will
 use D on dotnet. So even if creating a D for dotnet can be positive,
 I don't want D2 to change its design to allow a better implementation
 on dotnet.

 Even if you're correct that D.net is pointless, and I don't agree with  
 that assessment, I think the problems implementing D arrays on .net will  
 show up elsewhere in attempts to support other targets. So I think it's  
 a "canary" issue rather than a .net one.

But (IIRC) it's not actually a canary. .NET has slices and there's no  
'self-contained' implementation problem. The issue was that the .NET  
library wasn't written with slices (.NET, D or otherwise) in mind, so  
conversions to/from slices were required.

Aug 09 2009

div0 <div0 users.sourceforge.net> writes:

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

bearophile wrote:
 Walter Bright:
 
 I like this general proposal, it sounds like something that can improve D a
little. There are many other things that have to be improved in D, but baby
steps are enough to go somewhere :-)
 
 Slices will retain the old:
     T[] slice;
 syntax. Resizeable arrays will be declared as:
     T[new] array;

 
 Such syntaxes have to be chosen wisely. I don't fully understand that syntax.
And maybe I don't fully like it.
 I think the default (simpler) syntax has to be the most flexible and safer
construct, that is resizable arrays. Slices can be seen as an optimization, so
they can have a bit longer syntax.
 

<snip>

Ya, not a nice syntax.
how about T[!static] ?! ;)

- --
My enormous talent is exceeded only by my outrageous laziness.
http://www.ssTk.co.uk
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iD8DBQFKgHaiT9LetA9XoXwRAoBDAJ9WAHlAjkW7TXRUWqCz80N8m4WbwACfTCfP
9MnZfV5pifUKwNC3c1HgySM=
=J6pq
-----END PGP SIGNATURE-----

Aug 10 2009

bearophile <bearophileHUGS lycos.com> writes:

Walter Bright:
 Under the hood, a T[new] will be a single pointer to a library defined 
 type. This library defined type will likely contain three properties:

I have another question: are there some performance penalities in using such
arrays for normal random access operations?

Bye,
bearophile

Aug 09 2009

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

bearophile wrote:
 Walter Bright:
 Under the hood, a T[new] will be a single pointer to a library defined 
 type. This library defined type will likely contain three properties:

 
 I have another question: are there some performance penalities in using such
arrays for normal random access operations?

One extra indirection, usually nearby.

Andrei

Aug 09 2009

"Lionello Lunesu" <lionello lunesu.remove.com> writes:

"Andrei Alexandrescu" <SeeWebsiteForEmail erdani.org> wrote in message 
news:h5ndt0$1blj$1 digitalmars.com...
 bearophile wrote:
 Walter Bright:
 Under the hood, a T[new] will be a single pointer to a library defined 
 type. This library defined type will likely contain three properties:

 I have another question: are there some performance penalities in using 
 such arrays for normal random access operations?

 One extra indirection, usually nearby.

 Andrei

If the size and capacity get added to the begining of the memory block 
containing the data, only an extra offset is needed. The compiler could emit 
"mov eax, [esi*4 + 8]" or something. Look ma, no extra indirection :)

In fact, this is what .NET does as well, if I'm not mistaken. But then, .NET 
always has an extra indirection because of the movable GC stuff, right?

L.

Aug 09 2009

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

Lionello Lunesu wrote:
 
 "Andrei Alexandrescu" <SeeWebsiteForEmail erdani.org> wrote in message 
 news:h5ndt0$1blj$1 digitalmars.com...
 bearophile wrote:
 Walter Bright:
 Under the hood, a T[new] will be a single pointer to a library 
 defined type. This library defined type will likely contain three 
 properties:

 I have another question: are there some performance penalities in 
 using such arrays for normal random access operations?

 One extra indirection, usually nearby.

 Andrei

 
 If the size and capacity get added to the begining of the memory block 
 containing the data, only an extra offset is needed. The compiler could 
 emit "mov eax, [esi*4 + 8]" or something. Look ma, no extra indirection :)

I wish too. The problem is that when you resize the array, the control 
block will not refer to the adjacent chunk.


Andrei

Aug 09 2009

dsimcha <dsimcha yahoo.com> writes:

== Quote from bearophile (bearophileHUGS lycos.com)'s article
 Walter Bright:
 Under the hood, a T[new] will be a single pointer to a library defined
 type. This library defined type will likely contain three properties:

 I have another question: are there some performance penalities in using such

arrays for normal random access operations?
 Bye,
 bearophile

import std.stdio, std.perf;

final class TNew {
    uint* foo;
    uint length;

    this(uint size) {
        foo = (new uint[size]).ptr;
        length = size;
    }

    ref uint opIndex(uint index) {
        return foo[index];
    }
}

void main() {
    auto pc = new PerformanceCounter;

    uint[] array = new uint[1];
    pc.start;
    foreach(i; 0..100_000_000) {
        array.ptr[0]++;
    }
    pc.stop;
    writeln("Direct:  ", pc.milliseconds);

    auto tnew = new TNew(1);
    pc.start;
    foreach(i; 0..100_000_000) {
        tnew[0]++;
    }
    pc.stop;
    writeln("Indirect:  ", pc.milliseconds);
}

Results:

Direct:  226
Indirect:  229

Aug 09 2009

Stewart Gordon <smjg_1998 yahoo.com> writes:

Walter Bright wrote:
<snip>
 Under the hood, a T[new] will be a single pointer to a library defined 
 type. This library defined type will likely contain three properties:
 
     size_t length;
     T* ptr;
     size_t capacity;
 
 The usual array operations will work on T[new] as well as T[].

<snip>

Would new T[10] allocate this structure and the array data on a single 
GC block, or on two separate blocks?  And when the array is reallocated, 
will the structure move with it?

I suppose it depends on whether you want T[new] to be
(a) something whereby all references persist as the array is reallocated 
(b) merely a reference to an allocated array as opposed to an array slice

If (a), this is currently achievable with a T[]*.

If (b), what might work well is a structure like

     size_t length;
     size_t capacity;
     T[capacity] data;

meaning still only one allocation and only one level of indirection when 
one is used.  And the T[new] variable itself would simply hold &data[0].

Moreover, would whatever happens solve such const/invariant holes as bug 
2093?

Stewart.

Aug 09 2009

grauzone <none example.net> writes:

Stewart Gordon wrote:
 Moreover, would whatever happens solve such const/invariant holes as bug 
 2093?

Just what happens to the ~= operator anyway? Right now, it appends data 
inline.

My vote would be to make "a~=b" do the same as "a=a~b" (with types "T[] 
a" and "T[] b" or "T b"). T[new]'s ~= would still append inline.

Aug 09 2009

Walter Bright <newshound1 digitalmars.com> writes:

Stewart Gordon wrote:
 Walter Bright wrote:
 <snip>
 Under the hood, a T[new] will be a single pointer to a library defined 
 type. This library defined type will likely contain three properties:

     size_t length;
     T* ptr;
     size_t capacity;

 The usual array operations will work on T[new] as well as T[].

 <snip>
 
 Would new T[10] allocate this structure and the array data on a single 
 GC block, or on two separate blocks?

That's up to the implementation.

 And when the array is reallocated, 
 will the structure move with it?

No, that would defeat the whole purpose of making T[new] a reference 
type. With it being a reference type:

T[new] a = ...;
T[new] b = a;

a.length = ...
... b.length changes too ...


 I suppose it depends on whether you want T[new] to be
 (a) something whereby all references persist as the array is reallocated 
 (b) merely a reference to an allocated array as opposed to an array slice
 
 If (a), this is currently achievable with a T[]*.
 
 If (b), what might work well is a structure like
 
     size_t length;
     size_t capacity;
     T[capacity] data;
 
 meaning still only one allocation and only one level of indirection when 
 one is used.  And the T[new] variable itself would simply hold &data[0].
 
 Moreover, would whatever happens solve such const/invariant holes as bug 
 2093?

I believe it does.

 
 Stewart.

Aug 09 2009

Leandro Lucarella <llucax gmail.com> writes:

Walter Bright, el  9 de agosto a las 13:29 me escribiste:
 D has a number of subtle problems (performance and semantic) that arise
 when arrays are resized. The solution is to separate resizeable array
 types from slices. Slices will retain the old:
 
    T[] slice;
 
 syntax. Resizeable arrays will be declared as:
 
    T[new] array;
 
 The new expression:
 
    new T[10]
 
 will return a T[new].
 
 T[new] will implicitly convert to T[], but not the other way.
 
 slice.length will become read-only.
 
 Under the hood, a T[new] will be a single pointer to a library defined
 type. This library defined type will likely contain three properties:
 
     size_t length;
     T* ptr;
     size_t capacity;
 
 The usual array operations will work on T[new] as well as T[].
 
 Doing this change will:
 
 1. fix many nasties at the edges of array semantics
 
 2. make arrays implementable on .net
 
 3. make clear in function signatures if the function can resize the
 array or not

What about writing a DIP? ;)

-- 
Leandro Lucarella (luca) | Blog colectivo: http://www.mazziblog.com.ar/blog/
----------------------------------------------------------------------------
GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145  104C 949E BFB6 5F5A 8D05)
----------------------------------------------------------------------------
VECINOS RESCATARON A CABALLITO ATROPELLADO
	-- Crónica TV

Aug 09 2009

Michel Fortin <michel.fortin michelf.com> writes:

On 2009-08-09 16:29:21 -0400, Walter Bright <newshound1 digitalmars.com> said:

 D has a number of subtle problems (performance and semantic) that arise 
 when arrays are resized. The solution is to separate resizeable array 
 types from slices. Slices will retain the old:
 
     T[] slice;
 
 syntax. Resizeable arrays will be declared as:
 
     T[new] array;
 
 The new expression:
 
     new T[10]
 
 will return a T[new].

I have nothing against the concept of container, but I dislike like the 
syntax. Is this really better than Array!T ?


 T[new] will implicitly convert to T[], but not the other way.
 
 slice.length will become read-only.

All fine by me.


 Doing this change will:
 
 1. fix many nasties at the edges of array semantics

Unfortunatly, not all. Solving them all could be done with unique 
semantics (unique slices can be expanded with no side effect).


 2. make arrays implementable on .net
 
 3. make clear in function signatures if the function can resize the 
 array or not

Unique and ref unique slices would be perfectly fine for that (assuming 
slices can be resized when unique).

But I know, unique isn't easy to implement to fit all the use cases 
we'd like to solve. I'm just sharing a dream.


-- 
Michel Fortin
michel.fortin michelf.com
http://michelf.com/

Aug 09 2009

Walter Bright <newshound1 digitalmars.com> writes:

Michel Fortin wrote:
 But I know, unique isn't easy to implement to fit all the use cases we'd 
 like to solve. I'm just sharing a dream.

We explored unique at length, and trying to make it work would render 
the rest of the language nearly unusably complex.

Aug 09 2009

Michiel Helvensteijn <m.helvensteijn.remove gmail.com> writes:

Walter Bright wrote:

 But I know, unique isn't easy to implement to fit all the use cases we'd
 like to solve. I'm just sharing a dream.

 
 We explored unique at length, and trying to make it work would render
 the rest of the language nearly unusably complex.

I've heard 'unique' mentioned on this group now and then. What does it mean
in the context of programming languages?

-- 
Michiel Helvensteijn

Aug 09 2009

Sergey Gromov <snake.scaly gmail.com> writes:

Mon, 10 Aug 2009 01:56:35 +0200, Michiel Helvensteijn wrote:

 Walter Bright wrote:
 
 But I know, unique isn't easy to implement to fit all the use cases we'd
 like to solve. I'm just sharing a dream.

 
 We explored unique at length, and trying to make it work would render
 the rest of the language nearly unusably complex.

 
 I've heard 'unique' mentioned on this group now and then. What does it mean
 in the context of programming languages?

Unique reference is a reference which is statically known to be the only
reference to the underlying data.  They have some interesting
properties:

- Modification of underlying data does not cause side effects
- Can be implicitly cast to immutable
- Can be implicitly cast to mutable

They prove to be rather tricky both to specify and to implement though.

Aug 09 2009

Kagamin <spam here.lot> writes:

Walter Bright Wrote:

 Resizeable arrays will be declared as:
 
     T[new] array;
 
 The new expression:
 
     new T[10]
 
 will return a T[new].
 
 T[new] will implicitly convert to T[], but not the other way.
 
 slice.length will become read-only.

Of what type will strings be? Of what type will be the result of concatenation?

Aug 10 2009

Walter Bright <newshound1 digitalmars.com> writes:

Kagamin wrote:
 Of what type will strings be?

immutable(char)[]

 Of what type will be the result of concatenation?

T[new]

Aug 10 2009

"Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:

Walter Bright wrote:
 Kagamin wrote:
 Of what type will strings be?

 
 immutable(char)[]

I've always wondered: Why are strings of type immutable(char)[], and not 
immutable(char[])?

I mean, there is no point in allowing resizing of strings, when 
assigning to the new elements is impossible.

-Lars

Aug 10 2009

Walter Bright <newshound1 digitalmars.com> writes:

Lars T. Kyllingstad wrote:
 I've always wondered: Why are strings of type immutable(char)[], and not 
 immutable(char[])?

So:

    string a = "hello";
    a = "foo";

works.

Aug 10 2009

"Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:

Walter Bright wrote:
 Lars T. Kyllingstad wrote:
 I've always wondered: Why are strings of type immutable(char)[], and 
 not immutable(char[])?

 
 So:
 
    string a = "hello";
    a = "foo";
 
 works.


Ah, of course. :) Thanks.

-Lars

Aug 10 2009

Jeremie Pelletier <jeremiep gmail.com> writes:

Lars T. Kyllingstad Wrote:

 Walter Bright wrote:
 Lars T. Kyllingstad wrote:
 I've always wondered: Why are strings of type immutable(char)[], and 
 not immutable(char[])?

 
 So:
 
    string a = "hello";
    a = "foo";
 
 works.

 
 
 Ah, of course. :) Thanks.
 
 -Lars

The problem with immutable(char)[] was that any string can be resized, even
slices.

Walter: what will the string types be aliased to now: still immutable(char)[]
or immutable(char)[new]?

I think it would be best to have them use the array [new] type. Functions which
do not modify the string length can mark the string as an in parameter, and
immutable(char)[] should be castable to const(immutable(char)[new]) in such
cases to still allow slices to be passed.

Aug 10 2009

Leandro Lucarella <llucax gmail.com> writes:

Jeremie Pelletier, el 10 de agosto a las 13:01 me escribiste:
 Lars T. Kyllingstad Wrote:
 
 Walter Bright wrote:
 Lars T. Kyllingstad wrote:
 I've always wondered: Why are strings of type immutable(char)[], and 
 not immutable(char[])?

 
 So:
 
    string a = "hello";
    a = "foo";
 
 works.

 
 
 Ah, of course. :) Thanks.
 
 -Lars

 
 The problem with immutable(char)[] was that any string can be resized, even
slices.
 
 Walter: what will the string types be aliased to now: still immutable(char)[]
or immutable(char)[new]?

From: Walter Bright <newshound1 digitalmars.com>
Date: Mon, 10 Aug 2009 01:01:56 -0700
Subject: Re: T[new]
User-Agent: Thunderbird 2.0.0.22 (Windows/20090605)

Kagamin wrote:
Of what type will strings be?

immutable(char)[]

Of what type will be the result of concatenation?

T[new]

-- 
Leandro Lucarella (luca) | Blog colectivo: http://www.mazziblog.com.ar/blog/
----------------------------------------------------------------------------
GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145  104C 949E BFB6 5F5A 8D05)
----------------------------------------------------------------------------
El techo de mi cuarto lleno de galaxias

Aug 10 2009

Stewart Gordon <smjg_1998 yahoo.com> writes:

Jeremie Pelletier wrote:
<snip>
 Walter: what will the string types be aliased to now: still immutable(char)[]
or immutable(char)[new]?
 
 I think it would be best to have them use the array [new] type.

It would be more efficient to cut out this middleman for things that 
aren't going to change the length.  Which means most things that operate 
on the string type (and similarly wstring/dstring), since these types 
are designed to be immutable.  The change to T[] that's part of this 
plan would take it a stage further by making the length immutable as well.

 Functions which do not modify the string length can mark the string
 as an in parameter, and immutable(char)[] should be castable to
 const(immutable(char)[new]) in such cases to still allow slices to be
 passed.

The conversion would create an extra memory allocation.

Stewart.

Aug 10 2009

Walter Bright <newshound1 digitalmars.com> writes:

Jeremie Pelletier wrote:
 Walter: what will the string types be aliased to now: still
 immutable(char)[] or immutable(char)[new]?

immutable(char)[]

 I think it would be best to have them use the array [new] type.

The vast majority of uses will not need to be resizeable.

 Functions which do not modify the string length can mark the string
 as an in parameter, and immutable(char)[] should be castable to
 const(immutable(char)[new]) in such cases to still allow slices to be
 passed.

A slice won't be castable to a resizeable.

Aug 10 2009

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

Walter Bright wrote:
 Kagamin wrote:
 Of what type will strings be?

 
 immutable(char)[]
 
 Of what type will be the result of concatenation?

 
 T[new]

Hmmm, I see a problem.

auto s1 = "Hello";
auto s2 = " world";
auto s = s1 ~ s2;

Some might be surprised that the type of s is not the same as that of s1 
and s2.


Andrei

Aug 10 2009

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Mon, 10 Aug 2009 10:11:52 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 Walter Bright wrote:
 Kagamin wrote:
 Of what type will strings be?

  immutable(char)[]

 Of what type will be the result of concatenation?

  T[new]

 Hmmm, I see a problem.

 auto s1 = "Hello";
 auto s2 = " world";
 auto s = s1 ~ s2;

 Some might be surprised that the type of s is not the same as that of s1  
 and s2.

auto c1 = 'a';
auto c2 = 'b';

auto c = c1 + c2;

But s is better, because it implicitly casts back to typeof(s1).  I think  
it's the absolutely best behavior.

-Steve

Aug 10 2009

Sergey Gromov <snake.scaly gmail.com> writes:

Sun, 09 Aug 2009 13:29:21 -0700, Walter Bright wrote:

 D has a number of subtle problems (performance and semantic) that arise 
 when arrays are resized. The solution is to separate resizeable array 
 types from slices. Slices will retain the old:
 
     T[] slice;
 
 syntax. Resizeable arrays will be declared as:
 
     T[new] array;

Good news.  I'm all for it.  The T[new] syntax is a bit... weird... bit
I'm OK with it as well.  Thanks!

Aug 10 2009

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Sun, 09 Aug 2009 16:29:21 -0400, Walter Bright  
<newshound1 digitalmars.com> wrote:

 D has a number of subtle problems (performance and semantic) that arise  
 when arrays are resized. The solution is to separate resizeable array  
 types from slices. Slices will retain the old:

     T[] slice;

 syntax. Resizeable arrays will be declared as:

     T[new] array;

 The new expression:

     new T[10]

 will return a T[new].

 T[new] will implicitly convert to T[], but not the other way.

 slice.length will become read-only.

 Under the hood, a T[new] will be a single pointer to a library defined  
 type. This library defined type will likely contain three properties:

      size_t length;
      T* ptr;
      size_t capacity;

 The usual array operations will work on T[new] as well as T[].

 Doing this change will:

 1. fix many nasties at the edges of array semantics

 2. make arrays implementable on .net

 3. make clear in function signatures if the function can resize the  
 array or not

Yay!

Will capacity be settable?  That is, can I replace this pattern:

char[] buf = new buf[1024];
buf.length = 0;

with something like this?:

char[new] buf;
buf.capacity = 1024; // ensure capacity is *at least* 1024

----

I read elsewhere that

T[new] x;

doesn't allocate until length (or maybe capacity) is non-zero.

So T[new] is a library defined type.  What does the underlying type get  
called with when x.length = 5 is entered?  Is the underlying type:

1. an object?
2. a templated type?

----

What happens when you do:

x = null;

where x is a T[new] type?

Is it the same as setting x.length = 0, or x.capacity = 0?

What does x == null mean?  What about x is null?

Will ptr be settable, or only readable?

Will the compiler continue to optimize foreach calls to arrays?  How will  
foreach be implemented, as a range or opApply?  The former implies some  
automated method to return a range, because you wouldn't want to modify  
the referenced array during iteration.

This is exciting :)

-Steve

Aug 10 2009

Jarrett Billingsley <jarrett.billingsley gmail.com> writes:

On Sun, Aug 9, 2009 at 4:29 PM, Walter Bright<newshound1 digitalmars.com> w=
rote:
 D has a number of subtle problems (performance and semantic) that arise w=

hen
 arrays are resized. The solution is to separate resizeable array types fr=

om
 slices. Slices will retain the old:

 =A0 T[] slice;

 syntax. Resizeable arrays will be declared as:

 =A0 T[new] array;

 The new expression:

 =A0 new T[10]

 will return a T[new].

 T[new] will implicitly convert to T[], but not the other way.

 slice.length will become read-only.

 Under the hood, a T[new] will be a single pointer to a library defined ty=

pe.
 This library defined type will likely contain three properties:

 =A0 =A0size_t length;
 =A0 =A0T* ptr;
 =A0 =A0size_t capacity;

 The usual array operations will work on T[new] as well as T[].

 Doing this change will:

 1. fix many nasties at the edges of array semantics

 2. make arrays implementable on .net

 3. make clear in function signatures if the function can resize the array=

 or
 not

This is a great change.  MiniD's array semantics are very similar to
D's wrt. slicing, though I made the slight modification that array
objects keep track of whether or not they are a slice of another array
and behave slightly differently accordingly, and I've found it to be
more intuitive.  I like that D is doing the same thing, and statically
too ;)

The T[new] syntax isn't doing much for me, but then again, I don't
have any better suggestions.

Aug 10 2009

Stewart Gordon <smjg_1998 yahoo.com> writes:

Since nobody's yet asked....

What type would .dup and .idup return?

The best choice I can see is to make .dup return T[new] and .idup return 
invariant(T)[].

Stewart.

Aug 10 2009

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Mon, 10 Aug 2009 14:38:24 -0400, Stewart Gordon <smjg_1998 yahoo.com>  
wrote:

 Since nobody's yet asked....

 What type would .dup and .idup return?

 The best choice I can see is to make .dup return T[new] and .idup return  
 invariant(T)[].

This brings up another question.  Should something like immutable(T)[new]  
or const(T)[new] be even valid?  Consider this:

immutable(char)[new] str = "hello".idup; // not sure how you convert a  
literal to an array, but assume it works for now.

assert(str.capacity == 16); // assume GC still allocates 16 minimum blocks
str ~= " there";

auto slice = str[5..$];

str.length = 5;

str ~= " world";

assert(slice == " there"); // fails?

We most likely have the same issue here with immutable data as we do  
today, just not as obvious.  Although it's a corner case, it does violate  
immutability without casts.

There are several options:

1. immutable(T)[new] and const(T)[new] are invalid.  This limits us more  
than the current regime, but is a questionable construct to begin with,  
considering it allows appending in place (you can always create a new  
array with concatenation).
2. immutable(T)[new] and const(T)[new] are equivalent to immutable(T[new])  
and const(T[new]) respectively.  This simplifies generic programming and  
still prevents abuses.
3. when you shrink an array of immutable/const elements' length, the  
capacity automatically shrinks to prevent overwriting previous data.  This  
still leaves mutable data with the overwrite problem, but it doesn't  
violate const.


Also, implicit casting to const needs to be addressed, normally you can't  
do this with templated types (if that's how the underlying type is  
implemented).

-Steve

Aug 10 2009

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

Steven Schveighoffer wrote:
 On Mon, 10 Aug 2009 14:38:24 -0400, Stewart Gordon <smjg_1998 yahoo.com> 
 wrote:
 
 Since nobody's yet asked....

 What type would .dup and .idup return?

 The best choice I can see is to make .dup return T[new] and .idup 
 return invariant(T)[].

 
 This brings up another question.  Should something like 
 immutable(T)[new] or const(T)[new] be even valid?  Consider this:
 
 immutable(char)[new] str = "hello".idup; // not sure how you convert a 
 literal to an array, but assume it works for now.
 
 assert(str.capacity == 16); // assume GC still allocates 16 minimum blocks
 str ~= " there";
 
 auto slice = str[5..$];
 
 str.length = 5;
 
 str ~= " world";
 
 assert(slice == " there"); // fails?
 
 We most likely have the same issue here with immutable data as we do 
 today, just not as obvious.  Although it's a corner case, it does 
 violate immutability without casts.
 
 There are several options:
 
 1. immutable(T)[new] and const(T)[new] are invalid.  This limits us more 
 than the current regime, but is a questionable construct to begin with, 
 considering it allows appending in place (you can always create a new 
 array with concatenation).
 2. immutable(T)[new] and const(T)[new] are equivalent to 
 immutable(T[new]) and const(T[new]) respectively.  This simplifies 
 generic programming and still prevents abuses.
 3. when you shrink an array of immutable/const elements' length, the 
 capacity automatically shrinks to prevent overwriting previous data.  
 This still leaves mutable data with the overwrite problem, but it 
 doesn't violate const.

 
 Also, implicit casting to const needs to be addressed, normally you 
 can't do this with templated types (if that's how the underlying type is 
 implemented).

I think 1 is a good option. Essentially you can't have a resizeable 
array of immutable data. Functional languages don't have such a thing. 
If we go for (3), code that looks tighter is in fact more wasteful as 
every shrinking+expanding cycle causes a new allocation and a copy.


Andrei

Aug 10 2009

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Mon, 10 Aug 2009 15:21:48 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 Steven Schveighoffer wrote:
 On Mon, 10 Aug 2009 14:38:24 -0400, Stewart Gordon  
 <smjg_1998 yahoo.com> wrote:

 Since nobody's yet asked....

 What type would .dup and .idup return?

 The best choice I can see is to make .dup return T[new] and .idup  
 return invariant(T)[].

  This brings up another question.  Should something like  
 immutable(T)[new] or const(T)[new] be even valid?  Consider this:
  immutable(char)[new] str = "hello".idup; // not sure how you convert a  
 literal to an array, but assume it works for now.
  assert(str.capacity == 16); // assume GC still allocates 16 minimum  
 blocks
 str ~= " there";
  auto slice = str[5..$];
  str.length = 5;
  str ~= " world";
  assert(slice == " there"); // fails?
  We most likely have the same issue here with immutable data as we do  
 today, just not as obvious.  Although it's a corner case, it does  
 violate immutability without casts.
  There are several options:
  1. immutable(T)[new] and const(T)[new] are invalid.  This limits us  
 more than the current regime, but is a questionable construct to begin  
 with, considering it allows appending in place (you can always create a  
 new array with concatenation).
 2. immutable(T)[new] and const(T)[new] are equivalent to  
 immutable(T[new]) and const(T[new]) respectively.  This simplifies  
 generic programming and still prevents abuses.
 3. when you shrink an array of immutable/const elements' length, the  
 capacity automatically shrinks to prevent overwriting previous data.   
 This still leaves mutable data with the overwrite problem, but it  
 doesn't violate const.

  Also, implicit casting to const needs to be addressed, normally you  
 can't do this with templated types (if that's how the underlying type  
 is implemented).

 I think 1 is a good option. Essentially you can't have a resizeable  
 array of immutable data. Functional languages don't have such a thing.  
 If we go for (3), code that looks tighter is in fact more wasteful as  
 every shrinking+expanding cycle causes a new allocation and a copy.

Consider this then:

template ArrayOf(T)
{
   alias T[new] ArrayOf;
}

Is ArrayOf!invariant(char) valid?  Should it be?  I'm not sure, this was  
my focus of option 2.

-Steve

Aug 10 2009

bearophile <bearophileHUGS lycos.com> writes:

dsimcha:

 import std.stdio, std.perf;

[...]
 Results:
 Direct:  226
 Indirect:  229

Thank you. Following your example I have tried to do a bit more realistic tests.


// test1.d, for D2, Phobos
import std.stdio: writeln;
import std.perf: PerformanceCounter;

final class TNew(T) {
    T* ptr;
    size_t length, capacity;

    this(size_t size) {
        this.ptr = (new T[size]).ptr;
        this.length = size;
        this.capacity = size;
    }

    ref T opIndex(size_t index) {
        return this.ptr[index];
    }
}

void main() {
    // increase N if you have a large L2 cache
    enum int N = 100_000;

    enum int NLOOP = 40_000;
    alias uint T;

    auto pc = new PerformanceCounter;
    auto a = new T[N];
    auto c = new TNew!T(N);

    pc.start;
    T* p = a.ptr;
    for (uint j; j < NLOOP; j++)
        for (uint i; i < N; i++)
            p[i]++;
    pc.stop;
    writeln("Direct: ", pc.milliseconds);

    pc.start;
    for (uint j; j < NLOOP; j++)
        for (uint i; i < N; i++)
            c[i]++;
    pc.stop;
    writeln("Class:  ", pc.milliseconds);

    pc.start;
    for (uint j; j < NLOOP; j++)
        for (uint i; i < N; i++)
            a[i]++;
    pc.stop;
    writeln("Array:  ", pc.milliseconds);
    writeln();


    pc.start;
    for (uint j; j < NLOOP; j++)
        for (uint i = N-1; i--; )
            p[i]++;
    pc.stop;
    writeln("Direct: ", pc.milliseconds);

    pc.start;
    for (uint j; j < NLOOP; j++)
        for (uint i = N-1; i--; )
            c[i]++;
    pc.stop;
    writeln("Class:  ", pc.milliseconds);

    pc.start;
    for (uint j; j < NLOOP; j++)
        for (uint i = N-1; i--; )
            a[i]++;
    pc.stop;
    writeln("Array:  ", pc.milliseconds);
}


Timings on Windows, DMD v2.031, 2 GHz Core2 PC (use a larger N if you have a
large L2 CPU Cache):
Direct: 4654
Class:  4670
Array:  4095

Direct: 4650
Class:  4688
Array:  4089

It's curious how the "direct" version is slower than the normal dynamic array
(this doesn't happen in LDC, see below).

-----------------------------

Then I've tried to create a version that can be compiled with LDC too (with
Tango). The LDC that I have compiles D1 code, so I can't use ref T opIndex().
So I have created the following version, if you spot bugs or mistakes please
tell me:


// test2.d, for D1, Phobos and Tango
version (Tango) {
    import tango.stdc.stdio: printf;
    import tango.time.StopWatch: StopWatch;
} else {
    import std.c.stdio: printf;
    import std.perf: PerformanceCounter;
}

final class TNew(T) {
    T* ptr;
    size_t length, capacity;

    this(size_t size) {
        this.ptr = (new T[size]).ptr;
        this.length = size;
        this.capacity = size;
    }

    T* refItem(size_t index) {
        return &(this.ptr[index]);
    }
}

void main() {
    // increase N if you have a large L2 cache
    const int N = 100_000;

    const int NLOOP = 40_000;
    alias uint T;

    version (Tango)
        StopWatch timer;
    else
        auto timer = new PerformanceCounter;
    int deltaTime;

    auto a = new T[N];
    auto c = new TNew!(T)(N);

    timer.start;
    T* p = a.ptr;
    for (uint j; j < NLOOP; j++)
        for (uint i; i < N; i++)
            p[i]++;
    version (Tango) {
        deltaTime = timer.microsec() / 1000;
    } else {
        timer.stop;
        deltaTime = cast(int)timer.milliseconds;
    }
    printf("Direct: %d\n", deltaTime);

    timer.start;
    for (uint j; j < NLOOP; j++)
        for (uint i; i < N; i++)
            (*c.refItem(i))++;
    version (Tango) {
        deltaTime = timer.microsec() / 1000;
    } else {
        timer.stop;
        deltaTime = cast(int)timer.milliseconds;
    }
    printf("Class:  %d\n", deltaTime);

    timer.start;
    for (uint j; j < NLOOP; j++)
        for (uint i; i < N; i++)
            a[i]++;
    version (Tango) {
        deltaTime = timer.microsec() / 1000;
    } else {
        timer.stop;
        deltaTime = cast(int)timer.milliseconds;
    }
    printf("Array:  %d\n", deltaTime);
    printf("\n");


    timer.start;
    for (uint j; j < NLOOP; j++)
        for (uint i = N-1; i--; )
            p[i]++;
    version (Tango) {
        deltaTime = timer.microsec() / 1000;
    } else {
        timer.stop;
        deltaTime = cast(int)timer.milliseconds;
    }
    printf("Direct: %d\n", deltaTime);

    timer.start;
    for (uint j; j < NLOOP; j++)
        for (uint i = N-1; i--; )
            (*c.refItem(i))++;
    version (Tango) {
        deltaTime = timer.microsec() / 1000;
    } else {
        timer.stop;
        deltaTime = cast(int)timer.milliseconds;
    }
    printf("Class:  %d\n", deltaTime);

    timer.start;
    for (uint j; j < NLOOP; j++)
        for (uint i = N-1; i--; )
            a[i]++;
    version (Tango) {
        deltaTime = timer.microsec() / 1000;
    } else {
        timer.stop;
        deltaTime = cast(int)timer.milliseconds;
    }
    printf("Array:  %d\n", deltaTime);
}


Timings on Windows:
Direct: 4654
Class:  4810
Array:  4086

Direct: 4665
Class:  4598
Array:  4084


Timings on a 32 bit Linux (ldc -O3 -release -inline):
Direct: 4280
Class:  5000
Array:  4280

Direct: 4280
Class:  4860
Array:  4270

If such tests are correct, then the new resizable arrays of D2 will indeed have
some performance penality (something like 13%-17% in this test, but results may
change with other array sizes).

Bye,
bearophile

Aug 10 2009

Jarrett Billingsley <jarrett.billingsley gmail.com> writes:

On Mon, Aug 10, 2009 at 6:53 PM, bearophile<bearophileHUGS lycos.com> wrote:

 If such tests are correct, then the new resizable arrays of D2 will indeed
have some performance penality (something like 13%-17% in this test, but
results may change with other array sizes).

I don't see why that's a big problem.  If you're using a resizable
array, you probably care more about *resizing* it.  That is, you're
probably only going to be using resizable arrays when you're doing ~=.
 Like Walter said, if you need fast access to the data, you can just
slice it and get a slice array.

Aug 10 2009

bearophile <bearophileHUGS lycos.com> writes:

Jarrett Billingsley:
I don't see why that's a big problem.<

Like St Thomas I touch first and believe (maybe) later :-)

Bye,
bearophile

Aug 11 2009

D Programming

C/C++ Programming

Other

digitalmars.D - T[new]