digitalmars.D - T[new]
- Walter Bright (22/22) Aug 09 2009 D has a number of subtle problems (performance and semantic) that arise
- Brad Roberts (4/41) Aug 09 2009 Yay. What will happen with slices over a T[new] when the underlying T[n...
- Walter Bright (17/20) Aug 09 2009 Great question! It's no different from what happens with any container
- Brad Roberts (12/39) Aug 09 2009 As expected, but like I said.. this needs to be clear in the spec. Beca...
- Andrei Alexandrescu (10/52) Aug 09 2009 Well yah but in all languages this is the case. C++'s iterators not only...
- Graham St Jack (1/1) Aug 09 2009 This sounds excellent.
- grauzone (9/9) Aug 09 2009 That's great. I think this would finally fix some of the more pressing
- Walter Bright (8/18) Aug 09 2009 Yes.
- bearophile (11/16) Aug 09 2009 Now there are four ways to create/use "arrays" in D2:
- Walter Bright (2/2) Aug 09 2009 By the way, T[new] is Andrei's idea, as well as it being his idea to
- bearophile (9/20) Aug 09 2009 Such syntaxes have to be chosen wisely. I don't fully understand that sy...
- Andrei Alexandrescu (4/15) Aug 09 2009 Slices will remain ubiquitous, arrays probably not so much.
- BCS (6/13) Aug 09 2009 I beg to differ. D is much better than c# in many ways (D's abstraction ...
- grauzone (4/7) Aug 09 2009 I see two things a dotnet implementation of D could have over native D:
- bearophile (5/7) Aug 09 2009 The dotnet GC is probably better than the current D GC, but I think it's...
- Jeremie Pelletier (3/13) Aug 09 2009 I just finished reading about Bartosz's early entry about the shared qua...
- Walter Bright (5/12) Aug 09 2009 Even if you're correct that D.net is pointless, and I don't agree with
- Robert Jacques (6/17) Aug 09 2009 But (IIRC) it's not actually a canary. .NET has slices and there's no
- div0 (16/28) Aug 10 2009 -----BEGIN PGP SIGNED MESSAGE-----
- bearophile (4/6) Aug 09 2009 I have another question: are there some performance penalities in using ...
- Andrei Alexandrescu (3/8) Aug 09 2009 One extra indirection, usually nearby.
- Lionello Lunesu (8/17) Aug 09 2009 If the size and capacity get added to the begining of the memory block
- Andrei Alexandrescu (4/23) Aug 09 2009 I wish too. The problem is that when you resize the array, the control
- dsimcha (34/40) Aug 09 2009 import std.stdio, std.perf;
-
Stewart Gordon
(19/27)
Aug 09 2009
- grauzone (5/7) Aug 09 2009 Just what happens to the ~= operator anyway? Right now, it appends data
- Walter Bright (9/44) Aug 09 2009 No, that would defeat the whole purpose of making T[new] a reference
- Leandro Lucarella (9/46) Aug 09 2009 What about writing a DIP? ;)
- Michel Fortin (14/39) Aug 09 2009 I have nothing against the concept of container, but I dislike like the
- Walter Bright (3/5) Aug 09 2009 We explored unique at length, and trying to make it work would render
- Michiel Helvensteijn (5/10) Aug 09 2009 I've heard 'unique' mentioned on this group now and then. What does it m...
- Sergey Gromov (8/18) Aug 09 2009 Unique reference is a reference which is statically known to be the only
- Kagamin (2/15) Aug 10 2009 Of what type will strings be? Of what type will be the result of concate...
- Walter Bright (3/5) Aug 10 2009 T[new]
- Lars T. Kyllingstad (6/10) Aug 10 2009 I've always wondered: Why are strings of type immutable(char)[], and not...
- Walter Bright (5/7) Aug 10 2009 So:
- Lars T. Kyllingstad (3/13) Aug 10 2009 Ah, of course. :) Thanks.
- Jeremie Pelletier (4/20) Aug 10 2009 The problem with immutable(char)[] was that any string can be resized, e...
-
Leandro Lucarella
(14/38)
Aug 10 2009
From: Walter Bright
- Stewart Gordon (9/16) Aug 10 2009 It would be more efficient to cut out this middleman for things that
- Walter Bright (4/11) Aug 10 2009 The vast majority of uses will not need to be resizeable.
- Andrei Alexandrescu (8/16) Aug 10 2009 Hmmm, I see a problem.
- Steven Schveighoffer (8/21) Aug 10 2009 auto c1 = 'a';
- Sergey Gromov (3/12) Aug 10 2009 Good news. I'm all for it. The T[new] syntax is a bit... weird... bit
- Steven Schveighoffer (30/52) Aug 10 2009 Yay!
- Jarrett Billingsley (14/36) Aug 10 2009 hen
- Stewart Gordon (5/5) Aug 10 2009 Since nobody's yet asked....
- Steven Schveighoffer (32/36) Aug 10 2009 This brings up another question. Should something like immutable(T)[new...
- Andrei Alexandrescu (6/55) Aug 10 2009 I think 1 is a good option. Essentially you can't have a resizeable
- Steven Schveighoffer (10/54) Aug 10 2009 Consider this then:
- bearophile (193/197) Aug 10 2009 Thank you. Following your example I have tried to do a bit more realisti...
- Jarrett Billingsley (6/7) Aug 10 2009 I don't see why that's a big problem. If you're using a resizable
- bearophile (4/5) Aug 11 2009 Like St Thomas I touch first and believe (maybe) later :-)
D has a number of subtle problems (performance and semantic) that arise when arrays are resized. The solution is to separate resizeable array types from slices. Slices will retain the old: T[] slice; syntax. Resizeable arrays will be declared as: T[new] array; The new expression: new T[10] will return a T[new]. T[new] will implicitly convert to T[], but not the other way. slice.length will become read-only. Under the hood, a T[new] will be a single pointer to a library defined type. This library defined type will likely contain three properties: size_t length; T* ptr; size_t capacity; The usual array operations will work on T[new] as well as T[]. Doing this change will: 1. fix many nasties at the edges of array semantics 2. make arrays implementable on .net 3. make clear in function signatures if the function can resize the array or not
Aug 09 2009
Yay. What will happen with slices over a T[new] when the underlying T[new] must be moved due to resizing? The behavior needs to be at least well specified, if not well defined. Walter Bright wrote:D has a number of subtle problems (performance and semantic) that arise when arrays are resized. The solution is to separate resizeable array types from slices. Slices will retain the old: T[] slice; syntax. Resizeable arrays will be declared as: T[new] array; The new expression: new T[10] will return a T[new]. T[new] will implicitly convert to T[], but not the other way. slice.length will become read-only. Under the hood, a T[new] will be a single pointer to a library defined type. This library defined type will likely contain three properties: size_t length; T* ptr; size_t capacity; The usual array operations will work on T[new] as well as T[]. Doing this change will: 1. fix many nasties at the edges of array semantics 2. make arrays implementable on .net 3. make clear in function signatures if the function can resize the array or not
Aug 09 2009
Brad Roberts wrote:Yay. What will happen with slices over a T[new] when the underlying T[new] must be moved due to resizing? The behavior needs to be at least well specified, if not well defined.Great question! It's no different from what happens with any container when its contents change and there are references to the old content. What won't happen is the slice won't be pointing to unallocated memory, thanks to the gc. Resizing the T[new] will not cause the old contents to be deleted. The slice will either point to the old content, if the resize caused a move, or the new content, if it was resized in place. So it will still be implementation-defined behavior. But the likelihood of getting caught by such behavior is much less likely with T[new], as one can clearly see in the code where the resizeables are, rather than the current situation where any slice could be resized by anyone. I think that taking a slice of a T[new], then resizing the T[new], will be rare (or at least should be). Normally, a T[new] will be built, and when it is all done, it is converted to a T[] and there is no longer any way to resize it.
Aug 09 2009
Walter Bright wrote:Brad Roberts wrote:As expected, but like I said.. this needs to be clear in the spec. Because something like this is just confusing (and yes, I know it's not new behavior): auto a = new int[10]; a[0] = 1; auto s1 = a[0..1]; a.length = <some value large enough to force a move>; auto s2 = a[0..1]; s2[0] = 2; assert(s1[0] == s2[1]); // fail Later, BradYay. What will happen with slices over a T[new] when the underlying T[new] must be moved due to resizing? The behavior needs to be at least well specified, if not well defined.Great question! It's no different from what happens with any container when its contents change and there are references to the old content. What won't happen is the slice won't be pointing to unallocated memory, thanks to the gc. Resizing the T[new] will not cause the old contents to be deleted. The slice will either point to the old content, if the resize caused a move, or the new content, if it was resized in place. So it will still be implementation-defined behavior. But the likelihood of getting caught by such behavior is much less likely with T[new], as one can clearly see in the code where the resizeables are, rather than the current situation where any slice could be resized by anyone. I think that taking a slice of a T[new], then resizing the T[new], will be rare (or at least should be). Normally, a T[new] will be built, and when it is all done, it is converted to a T[] and there is no longer any way to resize it.
Aug 09 2009
Brad Roberts wrote:Walter Bright wrote:Well yah but in all languages this is the case. C++'s iterators not only invalidate at the drop of a hat, but you can't really tell whether they're invalid. Java iterators can also be invalidated and may throw an exception. When resizing a D array, its underlying slices may be *orphaned*. That means they are still valid, they just don't refer to the same data as the array. I think that's a reasonable tradeoff between efficiency and safety. AndreiBrad Roberts wrote:As expected, but like I said.. this needs to be clear in the spec. Because something like this is just confusing (and yes, I know it's not new behavior): auto a = new int[10]; a[0] = 1; auto s1 = a[0..1]; a.length = <some value large enough to force a move>; auto s2 = a[0..1]; s2[0] = 2; assert(s1[0] == s2[1]); // fail Later, BradYay. What will happen with slices over a T[new] when the underlying T[new] must be moved due to resizing? The behavior needs to be at least well specified, if not well defined.Great question! It's no different from what happens with any container when its contents change and there are references to the old content. What won't happen is the slice won't be pointing to unallocated memory, thanks to the gc. Resizing the T[new] will not cause the old contents to be deleted. The slice will either point to the old content, if the resize caused a move, or the new content, if it was resized in place. So it will still be implementation-defined behavior. But the likelihood of getting caught by such behavior is much less likely with T[new], as one can clearly see in the code where the resizeables are, rather than the current situation where any slice could be resized by anyone. I think that taking a slice of a T[new], then resizing the T[new], will be rare (or at least should be). Normally, a T[new] will be built, and when it is all done, it is converted to a T[] and there is no longer any way to resize it.
Aug 09 2009
That's great. I think this would finally fix some of the more pressing issues of D. I have a question: will T[new] a; allocate something? Or will the allocation of the hidden library array object delayed until the length is set to non null? If the library array object is not allocated, will a.ptr, a.capacity and a.length simply return null?
Aug 09 2009
grauzone wrote:will T[new] a; allocate something?Nope.Or will the allocation of the hidden library array object delayed until the length is set to non null?Yes.If the library array object is not allocated, will a.ptr, a.capacity and a.length simply return null?Yes. Clearly, those properties will have to be functions under the hood. So T[new] operations will be a bit slower than for slices. For faster indexing, you'd probably want to do: auto slice = a[]; and then operate on the slice.
Aug 09 2009
Now there are four ways to create/use "arrays" in D2: - the * syntax on a raw memory block, like in C. - fixed sized arrays, that are just a pointer somewhere. they may even become fully value types........ - the slice struct, about 2 * size_t.sizeof - the new reference arrays, a pointer on the stack, plus a 3 items struct somewhere, probably on the heap. If it doesn't escape the scope LDC may be able to put it too on the stack. The GC may allocate a space of 4 items anyway for it, so there's an item wasted. :-) In the meantime I have understood that the slice has to be the default syntax to keep the compatibility with C :-) Walter Bright:Yes. Clearly, those properties will have to be functions under the hood. So T[new] operations will be a bit slower than for slices. For faster indexing, you'd probably want to do: auto slice = a[]; and then operate on the slice.Let's hope such functions can be inlined. Assuming they can be inlined, a smart compiler can remove some of those ifs where it knows the array surely exists (virtual machines today are usually able to remove some array bound tests using similar tricks). I don't hold my breath for D2 to become this smart... Bye, bearophile
Aug 09 2009
By the way, T[new] is Andrei's idea, as well as it being his idea to make it a reference type.
Aug 09 2009
Walter Bright: I like this general proposal, it sounds like something that can improve D a little. There are many other things that have to be improved in D, but baby steps are enough to go somewhere :-)Slices will retain the old: T[] slice; syntax. Resizeable arrays will be declared as: T[new] array;Such syntaxes have to be chosen wisely. I don't fully understand that syntax. And maybe I don't fully like it. I think the default (simpler) syntax has to be the most flexible and safer construct, that is resizable arrays. Slices can be seen as an optimization, so they can have a bit longer syntax.Under the hood, a T[new] will be a single pointer to a library defined type. This library defined type will likely contain three properties: size_t length; T* ptr; size_t capacity;Weren't you suggestion to use the start pointer - end pointer instead?2. make arrays implementable on .netthan D, and it's similar anyway. So I don't think people will use D on dotnet. So even if creating a D for dotnet can be positive, I don't want D2 to change its design to allow a better implementation on dotnet. The question by Brad Roberts looks important, it has to find some kind of answer:What will happen with slices over a T[new] when the underlying T[new] must be moved due to resizing?<Bye, bearophile
Aug 09 2009
bearophile wrote:Walter Bright: I like this general proposal, it sounds like something that can improve D a little. There are many other things that have to be improved in D, but baby steps are enough to go somewhere :-)Why am I not surprised :o).Slices will retain the old: T[] slice; syntax. Resizeable arrays will be declared as: T[new] array;Such syntaxes have to be chosen wisely. I don't fully understand that syntax. And maybe I don't fully like it.I think the default (simpler) syntax has to be the most flexible and safer construct, that is resizable arrays. Slices can be seen as an optimization, so they can have a bit longer syntax.Slices will remain ubiquitous, arrays probably not so much. Andrei
Aug 09 2009
Reply to bearophile,Walter Bright:not as good in a few others (tools, libs, runtime refection).2. make arrays implementable on .netbetter than D,and it's similar anyway. So I don't think people will use D on dotnet.This I agree with because D looses many if not all of its advantages when forced into the .NET/CLI/managed-code world
Aug 09 2009
bearophile wrote:I see two things a dotnet implementation of D could have over native D: - better garbage collector (the D one barely does its job...) - better interoperability (*hint* OMF *hint*)2. make arrays implementable on .netthan D, and it's similar anyway. So I don't think people will use D on dotnet. So even if creating a D for dotnet can be positive, I don't want D2 to change its design to allow a better implementation on dotnet.
Aug 09 2009
grauzone:I see two things a dotnet implementation of D could have over native D: - better garbage collector (the D one barely does its job...)The dotnet GC is probably better than the current D GC, but I think it's designed for mostly movable objects. Currently most (or all) D objects are pinned (they can't be moved around in memory), so I don't know if the dotnet GC will do much good. D needs a GC designed for its peculiar characteristics. And I believe D will also need some extra semantics to allow the creation of such efficient half-movable half-pinned GC (I have explained such ideas one time in the past). Bye, bearophile
Aug 09 2009
bearophile Wrote:grauzone:I just finished reading about Bartosz's early entry about the shared qualifier and how it could lead to per-thead memory heaps (as only shared allocations should use the shared heap). This means the GC can maintain different heaps per thread, and perform collection locally without pausing the world, and the entire heap can just be trashed when the thread exits. I think it could be much more efficient than a moving GC (as maintaining different shared heaps and moving memory between them is a slow process). I am really tempted to modify my custom GC to do just that heh.I see two things a dotnet implementation of D could have over native D: - better garbage collector (the D one barely does its job...)The dotnet GC is probably better than the current D GC, but I think it's designed for mostly movable objects. Currently most (or all) D objects are pinned (they can't be moved around in memory), so I don't know if the dotnet GC will do much good. D needs a GC designed for its peculiar characteristics. And I believe D will also need some extra semantics to allow the creation of such efficient half-movable half-pinned GC (I have explained such ideas one time in the past). Bye, bearophile
Aug 09 2009
bearophile wrote:Even if you're correct that D.net is pointless, and I don't agree with that assessment, I think the problems implementing D arrays on .net will show up elsewhere in attempts to support other targets. So I think it's a "canary" issue rather than a .net one.2. make arrays implementable on .netbetter than D, and it's similar anyway. So I don't think people will use D on dotnet. So even if creating a D for dotnet can be positive, I don't want D2 to change its design to allow a better implementation on dotnet.
Aug 09 2009
On Sun, 09 Aug 2009 14:21:31 -0700, Walter Bright <newshound1 digitalmars.com> wrote:bearophile wrote:But (IIRC) it's not actually a canary. .NET has slices and there's no 'self-contained' implementation problem. The issue was that the .NET library wasn't written with slices (.NET, D or otherwise) in mind, so conversions to/from slices were required.Even if you're correct that D.net is pointless, and I don't agree with that assessment, I think the problems implementing D arrays on .net will show up elsewhere in attempts to support other targets. So I think it's a "canary" issue rather than a .net one.2. make arrays implementable on .netbetter than D, and it's similar anyway. So I don't think people will use D on dotnet. So even if creating a D for dotnet can be positive, I don't want D2 to change its design to allow a better implementation on dotnet.
Aug 09 2009
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 bearophile wrote:Walter Bright: I like this general proposal, it sounds like something that can improve D a little. There are many other things that have to be improved in D, but baby steps are enough to go somewhere :-)<snip> Ya, not a nice syntax. how about T[!static] ?! ;) - -- My enormous talent is exceeded only by my outrageous laziness. http://www.ssTk.co.uk -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.7 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iD8DBQFKgHaiT9LetA9XoXwRAoBDAJ9WAHlAjkW7TXRUWqCz80N8m4WbwACfTCfP 9MnZfV5pifUKwNC3c1HgySM= =J6pq -----END PGP SIGNATURE-----Slices will retain the old: T[] slice; syntax. Resizeable arrays will be declared as: T[new] array;Such syntaxes have to be chosen wisely. I don't fully understand that syntax. And maybe I don't fully like it. I think the default (simpler) syntax has to be the most flexible and safer construct, that is resizable arrays. Slices can be seen as an optimization, so they can have a bit longer syntax.
Aug 10 2009
Walter Bright:Under the hood, a T[new] will be a single pointer to a library defined type. This library defined type will likely contain three properties:I have another question: are there some performance penalities in using such arrays for normal random access operations? Bye, bearophile
Aug 09 2009
bearophile wrote:Walter Bright:One extra indirection, usually nearby. AndreiUnder the hood, a T[new] will be a single pointer to a library defined type. This library defined type will likely contain three properties:I have another question: are there some performance penalities in using such arrays for normal random access operations?
Aug 09 2009
"Andrei Alexandrescu" <SeeWebsiteForEmail erdani.org> wrote in message news:h5ndt0$1blj$1 digitalmars.com...bearophile wrote:If the size and capacity get added to the begining of the memory block containing the data, only an extra offset is needed. The compiler could emit "mov eax, [esi*4 + 8]" or something. Look ma, no extra indirection :) In fact, this is what .NET does as well, if I'm not mistaken. But then, .NET always has an extra indirection because of the movable GC stuff, right? L.Walter Bright:One extra indirection, usually nearby. AndreiUnder the hood, a T[new] will be a single pointer to a library defined type. This library defined type will likely contain three properties:I have another question: are there some performance penalities in using such arrays for normal random access operations?
Aug 09 2009
Lionello Lunesu wrote:"Andrei Alexandrescu" <SeeWebsiteForEmail erdani.org> wrote in message news:h5ndt0$1blj$1 digitalmars.com...I wish too. The problem is that when you resize the array, the control block will not refer to the adjacent chunk. Andreibearophile wrote:If the size and capacity get added to the begining of the memory block containing the data, only an extra offset is needed. The compiler could emit "mov eax, [esi*4 + 8]" or something. Look ma, no extra indirection :)Walter Bright:One extra indirection, usually nearby. AndreiUnder the hood, a T[new] will be a single pointer to a library defined type. This library defined type will likely contain three properties:I have another question: are there some performance penalities in using such arrays for normal random access operations?
Aug 09 2009
== Quote from bearophile (bearophileHUGS lycos.com)'s articleWalter Bright:arrays for normal random access operations?Under the hood, a T[new] will be a single pointer to a library defined type. This library defined type will likely contain three properties:I have another question: are there some performance penalities in using suchBye, bearophileimport std.stdio, std.perf; final class TNew { uint* foo; uint length; this(uint size) { foo = (new uint[size]).ptr; length = size; } ref uint opIndex(uint index) { return foo[index]; } } void main() { auto pc = new PerformanceCounter; uint[] array = new uint[1]; pc.start; foreach(i; 0..100_000_000) { array.ptr[0]++; } pc.stop; writeln("Direct: ", pc.milliseconds); auto tnew = new TNew(1); pc.start; foreach(i; 0..100_000_000) { tnew[0]++; } pc.stop; writeln("Indirect: ", pc.milliseconds); } Results: Direct: 226 Indirect: 229
Aug 09 2009
Walter Bright wrote: <snip>Under the hood, a T[new] will be a single pointer to a library defined type. This library defined type will likely contain three properties: size_t length; T* ptr; size_t capacity; The usual array operations will work on T[new] as well as T[].<snip> Would new T[10] allocate this structure and the array data on a single GC block, or on two separate blocks? And when the array is reallocated, will the structure move with it? I suppose it depends on whether you want T[new] to be (a) something whereby all references persist as the array is reallocated (b) merely a reference to an allocated array as opposed to an array slice If (a), this is currently achievable with a T[]*. If (b), what might work well is a structure like size_t length; size_t capacity; T[capacity] data; meaning still only one allocation and only one level of indirection when one is used. And the T[new] variable itself would simply hold &data[0]. Moreover, would whatever happens solve such const/invariant holes as bug 2093? Stewart.
Aug 09 2009
Stewart Gordon wrote:Moreover, would whatever happens solve such const/invariant holes as bug 2093?Just what happens to the ~= operator anyway? Right now, it appends data inline. My vote would be to make "a~=b" do the same as "a=a~b" (with types "T[] a" and "T[] b" or "T b"). T[new]'s ~= would still append inline.
Aug 09 2009
Stewart Gordon wrote:Walter Bright wrote: <snip>That's up to the implementation.Under the hood, a T[new] will be a single pointer to a library defined type. This library defined type will likely contain three properties: size_t length; T* ptr; size_t capacity; The usual array operations will work on T[new] as well as T[].<snip> Would new T[10] allocate this structure and the array data on a single GC block, or on two separate blocks?And when the array is reallocated, will the structure move with it?No, that would defeat the whole purpose of making T[new] a reference type. With it being a reference type: T[new] a = ...; T[new] b = a; a.length = ... ... b.length changes too ...I suppose it depends on whether you want T[new] to be (a) something whereby all references persist as the array is reallocated (b) merely a reference to an allocated array as opposed to an array slice If (a), this is currently achievable with a T[]*. If (b), what might work well is a structure like size_t length; size_t capacity; T[capacity] data; meaning still only one allocation and only one level of indirection when one is used. And the T[new] variable itself would simply hold &data[0]. Moreover, would whatever happens solve such const/invariant holes as bug 2093?I believe it does.Stewart.
Aug 09 2009
Walter Bright, el 9 de agosto a las 13:29 me escribiste:D has a number of subtle problems (performance and semantic) that arise when arrays are resized. The solution is to separate resizeable array types from slices. Slices will retain the old: T[] slice; syntax. Resizeable arrays will be declared as: T[new] array; The new expression: new T[10] will return a T[new]. T[new] will implicitly convert to T[], but not the other way. slice.length will become read-only. Under the hood, a T[new] will be a single pointer to a library defined type. This library defined type will likely contain three properties: size_t length; T* ptr; size_t capacity; The usual array operations will work on T[new] as well as T[]. Doing this change will: 1. fix many nasties at the edges of array semantics 2. make arrays implementable on .net 3. make clear in function signatures if the function can resize the array or notWhat about writing a DIP? ;) -- Leandro Lucarella (luca) | Blog colectivo: http://www.mazziblog.com.ar/blog/ ---------------------------------------------------------------------------- GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145 104C 949E BFB6 5F5A 8D05) ---------------------------------------------------------------------------- VECINOS RESCATARON A CABALLITO ATROPELLADO -- Crónica TV
Aug 09 2009
On 2009-08-09 16:29:21 -0400, Walter Bright <newshound1 digitalmars.com> said:D has a number of subtle problems (performance and semantic) that arise when arrays are resized. The solution is to separate resizeable array types from slices. Slices will retain the old: T[] slice; syntax. Resizeable arrays will be declared as: T[new] array; The new expression: new T[10] will return a T[new].I have nothing against the concept of container, but I dislike like the syntax. Is this really better than Array!T ?T[new] will implicitly convert to T[], but not the other way. slice.length will become read-only.All fine by me.Doing this change will: 1. fix many nasties at the edges of array semanticsUnfortunatly, not all. Solving them all could be done with unique semantics (unique slices can be expanded with no side effect).2. make arrays implementable on .net 3. make clear in function signatures if the function can resize the array or notUnique and ref unique slices would be perfectly fine for that (assuming slices can be resized when unique). But I know, unique isn't easy to implement to fit all the use cases we'd like to solve. I'm just sharing a dream. -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Aug 09 2009
Michel Fortin wrote:But I know, unique isn't easy to implement to fit all the use cases we'd like to solve. I'm just sharing a dream.We explored unique at length, and trying to make it work would render the rest of the language nearly unusably complex.
Aug 09 2009
Walter Bright wrote:I've heard 'unique' mentioned on this group now and then. What does it mean in the context of programming languages? -- Michiel HelvensteijnBut I know, unique isn't easy to implement to fit all the use cases we'd like to solve. I'm just sharing a dream.We explored unique at length, and trying to make it work would render the rest of the language nearly unusably complex.
Aug 09 2009
Mon, 10 Aug 2009 01:56:35 +0200, Michiel Helvensteijn wrote:Walter Bright wrote:Unique reference is a reference which is statically known to be the only reference to the underlying data. They have some interesting properties: - Modification of underlying data does not cause side effects - Can be implicitly cast to immutable - Can be implicitly cast to mutable They prove to be rather tricky both to specify and to implement though.I've heard 'unique' mentioned on this group now and then. What does it mean in the context of programming languages?But I know, unique isn't easy to implement to fit all the use cases we'd like to solve. I'm just sharing a dream.We explored unique at length, and trying to make it work would render the rest of the language nearly unusably complex.
Aug 09 2009
Walter Bright Wrote:Resizeable arrays will be declared as: T[new] array; The new expression: new T[10] will return a T[new]. T[new] will implicitly convert to T[], but not the other way. slice.length will become read-only.Of what type will strings be? Of what type will be the result of concatenation?
Aug 10 2009
Kagamin wrote:Of what type will strings be?immutable(char)[]Of what type will be the result of concatenation?T[new]
Aug 10 2009
Walter Bright wrote:Kagamin wrote:I've always wondered: Why are strings of type immutable(char)[], and not immutable(char[])? I mean, there is no point in allowing resizing of strings, when assigning to the new elements is impossible. -LarsOf what type will strings be?immutable(char)[]
Aug 10 2009
Lars T. Kyllingstad wrote:I've always wondered: Why are strings of type immutable(char)[], and not immutable(char[])?So: string a = "hello"; a = "foo"; works.
Aug 10 2009
Walter Bright wrote:Lars T. Kyllingstad wrote:Ah, of course. :) Thanks. -LarsI've always wondered: Why are strings of type immutable(char)[], and not immutable(char[])?So: string a = "hello"; a = "foo"; works.
Aug 10 2009
Lars T. Kyllingstad Wrote:Walter Bright wrote:The problem with immutable(char)[] was that any string can be resized, even slices. Walter: what will the string types be aliased to now: still immutable(char)[] or immutable(char)[new]? I think it would be best to have them use the array [new] type. Functions which do not modify the string length can mark the string as an in parameter, and immutable(char)[] should be castable to const(immutable(char)[new]) in such cases to still allow slices to be passed.Lars T. Kyllingstad wrote:Ah, of course. :) Thanks. -LarsI've always wondered: Why are strings of type immutable(char)[], and not immutable(char[])?So: string a = "hello"; a = "foo"; works.
Aug 10 2009
Jeremie Pelletier, el 10 de agosto a las 13:01 me escribiste:Lars T. Kyllingstad Wrote:From: Walter Bright <newshound1 digitalmars.com> Date: Mon, 10 Aug 2009 01:01:56 -0700 Subject: Re: T[new] User-Agent: Thunderbird 2.0.0.22 (Windows/20090605) Kagamin wrote:Walter Bright wrote:The problem with immutable(char)[] was that any string can be resized, even slices. Walter: what will the string types be aliased to now: still immutable(char)[] or immutable(char)[new]?Lars T. Kyllingstad wrote:Ah, of course. :) Thanks. -LarsI've always wondered: Why are strings of type immutable(char)[], and not immutable(char[])?So: string a = "hello"; a = "foo"; works.Of what type will strings be?immutable(char)[]Of what type will be the result of concatenation?T[new] -- Leandro Lucarella (luca) | Blog colectivo: http://www.mazziblog.com.ar/blog/ ---------------------------------------------------------------------------- GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145 104C 949E BFB6 5F5A 8D05) ---------------------------------------------------------------------------- El techo de mi cuarto lleno de galaxias
Aug 10 2009
Jeremie Pelletier wrote: <snip>Walter: what will the string types be aliased to now: still immutable(char)[] or immutable(char)[new]? I think it would be best to have them use the array [new] type.It would be more efficient to cut out this middleman for things that aren't going to change the length. Which means most things that operate on the string type (and similarly wstring/dstring), since these types are designed to be immutable. The change to T[] that's part of this plan would take it a stage further by making the length immutable as well.Functions which do not modify the string length can mark the string as an in parameter, and immutable(char)[] should be castable to const(immutable(char)[new]) in such cases to still allow slices to be passed.The conversion would create an extra memory allocation. Stewart.
Aug 10 2009
Jeremie Pelletier wrote:Walter: what will the string types be aliased to now: still immutable(char)[] or immutable(char)[new]?immutable(char)[]I think it would be best to have them use the array [new] type.The vast majority of uses will not need to be resizeable.Functions which do not modify the string length can mark the string as an in parameter, and immutable(char)[] should be castable to const(immutable(char)[new]) in such cases to still allow slices to be passed.A slice won't be castable to a resizeable.
Aug 10 2009
Walter Bright wrote:Kagamin wrote:Hmmm, I see a problem. auto s1 = "Hello"; auto s2 = " world"; auto s = s1 ~ s2; Some might be surprised that the type of s is not the same as that of s1 and s2. AndreiOf what type will strings be?immutable(char)[]Of what type will be the result of concatenation?T[new]
Aug 10 2009
On Mon, 10 Aug 2009 10:11:52 -0400, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:Walter Bright wrote:auto c1 = 'a'; auto c2 = 'b'; auto c = c1 + c2; But s is better, because it implicitly casts back to typeof(s1). I think it's the absolutely best behavior. -SteveKagamin wrote:Hmmm, I see a problem. auto s1 = "Hello"; auto s2 = " world"; auto s = s1 ~ s2; Some might be surprised that the type of s is not the same as that of s1 and s2.Of what type will strings be?immutable(char)[]Of what type will be the result of concatenation?T[new]
Aug 10 2009
Sun, 09 Aug 2009 13:29:21 -0700, Walter Bright wrote:D has a number of subtle problems (performance and semantic) that arise when arrays are resized. The solution is to separate resizeable array types from slices. Slices will retain the old: T[] slice; syntax. Resizeable arrays will be declared as: T[new] array;Good news. I'm all for it. The T[new] syntax is a bit... weird... bit I'm OK with it as well. Thanks!
Aug 10 2009
On Sun, 09 Aug 2009 16:29:21 -0400, Walter Bright <newshound1 digitalmars.com> wrote:D has a number of subtle problems (performance and semantic) that arise when arrays are resized. The solution is to separate resizeable array types from slices. Slices will retain the old: T[] slice; syntax. Resizeable arrays will be declared as: T[new] array; The new expression: new T[10] will return a T[new]. T[new] will implicitly convert to T[], but not the other way. slice.length will become read-only. Under the hood, a T[new] will be a single pointer to a library defined type. This library defined type will likely contain three properties: size_t length; T* ptr; size_t capacity; The usual array operations will work on T[new] as well as T[]. Doing this change will: 1. fix many nasties at the edges of array semantics 2. make arrays implementable on .net 3. make clear in function signatures if the function can resize the array or notYay! Will capacity be settable? That is, can I replace this pattern: char[] buf = new buf[1024]; buf.length = 0; with something like this?: char[new] buf; buf.capacity = 1024; // ensure capacity is *at least* 1024 ---- I read elsewhere that T[new] x; doesn't allocate until length (or maybe capacity) is non-zero. So T[new] is a library defined type. What does the underlying type get called with when x.length = 5 is entered? Is the underlying type: 1. an object? 2. a templated type? ---- What happens when you do: x = null; where x is a T[new] type? Is it the same as setting x.length = 0, or x.capacity = 0? What does x == null mean? What about x is null? Will ptr be settable, or only readable? Will the compiler continue to optimize foreach calls to arrays? How will foreach be implemented, as a range or opApply? The former implies some automated method to return a range, because you wouldn't want to modify the referenced array during iteration. This is exciting :) -Steve
Aug 10 2009
On Sun, Aug 9, 2009 at 4:29 PM, Walter Bright<newshound1 digitalmars.com> w= rote:D has a number of subtle problems (performance and semantic) that arise w=henarrays are resized. The solution is to separate resizeable array types fr=omslices. Slices will retain the old: =A0 T[] slice; syntax. Resizeable arrays will be declared as: =A0 T[new] array; The new expression: =A0 new T[10] will return a T[new]. T[new] will implicitly convert to T[], but not the other way. slice.length will become read-only. Under the hood, a T[new] will be a single pointer to a library defined ty=pe.This library defined type will likely contain three properties: =A0 =A0size_t length; =A0 =A0T* ptr; =A0 =A0size_t capacity; The usual array operations will work on T[new] as well as T[]. Doing this change will: 1. fix many nasties at the edges of array semantics 2. make arrays implementable on .net 3. make clear in function signatures if the function can resize the array=ornotThis is a great change. MiniD's array semantics are very similar to D's wrt. slicing, though I made the slight modification that array objects keep track of whether or not they are a slice of another array and behave slightly differently accordingly, and I've found it to be more intuitive. I like that D is doing the same thing, and statically too ;) The T[new] syntax isn't doing much for me, but then again, I don't have any better suggestions.
Aug 10 2009
Since nobody's yet asked.... What type would .dup and .idup return? The best choice I can see is to make .dup return T[new] and .idup return invariant(T)[]. Stewart.
Aug 10 2009
On Mon, 10 Aug 2009 14:38:24 -0400, Stewart Gordon <smjg_1998 yahoo.com> wrote:Since nobody's yet asked.... What type would .dup and .idup return? The best choice I can see is to make .dup return T[new] and .idup return invariant(T)[].This brings up another question. Should something like immutable(T)[new] or const(T)[new] be even valid? Consider this: immutable(char)[new] str = "hello".idup; // not sure how you convert a literal to an array, but assume it works for now. assert(str.capacity == 16); // assume GC still allocates 16 minimum blocks str ~= " there"; auto slice = str[5..$]; str.length = 5; str ~= " world"; assert(slice == " there"); // fails? We most likely have the same issue here with immutable data as we do today, just not as obvious. Although it's a corner case, it does violate immutability without casts. There are several options: 1. immutable(T)[new] and const(T)[new] are invalid. This limits us more than the current regime, but is a questionable construct to begin with, considering it allows appending in place (you can always create a new array with concatenation). 2. immutable(T)[new] and const(T)[new] are equivalent to immutable(T[new]) and const(T[new]) respectively. This simplifies generic programming and still prevents abuses. 3. when you shrink an array of immutable/const elements' length, the capacity automatically shrinks to prevent overwriting previous data. This still leaves mutable data with the overwrite problem, but it doesn't violate const. Also, implicit casting to const needs to be addressed, normally you can't do this with templated types (if that's how the underlying type is implemented). -Steve
Aug 10 2009
Steven Schveighoffer wrote:On Mon, 10 Aug 2009 14:38:24 -0400, Stewart Gordon <smjg_1998 yahoo.com> wrote:I think 1 is a good option. Essentially you can't have a resizeable array of immutable data. Functional languages don't have such a thing. If we go for (3), code that looks tighter is in fact more wasteful as every shrinking+expanding cycle causes a new allocation and a copy. AndreiSince nobody's yet asked.... What type would .dup and .idup return? The best choice I can see is to make .dup return T[new] and .idup return invariant(T)[].This brings up another question. Should something like immutable(T)[new] or const(T)[new] be even valid? Consider this: immutable(char)[new] str = "hello".idup; // not sure how you convert a literal to an array, but assume it works for now. assert(str.capacity == 16); // assume GC still allocates 16 minimum blocks str ~= " there"; auto slice = str[5..$]; str.length = 5; str ~= " world"; assert(slice == " there"); // fails? We most likely have the same issue here with immutable data as we do today, just not as obvious. Although it's a corner case, it does violate immutability without casts. There are several options: 1. immutable(T)[new] and const(T)[new] are invalid. This limits us more than the current regime, but is a questionable construct to begin with, considering it allows appending in place (you can always create a new array with concatenation). 2. immutable(T)[new] and const(T)[new] are equivalent to immutable(T[new]) and const(T[new]) respectively. This simplifies generic programming and still prevents abuses. 3. when you shrink an array of immutable/const elements' length, the capacity automatically shrinks to prevent overwriting previous data. This still leaves mutable data with the overwrite problem, but it doesn't violate const. Also, implicit casting to const needs to be addressed, normally you can't do this with templated types (if that's how the underlying type is implemented).
Aug 10 2009
On Mon, 10 Aug 2009 15:21:48 -0400, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:Steven Schveighoffer wrote:Consider this then: template ArrayOf(T) { alias T[new] ArrayOf; } Is ArrayOf!invariant(char) valid? Should it be? I'm not sure, this was my focus of option 2. -SteveOn Mon, 10 Aug 2009 14:38:24 -0400, Stewart Gordon <smjg_1998 yahoo.com> wrote:I think 1 is a good option. Essentially you can't have a resizeable array of immutable data. Functional languages don't have such a thing. If we go for (3), code that looks tighter is in fact more wasteful as every shrinking+expanding cycle causes a new allocation and a copy.Since nobody's yet asked.... What type would .dup and .idup return? The best choice I can see is to make .dup return T[new] and .idup return invariant(T)[].This brings up another question. Should something like immutable(T)[new] or const(T)[new] be even valid? Consider this: immutable(char)[new] str = "hello".idup; // not sure how you convert a literal to an array, but assume it works for now. assert(str.capacity == 16); // assume GC still allocates 16 minimum blocks str ~= " there"; auto slice = str[5..$]; str.length = 5; str ~= " world"; assert(slice == " there"); // fails? We most likely have the same issue here with immutable data as we do today, just not as obvious. Although it's a corner case, it does violate immutability without casts. There are several options: 1. immutable(T)[new] and const(T)[new] are invalid. This limits us more than the current regime, but is a questionable construct to begin with, considering it allows appending in place (you can always create a new array with concatenation). 2. immutable(T)[new] and const(T)[new] are equivalent to immutable(T[new]) and const(T[new]) respectively. This simplifies generic programming and still prevents abuses. 3. when you shrink an array of immutable/const elements' length, the capacity automatically shrinks to prevent overwriting previous data. This still leaves mutable data with the overwrite problem, but it doesn't violate const. Also, implicit casting to const needs to be addressed, normally you can't do this with templated types (if that's how the underlying type is implemented).
Aug 10 2009
dsimcha:import std.stdio, std.perf;[...]Results: Direct: 226 Indirect: 229Thank you. Following your example I have tried to do a bit more realistic tests. // test1.d, for D2, Phobos import std.stdio: writeln; import std.perf: PerformanceCounter; final class TNew(T) { T* ptr; size_t length, capacity; this(size_t size) { this.ptr = (new T[size]).ptr; this.length = size; this.capacity = size; } ref T opIndex(size_t index) { return this.ptr[index]; } } void main() { // increase N if you have a large L2 cache enum int N = 100_000; enum int NLOOP = 40_000; alias uint T; auto pc = new PerformanceCounter; auto a = new T[N]; auto c = new TNew!T(N); pc.start; T* p = a.ptr; for (uint j; j < NLOOP; j++) for (uint i; i < N; i++) p[i]++; pc.stop; writeln("Direct: ", pc.milliseconds); pc.start; for (uint j; j < NLOOP; j++) for (uint i; i < N; i++) c[i]++; pc.stop; writeln("Class: ", pc.milliseconds); pc.start; for (uint j; j < NLOOP; j++) for (uint i; i < N; i++) a[i]++; pc.stop; writeln("Array: ", pc.milliseconds); writeln(); pc.start; for (uint j; j < NLOOP; j++) for (uint i = N-1; i--; ) p[i]++; pc.stop; writeln("Direct: ", pc.milliseconds); pc.start; for (uint j; j < NLOOP; j++) for (uint i = N-1; i--; ) c[i]++; pc.stop; writeln("Class: ", pc.milliseconds); pc.start; for (uint j; j < NLOOP; j++) for (uint i = N-1; i--; ) a[i]++; pc.stop; writeln("Array: ", pc.milliseconds); } Timings on Windows, DMD v2.031, 2 GHz Core2 PC (use a larger N if you have a large L2 CPU Cache): Direct: 4654 Class: 4670 Array: 4095 Direct: 4650 Class: 4688 Array: 4089 It's curious how the "direct" version is slower than the normal dynamic array (this doesn't happen in LDC, see below). ----------------------------- Then I've tried to create a version that can be compiled with LDC too (with Tango). The LDC that I have compiles D1 code, so I can't use ref T opIndex(). So I have created the following version, if you spot bugs or mistakes please tell me: // test2.d, for D1, Phobos and Tango version (Tango) { import tango.stdc.stdio: printf; import tango.time.StopWatch: StopWatch; } else { import std.c.stdio: printf; import std.perf: PerformanceCounter; } final class TNew(T) { T* ptr; size_t length, capacity; this(size_t size) { this.ptr = (new T[size]).ptr; this.length = size; this.capacity = size; } T* refItem(size_t index) { return &(this.ptr[index]); } } void main() { // increase N if you have a large L2 cache const int N = 100_000; const int NLOOP = 40_000; alias uint T; version (Tango) StopWatch timer; else auto timer = new PerformanceCounter; int deltaTime; auto a = new T[N]; auto c = new TNew!(T)(N); timer.start; T* p = a.ptr; for (uint j; j < NLOOP; j++) for (uint i; i < N; i++) p[i]++; version (Tango) { deltaTime = timer.microsec() / 1000; } else { timer.stop; deltaTime = cast(int)timer.milliseconds; } printf("Direct: %d\n", deltaTime); timer.start; for (uint j; j < NLOOP; j++) for (uint i; i < N; i++) (*c.refItem(i))++; version (Tango) { deltaTime = timer.microsec() / 1000; } else { timer.stop; deltaTime = cast(int)timer.milliseconds; } printf("Class: %d\n", deltaTime); timer.start; for (uint j; j < NLOOP; j++) for (uint i; i < N; i++) a[i]++; version (Tango) { deltaTime = timer.microsec() / 1000; } else { timer.stop; deltaTime = cast(int)timer.milliseconds; } printf("Array: %d\n", deltaTime); printf("\n"); timer.start; for (uint j; j < NLOOP; j++) for (uint i = N-1; i--; ) p[i]++; version (Tango) { deltaTime = timer.microsec() / 1000; } else { timer.stop; deltaTime = cast(int)timer.milliseconds; } printf("Direct: %d\n", deltaTime); timer.start; for (uint j; j < NLOOP; j++) for (uint i = N-1; i--; ) (*c.refItem(i))++; version (Tango) { deltaTime = timer.microsec() / 1000; } else { timer.stop; deltaTime = cast(int)timer.milliseconds; } printf("Class: %d\n", deltaTime); timer.start; for (uint j; j < NLOOP; j++) for (uint i = N-1; i--; ) a[i]++; version (Tango) { deltaTime = timer.microsec() / 1000; } else { timer.stop; deltaTime = cast(int)timer.milliseconds; } printf("Array: %d\n", deltaTime); } Timings on Windows: Direct: 4654 Class: 4810 Array: 4086 Direct: 4665 Class: 4598 Array: 4084 Timings on a 32 bit Linux (ldc -O3 -release -inline): Direct: 4280 Class: 5000 Array: 4280 Direct: 4280 Class: 4860 Array: 4270 If such tests are correct, then the new resizable arrays of D2 will indeed have some performance penality (something like 13%-17% in this test, but results may change with other array sizes). Bye, bearophile
Aug 10 2009
On Mon, Aug 10, 2009 at 6:53 PM, bearophile<bearophileHUGS lycos.com> wrote:If such tests are correct, then the new resizable arrays of D2 will indeed have some performance penality (something like 13%-17% in this test, but results may change with other array sizes).I don't see why that's a big problem. If you're using a resizable array, you probably care more about *resizing* it. That is, you're probably only going to be using resizable arrays when you're doing ~=. Like Walter said, if you need fast access to the data, you can just slice it and get a slice array.
Aug 10 2009
Jarrett Billingsley:I don't see why that's a big problem.<Like St Thomas I touch first and believe (maybe) later :-) Bye, bearophile
Aug 11 2009