digitalmars.D - Decision on container design
- Andrei Alexandrescu (14/14) Jan 28 2011 Today after work I plan to start making one pass through std.container.
- bearophile (7/7) Jan 28 2011 Andrei:
- Michel Fortin (13/31) Jan 28 2011 Not my preferred choices (especially #1), but having containers in
- Andrei Alexandrescu (10/36) Jan 28 2011 Well if you brought forth some strong argument I'm all ears. What I see
- Michel Fortin (51/68) Jan 28 2011 We already argument this over and over in the past. First, I totally
- bearophile (17/23) Jan 28 2011 Built-in AAs are currently broken and in need to be fixed:
- spir (33/48) Jan 28 2011 Variation on the theme:
- Tomek =?UTF-8?B?U293acWEc2tp?= (10/59) Jan 28 2011 Is there anything implementation specific in the outer struct that provi...
- Michel Fortin (11/70) Jan 28 2011 You could provide an implementation-specific version of some functions
- Tomek =?ISO-8859-2?Q?Sowi=F1ski?= (28/37) Jan 29 2011 by
- Simen kjaeraas (27/57) Jan 29 2011 s
- Andrei Alexandrescu (5/63) Feb 01 2011 I do something similar with RefCounted. There are problems - you need to...
- Simen kjaeraas (40/43) Feb 01 2011 Static functions can safely be called. Hence, this template:
- Denis Koroskin (28/91) Jan 28 2011 Unfortunately, this design has big issues:
- Michel Fortin (26/56) Jan 28 2011 That's indeed a problem. I don't think it's a fatal flaw however, given
- Tomek =?ISO-8859-2?Q?Sowi=F1ski?= (4/9) Jan 29 2011 Or take an output range.
- Andrei Alexandrescu (15/67) Feb 01 2011 Yep, yep, I found myself wrestling with the same issues. All good
- Michel Fortin (19/82) Feb 01 2011 But are you not just pushing the aggravation elsewhere? If I need a by
- Andrei Alexandrescu (24/102) Feb 01 2011 If semantics are the primary concern, you could (and in fact Phobos
- Jonathan M Davis (8/101) Feb 01 2011 Java implements containers as classes (not that it really has any other ...
- bearophile (4/14) Feb 01 2011 I agree that most times a reference is better. This brings back the need...
- Michel Fortin (18/22) Feb 01 2011 What exactly is "most of the time"? In C++, you pass containers by '&'
- Simon Buerger (14/33) Feb 01 2011 Thats would not be "per-value" as most understand it.
- Steven Schveighoffer (27/98) Feb 01 2011 I've been thinking of making Appender.Impl public or at least its own
- Simon Buerger (8/117) Feb 01 2011 Just to note: the "correct" solution for the last problem is
- Steven Schveighoffer (15/28) Feb 01 2011 This is not a "solution". I cannot enforce that someone uses foreach wi...
- Steven Schveighoffer (16/40) Jan 31 2011 I see value in this idiom, and I have ideas on how to solve this with
- bearophile (8/10) Jan 28 2011 This page:
- Tomek =?ISO-8859-2?Q?Sowi=F1ski?= (9/23) Jan 29 2011 ve indicated large percentage improvements
- SHOO (14/30) Jan 29 2011 I think that it is a good policy.
- Simon Buerger (27/36) Jan 29 2011 Not perfectly what I would like, but a reasonable choice, and most
- bearophile (4/7) Jan 29 2011 This is good for a language like Python, or even Java, but for a system ...
- Jim (2/9) Jan 29 2011 There could be an interface named Set and various implementations of it,...
- Andrei Alexandrescu (4/13) Jan 29 2011 Thus approach has a few advantages but a lot of serious issues. I
- KennyTM~ (9/48) Jan 29 2011 A 'deque' just mean a double-ended queue, not necessarily doubly linked
- Jonathan M Davis (8/11) Jan 29 2011 It was decided previously that containers in std.container would specifi...
- spir (7/16) Jan 30 2011 +++ on all those points
- Steven Schveighoffer (3/40) Jan 31 2011 http://www.dsource.org/projects/dcollections
- Simon Buerger (6/8) Jan 31 2011 Well, seems not bad on a quick look. But source is updated 2 years
- David Nadlinger (5/6) Jan 31 2011 http://www.dsource.org/projects/dcollections/browser/branches/d2 – fou...
- Steven Schveighoffer (14/22) Jan 31 2011 latest meaningful change was 4 months ago:
- Simon Buerger (6/29) Jan 31 2011 Okay, my fault. Didnt realize you were the author, and the project is
- Steven Schveighoffer (7/12) Jan 31 2011 Sorry about that, I have had at least 3 people be confused at why trunk ...
- dsimcha (13/27) Jan 29 2011 In general I actually like this idea, but can we still have sealed
- Andrei Alexandrescu (3/6) Jan 29 2011 Where is it?
- dsimcha (2/8) Jan 29 2011
- Andrei Alexandrescu (5/15) Jan 29 2011 Well to create a proposal you'd need to generate the documentation and
- dsimcha (6/24) Jan 29 2011 I've uploaded the documentation to
- Andrei Alexandrescu (40/45) Feb 01 2011 Sorry for being slow in continuing this thread.
- spir (18/27) Feb 01 2011 An alternative, or a complementary approach, may be to delegate part of ...
- dsimcha (10/55) Feb 02 2011 I see your point. Some of the stuff I've posted lately (RandAA,
- dsimcha (5/50) Feb 02 2011 BTW, I thought the case for a sealed hash table with an optimized memory...
- Andrei Alexandrescu (4/8) Feb 02 2011 Perhaps a very good argument could be made with some benchmarks that
- Jonathan M Davis (14/32) Jan 29 2011 Overall, I think definitely support this approach. I definitely think th...
Today after work I plan to start making one pass through std.container. After having thought of things for a long time, my conclusions are as follows: 1. Containers will be classes. 2. Most of the methods in existing containers will be final. It's up to the container to make a method final or not. 3. Containers and their ranges decide whether they give away references to their objects. Sealing is a great idea but it makes everybody's life too complicated. I'll defer sealing to future improvements in the language and/or the reflection subsystem. 4. Containers will assume that objects are cheap to copy so they won't worry about moving primitives. Any showstoppers, please share. Andrei
Jan 28 2011
Andrei: As far as I understand the implications, I agree with your conclusions. Other people will be free to create an alternative D2 library like this one: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2271.html but a standard library can't be too much hard to use. Bye, bearophile
Jan 28 2011
On 2011-01-28 13:31:58 -0500, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> said:Today after work I plan to start making one pass through std.container. After having thought of things for a long time, my conclusions are as follows: 1. Containers will be classes. 2. Most of the methods in existing containers will be final. It's up to the container to make a method final or not. 3. Containers and their ranges decide whether they give away references to their objects. Sealing is a great idea but it makes everybody's life too complicated. I'll defer sealing to future improvements in the language and/or the reflection subsystem. 4. Containers will assume that objects are cheap to copy so they won't worry about moving primitives. Any showstoppers, please share.Phobos will certainly be an improvement over not having them. So go ahead! possible even if they fallback to (cheap) copy semantic when move isn't available. That way, if you have a type which is moveable but not copyable you can still put it in a container. Does that makes sense? -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Jan 28 2011
On 1/28/11 3:05 PM, Michel Fortin wrote:On 2011-01-28 13:31:58 -0500, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> said:Well if you brought forth some strong argument I'm all ears. What I see for now is that struct containers are just difficult to implement and need to have carefully explained semantics, whereas a lot of people know how classes behave from day one.Today after work I plan to start making one pass through std.container. After having thought of things for a long time, my conclusions are as follows: 1. Containers will be classes. 2. Most of the methods in existing containers will be final. It's up to the container to make a method final or not. 3. Containers and their ranges decide whether they give away references to their objects. Sealing is a great idea but it makes everybody's life too complicated. I'll defer sealing to future improvements in the language and/or the reflection subsystem. 4. Containers will assume that objects are cheap to copy so they won't worry about moving primitives. Any showstoppers, please share.Phobos will certainly be an improvement over not having them. So go ahead!possible even if they fallback to (cheap) copy semantic when move isn't available. That way, if you have a type which is moveable but not copyable you can still put it in a container. Does that makes sense?That's what I did up until now. It is tantamount to defining a bunch of methods (aliases or not) that add to the interface that the user must absorb, but that are seldom useful. It just seems that the entire move paraphernalia doesn't lift its weight. Andrei
Jan 28 2011
On 2011-01-28 17:09:08 -0500, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> said:On 1/28/11 3:05 PM, Michel Fortin wrote:We already argument this over and over in the past. First, I totally acknowledge that C++ style containers have a problem: they make it easier to copy the content than pass it by reference. On the other side of the spectrum, I think that class semantics makes it too easy to have null dereferences, it's easy to get lost when you have a container of containers. I have some experience with containers having class-style semantics: in Objective-C, I ended up creating a set of macro-like functions which I use to initialize containers whenever I use them in case they are null. And I had to do more of these utility functions to handle a particular data structure of mine which is a dictionary of arrays of objects. In C++, I'd have declared this as a "map< string, vector< Object > >" and be done with it; no need for special care initializing each vector, so much easier than in Objective-C. I agree that defining structs to have reference semantics as you have done is complicated. But I like the lazy initialization, and we have a precedent for that with AAs (ideally, AAs would be a compatible container too). Can't we just use the GC instead of reference counting? I'd make things much easier. Here is a implementation: struct Container { struct Impl { ... } private Impl* _impl; ref Impl impl() property { if (!impl) impl = new Impl; return *impl; } alias impl this; } I also believe reference semantics are not to be used everywhere, even though they're good most of the time. I'd like to have a way to bypass it and get a value-semantic container. With the above, it's easy as long as you keep Container.Impl public: void main() { Container lazyHeapAllocatedContainer; Container.Impl stackAllocatedContainer; } void MyObject { Container.Impl listOfObjects; }Phobos will certainly be an improvement over not having them. So go ahead!Well if you brought forth some strong argument I'm all ears. What I see for now is that struct containers are just difficult to implement and need to have carefully explained semantics, whereas a lot of people know how classes behave from day one.But could we limit this to say that only containers that can return elements by ref? Perhaps that won't help. You know the problem better than me, I don't really have anything more to say. -- Michel Fortin michel.fortin michelf.com http://michelf.com/possible even if they fallback to (cheap) copy semantic when move isn't available. That way, if you have a type which is moveable but not copyable you can still put it in a container. Does that makes sense?That's what I did up until now. It is tantamount to defining a bunch of methods (aliases or not) that add to the interface that the user must absorb, but that are seldom useful. It just seems that the entire move paraphernalia doesn't lift its weight.
Jan 28 2011
Michel Fortin:On the other side of the spectrum, I think that class semantics makes it too easy to have null dereferences,That's why I have asked for a suffix syntax+semantics for notnull reference types in D :-)I agree that defining structs to have reference semantics as you have done is complicated. But I like the lazy initialization, and we have a precedent for that with AAsBuilt-in AAs are currently broken and in need to be fixed: import std.stdio: writeln; void foo(int[int] aa, int n) { aa[n] = n; } void main() { int[int] a; foo(a, 0); writeln(a); a[1] = 1; foo(a, 2); writeln(a); } Bye, bearophile
Jan 28 2011
On 01/29/2011 01:01 AM, bearophile wrote:Built-in AAs are currently broken and in need to be fixed: import std.stdio: writeln; void foo(int[int] aa, int n) { aa[n] = n; } void main() { int[int] a; foo(a, 0); writeln(a); a[1] = 1; foo(a, 2); writeln(a); } Bye, bearophileVariation on the theme: void foo(int[int] aa, int n) { aa[n] = n; } void foo(int[] a, int n) { a ~= n; } void bar(ref int[int] aa, int n) { aa[n] = n; } void bar(ref int[] a, int n) { a ~= n; } unittest { int[int] aa; foo(aa, 3); writeln(aa.length); int[] a; foo(a, 3); writeln(a.length); int[int] bb; bar(bb, 3); writeln(bb.length); int[] b; bar(b, 3); writeln(b.length); } Denis -- _________________ vita es estrany spir.wikidot.com
Jan 28 2011
Michel Fortin napisa=C5=82:We already argument this over and over in the past. First, I totally=20 acknowledge that C++ style containers have a problem: they make it=20 easier to copy the content than pass it by reference. On the other side=20 of the spectrum, I think that class semantics makes it too easy to have=20 null dereferences, it's easy to get lost when you have a container of=20 containers. =20 I have some experience with containers having class-style semantics: in=20 Objective-C, I ended up creating a set of macro-like functions which I=20 use to initialize containers whenever I use them in case they are null.=20 And I had to do more of these utility functions to handle a particular=20 data structure of mine which is a dictionary of arrays of objects. In=20 C++, I'd have declared this as a "map< string, vector< Object > >" and=20 be done with it; no need for special care initializing each vector, so=20 much easier than in Objective-C. =20 I agree that defining structs to have reference semantics as you have=20 done is complicated. But I like the lazy initialization, and we have a=20 precedent for that with AAs (ideally, AAs would be a compatible=20 container too). Can't we just use the GC instead of reference counting?=20 I'd make things much easier. Here is a implementation: =20 struct Container { struct Impl { ... } =20 private Impl* _impl; ref Impl impl() property { if (!impl) impl =3D new Impl; return *impl; } =09 alias impl this; } =20 I also believe reference semantics are not to be used everywhere, even=20 though they're good most of the time. I'd like to have a way to bypass=20 it and get a value-semantic container. With the above, it's easy as=20 long as you keep Container.Impl public: =20 void main() { Container lazyHeapAllocatedContainer; Container.Impl stackAllocatedContainer; } =20 void MyObject { Container.Impl listOfObjects; }Is there anything implementation specific in the outer struct that provides= ref semantics to Impl? If not, Container could be generic, parametrized by= Impl type. Overall, I think a value-like implementation in a referency wrapper is a cl= ear-cut idiom, bringing order to otherwise messy struct-implemented ref-sem= antics. Do you know of a existing collection library that exploits this ide= a? --=20 Tomek
Jan 28 2011
On 2011-01-28 19:00:02 -0500, Tomek Sowiński <just ask.me> said:Michel Fortin napisał:You could provide an implementation-specific version of some functions as an optimization. For instance there is no need to create the Impl when asking for the length, if the pointer is null, length is zero. Typically, const function can be implemented in the outward container with a shortcut checking for null.We already argument this over and over in the past. First, I totally acknowledge that C++ style containers have a problem: they make it easier to copy the content than pass it by reference. On the other side of the spectrum, I think that class semantics makes it too easy to have null dereferences, it's easy to get lost when you have a container of containers. I have some experience with containers having class-style semantics: in Objective-C, I ended up creating a set of macro-like functions which I use to initialize containers whenever I use them in case they are null. And I had to do more of these utility functions to handle a particular data structure of mine which is a dictionary of arrays of objects. In C++, I'd have declared this as a "map< string, vector< Object > >" and be done with it; no need for special care initializing each vector, so much easier than in Objective-C. I agree that defining structs to have reference semantics as you have done is complicated. But I like the lazy initialization, and we have a precedent for that with AAs (ideally, AAs would be a compatible container too). Can't we just use the GC instead of reference counting? I'd make things much easier. Here is a implementation: struct Container { struct Impl { ... } private Impl* _impl; ref Impl impl() property { if (!impl) impl = new Impl; return *impl; } alias impl this; } I also believe reference semantics are not to be used everywhere, even though they're good most of the time. I'd like to have a way to bypass it and get a value-semantic container. With the above, it's easy as long as you keep Container.Impl public: void main() { Container lazyHeapAllocatedContainer; Container.Impl stackAllocatedContainer; } void MyObject { Container.Impl listOfObjects; }Is there anything implementation specific in the outer struct that provides ref semantics to Impl? If not, Container could be generic, parametrized by Impl type.Overall, I think a value-like implementation in a referency wrapper is a clear-cut idiom, bringing order to otherwise messy struct-implemented ref-semantics. Do you know of a existing collection library that exploits this idea?No. Only associative arrays in D do that, that I know of. -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Jan 28 2011
Michel Fortin napisa=B3:idesIs there anything implementation specific in the outer struct that prov=byref semantics to Impl? If not, Container could be generic, parametrized=I think the reference struct can still be orthogonal to the container. struct Ref(Impl) { private Impl* _impl; ref Impl impl() property { if (!impl) impl =3D new Impl; return *impl; } static if (hasLength!Impl) { auto length() property { return impl ? impl.length : 0; } } alias impl this; } Reusability lightens the burden of the container's author (less fuss for us= er implementations) and somewhat standardizes containers as they all must e= xhibit a certain API with certain semantics to be able to fit into Ref. The downside is that the syntax for the most common case (ref semantics) is= a little nosier than for value-like behavior. --=20 TomekImpl type. =20=20 You could provide an implementation-specific version of some functions=20 as an optimization. For instance there is no need to create the Impl=20 when asking for the length, if the pointer is null, length is zero.=20 Typically, const function can be implemented in the outward container=20 with a shortcut checking for null.
Jan 29 2011
Tomek Sowi=C5=84ski <just ask.me> wrote:Michel Fortin napisa=C5=82:=Is there anything implementation specific in the outer struct that =providesref semantics to Impl? If not, Container could be generic, =sparametrized byImpl type.You could provide an implementation-specific version of some function=as an optimization. For instance there is no need to create the Impl when asking for the length, if the pointer is null, length is zero. Typically, const function can be implemented in the outward container=with a shortcut checking for null.I think the reference struct can still be orthogonal to the container.=struct Ref(Impl) { private Impl* _impl; ref Impl impl() property { if (!impl) impl =3D new Impl; return *impl; } static if (hasLength!Impl) { auto length() property { return impl ? impl.length : 0; } } alias impl this; }Now, other functions may also exploit such a shortcut. How do you plan to implement that? Actually, by adopting another idiom, it can be done: struct Ref(Impl) { private Impl* _impl; property ref Impl impl() { return *(_impl =3D (_impl ? _impl : new Impl)); } alias impl this; auto opDispatch(string name, T...)(T args) { mixin("return impl."~name~"(_impl,args);"); } } struct ExampleImpl { static int length(ExampleImpl* that) { return that ? *that.actualLength : 0; } int actualLength( ) { return 42; } } Of course, I did not say it was a good idiom. :p -- = Simen
Jan 29 2011
On 1/29/11 5:01 AM, Simen kjaeraas wrote:Tomek Sowiński <just ask.me> wrote:I do something similar with RefCounted. There are problems - you need to know in advance which functions you can implement on a null container (empty and length are obvious candidates, but there could be others). AndreiMichel Fortin napisał:Now, other functions may also exploit such a shortcut. How do you plan to implement that? Actually, by adopting another idiom, it can be done: struct Ref(Impl) { private Impl* _impl; property ref Impl impl() { return *(_impl = (_impl ? _impl : new Impl)); } alias impl this; auto opDispatch(string name, T...)(T args) { mixin("return impl."~name~"(_impl,args);"); } } struct ExampleImpl { static int length(ExampleImpl* that) { return that ? *that.actualLength : 0; } int actualLength( ) { return 42; } } Of course, I did not say it was a good idiom. :pI think the reference struct can still be orthogonal to the container. struct Ref(Impl) { private Impl* _impl; ref Impl impl() property { if (!impl) impl = new Impl; return *impl; } static if (hasLength!Impl) { auto length() property { return impl ? impl.length : 0; } } alias impl this; }Is there anything implementation specific in the outer struct thatprovidesref semantics to Impl? If not, Container could be generic,parametrized byImpl type.You could provide an implementation-specific version of some functions as an optimization. For instance there is no need to create the Impl when asking for the length, if the pointer is null, length is zero. Typically, const function can be implemented in the outward container with a shortcut checking for null.
Feb 01 2011
Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:I do something similar with RefCounted. There are problems - you need to know in advance which functions you can implement on a null container (empty and length are obvious candidates, but there could be others).Static functions can safely be called. Hence, this template: import std.traits; template isStaticFunc(T, string fn) { enum isStaticFunc = is(typeof({ mixin("alias T."~fn~" func;"); ParameterTypeTuple!func args; mixin("T."~fn~"(args);"); })); } Now, Ref can look like this: struct Ref(Impl) { private Impl* _impl; property ref Impl impl() { return *(_impl = (_impl ? _impl : new Impl)); } //alias impl this; // Apparently, alias this takes precedence over opDispatch, so // using both doesn't work. Well, unless you put opDispatch in Impl. auto opDispatch(string name, T...)(T args) if ( isStaticFunc!(Impl, name)) { mixin("return impl."~name~"(_impl,args);"); } } struct ExampleImpl { static int length(ExampleImpl* that) { return that ? that.actualLength : 0; } int actualLength( ) { return 42; } } import std.stdio; void main( ) { Ref!ExampleImpl a; writeln( a.length ); } -- Simen
Feb 01 2011
On Sat, 29 Jan 2011 02:32:28 +0300, Michel Fortin <michel.fortin michelf.com> wrote:On 2011-01-28 17:09:08 -0500, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> said:Unfortunately, this design has big issues: void fill(Appender appender) { appender.put("hello"); appender.put("world"); } void test() { Appender<string> appender; fill(appender); // Appender is supposed to have reference semantics assert(appender.length != 0); // fails! } Asserting above fails because at the time you pass appender object to the fill method it isn't initialized yet (lazy initialization). As such, a null is passed, creating an instance at first appending, but the result isn't seen to the caller. An explicit initialization is needed to work around this design issue. The worst thing is that in many cases it would work fine (you might have already initialized it indirectly) but sometimes you get unexpected result. I got hit by this in past, and it wasn't easy to trace down. As such, I strongly believe containers either need to have copy semantics, or be classes. However, copy semantics contradicts with the "cheap copy ctor" idiom because you need to copy all the elements from source container.On 1/28/11 3:05 PM, Michel Fortin wrote:We already argument this over and over in the past. First, I totally acknowledge that C++ style containers have a problem: they make it easier to copy the content than pass it by reference. On the other side of the spectrum, I think that class semantics makes it too easy to have null dereferences, it's easy to get lost when you have a container of containers. I have some experience with containers having class-style semantics: in Objective-C, I ended up creating a set of macro-like functions which I use to initialize containers whenever I use them in case they are null. And I had to do more of these utility functions to handle a particular data structure of mine which is a dictionary of arrays of objects. In C++, I'd have declared this as a "map< string, vector< Object > >" and be done with it; no need for special care initializing each vector, so much easier than in Objective-C. I agree that defining structs to have reference semantics as you have done is complicated. But I like the lazy initialization, and we have a precedent for that with AAs (ideally, AAs would be a compatible container too). Can't we just use the GC instead of reference counting? I'd make things much easier. Here is a implementation: struct Container { struct Impl { ... } private Impl* _impl; ref Impl impl() property { if (!impl) impl = new Impl; return *impl; } alias impl this; }Phobos will certainly be an improvement over not having them. So go ahead!Well if you brought forth some strong argument I'm all ears. What I see for now is that struct containers are just difficult to implement and need to have carefully explained semantics, whereas a lot of people know how classes behave from day one.I also believe reference semantics are not to be used everywhere, even though they're good most of the time. I'd like to have a way to bypass it and get a value-semantic container. With the above, it's easy as long as you keep Container.Impl public: void main() { Container lazyHeapAllocatedContainer; Container.Impl stackAllocatedContainer; } void MyObject { Container.Impl listOfObjects; }-- Using Opera's revolutionary email client: http://www.opera.com/mail/But could we limit this to say that only containers that can return elements by ref? Perhaps that won't help. You know the problem better than me, I don't really have anything more to say.possible even if they fallback to (cheap) copy semantic when move isn't available. That way, if you have a type which is moveable but not copyable you can still put it in a container. Does that makes sense?That's what I did up until now. It is tantamount to defining a bunch of methods (aliases or not) that add to the interface that the user must absorb, but that are seldom useful. It just seems that the entire move paraphernalia doesn't lift its weight.
Jan 28 2011
On 2011-01-28 20:10:06 -0500, "Denis Koroskin" <2korden gmail.com> said:Unfortunately, this design has big issues: void fill(Appender appender) { appender.put("hello"); appender.put("world"); } void test() { Appender<string> appender; fill(appender); // Appender is supposed to have reference semantics assert(appender.length != 0); // fails! } Asserting above fails because at the time you pass appender object to the fill method it isn't initialized yet (lazy initialization). As such, a null is passed, creating an instance at first appending, but the result isn't seen to the caller.That's indeed a problem. I don't think it's a fatal flaw however, given that the idiom already exists in AAs. That said, the nice thing about my proposal is that you can easily reuse the Impl to create a new container to build a new container wrapper with the semantics you like with no loss of efficiency. As for the case of Appender... personally in the case above I'd be tempted to use Appender.Impl directly (value semantics) and make fill take a 'ref'. There's no point in having an extra heap allocation, especially if you're calling test() in a loop or if there's a good chance fill() has nothing to append to it. That's the issue with containers. The optimal semantics always change depending on the use case.An explicit initialization is needed to work around this design issue. The worst thing is that in many cases it would work fine (you might have already initialized it indirectly) but sometimes you get unexpected result. I got hit by this in past, and it wasn't easy to trace down. As such, I strongly believe containers either need to have copy semantics, or be classes. However, copy semantics contradicts with the "cheap copy ctor" idiom because you need to copy all the elements from source container.Personally, I'm really concerned by the case where you have a container of containers. Class semantics make things really complicated as you always have to initialize everything in the container explicitly; value semantics makes things semantically easier but quite inefficient as moving elements inside of the outermost container implies copying the containers. Making containers auto-initialize themselves on first use solves the case where containers are references-types; making containers capable of using move semantics solves the problem for value-type containers. -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Jan 28 2011
Michel Fortin napisa=B3:As for the case of Appender... personally in the case above I'd be=20 tempted to use Appender.Impl directly (value semantics) and make fill=20 take a 'ref'. There's no point in having an extra heap allocation,=20 especially if you're calling test() in a loop or if there's a good=20 chance fill() has nothing to append to it.Or take an output range. --=20 Tomek
Jan 29 2011
On 1/28/11 8:12 PM, Michel Fortin wrote:On 2011-01-28 20:10:06 -0500, "Denis Koroskin" <2korden gmail.com> said:Yep, yep, I found myself wrestling with the same issues. All good points. On one hand containers are a target for optimization because many will use them. On the other hand you'd want to have reasonably simple and idiomatic code in the container implementation because you want people to understand them easily and also to write their own. I thought for a while of a layered approach in which you'd have both the value and the sealed reference version of a container... it's just too much aggravation.Unfortunately, this design has big issues: void fill(Appender appender) { appender.put("hello"); appender.put("world"); } void test() { Appender<string> appender; fill(appender); // Appender is supposed to have reference semantics assert(appender.length != 0); // fails! } Asserting above fails because at the time you pass appender object to the fill method it isn't initialized yet (lazy initialization). As such, a null is passed, creating an instance at first appending, but the result isn't seen to the caller.That's indeed a problem. I don't think it's a fatal flaw however, given that the idiom already exists in AAs. That said, the nice thing about my proposal is that you can easily reuse the Impl to create a new container to build a new container wrapper with the semantics you like with no loss of efficiency. As for the case of Appender... personally in the case above I'd be tempted to use Appender.Impl directly (value semantics) and make fill take a 'ref'. There's no point in having an extra heap allocation, especially if you're calling test() in a loop or if there's a good chance fill() has nothing to append to it. That's the issue with containers. The optimal semantics always change depending on the use case.Neither values nor references are perfect indeed. For example, someone mentioned, hey, in STL I write set< vector<double> > and it Just Works(tm). On the other hand, if you swap the two names it still seems to work but it's awfully inefficient (something that may trip even experienced developers). AndreiAn explicit initialization is needed to work around this design issue. The worst thing is that in many cases it would work fine (you might have already initialized it indirectly) but sometimes you get unexpected result. I got hit by this in past, and it wasn't easy to trace down. As such, I strongly believe containers either need to have copy semantics, or be classes. However, copy semantics contradicts with the "cheap copy ctor" idiom because you need to copy all the elements from source container.Personally, I'm really concerned by the case where you have a container of containers. Class semantics make things really complicated as you always have to initialize everything in the container explicitly; value semantics makes things semantically easier but quite inefficient as moving elements inside of the outermost container implies copying the containers. Making containers auto-initialize themselves on first use solves the case where containers are references-types; making containers capable of using move semantics solves the problem for value-type containers.
Feb 01 2011
On 2011-02-01 11:12:13 -0500, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> said:On 1/28/11 8:12 PM, Michel Fortin wrote:But are you not just pushing the aggravation elsewhere? If I need a by value container for some reason (performance or semantics) I'll have to write my own, and likely others will write their own too. Using classes for containers is just marginally better than making them by-value structs: you can use 'new' with a by-value struct if you want it to behave as a class-like by-reference container: struct Container { ... } auto c = new Container(); The only noticeable difference from a class container is that now c is now a Container*.On 2011-01-28 20:10:06 -0500, "Denis Koroskin" <2korden gmail.com> said:Yep, yep, I found myself wrestling with the same issues. All good points. On one hand containers are a target for optimization because many will use them. On the other hand you'd want to have reasonably simple and idiomatic code in the container implementation because you want people to understand them easily and also to write their own. I thought for a while of a layered approach in which you'd have both the value and the sealed reference version of a container... it's just too much aggravation.Unfortunately, this design has big issues: void fill(Appender appender) { appender.put("hello"); appender.put("world"); } void test() { Appender<string> appender; fill(appender); // Appender is supposed to have reference semantics assert(appender.length != 0); // fails! } Asserting above fails because at the time you pass appender object to the fill method it isn't initialized yet (lazy initialization). As such, a null is passed, creating an instance at first appending, but the result isn't seen to the caller.That's indeed a problem. I don't think it's a fatal flaw however, given that the idiom already exists in AAs. That said, the nice thing about my proposal is that you can easily reuse the Impl to create a new container to build a new container wrapper with the semantics you like with no loss of efficiency. As for the case of Appender... personally in the case above I'd be tempted to use Appender.Impl directly (value semantics) and make fill take a 'ref'. There's no point in having an extra heap allocation, especially if you're calling test() in a loop or if there's a good chance fill() has nothing to append to it. That's the issue with containers. The optimal semantics always change depending on the use case.Isn't that solved by C++0x, using move semantics in swap? -- Michel Fortin michel.fortin michelf.com http://michelf.com/Personally, I'm really concerned by the case where you have a container of containers. Class semantics make things really complicated as you always have to initialize everything in the container explicitly; value semantics makes things semantically easier but quite inefficient as moving elements inside of the outermost container implies copying the containers. Making containers auto-initialize themselves on first use solves the case where containers are references-types; making containers capable of using move semantics solves the problem for value-type containers.Neither values nor references are perfect indeed. For example, someone mentioned, hey, in STL I write set< vector<double> > and it Just Works(tm). On the other hand, if you swap the two names it still seems to work but it's awfully inefficient (something that may trip even experienced developers).
Feb 01 2011
On 2/1/11 10:44 AM, Michel Fortin wrote:On 2011-02-01 11:12:13 -0500, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> said:If semantics are the primary concern, you could (and in fact Phobos could) provide a Value!C template that automatically calls dup in this(this) etc. For performance I agree there is stuff that class containers leave on the table.On 1/28/11 8:12 PM, Michel Fortin wrote:But are you not just pushing the aggravation elsewhere? If I need a by value container for some reason (performance or semantics) I'll have to write my own, and likely others will write their own too.On 2011-01-28 20:10:06 -0500, "Denis Koroskin" <2korden gmail.com> said:Yep, yep, I found myself wrestling with the same issues. All good points. On one hand containers are a target for optimization because many will use them. On the other hand you'd want to have reasonably simple and idiomatic code in the container implementation because you want people to understand them easily and also to write their own. I thought for a while of a layered approach in which you'd have both the value and the sealed reference version of a container... it's just too much aggravation.Unfortunately, this design has big issues: void fill(Appender appender) { appender.put("hello"); appender.put("world"); } void test() { Appender<string> appender; fill(appender); // Appender is supposed to have reference semantics assert(appender.length != 0); // fails! } Asserting above fails because at the time you pass appender object to the fill method it isn't initialized yet (lazy initialization). As such, a null is passed, creating an instance at first appending, but the result isn't seen to the caller.That's indeed a problem. I don't think it's a fatal flaw however, given that the idiom already exists in AAs. That said, the nice thing about my proposal is that you can easily reuse the Impl to create a new container to build a new container wrapper with the semantics you like with no loss of efficiency. As for the case of Appender... personally in the case above I'd be tempted to use Appender.Impl directly (value semantics) and make fill take a 'ref'. There's no point in having an extra heap allocation, especially if you're calling test() in a loop or if there's a good chance fill() has nothing to append to it. That's the issue with containers. The optimal semantics always change depending on the use case.Using classes for containers is just marginally better than making them by-value structs: you can use 'new' with a by-value struct if you want it to behave as a class-like by-reference container: struct Container { ... } auto c = new Container(); The only noticeable difference from a class container is that now c is now a Container*.Well one problem now is that if you have a Container* you don't know whether it's dynamically allocated or the address of some stack-allocated object. This is pretty big; a major issue that I believe C++ has is that you can seldom reason modularly about functions because C++ makes it impossible to represent reference semantics with local/remote/shared/no ownership without resorting to convention. A better solution is to define something like auto c = new Classify!Container; which transforms a value into a class object. With this, the question becomes a matter of choosing the right default: do we want values most of the time and occasional references, or vice versa? I think most of the time you need references, as witnessed by the many '&'s out there in code working on STL containers.This particular incarnation yes, but that doesn't automatically fix user code that forgets the cost of copying. But that took a large language change. My point was that values by default is not automatically a good choice. AndreiIsn't that solved by C++0x, using move semantics in swap?Personally, I'm really concerned by the case where you have a container of containers. Class semantics make things really complicated as you always have to initialize everything in the container explicitly; value semantics makes things semantically easier but quite inefficient as moving elements inside of the outermost container implies copying the containers. Making containers auto-initialize themselves on first use solves the case where containers are references-types; making containers capable of using move semantics solves the problem for value-type containers.Neither values nor references are perfect indeed. For example, someone mentioned, hey, in STL I write set< vector<double> > and it Just Works(tm). On the other hand, if you swap the two names it still seems to work but it's awfully inefficient (something that may trip even experienced developers).
Feb 01 2011
On Tuesday 01 February 2011 09:07:55 Andrei Alexandrescu wrote:On 2/1/11 10:44 AM, Michel Fortin wrote:Java implements containers as classes (not that it really has any other choice), so all containers in Java have reference semantics, and I've _never_ found that to be a problem. I do think that there are rare cases where it makes sense for a container to be a value type, but I really do think that it's a rare case for the average programmer. On the other hand, having containers be value types in C++ is _frequently_ a problem. - Jonathan M DavisOn 2011-02-01 11:12:13 -0500, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> said:If semantics are the primary concern, you could (and in fact Phobos could) provide a Value!C template that automatically calls dup in this(this) etc. For performance I agree there is stuff that class containers leave on the table.On 1/28/11 8:12 PM, Michel Fortin wrote:But are you not just pushing the aggravation elsewhere? If I need a by value container for some reason (performance or semantics) I'll have to write my own, and likely others will write their own too.On 2011-01-28 20:10:06 -0500, "Denis Koroskin" <2korden gmail.com> said:Yep, yep, I found myself wrestling with the same issues. All good points. On one hand containers are a target for optimization because many will use them. On the other hand you'd want to have reasonably simple and idiomatic code in the container implementation because you want people to understand them easily and also to write their own. I thought for a while of a layered approach in which you'd have both the value and the sealed reference version of a container... it's just too much aggravation.Unfortunately, this design has big issues: void fill(Appender appender) { appender.put("hello"); appender.put("world"); } void test() { Appender<string> appender; fill(appender); // Appender is supposed to have reference semantics assert(appender.length != 0); // fails! } Asserting above fails because at the time you pass appender object to the fill method it isn't initialized yet (lazy initialization). As such, a null is passed, creating an instance at first appending, but the result isn't seen to the caller.That's indeed a problem. I don't think it's a fatal flaw however, given that the idiom already exists in AAs. That said, the nice thing about my proposal is that you can easily reuse the Impl to create a new container to build a new container wrapper with the semantics you like with no loss of efficiency. As for the case of Appender... personally in the case above I'd be tempted to use Appender.Impl directly (value semantics) and make fill take a 'ref'. There's no point in having an extra heap allocation, especially if you're calling test() in a loop or if there's a good chance fill() has nothing to append to it. That's the issue with containers. The optimal semantics always change depending on the use case.Using classes for containers is just marginally better than making them by-value structs: you can use 'new' with a by-value struct if you want it to behave as a class-like by-reference container: struct Container { ... } auto c = new Container(); The only noticeable difference from a class container is that now c is now a Container*.Well one problem now is that if you have a Container* you don't know whether it's dynamically allocated or the address of some stack-allocated object. This is pretty big; a major issue that I believe C++ has is that you can seldom reason modularly about functions because C++ makes it impossible to represent reference semantics with local/remote/shared/no ownership without resorting to convention. A better solution is to define something like auto c = new Classify!Container; which transforms a value into a class object. With this, the question becomes a matter of choosing the right default: do we want values most of the time and occasional references, or vice versa? I think most of the time you need references, as witnessed by the many '&'s out there in code working on STL containers.
Feb 01 2011
Andrei:A better solution is to define something like auto c = new Classify!Container; which transforms a value into a class object. With this, the question becomes a matter of choosing the right default: do we want values most of the time and occasional references, or vice versa? I think most of the time you need references, as witnessed by the many '&'s out there in code working on STL containers.I agree that most times a reference is better. This brings back the need for a very good (efficient, syntactically readable, and even if not safe, able to spot most common errors, and RAII-safe) way to allocate class instances in-place, on the stack or inside another struct/class. Bye, bearophile
Feb 01 2011
On 2011-02-01 12:07:55 -0500, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> said:With this, the question becomes a matter of choosing the right default: do we want values most of the time and occasional references, or vice versa? I think most of the time you need references, as witnessed by the many '&'s out there in code working on STL containers.What exactly is "most of the time"? In C++, you pass containers by '&' for function parameters, using '&' elsewhere is rare. One thing I proposed some time ago to address this problem (and to which no one replied) was this: ref struct Container { ... } // new "ref struct" concept void func(Container c) { // c is implicitly a "ref Container" } Container a; // by value func(a); // implicitly passed by ref Containers would be stored by value, but always passed by ref in functions parameters. -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Feb 01 2011
On 01.02.2011 20:01, Michel Fortin wrote:On 2011-02-01 12:07:55 -0500, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> said:Thats would not be "per-value" as most understand it. Container globalC; func(c); void func(Container paramC) { c.add(42); // modifies globalC in reference-semantics // leaves globalC as it was in value-semantics } Your Idea would be somehow truly value-based if you default not only to "ref" but to "const ref", because then the function would not be able to alter globalC. But making parameters default-const was not considered the right way for D. - KroxWith this, the question becomes a matter of choosing the right default: do we want values most of the time and occasional references, or vice versa? I think most of the time you need references, as witnessed by the many '&'s out there in code working on STL containers.What exactly is "most of the time"? In C++, you pass containers by '&' for function parameters, using '&' elsewhere is rare. One thing I proposed some time ago to address this problem (and to which no one replied) was this: ref struct Container { ... } // new "ref struct" concept void func(Container c) { // c is implicitly a "ref Container" } Container a; // by value func(a); // implicitly passed by ref Containers would be stored by value, but always passed by ref in functions parameters.
Feb 01 2011
On Tue, 01 Feb 2011 11:44:36 -0500, Michel Fortin <michel.fortin michelf.com> wrote:On 2011-02-01 11:12:13 -0500, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> said:I've been thinking of making Appender.Impl public or at least its own type. A stack-based appender makes a lot of sense when you are using it temporarily to build an array. But array-based containers really are in a separate class from node-based containers. It's tempting to conflate the two because they are both 'containers', but arrays allow many more optimizations/features that node-based containers simply can't do.On 1/28/11 8:12 PM, Michel Fortin wrote:On 2011-01-28 20:10:06 -0500, "Denis Koroskin" <2korden gmail.com> said:Unfortunately, this design has big issues: void fill(Appender appender) { appender.put("hello"); appender.put("world"); } void test() { Appender<string> appender; fill(appender); // Appender is supposed to have reference semantics assert(appender.length != 0); // fails! } Asserting above fails because at the time you pass appender object to the fill method it isn't initialized yet (lazy initialization). As such, a null is passed, creating an instance at first appending, but the result isn't seen to the caller.That's indeed a problem. I don't think it's a fatal flaw however, given that the idiom already exists in AAs. That said, the nice thing about my proposal is that you can easily reuse the Impl to create a new container to build a new container wrapper with the semantics you like with no loss of efficiency. As for the case of Appender... personally in the case above I'd be tempted to use Appender.Impl directly (value semantics) and make fill take a 'ref'. There's no point in having an extra heap allocation, especially if you're calling test() in a loop or if there's a good chance fill() has nothing to append to it.foo(container.dup) ; // value semantics I'm sure some template guru can make a wrapper type for this for the rare occasions that you need it. We can work on solving the "auto-initialization" issue (where a nested container 'just works'), I think there are ways to do it. If that still doesn't help for your issues, then writing your own may be the only valid option.Yep, yep, I found myself wrestling with the same issues. All good points. On one hand containers are a target for optimization because many will use them. On the other hand you'd want to have reasonably simple and idiomatic code in the container implementation because you want people to understand them easily and also to write their own. I thought for a while of a layered approach in which you'd have both the value and the sealed reference version of a container... it's just too much aggravation.But are you not just pushing the aggravation elsewhere? If I need a by value container for some reason (performance or semantics) I'll have to write my own, and likely others will write their own too.Using classes for containers is just marginally better than making them by-value structs: you can use 'new' with a by-value struct if you want it to behave as a class-like by-reference container: struct Container { ... } auto c = new Container(); The only noticeable difference from a class container is that now c is now a Container*.And doesn't get cleaned up by the GC properly. Plus, each member call must check if the container is 'instantiated', since we can have no default ctors. Yes, it's a trade-off, and I think by far class-based containers win for the common case.swap isn't the problem. foreach(s; myVectorSet) { // if s is by value, it must be copied for each iteration in the loop } -SteveIsn't that solved by C++0x, using move semantics in swap?Personally, I'm really concerned by the case where you have a container of containers. Class semantics make things really complicated as you always have to initialize everything in the container explicitly; value semantics makes things semantically easier but quite inefficient as moving elements inside of the outermost container implies copying the containers. Making containers auto-initialize themselves on first use solves the case where containers are references-types; making containers capable of using move semantics solves the problem for value-type containers.Neither values nor references are perfect indeed. For example, someone mentioned, hey, in STL I write set< vector<double> > and it Just Works(tm). On the other hand, if you swap the two names it still seems to work but it's awfully inefficient (something that may trip even experienced developers).
Feb 01 2011
On 01.02.2011 18:08, Steven Schveighoffer wrote:On Tue, 01 Feb 2011 11:44:36 -0500, Michel Fortin <michel.fortin michelf.com> wrote:Just to note: the "correct" solution for the last problem is foreach(const ref s; myVectorSet) which is working in current D. In a more value-based language you may even want to default to "const ref" for foreach-loop-values, and even for function-parameters. I suggested that a while ago, but wasn't liked much for D, for good reasons. - KroxOn 2011-02-01 11:12:13 -0500, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> said:I've been thinking of making Appender.Impl public or at least its own type. A stack-based appender makes a lot of sense when you are using it temporarily to build an array. But array-based containers really are in a separate class from node-based containers. It's tempting to conflate the two because they are both 'containers', but arrays allow many more optimizations/features that node-based containers simply can't do.On 1/28/11 8:12 PM, Michel Fortin wrote:On 2011-01-28 20:10:06 -0500, "Denis Koroskin" <2korden gmail.com> said:Unfortunately, this design has big issues: void fill(Appender appender) { appender.put("hello"); appender.put("world"); } void test() { Appender<string> appender; fill(appender); // Appender is supposed to have reference semantics assert(appender.length != 0); // fails! } Asserting above fails because at the time you pass appender object to the fill method it isn't initialized yet (lazy initialization). As such, a null is passed, creating an instance at first appending, but the result isn't seen to the caller.That's indeed a problem. I don't think it's a fatal flaw however, given that the idiom already exists in AAs. That said, the nice thing about my proposal is that you can easily reuse the Impl to create a new container to build a new container wrapper with the semantics you like with no loss of efficiency. As for the case of Appender... personally in the case above I'd be tempted to use Appender.Impl directly (value semantics) and make fill take a 'ref'. There's no point in having an extra heap allocation, especially if you're calling test() in a loop or if there's a good chance fill() has nothing to append to it.foo(container.dup) ; // value semantics I'm sure some template guru can make a wrapper type for this for the rare occasions that you need it. We can work on solving the "auto-initialization" issue (where a nested container 'just works'), I think there are ways to do it. If that still doesn't help for your issues, then writing your own may be the only valid option.Yep, yep, I found myself wrestling with the same issues. All good points. On one hand containers are a target for optimization because many will use them. On the other hand you'd want to have reasonably simple and idiomatic code in the container implementation because you want people to understand them easily and also to write their own. I thought for a while of a layered approach in which you'd have both the value and the sealed reference version of a container... it's just too much aggravation.But are you not just pushing the aggravation elsewhere? If I need a by value container for some reason (performance or semantics) I'll have to write my own, and likely others will write their own too.Using classes for containers is just marginally better than making them by-value structs: you can use 'new' with a by-value struct if you want it to behave as a class-like by-reference container: struct Container { ... } auto c = new Container(); The only noticeable difference from a class container is that now c is now a Container*.And doesn't get cleaned up by the GC properly. Plus, each member call must check if the container is 'instantiated', since we can have no default ctors. Yes, it's a trade-off, and I think by far class-based containers win for the common case.swap isn't the problem. foreach(s; myVectorSet) { // if s is by value, it must be copied for each iteration in the loop } -SteveIsn't that solved by C++0x, using move semantics in swap?Personally, I'm really concerned by the case where you have a container of containers. Class semantics make things really complicated as you always have to initialize everything in the container explicitly; value semantics makes things semantically easier but quite inefficient as moving elements inside of the outermost container implies copying the containers. Making containers auto-initialize themselves on first use solves the case where containers are references-types; making containers capable of using move semantics solves the problem for value-type containers.Neither values nor references are perfect indeed. For example, someone mentioned, hey, in STL I write set< vector<double> > and it Just Works(tm). On the other hand, if you swap the two names it still seems to work but it's awfully inefficient (something that may trip even experienced developers).
Feb 01 2011
On Tue, 01 Feb 2011 14:26:26 -0500, Simon Buerger <krox gmx.net> wrote:On 01.02.2011 18:08, Steven Schveighoffer wrote:This is not a "solution". I cannot enforce that someone uses foreach with ref semantics. Unless the type is a reference type, like a class. Note that const ref isn't necessarily want, you could want to mutate the elements of the vector, meaning you really want ref. The whole point is, it's too easy to get it wrong, and the wrong thing looks innocuous and does not look like it will perform horribly. Compare this to someone who actually *wants* value semantics when iterating a reference type (which should be very rare): foreach(s; myVectorSet) { auto setIAmGoingToUse = s.dup; // clear intentions: "yes, I want to do an expensive copy" } -Steveswap isn't the problem. foreach(s; myVectorSet) { // if s is by value, it must be copied for each iteration in the loop }Just to note: the "correct" solution for the last problem is foreach(const ref s; myVectorSet) which is working in current D. In a more value-based language you may even want to default to "const ref" for foreach-loop-values, and even for function-parameters. I suggested that a while ago, but wasn't liked much for D, for good reasons.
Feb 01 2011
On Fri, 28 Jan 2011 18:32:28 -0500, Michel Fortin <michel.fortin michelf.com> wrote:On 2011-01-28 17:09:08 -0500, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> said:I see value in this idiom, and I have ideas on how to solve this with class-based containers. Essentially, I think we need a way to construct a "container of containers" where the owner container is the one who has class semantics, and the sub-containers only provide access through the owner. I haven't fleshed out how this will work, but it solves another really important problem that dcollections currently has -- over-allocation. The issue is, if you have a nested container (say a map of strings to linked-lists), the map container pre-allocates a page of elements, to ease stress on the GC. But so does each linked list! However, since the lists all are part of the map, they could potentially share the same allocator. If you have few elements in each list, this could significantly reduce the over-allocation of memory. -SteveOn 1/28/11 3:05 PM, Michel Fortin wrote:We already argument this over and over in the past. First, I totally acknowledge that C++ style containers have a problem: they make it easier to copy the content than pass it by reference. On the other side of the spectrum, I think that class semantics makes it too easy to have null dereferences, it's easy to get lost when you have a container of containers. I have some experience with containers having class-style semantics: in Objective-C, I ended up creating a set of macro-like functions which I use to initialize containers whenever I use them in case they are null. And I had to do more of these utility functions to handle a particular data structure of mine which is a dictionary of arrays of objects. In C++, I'd have declared this as a "map< string, vector< Object > >" and be done with it; no need for special care initializing each vector, so much easier than in Objective-C.Phobos will certainly be an improvement over not having them. So go ahead!Well if you brought forth some strong argument I'm all ears. What I see for now is that struct containers are just difficult to implement and need to have carefully explained semantics, whereas a lot of people know how classes behave from day one.
Jan 31 2011
Andrei:1. Containers will be classes.This page: http://www.jroller.com/scolebourne/entry/the_next_big_jvm_language1 A quotation:3) Everything is a monitor. In Java and the JVM, every object is a monitor, meaning that you can synchronize on any object. This is incredibly wasteful at the JVM level. Senior JVM guys have indicated large percentage improvements in JVM space and performance if we removed the requirement that every object can be synchronized on. (Instead, you would have specific classes like Java 5 Lock)<I have read similar comments in various other places. What about creating a nomonitor annotation, for D2 classes to not create a monitor for specific classes annotated with it? This may reduce some class overhead. Bye, bearophile
Jan 28 2011
bearophile napisa=B3:This page: http://www.jroller.com/scolebourne/entry/the_next_big_jvm_language1 =20 A quotation: =20or, meaning that you can synchronize on any3) Everything is a monitor. In Java and the JVM, every object is a monit=ve indicated large percentage improvementsobject. This is incredibly wasteful at the JVM level. Senior JVM guys ha=ject can be synchronized on. (Instead, youin JVM space and performance if we removed the requirement that every ob=a monitor for specific classes annotatedwould have specific classes like Java 5 Lock)<=20 I have read similar comments in various other places. =20 What about creating a nomonitor annotation, for D2 classes to not create=with it? This may reduce some class overhead.Better just remove it, it's not used often. Besides, there are different lo= cks, one size doesn't fit all. --=20 Tomek
Jan 29 2011
(2011/01/29 3:31), Andrei Alexandrescu wrote:Today after work I plan to start making one pass through std.container. After having thought of things for a long time, my conclusions are as follows: 1. Containers will be classes. 2. Most of the methods in existing containers will be final. It's up to the container to make a method final or not. 3. Containers and their ranges decide whether they give away references to their objects. Sealing is a great idea but it makes everybody's life too complicated. I'll defer sealing to future improvements in the language and/or the reflection subsystem. 4. Containers will assume that objects are cheap to copy so they won't worry about moving primitives. Any showstoppers, please share. Andrei1, 2I think that it is a good policy. There is hardly the superiority that a containers are structs. :-) In the first place I had doubt towards treating structs as classes (OOP). When I use it by such a way, the lack of the default constructor is fatal.3, 4If speed is accompanied, the sealing up is useful, but if it is late, there are many unnecessary cases. Because it is thought that a lot of repetition access to the element and life cycles occurs in the case of a container, handling it by the element unit should be high-performance. In addition, for usability, a thing of simple behavior is preferable rather than a complicated thing. However, on the other hand, the design that wrong use is impossible should be considered enough.
Jan 29 2011
On 28.01.2011 19:31, Andrei Alexandrescu wrote:1. Containers will be classes. 2. Most of the methods in existing containers will be final. It's up to the container to make a method final or not. 3. Containers and their ranges decide whether they give away references to their objects. Sealing is a great idea but it makes everybody's life too complicated. I'll defer sealing to future improvements in the language and/or the reflection subsystem. 4. Containers will assume that objects are cheap to copy so they won't worry about moving primitives.Not perfectly what I would like, but a reasonable choice, and most important to actually have a mature container-lib. But there are other choices remaining: what containers will be there and what will they be called? My suggestion is * Set, MulitSet, Map, MultiMap (hash-table based) * OrderedSet, OrderedMultiSet, OrderedMap, OrderedMultiMap (tree-based) * Sequence (like stl-Deque. the name is just more intuitive. Funny enough, the stl-deque implemenation has nothing to do with a "doubly linked list") * Array (like stl-vector. I think "vector" is a kinda strange name, but that may be only my impression) * List (linked list) * Stack/Queue/PriorityQueue should be done on top of an other class, with a "impl"-template-param, like the stl-ones Things to note: * container should be named with respect to their use, not the implementation. "HashSet" is a bad name, because the user shouldnt care about the implemenation. * unordered sets are used more often than ordered. So it should be "Set/OrderedSet", and not "UnorderedSet/Set" (also, the first one is two characters less typing *g*) * opEqual should work between different types of Sets (or Maps, or sequences). Nothing wrong with comparing an ordered to an unordered one, or a list to an array. just my 2 cents, Krox
Jan 29 2011
Simon Buerger:* container should be named with respect to their use, not the implementation. "HashSet" is a bad name, because the user shouldnt care about the implemenation.This is good for a language like Python, or even Java, but for a system language I want to know the algorithms I'm using under the hood. So I am not sure. Bye, bearophile
Jan 29 2011
bearophile Wrote:Simon Buerger:There could be an interface named Set and various implementations of it, e.g. HashSet. I think this would be a good principle in general for all abstract collection types.* container should be named with respect to their use, not the implementation. "HashSet" is a bad name, because the user shouldnt care about the implemenation.This is good for a language like Python, or even Java, but for a system language I want to know the algorithms I'm using under the hood. So I am not sure.
Jan 29 2011
On 01/29/2011 12:44 PM, Jim wrote:bearophile Wrote:Thus approach has a few advantages but a lot of serious issues. I decided to not define a hierarchy for std.container. AndreiSimon Buerger:There could be an interface named Set and various implementations of it, e.g. HashSet. I think this would be a good principle in general for all abstract collection types.* container should be named with respect to their use, not the implementation. "HashSet" is a bad name, because the user shouldnt care about the implemenation.This is good for a language like Python, or even Java, but for a system language I want to know the algorithms I'm using under the hood. So I am not sure.
Jan 29 2011
On Jan 30, 11 01:09, Simon Buerger wrote:On 28.01.2011 19:31, Andrei Alexandrescu wrote:A 'deque' just mean a double-ended queue, not necessarily doubly linked list. I don't like the name Sequence since it doesn't specify the container can be modified from both ends. 'Deque' is just fine.1. Containers will be classes. 2. Most of the methods in existing containers will be final. It's up to the container to make a method final or not. 3. Containers and their ranges decide whether they give away references to their objects. Sealing is a great idea but it makes everybody's life too complicated. I'll defer sealing to future improvements in the language and/or the reflection subsystem. 4. Containers will assume that objects are cheap to copy so they won't worry about moving primitives.Not perfectly what I would like, but a reasonable choice, and most important to actually have a mature container-lib. But there are other choices remaining: what containers will be there and what will they be called? My suggestion is * Set, MulitSet, Map, MultiMap (hash-table based) * OrderedSet, OrderedMultiSet, OrderedMap, OrderedMultiMap (tree-based) * Sequence (like stl-Deque. the name is just more intuitive. Funny enough, the stl-deque implemenation has nothing to do with a "doubly linked list")* Array (like stl-vector. I think "vector" is a kinda strange name, but that may be only my impression) * List (linked list) * Stack/Queue/PriorityQueue should be done on top of an other class, with a "impl"-template-param, like the stl-ones Things to note: * container should be named with respect to their use, not the implementation. "HashSet" is a bad name, because the user shouldnt care about the implemenation.Except that there is already a std.container.RedBlackTree named in this fashion :).* unordered sets are used more often than ordered. So it should be "Set/OrderedSet", and not "UnorderedSet/Set" (also, the first one is two characters less typing *g*) * opEqual should work between different types of Sets (or Maps, or sequences). Nothing wrong with comparing an ordered to an unordered one, or a list to an array.This breaks transitivity of '==' unnecessarily. I don't see the benefit of comparing two different kinds of containers. If you really need to compare them, use std.algorithm.equal.just my 2 cents, Krox
Jan 29 2011
On Saturday 29 January 2011 09:09:00 Simon Buerger wrote:* container should be named with respect to their use, not the implementation. "HashSet" is a bad name, because the user shouldnt care about the implemenation.It was decided previously that containers in std.container would specifically be named based on the type of data structure that they are rather than what they're used for. Hence why the newest container to std.container is called RedBlackTree. If you want to call it TreeSet or something similar, then all you have to do is alias it. So, it's easy to get names based on usage if that's what you want, but that's not the naming scheme that std.container is using. - Jonathan M Davis
Jan 29 2011
On 01/29/2011 06:09 PM, Simon Buerger wrote:Things to note: * container should be named with respect to their use, not the implementation. "HashSet" is a bad name, because the user shouldnt care about the implemenation. * unordered sets are used more often than ordered. So it should be "Set/OrderedSet", and not "UnorderedSet/Set" (also, the first one is two characters less typing *g*) * opEqual should work between different types of Sets (or Maps, or sequences). Nothing wrong with comparing an ordered to an unordered one, or a list to an array.+++ on all those points Denis -- _________________ vita es estrany spir.wikidot.com
Jan 30 2011
On Sat, 29 Jan 2011 12:09:00 -0500, Simon Buerger <krox gmx.net> wrote:On 28.01.2011 19:31, Andrei Alexandrescu wrote:http://www.dsource.org/projects/dcollections -Steve1. Containers will be classes. 2. Most of the methods in existing containers will be final. It's up to the container to make a method final or not. 3. Containers and their ranges decide whether they give away references to their objects. Sealing is a great idea but it makes everybody's life too complicated. I'll defer sealing to future improvements in the language and/or the reflection subsystem. 4. Containers will assume that objects are cheap to copy so they won't worry about moving primitives.Not perfectly what I would like, but a reasonable choice, and most important to actually have a mature container-lib. But there are other choices remaining: what containers will be there and what will they be called? My suggestion is * Set, MulitSet, Map, MultiMap (hash-table based) * OrderedSet, OrderedMultiSet, OrderedMap, OrderedMultiMap (tree-based) * Sequence (like stl-Deque. the name is just more intuitive. Funny enough, the stl-deque implemenation has nothing to do with a "doubly linked list") * Array (like stl-vector. I think "vector" is a kinda strange name, but that may be only my impression) * List (linked list) * Stack/Queue/PriorityQueue should be done on top of an other class, with a "impl"-template-param, like the stl-ones Things to note: * container should be named with respect to their use, not the implementation. "HashSet" is a bad name, because the user shouldnt care about the implemenation. * unordered sets are used more often than ordered. So it should be "Set/OrderedSet", and not "UnorderedSet/Set" (also, the first one is two characters less typing *g*) * opEqual should work between different types of Sets (or Maps, or sequences). Nothing wrong with comparing an ordered to an unordered one, or a list to an array.
Jan 31 2011
On 31.01.2011 17:53, Steven Schveighoffer wrote:http://www.dsource.org/projects/dcollections -SteveWell, seems not bad on a quick look. But source is updated 2 years ago, so I doubt it would compile with current dmd. Anyway, the topic here is the std-behaviour of the std-lib. But sure, always nice to have custom alternatives. Krox
Jan 31 2011
On 1/31/11 6:48 PM, Simon Buerger wrote:Well, seems not bad on a quick look. But source is updated 2 years ago,…http://www.dsource.org/projects/dcollections/browser/branches/d2 – four months are not quite as long as two years. Oh, and Steve is the very author of dcollections. ;) David
Jan 31 2011
On Mon, 31 Jan 2011 12:48:06 -0500, Simon Buerger <krox gmx.net> wrote:On 31.01.2011 17:53, Steven Schveighoffer wrote:latest meaningful change was 4 months ago: http://www.dsource.org/projects/dcollections/changeset/102 It should compile on the latest DMD (if not, file a ticket). BTW, it changes very little mostly because I haven't had many complaints about it (maybe not a good sign?) and I haven't had much opportunity to use it, as my day job does not allow using D unfortunately. It was proposed as a possibility for the std lib, but Andrei and I couldn't come to an agreement on what the collections should look like. Ironically, his latest decision moves std.container closer to dcollections in design. However, it should be relatively compatible with phobos' container lib (in fact RedBlackTree is a direct port of dcollections' version). -Stevehttp://www.dsource.org/projects/dcollections -SteveWell, seems not bad on a quick look. But source is updated 2 years ago, so I doubt it would compile with current dmd. Anyway, the topic here is the std-behaviour of the std-lib. But sure, always nice to have custom alternatives.
Jan 31 2011
Okay, my fault. Didnt realize you were the author, and the project is still active. The "2 years" came from here: http://www.dsource.org/projects/dcollections/browser/trunk/dcollections. I thought, that "trunk" was the most recent version. Added a bookmark, and will definitely take a closer look later. Thx for mentioning. On 31.01.2011 19:09, Steven Schveighoffer wrote:On Mon, 31 Jan 2011 12:48:06 -0500, Simon Buerger <krox gmx.net> wrote:On 31.01.2011 17:53, Steven Schveighoffer wrote:latest meaningful change was 4 months ago: http://www.dsource.org/projects/dcollections/changeset/102 It should compile on the latest DMD (if not, file a ticket). BTW, it changes very little mostly because I haven't had many complaints about it (maybe not a good sign?) and I haven't had much opportunity to use it, as my day job does not allow using D unfortunately. It was proposed as a possibility for the std lib, but Andrei and I couldn't come to an agreement on what the collections should look like. Ironically, his latest decision moves std.container closer to dcollections in design. However, it should be relatively compatible with phobos' container lib (in fact RedBlackTree is a direct port of dcollections' version). -Stevehttp://www.dsource.org/projects/dcollections -SteveWell, seems not bad on a quick look. But source is updated 2 years ago, so I doubt it would compile with current dmd. Anyway, the topic here is the std-behaviour of the std-lib. But sure, always nice to have custom alternatives.
Jan 31 2011
On Mon, 31 Jan 2011 13:36:57 -0500, Simon Buerger <krox gmx.net> wrote:Okay, my fault. Didnt realize you were the author, and the project is still active. The "2 years" came from here: http://www.dsource.org/projects/dcollections/browser/trunk/dcollections. I thought, that "trunk" was the most recent version. Added a bookmark, and will definitely take a closer look later. Thx for mentioning.Sorry about that, I have had at least 3 people be confused at why trunk wasn't the D2 version. I really need to swap those (make the 1.x version the branch). Unfortunately, it will screw up the D1 documentation, since it's auto-generated. But that should probably be moot since the D1 version isn't changing... -Steve
Jan 31 2011
In general I actually like this idea, but can we still have sealed ref-counted arrays and associative arrays somewhere in Phobos (not necessarily any more esoteric containers than these and not necessarily in std.container, maybe in a new module called std.refcounted or something)? For huge arrays/AAs I think ref counting is an important optimization, especially with the GC in its current state but to a lesser extent even when we get a better GC. GC tends to be slow compared to manual memory management in cases where most allocated blocks are large. Also, I like reference counting for huge arrays/AAs enough that I submitted a ref counted hash table implementation to the Phobos list for review, which noone seemed to notice. Please comment. On 1/28/2011 1:31 PM, Andrei Alexandrescu wrote:Today after work I plan to start making one pass through std.container. After having thought of things for a long time, my conclusions are as follows: 1. Containers will be classes. 2. Most of the methods in existing containers will be final. It's up to the container to make a method final or not. 3. Containers and their ranges decide whether they give away references to their objects. Sealing is a great idea but it makes everybody's life too complicated. I'll defer sealing to future improvements in the language and/or the reflection subsystem. 4. Containers will assume that objects are cheap to copy so they won't worry about moving primitives. Any showstoppers, please share. Andrei
Jan 29 2011
On 01/29/2011 12:54 PM, dsimcha wrote:Also, I like reference counting for huge arrays/AAs enough that I submitted a ref counted hash table implementation to the Phobos list for review, which noone seemed to notice. Please comment.Where is it? Andrei
Jan 29 2011
http://www.dsource.org/projects/aa/browser/trunk/randAA/randaasealed.d On 1/29/2011 2:18 PM, Andrei Alexandrescu wrote:On 01/29/2011 12:54 PM, dsimcha wrote:Also, I like reference counting for huge arrays/AAs enough that I submitted a ref counted hash table implementation to the Phobos list for review, which noone seemed to notice. Please comment.Where is it? Andrei
Jan 29 2011
On 01/29/2011 01:29 PM, dsimcha wrote:http://www.dsource.org/projects/aa/browser/trunk/randAA/randaasealed.d On 1/29/2011 2:18 PM, Andrei Alexandrescu wrote:Well to create a proposal you'd need to generate the documentation and put it somewhere and write a note to this group motivating the poposal and asking for a review manager. AndreiOn 01/29/2011 12:54 PM, dsimcha wrote:Also, I like reference counting for huge arrays/AAs enough that I submitted a ref counted hash table implementation to the Phobos list for review, which noone seemed to notice. Please comment.Where is it? Andrei
Jan 29 2011
I've uploaded the documentation to http://cis.jhu.edu/~dsimcha/randaasealed.html and mentioned it again on the mailing list. The documentation is pretty sparse because interface-wise it's just a standard hash table. More generally, though, are we still interested in sealed/ref counted containers? On 1/29/2011 2:52 PM, Andrei Alexandrescu wrote:On 01/29/2011 01:29 PM, dsimcha wrote:http://www.dsource.org/projects/aa/browser/trunk/randAA/randaasealed.d On 1/29/2011 2:18 PM, Andrei Alexandrescu wrote:Well to create a proposal you'd need to generate the documentation and put it somewhere and write a note to this group motivating the poposal and asking for a review manager. AndreiOn 01/29/2011 12:54 PM, dsimcha wrote:Also, I like reference counting for huge arrays/AAs enough that I submitted a ref counted hash table implementation to the Phobos list for review, which noone seemed to notice. Please comment.Where is it? Andrei
Jan 29 2011
On 1/29/11 3:36 PM, dsimcha wrote:I've uploaded the documentation to http://cis.jhu.edu/~dsimcha/randaasealed.html and mentioned it again on the mailing list. The documentation is pretty sparse because interface-wise it's just a standard hash table. More generally, though, are we still interested in sealed/ref counted containers?Sorry for being slow in continuing this thread. Regarding the general issue that someone makes an informal proposal (either here, as a DIP, or on the Phobos mailing list), followed by a thundering silence: I believe that a good technique is to formalize the proposal review process, which has been a homerun for Boost. The disadvantage of that is that almost without exception this is very taxing to library submitters. This means the submitter must put a lot of thought and a lot of work into motivating, polishing, and documenting an artifact without any guarantee that it would lead to inclusion in the target library. I've seen very, VERY elaborate Boost submissions fail - literally months of work gone to waste. I'm not sure how to save people from doing work up front in hope of an uncertain outcome in the future. I do know what does _not_ work: the take it or leave it approach: "Hey, I have this code for abstraction XYZ that I extracted from a project of mine and I think it may be of general interest. It's at http://site.com/path/to/code.{d,html}. It needs polishing here and there, it's largely undocumented, but I'm sure the ideas shine through. Eh?" The doc at http://cis.jhu.edu/~dsimcha/randaasealed.html is somewhere in between. It is clear you have a good understanding of sealing and hash containers, but let me ask you this - if you wanted to sell this to someone, what would you do? Probably you'd show some relevant benchmarks putting the built-in hashes to shame. Maybe you'd have some good examples - yes, we know it's a hash, but it doesn't hurt to see some code pulp over there. Maybe you'd explain sealing and discuss its relative advantages and disadvantages (which have not yet been documented anywhere - a great opportunity). Maybe you'd even show some numbers showing how sealing does as well/better/worse than reference leaking. This is a good amount of upfront work for little promise. Again, I don't know yet how to optimize for minimizing that. What I did see works on Boost is a request for interest in the form of a discussion (usually with NO source, only USAGE examples) asking if there are people who are interested in such a notion. In this particular case, you'd need numbers to make a strong case which means that code must be already written. For something like e.g. "how about a Finite State Automaton library?" perhaps upfront code wouldn't be necessary for gauging interest. Andrei
Feb 01 2011
On 02/01/2011 05:00 PM, Andrei Alexandrescu wrote:Regarding the general issue that someone makes an informal proposal (either here, as a DIP, or on the Phobos mailing list), followed by a thundering silence: I believe that a good technique is to formalize the proposal review process, which has been a homerun for Boost. The disadvantage of that is that almost without exception this is very taxing to library submitters. This means the submitter must put a lot of thought and a lot of work into motivating, polishing, and documenting an artifact without any guarantee that it would lead to inclusion in the target library. I've seen very, VERY elaborate Boost submissions fail - literally months of work gone to waste.An alternative, or a complementary approach, may be to delegate part of your responsability. In this case, I'm thinking at a pool of people which "mission" would be to obviously show interest (or lack of) for proposals made on the mailing list --whatever their advancement, formality, code quality... This would provide a valuable indicator while removing some load from your shoulders, I guess. Such people may be chosen by cooptation. Note this approach is not exclusive of formal & heavy adoption process like Boost's; instead it can be a complementary or preliminary way of judging interest for proposals. A similar principle may indeed be used for other purpose: specification & evolution of D-the-language, implementation & bug removal of the reference compiler,... Denis -- _________________ vita es estrany spir.wikidot.com
Feb 01 2011
I see your point. Some of the stuff I've posted lately (RandAA, CsvRange) is in a usable but admittedly rough state, and was written partially for myself and partially for Phobos. If this came off as take-it-or-leave-it, this was unintended. These posts were supposed to be informal requests for comment. My main goal in these posts was to gauge whether there's enough interest, whether the design was good enough that it's worth polishing up for Phobos, and whether I missed any key features. If there was sufficient interest I had every intention of polishing/fleshing out these libraries. On 2/1/2011 11:00 AM, Andrei Alexandrescu wrote:On 1/29/11 3:36 PM, dsimcha wrote:I've uploaded the documentation to http://cis.jhu.edu/~dsimcha/randaasealed.html and mentioned it again on the mailing list. The documentation is pretty sparse because interface-wise it's just a standard hash table. More generally, though, are we still interested in sealed/ref counted containers?Sorry for being slow in continuing this thread. Regarding the general issue that someone makes an informal proposal (either here, as a DIP, or on the Phobos mailing list), followed by a thundering silence: I believe that a good technique is to formalize the proposal review process, which has been a homerun for Boost. The disadvantage of that is that almost without exception this is very taxing to library submitters. This means the submitter must put a lot of thought and a lot of work into motivating, polishing, and documenting an artifact without any guarantee that it would lead to inclusion in the target library. I've seen very, VERY elaborate Boost submissions fail - literally months of work gone to waste. I'm not sure how to save people from doing work up front in hope of an uncertain outcome in the future. I do know what does _not_ work: the take it or leave it approach: "Hey, I have this code for abstraction XYZ that I extracted from a project of mine and I think it may be of general interest. It's at http://site.com/path/to/code.{d,html}. It needs polishing here and there, it's largely undocumented, but I'm sure the ideas shine through. Eh?" The doc at http://cis.jhu.edu/~dsimcha/randaasealed.html is somewhere in between. It is clear you have a good understanding of sealing and hash containers, but let me ask you this - if you wanted to sell this to someone, what would you do? Probably you'd show some relevant benchmarks putting the built-in hashes to shame. Maybe you'd have some good examples - yes, we know it's a hash, but it doesn't hurt to see some code pulp over there. Maybe you'd explain sealing and discuss its relative advantages and disadvantages (which have not yet been documented anywhere - a great opportunity). Maybe you'd even show some numbers showing how sealing does as well/better/worse than reference leaking. This is a good amount of upfront work for little promise. Again, I don't know yet how to optimize for minimizing that. What I did see works on Boost is a request for interest in the form of a discussion (usually with NO source, only USAGE examples) asking if there are people who are interested in such a notion. In this particular case, you'd need numbers to make a strong case which means that code must be already written. For something like e.g. "how about a Finite State Automaton library?" perhaps upfront code wouldn't be necessary for gauging interest. Andrei
Feb 02 2011
BTW, I thought the case for a sealed hash table with an optimized memory allocation scheme was self-evident to most heavy D users, given how the builtin hash table leaks memory/fragments the heap (I think it's a little of both) and destroys multithreaded performance. On 2/1/2011 11:00 AM, Andrei Alexandrescu wrote:On 1/29/11 3:36 PM, dsimcha wrote:I've uploaded the documentation to http://cis.jhu.edu/~dsimcha/randaasealed.html and mentioned it again on the mailing list. The documentation is pretty sparse because interface-wise it's just a standard hash table. More generally, though, are we still interested in sealed/ref counted containers?Sorry for being slow in continuing this thread. Regarding the general issue that someone makes an informal proposal (either here, as a DIP, or on the Phobos mailing list), followed by a thundering silence: I believe that a good technique is to formalize the proposal review process, which has been a homerun for Boost. The disadvantage of that is that almost without exception this is very taxing to library submitters. This means the submitter must put a lot of thought and a lot of work into motivating, polishing, and documenting an artifact without any guarantee that it would lead to inclusion in the target library. I've seen very, VERY elaborate Boost submissions fail - literally months of work gone to waste. I'm not sure how to save people from doing work up front in hope of an uncertain outcome in the future. I do know what does _not_ work: the take it or leave it approach: "Hey, I have this code for abstraction XYZ that I extracted from a project of mine and I think it may be of general interest. It's at http://site.com/path/to/code.{d,html}. It needs polishing here and there, it's largely undocumented, but I'm sure the ideas shine through. Eh?" The doc at http://cis.jhu.edu/~dsimcha/randaasealed.html is somewhere in between. It is clear you have a good understanding of sealing and hash containers, but let me ask you this - if you wanted to sell this to someone, what would you do? Probably you'd show some relevant benchmarks putting the built-in hashes to shame. Maybe you'd have some good examples - yes, we know it's a hash, but it doesn't hurt to see some code pulp over there. Maybe you'd explain sealing and discuss its relative advantages and disadvantages (which have not yet been documented anywhere - a great opportunity). Maybe you'd even show some numbers showing how sealing does as well/better/worse than reference leaking. This is a good amount of upfront work for little promise. Again, I don't know yet how to optimize for minimizing that. What I did see works on Boost is a request for interest in the form of a discussion (usually with NO source, only USAGE examples) asking if there are people who are interested in such a notion. In this particular case, you'd need numbers to make a strong case which means that code must be already written. For something like e.g. "how about a Finite State Automaton library?" perhaps upfront code wouldn't be necessary for gauging interest. Andrei
Feb 02 2011
On 2/2/11 8:56 AM, dsimcha wrote:BTW, I thought the case for a sealed hash table with an optimized memory allocation scheme was self-evident to most heavy D users, given how the builtin hash table leaks memory/fragments the heap (I think it's a little of both) and destroys multithreaded performance.Perhaps a very good argument could be made with some benchmarks that show how your table is better than others in casual use and in heavy use. Andrei
Feb 02 2011
On Friday 28 January 2011 10:31:58 Andrei Alexandrescu wrote:Today after work I plan to start making one pass through std.container. After having thought of things for a long time, my conclusions are as follows: 1. Containers will be classes. 2. Most of the methods in existing containers will be final. It's up to the container to make a method final or not. 3. Containers and their ranges decide whether they give away references to their objects. Sealing is a great idea but it makes everybody's life too complicated. I'll defer sealing to future improvements in the language and/or the reflection subsystem. 4. Containers will assume that objects are cheap to copy so they won't worry about moving primitives. Any showstoppers, please share.Overall, I think definitely support this approach. I definitely think that containers should be reference types, and I'm not sure that I care much whether they end up being structs or classes. My one concern would be the ability to inline the container's function calls, and marking them as final fixes that problem. I'm not sure that I know enough about the sealing issue to comment on it, but if what we're doing doesn't preclude having it later, then that sounds like it's a good compromise. And I definitely think that make it so that containers don't have to worry about moving primitives is the way to go. So, I'm essentially 100% behind this, and I don't see any real issues with it. Containers is one of the places that Phobos is generally behind on and which really needs to be taken care if we really want people to be using D and Phobos. So, it's great to see that progress is being made with regards to containers. - Jonathan M Davis
Jan 29 2011