www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Paralysis of analysis

reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
I kept on literally losing sleep about a number of issues involving 
containers, sealing, arbitrary-cost copying vs. reference counting and 
copy-on-write, and related issues. This stops me from making rapid 
progress on defining D containers and other artifacts in the standard 
library.

Clearly we need to break this paralysis, and just as clearly whatever 
decision taken now will influence the prevalent D style going forward. 
So a decision needs to be made soon, just not hastily. Easier said than 
done!

I continue to believe that containers should have reference semantics, 
just like classes. Copying a container wholesale is not something you 
want to be automatic.

I also continue to believe that controlled lifetime (i.e. 
reference-counted implementation) is important for a container. 
Containers tend to be large compared to other objects, so exercising 
strict control over their allocated storage makes a lot of sense. What 
has recently shifted in my beliefs is that we should attempt to 
implement controlled lifetime _outside_ the container definition, by 
using introspection. (Currently some containers use reference counting 
internally, which makes their implementation more complicated than it 
could be.)

Finally, I continue to believe that sealing is worthwhile. In brief, a 
sealing container never gives out addresses of its elements so it has 
great freedom in controlling the data layout (e.g. pack 8 bools in one 
ubyte) and in controlling the lifetime of its own storage. Currently I'm 
not sure whether that decision should be taken by the container, by the 
user of the container, or by an introspection-based wrapper around an 
unsealed container.

* * *

That all being said, I'd like to make a motion that should simplify 
everyone's life - if only for a bit. I'm thinking of making all 
containers classes (either final classes or at a minimum classes with 
only final methods). Currently containers are implemented as structs 
that are engineered to have reference semantics. Some collections use 
reference counting to keep track of the memory used.

Advantages of the change:

- Clear, self-documented reference semantics

- Uses the right tool (classes) for the job (define a type with 
reference semantics)

- Pushes deterministic lifetime issues outside the containers 
(simplifying them) and factors such issues into reusable wrappers a la 
RefCounted.

Disadvantages:

- Containers must be dynamically allocated to do anything - even calling 
empty requires allocation.

- There's a two-words overhead associated with any class object.

- Containers cannot do certain optimizations that depend on container's 
control over its own storage.


What say you?

Andrei
Dec 14 2010
next sibling parent reply so <so so.do> writes:
Could you please elaborate the disadvantages part?

Thanks!

-- 
Using Opera's revolutionary email client: http://www.opera.com/mail/
Dec 14 2010
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 12/14/10 1:19 PM, so wrote:
 Could you please elaborate the disadvantages part?

 Thanks!
Consider the empty() property. A struct using a pointer internally can return true from empty if the pointer is null. A class cannot do that. struct Array { Impl * p; property bool empty() { return !p || p.empty; } } vs. class Array { property final bool empty() { ... } } Whatever empty() does, it must be called against an already-allocated reference to an Array. The two words overhead comes from the vtable and the mutex. A struct that has control over its own storage can be more aggressive about releasing unused memory. A class does not have that kind of control because it doesn't know how many references are out there to the same object. Andrei
Dec 14 2010
next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
Andrei Alexandrescu wrote:
 The two words overhead comes from the vtable and the mutex.
I don't think that overhead is a problem. For small numbers of values, one should use an array. The more complex containers are for larger numbers of values, where 2 words is insignificant.
Dec 14 2010
parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Tue, 14 Dec 2010 15:13:36 -0500, Walter Bright  
<newshound2 digitalmars.com> wrote:

 Andrei Alexandrescu wrote:
 The two words overhead comes from the vtable and the mutex.
I don't think that overhead is a problem. For small numbers of values, one should use an array. The more complex containers are for larger numbers of values, where 2 words is insignificant.
The place I see it being an issue is for things like a map where the value type is a linked list. It's quite conceivable that a map type like this could have thousands of one-element linked lists, with a few scattered 2 or more element linked lists. I don't think it's a common thing, but I think there can be solutions that work around this issue. I agree the 2 words are not enough to dissuade using classes for container types. -Steve
Dec 14 2010
prev sibling parent Bruno Medeiros <brunodomedeiros+spam com.gmail> writes:
On 14/12/2010 19:30, Andrei Alexandrescu wrote:
 On 12/14/10 1:19 PM, so wrote:
 Could you please elaborate the disadvantages part?

 Thanks!
Whatever empty() does, it must be called against an already-allocated reference to an Array.
Phew... That seems quite acceptable. For a moment, when I read your original words: "- Containers must be dynamically allocated to do anything - even calling empty requires allocation. " it almost reads as you saying that any and each call to empty requires an allocation... :S -- Bruno Medeiros - Software Engineer
Jan 27 2011
prev sibling next sibling parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Tue, 14 Dec 2010 14:02:34 -0500, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 I continue to believe that containers should have reference semantics,  
 just like classes. Copying a container wholesale is not something you  
 want to be automatic.
I agree.
 I also continue to believe that controlled lifetime (i.e.  
 reference-counted implementation) is important for a container.  
 Containers tend to be large compared to other objects, so exercising  
 strict control over their allocated storage makes a lot of sense. What  
 has recently shifted in my beliefs is that we should attempt to  
 implement controlled lifetime _outside_ the container definition, by  
 using introspection. (Currently some containers use reference counting  
 internally, which makes their implementation more complicated than it  
 could be.)
I think ref counting needs to be fleshed out more before we use it. I'm not of the mind that phobos should use concepts that are not properly implementable based on the current compiler/runtime design in hopes that the design gets better. I'd rather design it to work now, and redesign later if the opportunity becomes available.
 Finally, I continue to believe that sealing is worthwhile. In brief, a  
 sealing container never gives out addresses of its elements so it has  
 great freedom in controlling the data layout (e.g. pack 8 bools in one  
 ubyte) and in controlling the lifetime of its own storage. Currently I'm  
 not sure whether that decision should be taken by the container, by the  
 user of the container, or by an introspection-based wrapper around an  
 unsealed container.
I agree that a sealed container is worthwhile. I think it needs to be the container's decision (for instance, the pack bools into bits must be a container decision).
 That all being said, I'd like to make a motion that should simplify  
 everyone's life - if only for a bit. I'm thinking of making all  
 containers classes (either final classes or at a minimum classes with  
 only final methods). Currently containers are implemented as structs  
 that are engineered to have reference semantics. Some collections use  
 reference counting to keep track of the memory used.
I think this is the right move. Responding to pros/cons below:
 Advantages of the change:

 - Clear, self-documented reference semantics

 - Uses the right tool (classes) for the job (define a type with  
 reference semantics)

 - Pushes deterministic lifetime issues outside the containers  
 (simplifying them) and factors such issues into reusable wrappers a la  
 RefCounted.
- exposes the issue of default initialization by disallowing that. This is the problem of passing an uninitialized struct into a function and having the function not be able to affect the original. A class has a more defined and better understood lifetime cycle -- nothing exists until new is used. - no more need to "check if it's valid" in every member function.
 Disadvantages:

 - Containers must be dynamically allocated to do anything - even calling  
 empty requires allocation.
Can't emplace work to fix this? At least for cases where you don't need the container to live beyond the scope of a function.
 - There's a two-words overhead associated with any class object.
I assume this is in response to containers of containers? It's actually 96 bits, because the minimal memory block size is 16 bytes. Therefore, a container which could potentially have a 1-word footprint must have 4 words. For 64-bit, I'm unsure of the proposed GC implementation. I have some ideas to solve this, but they are abstract in my head, I haven't solidified them enough to start a discussion yet. Short story -- I think if we clearly separate the implementation from the container, we might be able to combine implementations in a minimal way.
 - Containers cannot do certain optimizations that depend on container's  
 control over its own storage.
Can you explain this further? -Steve
Dec 14 2010
parent "Nick Sabalausky" <a a.a> writes:
"Steven Schveighoffer" <schveiguy yahoo.com> wrote in message 
news:op.vnpx3oioeav7ka steve-laptop...
 On Tue, 14 Dec 2010 14:02:34 -0500, Andrei Alexandrescu 
 <SeeWebsiteForEmail erdani.org> wrote:

 Advantages of the change:

 - Clear, self-documented reference semantics

 - Uses the right tool (classes) for the job (define a type with 
 reference semantics)

 - Pushes deterministic lifetime issues outside the containers 
 (simplifying them) and factors such issues into reusable wrappers a la 
 RefCounted.
- exposes the issue of default initialization by disallowing that. This is the problem of passing an uninitialized struct into a function and having the function not be able to affect the original. A class has a more defined and better understood lifetime cycle -- nothing exists until new is used. - no more need to "check if it's valid" in every member function.
 Disadvantages:

 - Containers must be dynamically allocated to do anything - even calling 
 empty requires allocation.
(First of all, Disclaimer: I might not even know what the hell I'm talking about...) I'd be surprised if typical usage would cause this to be an issue. It's not like people are going to be allocating a new container *every* time something like "empty" is called. And there's always checking if the reference itself is null - that doesn't require allocation.
 - There's a two-words overhead associated with any class object.
Since, like you said, containers usually carry a fair amount of data, I'd be surprised if this would really be an issue. If there is ever a need for lots of containers with very small data one could still just do it all manually with structs. And maybe some wrappers could be used to get it to play nice with existing stuff that expects a standard class-based container? Although, if this overhead would be associated with, for instance, every node in a tree or graph, then it might be more of an issue.
 - Containers cannot do certain optimizations that depend on container's 
 control over its own storage.
This seems like it could be more of an issue than the other two drawbacks. I wonder how often those optimizations would be needed? If only on occasion, then forcing a manual solution in those cases might be worth it for the rather compelling advantages.
Dec 14 2010
prev sibling next sibling parent reply Simon Buerger <krox gmx.net> writes:
On 14.12.2010 20:02, Andrei Alexandrescu wrote:
 I kept on literally losing sleep about a number of issues involving
 containers, sealing, arbitrary-cost copying vs. reference counting and
 copy-on-write, and related issues. This stops me from making rapid
 progress on defining D containers and other artifacts in the standard
 library.

 Clearly we need to break this paralysis, and just as clearly whatever
 decision taken now will influence the prevalent D style going forward.
 So a decision needs to be made soon, just not hastily. Easier said
 than done!

 I continue to believe that containers should have reference semantics,
 just like classes. Copying a container wholesale is not something you
 want to be automatic.

 I also continue to believe that controlled lifetime (i.e.
 reference-counted implementation) is important for a container.
 Containers tend to be large compared to other objects, so exercising
 strict control over their allocated storage makes a lot of sense. What
 has recently shifted in my beliefs is that we should attempt to
 implement controlled lifetime _outside_ the container definition, by
 using introspection. (Currently some containers use reference counting
 internally, which makes their implementation more complicated than it
 could be.)

 Finally, I continue to believe that sealing is worthwhile. In brief, a
 sealing container never gives out addresses of its elements so it has
 great freedom in controlling the data layout (e.g. pack 8 bools in one
 ubyte) and in controlling the lifetime of its own storage. Currently
 I'm not sure whether that decision should be taken by the container,
 by the user of the container, or by an introspection-based wrapper
 around an unsealed container.

 * * *

 That all being said, I'd like to make a motion that should simplify
 everyone's life - if only for a bit. I'm thinking of making all
 containers classes (either final classes or at a minimum classes with
 only final methods). Currently containers are implemented as structs
 that are engineered to have reference semantics. Some collections use
 reference counting to keep track of the memory used.

 Advantages of the change:

 - Clear, self-documented reference semantics

 - Uses the right tool (classes) for the job (define a type with
 reference semantics)

 - Pushes deterministic lifetime issues outside the containers
 (simplifying them) and factors such issues into reusable wrappers a la
 RefCounted.

 Disadvantages:

 - Containers must be dynamically allocated to do anything - even
 calling empty requires allocation.

 - There's a two-words overhead associated with any class object.

 - Containers cannot do certain optimizations that depend on
 container's control over its own storage.


 What say you?

 Andrei
I continue to belief, that containers should be value-types. In order to prevent useless copying you can use something like "Impl * impl" and reference-counting. Then you only do a copy on actual change. This is the way I'm currently implementing in my own container-classes. But I see the point in making them reference-types, because copying is so rare in real world. Though I find the expression "new Set()" most strange, you are definitlely right in the following: If you make them reference-types, they should be classes, not structs (and final, to prevent strange overloading). Krox
Dec 14 2010
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 12/14/10 1:38 PM, Simon Buerger wrote:
 I continue to belief, that containers should be value-types. In order to
 prevent useless copying you can use something like "Impl * impl" and
 reference-counting. Then you only do a copy on actual change. This is
 the way I'm currently implementing in my own container-classes.

 But I see the point in making them reference-types, because copying is
 so rare in real world. Though I find the expression "new Set()" most
 strange, you are definitlely right in the following: If you make them
 reference-types, they should be classes, not structs (and final, to
 prevent strange overloading).
Coming from an STL background I was also very comfortable with the notion of value. Walter pointed to me that in the STL what you worry about most of the time is to _undo_ the propensity of objects getting copied at the drop of a hat. For example, think of the common n00b error of passing containers by value. So since we have the opportunity to decide now for eternity the right thing, I think reference semantics works great with containers. Andrei
Dec 14 2010
next sibling parent reply Simon Buerger <krox gmx.net> writes:
On 14.12.2010 20:53, Andrei Alexandrescu wrote:
 Coming from an STL background I was also very comfortable with the
 notion of value. Walter pointed to me that in the STL what you worry
 about most of the time is to _undo_ the propensity of objects getting
 copied at the drop of a hat. For example, think of the common n00b
 error of passing containers by value.
True thing, C++/STL does much work to prevent the copy-mechanism, but it can be circumvented by using the indirection+refCount trick. Than it doesnt matter how you pass it, it gets copied layzily when the first actual change occurs. That places some overhead 1) increasing/decrasing refcount on every argument-passing 2) checking for refCount>1 on every modifying method-call (not on the reading methods) I'm pretty sure (1) is insignificand. (2) I'm not sure about. For a very simple list-container it might be a problem, but for sophisticated structures like hashtables or trees this one check is probably insignificand.
 So since we have the opportunity to decide now for eternity the right
 thing, I think reference semantics works great with containers.
Indeed. Whichever way to go, you need a good reason. I hope, a similar discussion will be placed for the actual interface of the container-lib. (Which template-params should there be? T, Allocator, Comp are the three most classic ones, but more or less is possible, and what kinds of containers should be there at all?. Anyway, doesnt belong here now). Krox
Dec 14 2010
parent spir <denis.spir gmail.com> writes:
On Tue, 14 Dec 2010 22:11:59 +0100
Simon Buerger <krox gmx.net> wrote:

 So since we have the opportunity to decide now for eternity the right
 thing, I think reference semantics works great with containers. =20
=20 Indeed. Whichever way to go, you need a good reason.
There is not nay good technical answer. The only answer is semantic, on a per-application basis: it depends on what= the collection actually represents. Every container, just like a composite= element (struct vs class (*)) can be required on both value & ref version.= That's why we cannot decide,there will always be people on both sides base= d on personal preferences and previous experiences. Value vs ref has nothing to do with the data type. We could go on arguing o= n this question until the end of times ;-) Denis (*) That's I regret D structs (unlike eg Oberon's record) do not have the f= ull expressiveness of classes (miss extension/inheritance and method dispat= ch according to runtime-type). -- -- -- -- -- -- -- vit esse estrany =E2=98=A3 spir.wikidot.com
Dec 15 2010
prev sibling parent reply spir <denis.spir gmail.com> writes:
On Tue, 14 Dec 2010 13:53:39 -0600
Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:

 Coming from an STL background I was also very comfortable with the=20
 notion of value. Walter pointed to me that in the STL what you worry=20
 about most of the time is to _undo_ the propensity of objects getting=20
 copied at the drop of a hat. For example, think of the common n00b error=
=20
 of passing containers by value.
=20
 So since we have the opportunity to decide now for eternity the right=20
 thing, I think reference semantics works great with containers.
The issue for me in your reasoning is that what you are here talking about,= and what your choice is based on, is _not_ reference _semantics_, but some= thing like "indirection efficiency". This is optimization that clearly belo= ngs to implementation and has nothing to do with semantics. Now, I totally = agree it is very important (esp avoiding useless copies). Reference semantics has something to do with semantics, namely that an elem= ent in the program represents a "thing", some kind of entity in the model t= hat has a proper "identity" (selfsameness), left unchanged in time however = its form changes, and that can be multiply referenced. Confusion arises (esp in languages of the C-line) because pointers used for= implementation (of variable size elements like dyn arrays) & performance (= avoid copy) are sometimes called "references"; and references themselves ar= e most commonly implemented as pointers. The choice whether an program element should be made plain value/data or th= ing/entity/ref, only depends from the semantic point of view on what it rep= resents in the model. In languages of the C line that expose many implement= ation issues to the programmer, other considerations may then enter the dan= ce and contradict semantics in some cases. In other words, the value/ref cr= iterion is orthogonal to the common notion of type. We may be forced to paradoxically "ref" elements that represent plain infor= mation, like color values, just to avoid useles copies, for instance, becau= se the compiler won't ref it under the hood when possible. I am convinced this efficiency can be automagic in the compiler, and the pr= ogrammer would not have to care about that. Actually, the only problematic = case is the one of (input-only-) _value_ parameters. The aim is for the com= piler to pass them by ref for efficiency, when (1) they are heavy & (2) the= y are left unchanged. In an ideal world, parameters would be read-only e basta! But since this se= ems to be impossible in a C-like language, the compiler would have to check= whether a value parameter is changed (1 per-thousand of all cases?), and c= opy it only in this case. I do not know how complicated this is, anyway it = is certainly doable. On the other hand, if arguments let your positon unchanged that containers = must behave like refs, then I fully agree they should be implemented as cla= sses. Denis -- -- -- -- -- -- -- vit esse estrany =E2=98=A3 spir.wikidot.com
Dec 15 2010
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 12/15/10 8:34 AM, spir wrote:
 On Tue, 14 Dec 2010 13:53:39 -0600 Andrei
 Alexandrescu<SeeWebsiteForEmail erdani.org>  wrote:

 Coming from an STL background I was also very comfortable with the
 notion of value. Walter pointed to me that in the STL what you
 worry about most of the time is to _undo_ the propensity of objects
 getting copied at the drop of a hat. For example, think of the
 common n00b error of passing containers by value.

 So since we have the opportunity to decide now for eternity the
 right thing, I think reference semantics works great with
 containers.
The issue for me in your reasoning is that what you are here talking about, and what your choice is based on, is _not_ reference _semantics_, but something like "indirection efficiency". This is optimization that clearly belongs to implementation and has nothing to do with semantics.
Optimization (or pessimization) is a concern, but not my primary one. My concern is: most of the time, do you want to work on a container or on a copy of the container? Consider this path-of-least-resistance code: void fun(Container!int c) { ... c[5] += 42; ... } Question is, what's the most encountered activity? Should fun operate on whatever container it was passed, or on a copy of it? Based on extensive experience with the STL, I can say that in the overwhelming majority of cases you want the function to mess with the container, or look without touch (by means of const). It is so overwhelming, any code reviewer in an STL-based environment will raise a flag when seeing the C++ equivalent to the code above - ironically, even if fun actually does need a copy of its input! (The common idiom is to pass the container by constant reference and then create a copy of it inside fun, which is suboptimal.) In contrast, most of the time you want to work on a copy of a string, so strings are commonly not containers. (This is nicely effected by string being defined as arrays of immutable characters.) However, you sometimes do need to mutate a string, which is why char[] is useful on occasion. Andrei
Dec 15 2010
parent reply spir <denis.spir gmail.com> writes:
On Wed, 15 Dec 2010 09:56:36 -0600
Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:

 Optimization (or pessimization) is a concern, but not my primary one. My=
=20
 concern is: most of the time, do you want to work on a container or on a=
=20
 copy of the container? Consider this path-of-least-resistance code:
=20
 void fun(Container!int c) {
      ...
      c[5] +=3D 42;
      ...
 }
=20
 Question is, what's the most encountered activity? Should fun operate on=
=20
 whatever container it was passed, or on a copy of it? Based on extensive=
=20
 experience with the STL, I can say that in the overwhelming majority of=20
 cases you want the function to mess with the container, or look without=20
 touch (by means of const). It is so overwhelming, any code reviewer in=20
 an STL-based environment will raise a flag when seeing the C++=20
 equivalent to the code above - ironically, even if fun actually does=20
 need a copy of its input! (The common idiom is to pass the container by=20
 constant reference and then create a copy of it inside fun, which is=20
 suboptimal.)
I do agree. When a container is passed as parameter * either it is a value in meaning and should be left unchanged (--> so that= the compiler can pass it as "constant reference") * or it means an entity with identity, it makes sense to change it, and it = should be implemented as a ref. What I'm trying to fight is beeing forced to implement semantics values as = concrete ref elements. This is very bad, a kind of conceptual distortion (t= he author of XL calls this semantic mismatch) that leads to much confusion. Example of semantic distinction: Take a palette of predefined colors (red, green,..) used to draw visual wid= gets. In the simple case, colors are plain information (=3Dvalues), and the= palette (a collection) as well. In this case, every widget holds its own s= ubset of colors used for each part of itself. Meaning copies. Chenging a gi= ven color assigned to a widget should & does not affect others. Now, imagine this palette can be edited "live" by the user, meaning redefin= ing the components of re, green,... This time, the semantics may well be th= at such changes should aaffect all widgets, including already defined ones.= For this, the palette must be implemented as an "entity", and each as well= . But the reason for this is that the palette does not mean the same thing = at all: instead of information about an aspect (color) of every widget, we = have now a kind of container of color _sources_. Instead of color values, t= he widget fields point to kinds of paint pots; these fields should not be c= alled "color". [It is not always that simple to find real-world metaphors helping us and c= orrectly understand what we have to model and write into programs. A progra= m's world is not at all reality, not even similar to it, even in the (minor= ity of) cases where it models reality. In this case, "color" is misleading.] In the first case, palette must be a value, in the second case it must be a= ref. There is no way to escape the dilemma about having value or ref colle= ctions. Conceptually, we absolutely need both. Again the ref/value semantic= duality is independant from data types. If the language provides one kind = only, we have to hack, to cheat with it. There is a special case in non-OO-only cisconstances: sometimes an element = is passed as parameter while it is conceptually the "object" (in common sen= se) on which an operation applies (~ OO receiver). In OO, it would be passe= d by ref precisely to allow it beeing changed, even if it is a plain value = (this prevents creating a new value at every tiny chenge, as opposed to imm= utability). But this relevant distinction between object of an operation (w= hat) and true parameters (how) does not exist in plain function-based style: func(object, param1, param12); So that we have to pass the object by ref when the operation is precisely h= ere to modify it. But conceptually it is not a parameter.
 In contrast, most of the time you want to work on a copy of a string, so=
=20
 strings are commonly not containers. (This is nicely effected by string=20
 being defined as arrays of immutable characters.) However, you sometimes=
=20
 do need to mutate a string, which is why char[] is useful on occasion.
I agree with this as well. Do does the right thing for strings. Denis -- -- -- -- -- -- -- vit esse estrany =E2=98=A3 spir.wikidot.com
Dec 15 2010
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 12/15/10 11:05 AM, spir wrote:
 What I'm trying to fight is beeing forced to implement semantics
 values as concrete ref elements. This is very bad, a kind of
 conceptual distortion (the author of XL calls this semantic mismatch)
 that leads to much confusion.
[snip example]
 Conceptually, we absolutely need both.
 Again the ref/value semantic duality is independant from data types.
 If the language provides one kind only, we have to hack, to cheat
 with it.
Both are good. The question is which should be the "default" one and what should be the "other" one. Your example is from a class of examples that basically say: a mutable reference object in a struct with value semantics is trouble. That is: struct Widget // value type { ... Array!Color colorMap; // oops, undue aliasing } That is correct. There are two solutions I envision: 1. Define a Value wrapper in std.container or std.typecons: struct Widget // value type { ... Value!(Array!Color) colorMap; // clones upon copying } 2. Define this(this) struct Widget // value type { ... Array!Color colorMap; // manually cloned upon copying this(this) { colorMap = colorMap.clone; } } Note that if Widget is a class there is no such problem. The entire issue applies to designing value types.
 There is a special case in non-OO-only cisconstances: sometimes an
 element is passed as parameter while it is conceptually the "object"
 (in common sense) on which an operation applies (~ OO receiver). In
 OO, it would be passed by ref precisely to allow it beeing changed,
 even if it is a plain value (this prevents creating a new value at
 every tiny chenge, as opposed to immutability). But this relevant
 distinction between object of an operation (what) and true parameters
 (how) does not exist in plain function-based style: func(object,
 param1, param12); So that we have to pass the object by ref when the
 operation is precisely here to modify it. But conceptually it is  not
 a parameter.
I don't understand this part. Andrei
Dec 15 2010
parent spir <denis.spir gmail.com> writes:
On Wed, 15 Dec 2010 11:57:32 -0600
Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:

 On 12/15/10 11:05 AM, spir wrote:
 What I'm trying to fight is beeing forced to implement semantics
 values as concrete ref elements. This is very bad, a kind of
 conceptual distortion (the author of XL calls this semantic mismatch)
 that leads to much confusion.
[snip example] =20
 Conceptually, we absolutely need both.
 Again the ref/value semantic duality is independant from data types.
 If the language provides one kind only, we have to hack, to cheat
 with it.
=20 Both are good. The question is which should be the "default" one and=20 what should be the "other" one. =20 Your example is from a class of examples that basically say: a mutable=20 reference object in a struct with value semantics is trouble. That is: =20 struct Widget // value type { ... Array!Color colorMap; // oops, undue aliasing } =20 That is correct. There are two solutions I envision:
I agree this is also an issue, but this is not the one I had in mind (sorry= , for unclear expression).
 1. Define a Value wrapper in std.container or std.typecons:
=20
 struct Widget // value type
 {
      ...
      Value!(Array!Color) colorMap; // clones upon copying
 }
=20
 2. Define this(this)
=20
 struct Widget // value type
 {
      ...
      Array!Color colorMap; // manually cloned upon copying
      this(this) {
          colorMap =3D colorMap.clone;
      }
 }
=20
 Note that if Widget is a class there is no such problem. The entire=20
 issue applies to designing value types.
Actually, that is not what I meant. The actual "nature" (class/struct) of W= idget is not the problem I tried to point. Rather to have colorMap's type d= efined as a class when its meaning (in the model) is of plain value (often = to avoid useless copy); or conversely (eg to avoid cost of instanciation on= the heap). Ideally, I would like to have ref vs value distinction orthogonal to the wh= ole type system, meaning "entity with identity" vs "plain data". For this, = the language must (1) properly cope with implemention issues (read: code efficiency), (2) provide a "ref-ing" syntax similar to "pointing". I won't dream of the latter, but I guess the first feature can well be done= in D. In requires the compiler detecting when a value parameter is not tou= ched in a func body, then passing it by ref behind the stage. This would al= low defining conceptual values as instances of value types without fearing = inefficiency. For the converse issue, I have no idea.
 [...]=20
I don't understand this part.
Not that important [a bit off-topic]. Denis -- -- -- -- -- -- -- vit esse estrany =E2=98=A3 spir.wikidot.com
Dec 15 2010
prev sibling parent Bruno Medeiros <brunodomedeiros+spam com.gmail> writes:
On 15/12/2010 17:05, spir wrote:
 On Wed, 15 Dec 2010 09:56:36 -0600
 Andrei Alexandrescu<SeeWebsiteForEmail erdani.org>  wrote:

 Optimization (or pessimization) is a concern, but not my primary one. My
 concern is: most of the time, do you want to work on a container or on a
 copy of the container? Consider this path-of-least-resistance code:

 void fun(Container!int c) {
       ...
       c[5] += 42;
       ...
 }

 Question is, what's the most encountered activity? Should fun operate on
 whatever container it was passed, or on a copy of it? Based on extensive
 experience with the STL, I can say that in the overwhelming majority of
 cases you want the function to mess with the container, or look without
 touch (by means of const). It is so overwhelming, any code reviewer in
 an STL-based environment will raise a flag when seeing the C++
 equivalent to the code above - ironically, even if fun actually does
 need a copy of its input! (The common idiom is to pass the container by
 constant reference and then create a copy of it inside fun, which is
 suboptimal.)
I do agree. When a container is passed as parameter * either it is a value in meaning and should be left unchanged (--> so that the compiler can pass it as "constant reference") * or it means an entity with identity, it makes sense to change it, and it should be implemented as a ref. What I'm trying to fight is beeing forced to implement semantics values as concrete ref elements. This is very bad, a kind of conceptual distortion (the author of XL calls this semantic mismatch) that leads to much confusion.
As someone who takes conceptual cleanliness very seriously, I had to chime in, as I don't quite agree with your points.
 Example of semantic distinction:
 Take a palette of predefined colors (red, green,..) used to draw visual
widgets. In the simple case, colors are plain information (=values), and the
palette (a collection) as well. In this case, every widget holds its own subset
of colors used for each part of itself. Meaning copies. Chenging a given color
assigned to a widget should&  does not affect others.
 Now, imagine this palette can be edited "live" by the user, meaning redefining
the components of re, green,... This time, the semantics may well be that such
changes should aaffect all widgets, including already defined ones. For this,
the palette must be implemented as an "entity", and each as well. But the
reason for this is that the palette does not mean the same thing at all:
instead of information about an aspect (color) of every widget, we have now a
kind of container of color _sources_. Instead of color values, the widget
fields point to kinds of paint pots; these fields should not be called "color".
 [It is not always that simple to find real-world metaphors helping us and
correctly understand what we have to model and write into programs. A program's
world is not at all reality, not even similar to it, even in the (minority of)
cases where it models reality. In this case, "color" is misleading.]

 In the first case, palette must be a value, in the second case it must be a
ref. There is no way to escape the dilemma about having value or ref
collections. Conceptually, we absolutely need both. Again the ref/value
semantic duality is independant from data types. If the language provides one
kind only, we have to hack, to cheat with it.
The discussion here is simply what should be the common, default case. Who said we can't have both? What's the justification for "If the language provides one kind only, we have to hack, to cheat with it." ?? Also, the things you say about collections being ref or value is badly worded. First of all, considering a collection on its own, there is no right answer to whether the collection /should/ have value or reference semantics. The statement is meaningless. Only when a collection is associated with some other object does this question make sense. Is the collection part-of/owned-by the object, or it is merely referenced by it? So taking your example, does a Widget each have their own Palette of Colors, or is there only one common Palette? The answer depends on your domain, this is a modeling/design problem, not a language design one. The only thing the language should strive for is being able to represent/code both possible designs as well as possible (in a clear way, less bug prone, etc.). -- Bruno Medeiros - Software Engineer
Jan 27 2011
prev sibling parent reply Jonathan =?UTF-8?B?U2NobWlkdC1Eb21pbsOp?= <devel the-user.org> writes:
 I continue to belief, that containers should be value-types. In order
 to prevent useless copying you can use something like "Impl * impl"
 and reference-counting. Then you only do a copy on actual change. This
 is the way I'm currently implementing in my own container-classes.
From my point of view reference counting is not very elegant. The compiler should take care (or give possibilities to take care!!) that no unneccessary copies are made. It could be much simplier than reference counting, but D is simply currently not powerfull enough to allow this in a generic way. As I said in another thread some minutes before: I agree that containers should definitely be by value. The User
Dec 14 2010
parent Jonathan =?UTF-8?B?U2NobWlkdC1Eb21pbsOp?= <devel the-user.org> writes:
Jonathan Schmidt-Dominé wrote:

 I continue to belief, that containers should be value-types. In order
 to prevent useless copying you can use something like "Impl * impl"
 and reference-counting. Then you only do a copy on actual change. This
 is the way I'm currently implementing in my own container-classes.
From my point of view reference counting is not very elegant.
However, maybe reference counting is a feasible way to go before better times will arise.
Dec 14 2010
prev sibling next sibling parent reply Dmitry Olshansky <dmitry.olsh gmail.com> writes:
On 14.12.2010 22:02, Andrei Alexandrescu wrote:
 I kept on literally losing sleep about a number of issues involving 
 containers, sealing, arbitrary-cost copying vs. reference counting and 
 copy-on-write, and related issues. This stops me from making rapid 
 progress on defining D containers and other artifacts in the standard 
 library.

 Clearly we need to break this paralysis, and just as clearly whatever 
 decision taken now will influence the prevalent D style going forward. 
 So a decision needs to be made soon, just not hastily. Easier said 
 than done!

 I continue to believe that containers should have reference semantics, 
 just like classes. Copying a container wholesale is not something you 
 want to be automatic.
Sure thing.
 I also continue to believe that controlled lifetime (i.e. 
 reference-counted implementation) is important for a container. 
 Containers tend to be large compared to other objects, so exercising 
 strict control over their allocated storage makes a lot of sense. What 
 has recently shifted in my beliefs is that we should attempt to 
 implement controlled lifetime _outside_ the container definition, by 
 using introspection. (Currently some containers use reference counting 
 internally, which makes their implementation more complicated than it 
 could be.)
What challenges do we face with this approach? Can you please outline the mechanics of that controlled lifetime outside the container part, e.g. is it by usage of some tricky wrappers?
 Finally, I continue to believe that sealing is worthwhile. In brief, a 
 sealing container never gives out addresses of its elements so it has 
 great freedom in controlling the data layout (e.g. pack 8 bools in one 
 ubyte) and in controlling the lifetime of its own storage. Currently 
 I'm not sure whether that decision should be taken by the container, 
 by the user of the container, or by an introspection-based wrapper 
 around an unsealed container.
Your change looks like going with third option, am I correct?
 * * *

 That all being said, I'd like to make a motion that should simplify 
 everyone's life - if only for a bit. I'm thinking of making all 
 containers classes (either final classes or at a minimum classes with 
 only final methods). Currently containers are implemented as structs 
 that are engineered to have reference semantics. Some collections use 
 reference counting to keep track of the memory used.

 Advantages of the change:

 - Clear, self-documented reference semantics

 - Uses the right tool (classes) for the job (define a type with 
 reference semantics)

 - Pushes deterministic lifetime issues outside the containers 
 (simplifying them) and factors such issues into reusable wrappers a la 
 RefCounted.

 Disadvantages:

 - Containers must be dynamically allocated to do anything - even 
 calling empty requires allocation.
I was of impression that you could allocate class instances almost anywhere (with help of emplace), it's just that heap being the safe default.
 - There's a two-words overhead associated with any class object.

 - Containers cannot do certain optimizations that depend on 
 container's control over its own storage.
That must have something to do with sealed container being wrappers over unsealed ones, so as I observe your change implies not only a change to final classes. Clearly something is missing in your post can you please be more specific on that change?
 What say you?

 Andrei
-- Dmitry Olshansky
Dec 14 2010
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 12/14/10 1:42 PM, Dmitry Olshansky wrote:
 On 14.12.2010 22:02, Andrei Alexandrescu wrote:
 I also continue to believe that controlled lifetime (i.e.
 reference-counted implementation) is important for a container.
 Containers tend to be large compared to other objects, so exercising
 strict control over their allocated storage makes a lot of sense. What
 has recently shifted in my beliefs is that we should attempt to
 implement controlled lifetime _outside_ the container definition, by
 using introspection. (Currently some containers use reference counting
 internally, which makes their implementation more complicated than it
 could be.)
What challenges do we face with this approach? Can you please outline the mechanics of that controlled lifetime outside the container part, e.g. is it by usage of some tricky wrappers?
Usage of wrappers, yes. Essentially you'd use e.g. RBTree as a class or RefCounted!RBTree, which calls clear() against the object when the reference count goes down to zero.
 Finally, I continue to believe that sealing is worthwhile. In brief, a
 sealing container never gives out addresses of its elements so it has
 great freedom in controlling the data layout (e.g. pack 8 bools in one
 ubyte) and in controlling the lifetime of its own storage. Currently
 I'm not sure whether that decision should be taken by the container,
 by the user of the container, or by an introspection-based wrapper
 around an unsealed container.
Your change looks like going with third option, am I correct?
Steve correctly pointed out that sealing must belong in the container.
 - Containers must be dynamically allocated to do anything - even
 calling empty requires allocation.
I was of impression that you could allocate class instances almost anywhere (with help of emplace), it's just that heap being the safe default.
Most people would simply call new.
 - There's a two-words overhead associated with any class object.

 - Containers cannot do certain optimizations that depend on
 container's control over its own storage.
That must have something to do with sealed container being wrappers over unsealed ones, so as I observe your change implies not only a change to final classes. Clearly something is missing in your post can you please be more specific on that change?
I withdraw that comment because I don't have good examples aside from deterministic memory release, which I already discussed. Andrei
Dec 14 2010
prev sibling next sibling parent reply Kagamin <spam here.lot> writes:
Andrei Alexandrescu Wrote:

 That all being said, I'd like to make a motion that should simplify 
 everyone's life - if only for a bit. I'm thinking of making all 
 containers classes (either final classes or at a minimum classes with 
 only final methods). Currently containers are implemented as structs 
 that are engineered to have reference semantics. Some collections use 
 reference counting to keep track of the memory used.
Thinking about this I've found an interesting issue: --- void foo(int[int] aa) { aa[2]=2; } int main() { int[int] aa; //aa[1]=1; //uncomment this and it will work foo(aa); assert(aa[2]==2); return 0; } ---
Dec 14 2010
next sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 12/14/10 1:56 PM, Kagamin wrote:
 Andrei Alexandrescu Wrote:

 That all being said, I'd like to make a motion that should simplify
 everyone's life - if only for a bit. I'm thinking of making all
 containers classes (either final classes or at a minimum classes with
 only final methods). Currently containers are implemented as structs
 that are engineered to have reference semantics. Some collections use
 reference counting to keep track of the memory used.
Thinking about this I've found an interesting issue: --- void foo(int[int] aa) { aa[2]=2; } int main() { int[int] aa; //aa[1]=1; //uncomment this and it will work foo(aa); assert(aa[2]==2); return 0; } ---
Yah, this has been discussed many times. Essentially AAs have class-like semantics with null in tow. Andrei
Dec 14 2010
prev sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Tue, 14 Dec 2010 14:56:55 -0500, Kagamin <spam here.lot> wrote:

 Andrei Alexandrescu Wrote:

 That all being said, I'd like to make a motion that should simplify
 everyone's life - if only for a bit. I'm thinking of making all
 containers classes (either final classes or at a minimum classes with
 only final methods). Currently containers are implemented as structs
 that are engineered to have reference semantics. Some collections use
 reference counting to keep track of the memory used.
Thinking about this I've found an interesting issue: --- void foo(int[int] aa) { aa[2]=2; } int main() { int[int] aa; //aa[1]=1; //uncomment this and it will work foo(aa); assert(aa[2]==2); return 0; } ---
That's been discussed very much in the past. There is no good solution, and it's one of the good reasons to make collections classes with clearly defined lifetimes. I can't find the thread that talks about this, but I think it was over a year ago that I brought up this subtlety when people were wondering what the correct implementation for containers should be -- struct or class. -Steve
Dec 14 2010
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Tuesday, December 14, 2010 11:02:34 Andrei Alexandrescu wrote:
 I kept on literally losing sleep about a number of issues involving
 containers, sealing, arbitrary-cost copying vs. reference counting and
 copy-on-write, and related issues. This stops me from making rapid
 progress on defining D containers and other artifacts in the standard
 library.
 
 Clearly we need to break this paralysis, and just as clearly whatever
 decision taken now will influence the prevalent D style going forward.
 So a decision needs to be made soon, just not hastily. Easier said than
 done!
 
 I continue to believe that containers should have reference semantics,
 just like classes. Copying a container wholesale is not something you
 want to be automatic.
 
 I also continue to believe that controlled lifetime (i.e.
 reference-counted implementation) is important for a container.
 Containers tend to be large compared to other objects, so exercising
 strict control over their allocated storage makes a lot of sense. What
 has recently shifted in my beliefs is that we should attempt to
 implement controlled lifetime _outside_ the container definition, by
 using introspection. (Currently some containers use reference counting
 internally, which makes their implementation more complicated than it
 could be.)
 
 Finally, I continue to believe that sealing is worthwhile. In brief, a
 sealing container never gives out addresses of its elements so it has
 great freedom in controlling the data layout (e.g. pack 8 bools in one
 ubyte) and in controlling the lifetime of its own storage. Currently I'm
 not sure whether that decision should be taken by the container, by the
 user of the container, or by an introspection-based wrapper around an
 unsealed container.
 
 * * *
 
 That all being said, I'd like to make a motion that should simplify
 everyone's life - if only for a bit. I'm thinking of making all
 containers classes (either final classes or at a minimum classes with
 only final methods). Currently containers are implemented as structs
 that are engineered to have reference semantics. Some collections use
 reference counting to keep track of the memory used.
 
 Advantages of the change:
 
 - Clear, self-documented reference semantics
 
 - Uses the right tool (classes) for the job (define a type with
 reference semantics)
 
 - Pushes deterministic lifetime issues outside the containers
 (simplifying them) and factors such issues into reusable wrappers a la
 RefCounted.
 
 Disadvantages:
 
 - Containers must be dynamically allocated to do anything - even calling
 empty requires allocation.
 
 - There's a two-words overhead associated with any class object.
 
 - Containers cannot do certain optimizations that depend on container's
 control over its own storage.
 
 
 What say you?
One concern that I would have would be inlining. Containers need to be efficient, and if their functions can't be inlined, that could be problematic. I expect that if a container is a class and its functions are final (and possibly the class itself), then the functions wouldn't be virtual, and then the inliner can do its job. But if the container's functions are virtual, then inlining won't work. How much of a problem that would be in practice, I don't know, but I think that it's something that needs to be considered. - Jonathan M Davis
Dec 14 2010
prev sibling next sibling parent Michel Fortin <michel.fortin michelf.com> writes:
On 2010-12-14 14:02:34 -0500, Andrei Alexandrescu 
<SeeWebsiteForEmail erdani.org> said:

 What say you?
I'd prefer them to be value types. And I agree that if you want to give them reference semantics it's much cleaner if they're implemented as a class. I fear the null pointers however... I understand your paralysis. -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Dec 14 2010
prev sibling parent reply "Craig Black" <craigblack2 cox.net> writes:
 What say you?
I feel like the odd man out here since my perspective is so different. I use custom container classes even in C++, partly because I can usually get better performance that way, and because I can customize the the container however I like. So I will probably be doing my own containers if/when I use D. Beyond that, my own personal preferences seem so different that I hesitate to mention them. I use dynamic arrays by far the most out of all container classes. I use them so much that I cringe at the thought of allocating them on the GC heap. My code is very high performance and I would like to keep it that way. Also, my usage of arrays is such that most of them are empty, so it is important to me that the empty arrays are stored efficiently. Using my custom container class, an empty array does not require a heap allocation, and only requires a single pointer to be allocated. Not sure if these requirements are important to anyone else, but I don't mind making my own custom containers if I need to.
Dec 14 2010
parent reply Jonathan M Davis <jmdavisProg gmx.com> writes:
On Tuesday, December 14, 2010 16:35:34 Craig Black wrote:
 What say you?
I feel like the odd man out here since my perspective is so different. I use custom container classes even in C++, partly because I can usually get better performance that way, and because I can customize the the container however I like. So I will probably be doing my own containers if/when I use D. Beyond that, my own personal preferences seem so different that I hesitate to mention them. I use dynamic arrays by far the most out of all container classes. I use them so much that I cringe at the thought of allocating them on the GC heap. My code is very high performance and I would like to keep it that way. Also, my usage of arrays is such that most of them are empty, so it is important to me that the empty arrays are stored efficiently. Using my custom container class, an empty array does not require a heap allocation, and only requires a single pointer to be allocated. Not sure if these requirements are important to anyone else, but I don't mind making my own custom containers if I need to.
Dynamic arrays are already on the GC heap... - Jonathan M Davis
Dec 14 2010
next sibling parent "Craig Black" <craigblack2 cox.net> writes:
"Jonathan M Davis" <jmdavisProg gmx.com> wrote in message 
news:mailman.1005.1292374292.21107.digitalmars-d puremagic.com...
 On Tuesday, December 14, 2010 16:35:34 Craig Black wrote:
 What say you?
I feel like the odd man out here since my perspective is so different. I use custom container classes even in C++, partly because I can usually get better performance that way, and because I can customize the the container however I like. So I will probably be doing my own containers if/when I use D. Beyond that, my own personal preferences seem so different that I hesitate to mention them. I use dynamic arrays by far the most out of all container classes. I use them so much that I cringe at the thought of allocating them on the GC heap. My code is very high performance and I would like to keep it that way. Also, my usage of arrays is such that most of them are empty, so it is important to me that the empty arrays are stored efficiently. Using my custom container class, an empty array does not require a heap allocation, and only requires a single pointer to be allocated. Not sure if these requirements are important to anyone else, but I don't mind making my own custom containers if I need to.
Dynamic arrays are already on the GC heap... - Jonathan M Davis
Using built-in D arrays, yes. Using a templated struct, they don't have to be. C++ std::vector works just fine without GC. The same can be done in D.
Dec 14 2010
prev sibling parent reply Dmitry Olshansky <dmitry.olsh gmail.com> writes:
On 15.12.2010 3:50, Jonathan M Davis wrote:
 On Tuesday, December 14, 2010 16:35:34 Craig Black wrote:
 What say you?
I feel like the odd man out here since my perspective is so different. I use custom container classes even in C++, partly because I can usually get better performance that way, and because I can customize the the container however I like. So I will probably be doing my own containers if/when I use D. Beyond that, my own personal preferences seem so different that I hesitate to mention them. I use dynamic arrays by far the most out of all container classes. I use them so much that I cringe at the thought of allocating them on the GC heap. My code is very high performance and I would like to keep it that way. Also, my usage of arrays is such that most of them are empty, so it is important to me that the empty arrays are stored efficiently. Using my custom container class, an empty array does not require a heap allocation, and only requires a single pointer to be allocated. Not sure if these requirements are important to anyone else, but I don't mind making my own custom containers if I need to.
Dynamic arrays are already on the GC heap... - Jonathan M Davis
Hm, ((T*)malloc(1024*T.sizeof))[0..size]; works. Just needs careful initialization of each field, since they are filled with trash ... And you can even do slicing. Just don't append to them and keep track of the initial reference ;) -- Dmitry Olshansky
Dec 15 2010
parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Wed, 15 Dec 2010 14:18:20 -0500, Dmitry Olshansky  
<dmitry.olsh gmail.com> wrote:

 On 15.12.2010 3:50, Jonathan M Davis wrote:
 On Tuesday, December 14, 2010 16:35:34 Craig Black wrote:
 What say you?
I feel like the odd man out here since my perspective is so different. I use custom container classes even in C++, partly because I can usually get better performance that way, and because I can customize the the container however I like. So I will probably be doing my own containers if/when I use D. Beyond that, my own personal preferences seem so different that I hesitate to mention them. I use dynamic arrays by far the most out of all container classes. I use them so much that I cringe at the thought of allocating them on the GC heap. My code is very high performance and I would like to keep it that way. Also, my usage of arrays is such that most of them are empty, so it is important to me that the empty arrays are stored efficiently. Using my custom container class, an empty array does not require a heap allocation, and only requires a single pointer to be allocated. Not sure if these requirements are important to anyone else, but I don't mind making my own custom containers if I need to.
Dynamic arrays are already on the GC heap... - Jonathan M Davis
Hm, ((T*)malloc(1024*T.sizeof))[0..size]; works. Just needs careful initialization of each field, since they are filled with trash ... And you can even do slicing. Just don't append to them and keep track of the initial reference ;)
You can append them. The append code will recognize that it's not a GC block and reallocate. What you need to do more importantly is depending on the type of T, you may need to register the block as a root in the GC. Otherwise, if T contains GC references, those could be collected prematurely. -Steve
Dec 15 2010
parent reply Dmitry Olshansky <dmitry.olsh gmail.com> writes:
On 15.12.2010 22:52, Steven Schveighoffer wrote:
 On Wed, 15 Dec 2010 14:18:20 -0500, Dmitry Olshansky 
 <dmitry.olsh gmail.com> wrote:

 On 15.12.2010 3:50, Jonathan M Davis wrote:
 On Tuesday, December 14, 2010 16:35:34 Craig Black wrote:
 What say you?
I feel like the odd man out here since my perspective is so different. I use custom container classes even in C++, partly because I can usually get better performance that way, and because I can customize the the container however I like. So I will probably be doing my own containers if/when I use D. Beyond that, my own personal preferences seem so different that I hesitate to mention them. I use dynamic arrays by far the most out of all container classes. I use them so much that I cringe at the thought of allocating them on the GC heap. My code is very high performance and I would like to keep it that way. Also, my usage of arrays is such that most of them are empty, so it is important to me that the empty arrays are stored efficiently. Using my custom container class, an empty array does not require a heap allocation, and only requires a single pointer to be allocated. Not sure if these requirements are important to anyone else, but I don't mind making my own custom containers if I need to.
Dynamic arrays are already on the GC heap... - Jonathan M Davis
Hm, ((T*)malloc(1024*T.sizeof))[0..size]; works. Just needs careful initialization of each field, since they are filled with trash ... And you can even do slicing. Just don't append to them and keep track of the initial reference ;)
You can append them. The append code will recognize that it's not a GC block and reallocate.
Good to know.
 What you need to do more importantly is depending on the type of T, 
 you may need to register the block as a root in the GC.  Otherwise, if 
 T contains GC references, those could be collected prematurely.
Right, this is very important! I just checked, and luckily I did this only with plain data structs.
 -Steve
-- Dmitry Olshansky
Dec 15 2010
parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Wed, 15 Dec 2010 15:19:34 -0500, Dmitry Olshansky  
<dmitry.olsh gmail.com> wrote:

 On 15.12.2010 22:52, Steven Schveighoffer wrote:
 On Wed, 15 Dec 2010 14:18:20 -0500, Dmitry Olshansky  
 <dmitry.olsh gmail.com> wrote:
 Hm,
 ((T*)malloc(1024*T.sizeof))[0..size];
 works. Just needs careful initialization of each field, since they are  
 filled with trash ...
 And you can even do slicing. Just don't append to them and keep track  
 of the initial reference ;)
You can append them. The append code will recognize that it's not a GC block and reallocate.
Good to know.
I should also note, if you do this: auto x = ((T*)malloc(1024*T.sizeof))[0..size]; x ~= T.init; You have now lost the original reference to the data (because x now points to the GC allocated block), so it will leak! So while appending does work, you have to take care to still keep track of the original data. I'd recommend something like this if it's a temporary: auto x = (cast(T*)malloc(1024*T.sizeof))[0..size]; const origdata = x.ptr; scope(exit) free(origdata); -Steve
Dec 15 2010