www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Kinds of containers

reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
I'm finally getting the cycles to get to thinking about Design by 
Introspection containers. First off, there are several general 
categories of containers. I think D should support all properly. One 
question is which we go for in the standard library.

1. Functional containers.

These are immutable; once created, neither their topology nor their 
elements may be observably changed. Manipulating a container entails 
creating an entire new container, often based on an existing container 
(e.g. append takes a container and an element and creates a whole new 
container).

Internally, functional containers take advantage of common substructure 
and immutability to share actual data. The classic resource for defining 
and implementing functional containers is 
http://www.amazon.com/Purely-Functional-Structures-Chris-Okasaki/dp/0521663504.

2. Reference containers.

These have classic reference semantics (à la Java). Internally, they may 
be implemented either as class objects or as reference counted structs.

They're by default mutable. Qualifiers should apply to them gracefully.

3. Eager value containers.

These are STL-style. Somewhat surprisingly I think these are the worst 
of the pack; they expensively duplicate at the drop of a hat and need to 
be carefully passed around by reference lest performance silently drops. 
Nevertheless, when used as members inside other data structures value 
semantics might be the appropriate choice. Also, thinking of them as 
values often makes code simpler.

By default eager value containers are mutable. They should support 
immutable and const meaningfully.

4. Copy-on-write containers.

These combine the advantages of value and reference containers: you get 
to think of them as values, yet they're not expensive to copy. Copying 
only occurs by necessity upon the first attempt to change them.

The disadvantage is implementations get somewhat complicated. Also, they 
are shunned in C++ because there is no proper support for COW; for 
example, COW strings have been banned starting with C++11 which is quite 
the bummer.

Together with Scott Meyers, Walter figured out a way to change D to 
support COW properly. The language change consists of two attributes.

=======

I'll attempt to implement a few versions of each and see what they look 
like. The question here is what containers are of interest for D's 
standard library.


Andrei
Oct 21 2015
next sibling parent Rikki Cattermole <alphaglosined gmail.com> writes:
On 22/10/15 12:05 AM, Andrei Alexandrescu wrote:
 I'm finally getting the cycles to get to thinking about Design by
 Introspection containers. First off, there are several general
 categories of containers. I think D should support all properly. One
 question is which we go for in the standard library.

 1. Functional containers.

 These are immutable; once created, neither their topology nor their
 elements may be observably changed. Manipulating a container entails
 creating an entire new container, often based on an existing container
 (e.g. append takes a container and an element and creates a whole new
 container).

 Internally, functional containers take advantage of common substructure
 and immutability to share actual data. The classic resource for defining
 and implementing functional containers is
 http://www.amazon.com/Purely-Functional-Structures-Chris-Okasaki/dp/0521663504.


 2. Reference containers.

 These have classic reference semantics (à la Java). Internally, they may
 be implemented either as class objects or as reference counted structs.

 They're by default mutable. Qualifiers should apply to them gracefully.

 3. Eager value containers.

 These are STL-style. Somewhat surprisingly I think these are the worst
 of the pack; they expensively duplicate at the drop of a hat and need to
 be carefully passed around by reference lest performance silently drops.
 Nevertheless, when used as members inside other data structures value
 semantics might be the appropriate choice. Also, thinking of them as
 values often makes code simpler.

 By default eager value containers are mutable. They should support
 immutable and const meaningfully.

 4. Copy-on-write containers.

 These combine the advantages of value and reference containers: you get
 to think of them as values, yet they're not expensive to copy. Copying
 only occurs by necessity upon the first attempt to change them.

 The disadvantage is implementations get somewhat complicated. Also, they
 are shunned in C++ because there is no proper support for COW; for
 example, COW strings have been banned starting with C++11 which is quite
 the bummer.

 Together with Scott Meyers, Walter figured out a way to change D to
 support COW properly. The language change consists of two attributes.

 =======

 I'll attempt to implement a few versions of each and see what they look
 like. The question here is what containers are of interest for D's
 standard library.


 Andrei
I've got a list and a map implementation using allocators. They are pretty primitive but they may lead to insight so links provided. https://github.com/rikkimax/alphaPhobos/blob/master/source/std/experimental/internal/containers/list.d https://github.com/rikkimax/alphaPhobos/blob/master/source/std/experimental/internal/containers/map.d
Oct 21 2015
prev sibling next sibling parent reply Russel Winder via Digitalmars-d <digitalmars-d puremagic.com> writes:
On Wed, 2015-10-21 at 07:05 -0400, Andrei Alexandrescu via Digitalmars-
d wrote:
=20
[=E2=80=A6]
 1. Functional containers.
=20
 These are immutable; once created, neither their topology nor their=20
 elements may be observably changed. Manipulating a container entails=20
 creating an entire new container, often based on an existing
 container=20
 (e.g. append takes a container and an element and creates a whole new
 container).
=20
 Internally, functional containers take advantage of common
 substructure=20
 and immutability to share actual data. The classic resource for
 defining=20
 and implementing functional containers is=20
 http://www.amazon.com/Purely-Functional-Structures-Chris-Okasaki/dp/0
 521663504.
I believe (currently, I am open to being re-educated) that most of the rest of the world calls these persistent data structures, despite the title of Okasaki's thesis. I am not sure I like either of the labels "functional" or "persistent", there is little affordance with the concept underneath.
 2. Reference containers.
=20
 These have classic reference semantics (=C3=A0 la Java). Internally, they
 may=20
 be implemented either as class objects or as reference counted
 structs.
=20
 They're by default mutable. Qualifiers should apply to them
 gracefully.
=20
 3. Eager value containers.
=20
 These are STL-style. Somewhat surprisingly I think these are the
 worst=20
 of the pack; they expensively duplicate at the drop of a hat and need
 to=20
 be carefully passed around by reference lest performance silently
 drops.=20
 Nevertheless, when used as members inside other data structures value
 semantics might be the appropriate choice. Also, thinking of them as=20
 values often makes code simpler.
=20
 By default eager value containers are mutable. They should support=20
 immutable and const meaningfully.
Is there a need for separate eager structures explicity? Is it not enough for all structures to be lazy with there being a pull operation?
 4. Copy-on-write containers.
=20
 These combine the advantages of value and reference containers: you
 get=20
 to think of them as values, yet they're not expensive to copy.
 Copying=20
 only occurs by necessity upon the first attempt to change them.
=20
 The disadvantage is implementations get somewhat complicated. Also,
 they=20
 are shunned in C++ because there is no proper support for COW; for=20
 example, COW strings have been banned starting with C++11 which is
 quite=20
 the bummer.
=20
 Together with Scott Meyers, Walter figured out a way to change D to=20
 support COW properly. The language change consists of two attributes.
=20
 =3D=3D=3D=3D=3D=3D=3D
=20
 I'll attempt to implement a few versions of each and see what they
 look=20
 like. The question here is what containers are of interest for D's=20
 standard library.
N-dimensional arrays, N-dimensional dynamic array, singly-linked list, doubly-linked list, hash map (aka dictionary, associative array,=E2=80=A6), red-black tree, 2-3 tree, doubly-linked tree, tree map, b-tree. And no doubt a few others will be proposed since they are classic, appear in undergraduate courses, and are actually used for real. However these are classic sequential data structures. We should be looking to support data structures that can be used in a multi-threaded=20 context without having explicit locks. The obvious exemplars to indicate the way are ConcurrentHashMap in Java and NumPy array. (cf. other thread talking about these quasi-weird array data structures.) NumPy array is an opaque data structure that should never be accessed in any way other than the functions and higher-order functions from the library. It is a nice design, works well enough for data scientists, quants, bioinformatics people to think it is great, but fundamentally is seriously slow. If D could provide data structures that did NumPy/SciPy/Pandas things significantly faster than NumPy/SciPy/Pandas, there would be traction.=C2=A0 In this sense std.concurrent, std.parallelism, and std.datastructures (?) whilst almost certainly remaining separate, must be able to interwork. The alternative is the sort of hack that goes on in Java where data structures are sequential but you can provide thread-safe versions. This is fine for small data structures, but is not acceptable for large ones in a multicore situation. =C2=A0Hence ConcurrentHashMap, and the ilke. Then there is the question whether to try and do the sort of thing Chapel and X10 are doing. They have the ideas of domain and places which allow for partitioned arrays that can be mapped to clusters of multicore, multiprocessor machines =E2=80=94 Partitioned Global Address Spa= ce. IBM have now released their APGAS stuff over Hazelcast for Java users. This will begin to challenge Apache Spark (and Hadoop). Apache Spark is in many ways like NumPy in it's architecture, an opaque data N-dimensional array operated on by functions and higher-order functions. Obviously I am conflicted here: I use Go for networking (but mostly because I earn money running learning workshops), quite like Rust because it is the fastest language currently on my microbenchmarks, like D because it's nice, love Chapel because it is great (better than D in so many ways), sort of like X10 (because it is a bit Scala like and can be JVM or native), but end up having to work on some C++ codebases. I am not entirely sure D can compete with Chapel in HPC and increasingly the Python-verse. Laeeth and others are championing D instead of Python, but really the competition is Python/Chapel vs. Python/NumPy/SciPy/Pandas. Unless that is D can get the Pandas style data structures. Which brings my ramblings back on point.=C2=A0 --=20 Russel. =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D Dr Russel Winder t: +44 20 7585 2200 voip: sip:russel.winder ekiga.n= et 41 Buckmaster Road m: +44 7770 465 077 xmpp: russel winder.org.uk London SW11 1EN, UK w: www.russel.org.uk skype: russel_winder
Oct 21 2015
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 10/21/2015 09:01 AM, Russel Winder via Digitalmars-d wrote:
 I believe (currently, I am open to being re-educated) that most of the
 rest of the world calls these persistent data structures, despite the
 title of Okasaki's thesis. I am not sure I like either of the labels
 "functional" or "persistent", there is little affordance with the
 concept underneath.
OK. We should go with the more widespread name.
 By default eager value containers are mutable. They should support
 immutable and const meaningfully.
Is there a need for separate eager structures explicity? Is it not enough for all structures to be lazy with there being a pull operation?
Well it is but I wanted to clarify where STL-style containers fit.
 N-dimensional arrays, N-dimensional dynamic array, singly-linked list,
 doubly-linked list, hash map (aka dictionary, associative array,…),
 red-black tree, 2-3 tree, doubly-linked tree, tree map, b-tree. And no
 doubt a few others will be proposed since they are classic, appear in
 undergraduate courses, and are actually used for real.
These are orthogonal on the kinds I discussed. So we have container topologies (which you enumerate a few of) and container kinds, which concerns copying semantics of the entire containers.
 However these are classic sequential data structures.
Hashes and some of the trees would actually be associative data structures.
 We should be
 looking to support data structures that can be used in a multi-threaded
 context without having explicit locks. The obvious exemplars to
 indicate the way are ConcurrentHashMap in Java and NumPy array. (cf.
 other thread talking about these quasi-weird array data structures.)
 NumPy array is an opaque data structure that should never be accessed
 in any way other than the functions and higher-order functions from the
 library. It is a nice design, works well enough for data scientists,
 quants, bioinformatics people to think it is great, but fundamentally
 is seriously slow. If D could provide data structures that did
 NumPy/SciPy/Pandas things significantly faster than NumPy/SciPy/Pandas,
 there would be traction.
Cool thoughts, thanks. Those would have their own APIs, separate from mainstream containers.
 Then there is the question whether to try and do the sort of thing
 Chapel and X10 are doing. They have the ideas of domain and places
 which allow for partitioned arrays that can be mapped to clusters of
 multicore, multiprocessor machines — Partitioned Global Address Space.
 IBM have now released their APGAS stuff over Hazelcast for Java users.
 This will begin to challenge Apache Spark (and Hadoop).
We don't support PGAS at language level, but I'll need to look into how APGAS works with Java.
 Obviously I am conflicted here: I use Go for networking (but mostly
 because I earn money running learning workshops), quite like Rust
 because it is the fastest language currently on my microbenchmarks,
 like D because it's nice, love Chapel because it is great (better than
 D in so many ways), sort of like X10 (because it is a bit Scala like
 and can be JVM or native), but end up having to work on some C++
 codebases. I am not entirely sure D can compete with Chapel in HPC and
 increasingly the Python-verse. Laeeth and others are championing D
 instead of Python, but really the competition is Python/Chapel vs.
 Python/NumPy/SciPy/Pandas. Unless that is D can get the Pandas style
 data structures. Which brings my ramblings back on point.
My finance folks also rave about Pandas. I wish I could fork myself to look into it. Nevertheless, what I'm working on now should help libraries such as Pandas for D. We have a large language but are unclear on recipe-style idioms for writing library code. Andrei
Oct 21 2015
next sibling parent reply Russel Winder via Digitalmars-d <digitalmars-d puremagic.com> writes:
On Wed, 2015-10-21 at 09:11 -0400, Andrei Alexandrescu via Digitalmars-d
wrote:
=20
[=E2=80=A6]
 My finance folks also rave about Pandas. I wish I could fork myself to=
=20
 look into it.
Pandas is what makes Python a viable competitor to R. Most data science people will use one or the other these days. Though some (with budgets) wil= l use Matlab or Mathematica.=C2=A0
 Nevertheless, what I'm working on now should help libraries such as=20
 Pandas for D. We have a large language but are unclear on recipe-style=
=20
 idioms for writing library code.
=20
Can I suggest that we avoid "should" and ensure that we have more of a "will". =C2=A0Instead of a bottom up approach, we perhaps need a top-down approach: Applications programmers do X, Y, Z, that are best support by dat= a structures A, B, C, and A, B, and C are realized by library architecture = =CE=B1, =CE=B2, =CE=B3. --=20 Russel. =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D Dr Russel Winder t:+44 20 7585 2200 voip:sip: russel.winder ekiga.net 41 Buckmaster Road m:+44 7770 465 077 xmpp:russel winder.org.uk London SW11 1EN, UK w: www.russel.org.uk skype:russel_winder
Oct 21 2015
parent jmh530 <john.michael.hall gmail.com> writes:
On Wednesday, 21 October 2015 at 15:44:36 UTC, Russel Winder 
wrote:
 On Wed, 2015-10-21 at 09:11 -0400, Andrei Alexandrescu via 
 Digitalmars-d wrote:
 
[…]
 My finance folks also rave about Pandas. I wish I could fork 
 myself to look into it.
Pandas is what makes Python a viable competitor to R. Most data science people will use one or the other these days. Though some (with budgets) will use Matlab or Mathematica.
Yeah, R has had the functionality of what Pandas provides for a while. Data frames and zoo/xts cover most of Pandas. I think there has made some effort to get some of the functionality of dplyr as well (blaze).
Oct 21 2015
prev sibling parent reply "H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:
On Wed, Oct 21, 2015 at 09:11:38AM -0400, Andrei Alexandrescu via Digitalmars-d
wrote:
[...]
 My finance folks also rave about Pandas. I wish I could fork myself to
 look into it.
[...] Unfortunately, forking a human process takes 9 months to spawn the new process (slow OS, y'know), and the new process's module constructor takes about 18 or so years to run before it's ready for use. :-D Also, the specs are ambiguous in some key areas of the semantics, so implementations in practice don't quite manage to replicate the original process exactly. T -- Chance favours the prepared mind. -- Louis Pasteur
Oct 21 2015
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 10/21/2015 12:12 PM, H. S. Teoh via Digitalmars-d wrote:
 On Wed, Oct 21, 2015 at 09:11:38AM -0400, Andrei Alexandrescu via
Digitalmars-d wrote:
 [...]
 My finance folks also rave about Pandas. I wish I could fork myself to
 look into it.
[...] Unfortunately, forking a human process takes 9 months to spawn the new process (slow OS, y'know), and the new process's module constructor takes about 18 or so years to run before it's ready for use. :-D Also, the specs are ambiguous in some key areas of the semantics, so implementations in practice don't quite manage to replicate the original process exactly.
As one with two spawned forks underway, I totally agree. Besides, early versions tend to consume a lot of resources, emit many nonmaskable interrupts, and leak a lot. -- Andrei
Oct 21 2015
prev sibling next sibling parent reply Andrea Fontana <nospam example.com> writes:
On Wednesday, 21 October 2015 at 11:05:12 UTC, Andrei 
Alexandrescu wrote:
 I'll attempt to implement a few versions of each and see what 
 they look like. The question here is what containers are of 
 interest for D's standard library.


 Andrei
Tree (binary, quad, octo, r/b, etc...), graph (directed, undirected, sparse, weighted, etc...) have my votes. They enable a lot of algorithms to be implemented.
Oct 21 2015
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 10/21/2015 09:13 AM, Andrea Fontana wrote:
 On Wednesday, 21 October 2015 at 11:05:12 UTC, Andrei Alexandrescu wrote:
 I'll attempt to implement a few versions of each and see what they
 look like. The question here is what containers are of interest for
 D's standard library.


 Andrei
Tree (binary, quad, octo, r/b, etc...), graph (directed, undirected, sparse, weighted, etc...) have my votes. They enable a lot of algorithms to be implemented.
As I also replied to Russel, I am now discussing semantics of container values. -- Andrei
Oct 21 2015
prev sibling next sibling parent reply Robert burner Schadek <rburners gmail.com> writes:
On Wednesday, 21 October 2015 at 11:05:12 UTC, Andrei 
Alexandrescu wrote:
 I'm finally getting the cycles to get to thinking about Design 
 by Introspection containers.
What do you have in mind about the Design by Introspection (DyI, DbI is taken) for the container? How do you plan to use the allocators? Have the container create and handle the allocator, pass it at container construction, or (YOUR IDEA HERE)? anyway, container yuhu http://www.alfabetajuega.com/multimedia/imagenes/201310/53112.homer_yuhu_simpsons.jpg
Oct 21 2015
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 10/21/2015 09:46 AM, Robert burner Schadek wrote:
 On Wednesday, 21 October 2015 at 11:05:12 UTC, Andrei Alexandrescu wrote:
 I'm finally getting the cycles to get to thinking about Design by
 Introspection containers.
What do you have in mind about the Design by Introspection (DyI, DbI is taken) for the container?
Containers will obey subsets of a nomenclature of primitives.
 How do you plan to use the allocators? Have the container create and
 handle the allocator, pass it at container construction, or (YOUR IDEA
 HERE)?
Container will pick the current allocator reference, or get it as a creation parameters, and store it for the life of the container. Andrei
Oct 21 2015
parent reply Robert burner Schadek <rburners gmail.com> writes:
On Wednesday, 21 October 2015 at 14:20:23 UTC, Andrei 
Alexandrescu wrote:
 Containers will obey subsets of a nomenclature of primitives.
Just to be crystal clear, something like this? void fun(Container)(ref Container c) if (hasAppend!Container) { // append stuff to c }
Oct 21 2015
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 10/21/2015 10:51 AM, Robert burner Schadek wrote:
 On Wednesday, 21 October 2015 at 14:20:23 UTC, Andrei Alexandrescu wrote:
 Containers will obey subsets of a nomenclature of primitives.
Just to be crystal clear, something like this? void fun(Container)(ref Container c) if (hasAppend!Container) { // append stuff to c }
Even simpler, hasMethod!(Container, "append") -- Andrei
Oct 21 2015
parent reply Robert burner Schadek <rburners gmail.com> writes:
On Wednesday, 21 October 2015 at 17:23:15 UTC, Andrei 
Alexandrescu wrote:
 Even simpler, hasMethod!(Container, "append") -- Andrei
I know this goes against your talk at DConf, but having to write string parameters does not feel good. I'm will properly not be the only one who will mistype "apend" and wonder why the other template function will be chosen. I rather have the compiler scream at me telling me there is no symbol hasApend. And hasAppend!Container is less typing than hasMethod!(Container, "append").
Oct 21 2015
next sibling parent reply Jonathan M Davis <jmdavisProg gmx.com> writes:
On Wednesday, 21 October 2015 at 18:24:30 UTC, Robert burner 
Schadek wrote:
 On Wednesday, 21 October 2015 at 17:23:15 UTC, Andrei 
 Alexandrescu wrote:
 Even simpler, hasMethod!(Container, "append") -- Andrei
I know this goes against your talk at DConf, but having to write string parameters does not feel good. I'm will properly not be the only one who will mistype "apend" and wonder why the other template function will be chosen. I rather have the compiler scream at me telling me there is no symbol hasApend. And hasAppend!Container is less typing than hasMethod!(Container, "append").
The other concern is that hasMethod isn't going to be checking anything other than the name, whereas a hasAppend could actually check that the function accepts the right arguments and returns the correct type. And it's not like the list of the functions to check for is going to be infinite. It needs to be a known list of functions where each of those functions has a signature that meets certain requirements. You can't just be checking for a random function name and expect it to do what you want. So, having a list of explicit traits to use for testing for the expected set of functions will both allow for the tests to be more thorough than simply checking the name, _and_ it will serve as a way to help document what the list of expected functions is for a particular domain. I really think that using hasMethod by itself like that is getting too ad hoc and doesn't really benefit us over having an explicit list of traits for the specific domain that we're dealing with. I totally buy that testing for each of the specific functions rather than trying to put all of that information in the type itself (e.g. ContainerWithAppendAndRemove) simply isn't going to scale, but I don't at all buy that using hasMethod to test for the existence of member functions is a good way to go. - Jonathan M Davis
Oct 21 2015
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 10/21/2015 02:32 PM, Jonathan M Davis wrote:
 On Wednesday, 21 October 2015 at 18:24:30 UTC, Robert burner Schadek wrote:
 On Wednesday, 21 October 2015 at 17:23:15 UTC, Andrei Alexandrescu wrote:
 Even simpler, hasMethod!(Container, "append") -- Andrei
I know this goes against your talk at DConf, but having to write string parameters does not feel good. I'm will properly not be the only one who will mistype "apend" and wonder why the other template function will be chosen. I rather have the compiler scream at me telling me there is no symbol hasApend. And hasAppend!Container is less typing than hasMethod!(Container, "append").
The other concern is that hasMethod isn't going to be checking anything other than the name, whereas a hasAppend could actually check that the function accepts the right arguments and returns the correct type. And it's not like the list of the functions to check for is going to be infinite. It needs to be a known list of functions where each of those functions has a signature that meets certain requirements. You can't just be checking for a random function name and expect it to do what you want. So, having a list of explicit traits to use for testing for the expected set of functions will both allow for the tests to be more thorough than simply checking the name, _and_ it will serve as a way to help document what the list of expected functions is for a particular domain. I really think that using hasMethod by itself like that is getting too ad hoc and doesn't really benefit us over having an explicit list of traits for the specific domain that we're dealing with. I totally buy that testing for each of the specific functions rather than trying to put all of that information in the type itself (e.g. ContainerWithAppendAndRemove) simply isn't going to scale, but I don't at all buy that using hasMethod to test for the existence of member functions is a good way to go.
I'd say let's first have a Pope before becoming more Catholic than him. -- Andrei
Oct 21 2015
parent reply Jonathan M Davis <jmdavisProg gmx.com> writes:
On Wednesday, 21 October 2015 at 19:19:23 UTC, Andrei 
Alexandrescu wrote:
 I'd say let's first have a Pope before becoming more Catholic 
 than him. -- Andrei
I confess that I really don't understand that one. - Jonathan M Davis
Oct 21 2015
next sibling parent MobPassenger <MobPassenger nowhere.fi> writes:
On Wednesday, 21 October 2015 at 19:38:37 UTC, Jonathan M Davis 
wrote:
 On Wednesday, 21 October 2015 at 19:19:23 UTC, Andrei 
 Alexandrescu wrote:
 I'd say let's first have a Pope before becoming more Catholic 
 than him. -- Andrei
I confess that I really don't understand that one. - Jonathan M Davis
It sounds like he say that you are a bit too orthodox when you argue about the way to test the traits that a Container have.
Oct 21 2015
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 10/21/2015 03:38 PM, Jonathan M Davis wrote:
 On Wednesday, 21 October 2015 at 19:19:23 UTC, Andrei Alexandrescu wrote:
 I'd say let's first have a Pope before becoming more Catholic than
 him. -- Andrei
I confess that I really don't understand that one.
"More Catholic than the Pope" = overdoing something all too pedantically. What I mean is: let's get things started and if there's any need for isXxx we'll add them. -- Andrei
Oct 21 2015
next sibling parent Robert burner Schadek <rburners gmail.com> writes:
On Wednesday, 21 October 2015 at 20:06:41 UTC, Andrei 
Alexandrescu wrote:
 "More Catholic than the Pope" = overdoing something all too 
 pedantically. What I mean is: let's get things started and if 
 there's any need for isXxx we'll add them. -- Andrei
+1
Oct 21 2015
prev sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Wednesday, 21 October 2015 at 20:06:41 UTC, Andrei 
Alexandrescu wrote:
 On 10/21/2015 03:38 PM, Jonathan M Davis wrote:
 On Wednesday, 21 October 2015 at 19:19:23 UTC, Andrei 
 Alexandrescu wrote:
 I'd say let's first have a Pope before becoming more Catholic 
 than
 him. -- Andrei
I confess that I really don't understand that one.
"More Catholic than the Pope" = overdoing something all too pedantically. What I mean is: let's get things started and if there's any need for isXxx we'll add them. -- Andrei
In principle, I agree with the sentiment, but I would have thought that our experience thus far already showed that being imprecise with code involving compile time introspection was problematic at best. This sort of thing in static ifs probably isn't as bad as it would likely be in template constraints, but it seems like way too often we end up with problems like user code compiling at first and then not compiling later due to a small change in Phobos or dmd, just because someone did something differently from what we expected, and the template constraints or static ifs didn't take it into account properly. - Jonathan M Davis
Oct 22 2015
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 10/21/2015 02:24 PM, Robert burner Schadek wrote:
 On Wednesday, 21 October 2015 at 17:23:15 UTC, Andrei Alexandrescu wrote:
 Even simpler, hasMethod!(Container, "append") -- Andrei
I know this goes against your talk at DConf, but having to write string parameters does not feel good. I'm will properly not be the only one who will mistype "apend" and wonder why the other template function will be chosen. I rather have the compiler scream at me telling me there is no symbol hasApend. And hasAppend!Container is less typing than hasMethod!(Container, "append").
Yah, but defining isXxx for dozens of Xxx doesn't sit well either. -- Andrei
Oct 21 2015
parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Wednesday, 21 October 2015 at 19:18:44 UTC, Andrei 
Alexandrescu wrote:
 On 10/21/2015 02:24 PM, Robert burner Schadek wrote:
 On Wednesday, 21 October 2015 at 17:23:15 UTC, Andrei 
 Alexandrescu wrote:
 Even simpler, hasMethod!(Container, "append") -- Andrei
I know this goes against your talk at DConf, but having to write string parameters does not feel good. I'm will properly not be the only one who will mistype "apend" and wonder why the other template function will be chosen. I rather have the compiler scream at me telling me there is no symbol hasApend. And hasAppend!Container is less typing than hasMethod!(Container, "append").
Yah, but defining isXxx for dozens of Xxx doesn't sit well either. -- Andrei
So, instead we make it ad hoc where only the name is tested rather than anything about what the function is or does, and we make it that much harder for folks to know what the vocabulary list of functions is, because there's no list of traits to test for them and just a table in some piece of documentation somewhere? I just don't see this scaling well if we're not more rigorous about it than hasMethod!(Container, "append"). If all of the containers are in Phobos and nothing outside of Phobos is actually testing for any of these methods, then it's a much smaller issue, but if any of this method testing is going to happen outside of Phobos, then I definitely think that it's a bad idea to be this ad hoc about it. - Jonathan M Davis
Oct 21 2015
prev sibling next sibling parent reply Jonathan M Davis <jmdavisProg gmx.com> writes:
On Wednesday, 21 October 2015 at 11:05:12 UTC, Andrei 
Alexandrescu wrote:
 I'm finally getting the cycles to get to thinking about Design 
 by Introspection containers. First off, there are several 
 general categories of containers. I think D should support all 
 properly. One question is which we go for in the standard 
 library.

 1. Functional containers.
I fully expect that these have their place, but I honestly have no idea what they would be. When I've used functional languages, I've only ever found these attributes to be a royal pain to deal with, not a benefit. From what I've seen, containers are the sort of thing that almost always always need to be mutable to be of any real benefit. Upon occasion, constructing a container up front and then never tweaking it after that is useful, but that's pretty rare from what I've seen. The only cases that I can think of where this could be really beneficial would be something like strings, and we're using arrays for those currently (though they are arguably functional containers given that they have immutable elements).
 2. Reference containers.
These are where it's at IMHO. 99.999999% this is what makes sense.
 3. Eager value containers.
_Maybe_ this makes sense for specific container types, but in general, it's a horrible idea IMHO. Almost all containers involve allocating something on the heap internally, so the fact that they're treated as values doesn't avoid heap allocations, and reference-counting reference containers solves the problem of the containers living past the end of the lifetime of whatever owns them. And having value type containers is incredibly error-prone - both from an efficiency standpoint and correctness standpoint. It's _way_ too easy to copy them when you don't mean to, and simple stuff like vector<T> foo() {...} for(auto iter = foo().begin(); iter != foo().end(); ++iter) {...} ends up doing nasty things, because you have iterators (or ranges) to a container that's already been destroyed. Honestly, unless you're counting something like a point or tuple as a container, I really don't see any value in these sort of containers. They're just too error-prone without adding any real value.
 4. Copy-on-write containers.
This could be interesting for some applications (probably more so than functional containers), but I would expect that it would only really be useful for very specific containers. I certainly wouldn't want to end up with a copy of my vector or list because I added a value to it. COW for strings makes a lot of sense, because they don't tend to get mutated much (which is also why string is immutable(char)[]). But since we're already using arrays for strings, a COW string would then be for special cases only. Overall, I think that we should focus on getting good reference containers while leaving the door open for cool stuff from the other container types. It's the reference containers that most programs are going to need, whereas I expect that the others are more likely to be nice-to-haves for a minority of applications and outright useless for the majority. - Jonathan M Davis
Oct 21 2015
next sibling parent reply Ice Cream Man <thegreatone google.com> writes:
On Wednesday, 21 October 2015 at 14:06:43 UTC, Jonathan M Davis 
wrote:
 1. Functional containers.
I fully expect that these have their place, but I honestly have no idea what they would be. When I've used functional languages, I've only ever found these attributes to be a royal pain to deal with, not a benefit. From what I've seen, containers are the sort of thing that almost always always need to be mutable to be of any real benefit. Upon occasion, constructing a container up front and then never tweaking it after that is useful, but that's pretty rare from what I've seen. The only cases that I can think of where this could be really beneficial would be something like strings, and we're using arrays for those currently (though they are arguably functional containers given that they have immutable elements).
I've found immutable containers useful when working with configuration files. There are only a few places in a program where you want to actively change values in a configuration files (configure the values). For these instances it's useful for the GUI or whatever to grab a mutable container to work with. For all other areas, immutable containers are very helpful in ensuring nobody accidentally modifies something they shouldn't be outside of the provided sandboxes.
Oct 21 2015
parent Ice Cream Man <thegreatone google.com> writes:
On Wednesday, 21 October 2015 at 15:18:08 UTC, Ice Cream Man 
wrote:
 On Wednesday, 21 October 2015 at 14:06:43 UTC, Jonathan M Davis 
 wrote:
 [...]
I've found immutable containers useful when working with configuration files. There are only a few places in a program where you want to actively change values in a configuration files (configure the values). For these instances it's useful for the GUI or whatever to grab a mutable container to work with. For all other areas, immutable containers are very helpful in ensuring nobody accidentally modifies something they shouldn't be outside of the provided sandboxes.
Actually, I think functional containers are probably more often than not the right tool for whatever it is you are doing. They just aren't convenient, because you might have to rethinking the architecture.
Oct 21 2015
prev sibling next sibling parent reply Russel Winder via Digitalmars-d <digitalmars-d puremagic.com> writes:
On Wed, 2015-10-21 at 14:06 +0000, Jonathan M Davis via Digitalmars-d wrote=
:
=20
[=E2=80=A6]
 1. Functional containers.
=20 I fully expect that these have their place, but I honestly have=20 no idea what they would be. When I've used functional languages,=20 I've only ever found these attributes to be a royal pain to deal=20 with, not a benefit. From what I've seen, containers are the sort=20 of thing that almost always always need to be mutable to be of=20 any real benefit. Upon occasion, constructing a container up=20 front and then never tweaking it after that is useful, but that's=20 pretty rare from what I've seen. =20 The only cases that I can think of where this could be really=20 beneficial would be something like strings, and we're using=20 arrays for those currently (though they are arguably functional=20 containers given that they have immutable elements).
I am confused as to why you would find these functional/persistent data structures to be a problem in functional languages. The whole point of them is they can be easily shared without any locks. In effect each amendment of the structure creates a fork (effectively a copy but without the copy bit) of it. This goes hand in hand with the functional, and especially tail recursive computational model. The Groovy team is currently debating adding persistent data structures to the Groovy armoury. Clojure and Scala have had persistent data structures for ages. The Clojure ones are really good. The Scala ones have some issues so there is currently discussion of a third rewrite.
 2. Reference containers.
=20 These are where it's at IMHO. 99.999999% this is what makes sense.
Except in a concurrent and parallel world! For big mutable data structures you end up creating agents (or so we can use hyper trendy nomenclature, pico-services) to avoid messing around with locks, mutexes, etc. the use of which generally kills performance.
 3. Eager value containers.
=20 =20
[=E2=80=A6] As noted in an earlier email, I am not yes convinced by this one.
 4. Copy-on-write containers.
=20 This could be interesting for some applications (probably more so=20 than functional containers), but I would expect that it would=20 only really be useful for very specific containers. I certainly=20 wouldn't want to end up with a copy of my vector or list because=20 I added a value to it. COW for strings makes a lot of sense,=20 because they don't tend to get mutated much (which is also why=20 string is immutable(char)[]). But since we're already using=20 arrays for strings, a COW string would then be for special cases=20 only.
I'd say the opposite. I would suggest that persistent/functional data structures would be more generally useful than COW ones. =C2=A0
 Overall, I think that we should focus on getting good reference=20
 containers while leaving the door open for cool stuff from the=20
 other container types. It's the reference containers that most=20
 programs are going to need, whereas I expect that the others are=20
 more likely to be nice-to-haves for a minority of applications=20
 and outright useless for the majority.
I am not convinced by this argument. Yes these are the type of containers C++/D/=E2=80=A6 programmers are used to and have a lot of code using, but a= re they really the ones that should be used. Experience from Clojure and Scala is that persistent/functional data structures lead to much nicer/easier concurrent and parallel code.=C2=A0 --=20 Russel. =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D Dr Russel Winder t:+44 20 7585 2200 voip:sip: russel.winder ekiga.net 41 Buckmaster Road m:+44 7770 465 077 xmpp:russel winder.org.uk London SW11 1EN, UK w: www.russel.org.uk skype:russel_winder
Oct 21 2015
next sibling parent Ice Cream Man <thegreatone google.com> writes:
On Wednesday, 21 October 2015 at 15:34:14 UTC, Russel Winder 
wrote:
 On Wed, 2015-10-21 at 14:06 +0000, Jonathan M Davis via 
 Digitalmars-d wrote:
 [...]
[…]
 [...]
I am confused as to why you would find these functional/persistent data structures to be a problem in functional languages. The whole point of them is they can be easily shared without any locks. In effect each amendment of the structure creates a fork (effectively a copy but without the copy bit) of it. This goes hand in hand with the functional, and especially tail recursive computational model. The Groovy team is currently debating adding persistent data structures to the Groovy armoury. Clojure and Scala have had persistent data structures for ages. The Clojure ones are really good. The Scala ones have some issues so there is currently discussion of a third rewrite.
 [...]
Except in a concurrent and parallel world! For big mutable data structures you end up creating agents (or so we can use hyper trendy nomenclature, pico-services) to avoid messing around with locks, mutexes, etc. the use of which generally kills performance.
 [...]
[…] As noted in an earlier email, I am not yes convinced by this one.
 [...]
I'd say the opposite. I would suggest that persistent/functional data structures would be more generally useful than COW ones.
 [...]
I am not convinced by this argument. Yes these are the type of containers C++/D/… programmers are used to and have a lot of code using, but are they really the ones that should be used. Experience from Clojure and Scala is that persistent/functional data structures lead to much nicer/easier concurrent and parallel code.
 [...]
These are where it's at IMHO. 99.999999% this is what makes sense.
Except in a concurrent and parallel world! For big mutable data structures you end up creating agents (or so we can use hyper trendy nomenclature, pico-services) to avoid messing around with locks, mutexes, etc. the use of which generally kills performance. shouldn't be all of the time. Which makes errors so much more prevalent. Every property that returns a List<T> is frequently exposed for the world to use. The best workaround I've found is to use interfaces that restrict what methods they have access to, but even then people resort to just casting it away. I'd prefer immutable by default. Personally.
Oct 21 2015
prev sibling parent reply Jonathan M Davis <jmdavisProg gmx.com> writes:
On Wednesday, 21 October 2015 at 15:34:14 UTC, Russel Winder 
wrote:
 On Wed, 2015-10-21 at 14:06 +0000, Jonathan M Davis via 
 Digitalmars-d wrote:
 
[…]
 1. Functional containers.
I fully expect that these have their place, but I honestly have no idea what they would be. When I've used functional languages, I've only ever found these attributes to be a royal pain to deal with, not a benefit. From what I've seen, containers are the sort of thing that almost always always need to be mutable to be of any real benefit. Upon occasion, constructing a container up front and then never tweaking it after that is useful, but that's pretty rare from what I've seen. The only cases that I can think of where this could be really beneficial would be something like strings, and we're using arrays for those currently (though they are arguably functional containers given that they have immutable elements).
I am confused as to why you would find these functional/persistent data structures to be a problem in functional languages. The whole point of them is they can be easily shared without any locks. In effect each amendment of the structure creates a fork (effectively a copy but without the copy bit) of it. This goes hand in hand with the functional, and especially tail recursive computational model.
My experience with immutable containers is that their performance is trash precisely because you can't mutate them. They end up being copied just to add elements or to change elements. And even if their internals are smart and share data as much as possible (though that starts moving into COW pretty quickly), the efficiency is still worse than having a properly mutable container would be. I'm sure that there are use cases where they would be useful, but I've sure never had one. I've found that functional languages can be great from an algorithmic perspective, but for data structures? Not so much. And I'm by no means against having data structures like this in Phobos, but I think that having normal, reference containers is far more useful for the common case. Certainly, it's what most programs currently use and what most programmers expect. I'd much prefer that we get a solid set of reference containers working than spend a lot of time worrying about more exotic containers up front. IMHO, they can wait. We'll see what the future may hold, and I could certainly end up being surprised, but I fully expect that I'll never use any functional containers if they're put into Phobos. They're just too unwieldy and inefficient. - Jonathan M Davis
Oct 21 2015
next sibling parent reply Ice Cream Man <thegreatone google.com> writes:
On Wednesday, 21 October 2015 at 16:25:19 UTC, Jonathan M Davis 
wrote:
 They're just too unwieldy and inefficient.

 - Jonathan M Davis
Only if you don't know how to use them. They are very wieldy, efficient AND safe, if you do know how to swing the ax. Its all about learning to work with them.
Oct 21 2015
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 10/21/2015 12:34 PM, Ice Cream Man wrote:
 On Wednesday, 21 October 2015 at 16:25:19 UTC, Jonathan M Davis wrote:
 They're just too unwieldy and inefficient.

 - Jonathan M Davis
Only if you don't know how to use them. They are very wieldy, efficient AND safe, if you do know how to swing the ax. Its all about learning to work with them.
Yep, and part of learning is knowing when they shouldn't be used. -- Andrei
Oct 21 2015
parent Ice Cream Man <thegreatone google.com> writes:
On Wednesday, 21 October 2015 at 18:50:05 UTC, Andrei 
Alexandrescu wrote:
 On 10/21/2015 12:34 PM, Ice Cream Man wrote:
 On Wednesday, 21 October 2015 at 16:25:19 UTC, Jonathan M 
 Davis wrote:
 They're just too unwieldy and inefficient.

 - Jonathan M Davis
Only if you don't know how to use them. They are very wieldy, efficient AND safe, if you do know how to swing the ax. Its all about learning to work with them.
Yep, and part of learning is knowing when they shouldn't be used. -- Andrei
I definitely won't disagree with that.
Oct 21 2015
prev sibling next sibling parent Timon Gehr <timon.gehr gmx.ch> writes:
On 10/21/2015 06:25 PM, Jonathan M Davis wrote:
 My experience with immutable containers is that their performance is
 trash precisely because you can't mutate them. They end up being copied
 just to add elements or to change elements.
I don't think this is what's being proposed here. Updates are fast.
 And even if their internals
 are smart and share data as much as possible (though that starts moving
 into COW pretty quickly),
COW copies the entire container.
 the efficiency is still worse than having a
 properly mutable container would be.
Not if you need access to older versions. If this is the case, then the "properly" mutable container carries a lot of runtime and memory overhead due to copying. Also, the difference is not very large for good implementations.
 I'm sure that there are use cases
 where they would be useful, but I've sure never had one. I've found that
 functional languages can be great from an algorithmic perspective, but
 for data structures? Not so much.
It's not tied to the language.
Oct 21 2015
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 10/21/2015 12:25 PM, Jonathan M Davis wrote:
 My experience with immutable containers is that their performance is
 trash precisely because you can't mutate them.
That's actually the experience in the Scala community. Over and again people start with immutable containers all over the place because they're cool, and end up with mutable containers because they work. -- Andrei
Oct 21 2015
next sibling parent Russel Winder via Digitalmars-d <digitalmars-d puremagic.com> writes:
On Wed, 2015-10-21 at 14:49 -0400, Andrei Alexandrescu via Digitalmars-
d wrote:
=20
[=E2=80=A6]
 That's actually the experience in the Scala community. Over and again
 people start with immutable containers all over the place because=20
 they're cool, and end up with mutable containers because they work. -
=20
For some applications, I am sure this true. For the majority no. I have no doubt that for some people fashion and trendy are decision making criteria when clearly they should be thinking about efficacy and efficiency. However it is important to remember that persistent/functional data structures are generally both efficacious and efficient in a fine-grain parallel context. Sadly a lot of this is advocacy research because there is no systematic, statistically significant research that is credible. Scala might have been a nice context to try some experimentation but they still do not have really good data structure implementations. Hence the recent notification of another rewrite. --=20 Russel. =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D Dr Russel Winder t: +44 20 7585 2200 voip: sip:russel.winder ekiga.n= et 41 Buckmaster Road m: +44 7770 465 077 xmpp: russel winder.org.uk London SW11 1EN, UK w: www.russel.org.uk skype: russel_winder
Oct 22 2015
prev sibling parent reply Ulrich =?UTF-8?B?S8O8dHRsZXI=?= <kuettler gmail.com> writes:
On Wednesday, 21 October 2015 at 18:49:26 UTC, Andrei 
Alexandrescu wrote:
 On 10/21/2015 12:25 PM, Jonathan M Davis wrote:
 My experience with immutable containers is that their 
 performance is
 trash precisely because you can't mutate them.
That's actually the experience in the Scala community. Over and again people start with immutable containers all over the place because they're cool, and end up with mutable containers because they work. -- Andrei
Ranges and loops. Same story. Ranges are cool, loops get stuff done.
Oct 26 2015
next sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 10/26/2015 08:31 PM, Ulrich Küttler wrote:
 On Wednesday, 21 October 2015 at 18:49:26 UTC, Andrei Alexandrescu wrote:
 On 10/21/2015 12:25 PM, Jonathan M Davis wrote:
 My experience with immutable containers is that their performance is
 trash precisely because you can't mutate them.
That's actually the experience in the Scala community. Over and again people start with immutable containers all over the place because they're cool, and end up with mutable containers because they work. -- Andrei
Ranges and loops. Same story. Ranges are cool, loops get stuff done.
This kind of reasoning sounds cool but is ultimately misguided. (I don't think the stories are even analogous.) Persistent containers have a /different interface/ than mutable containers and hence can be used in different algorithms, where they can help getting low asymptotic resource usage. Obviously there is not much point in using persistent containers just because one considers them "cool" if what one really wants is a mutable container. Ranges can be iterated, loops iterate them. If one wants to get stuff done, one uses library primitives when they apply and implements the remaining functionality oneself. When one notices that certain patterns begin to repeat, one will sometimes take a short time to factor them out in order to make progress faster. If they are patterns of iteration, this means writing an opApply or sometimes a range.
Oct 26 2015
parent Ulrich =?UTF-8?B?S8O8dHRsZXI=?= <kuettler gmail.com> writes:
On Monday, 26 October 2015 at 20:46:21 UTC, Timon Gehr wrote:
 On 10/26/2015 08:31 PM, Ulrich Küttler wrote:
 On Wednesday, 21 October 2015 at 18:49:26 UTC, Andrei 
 Alexandrescu wrote:
 On 10/21/2015 12:25 PM, Jonathan M Davis wrote:
 My experience with immutable containers is that their 
 performance is
 trash precisely because you can't mutate them.
That's actually the experience in the Scala community. Over and again people start with immutable containers all over the place because they're cool, and end up with mutable containers because they work. -- Andrei
Ranges and loops. Same story. Ranges are cool, loops get stuff done.
This kind of reasoning sounds cool but is ultimately misguided. (I don't think the stories are even analogous.)
Nobody argues against ranges. The argument is: Designing range-based code results in something very different from traditional loop-based code. There are very good reasons why the effort is worthwhile, still, the effort is real. (See the calendar example.) Even those library primitives do not come for free. The same is true about containers. The container variant is a design choice, not an implementation detail. Now, are persistent containers worth the effort? What would a design based on persistent containers look like? Is there a good way to implement them in D? Who knows.
Oct 26 2015
prev sibling next sibling parent rsw0x <anonymous anonymous.com> writes:
On Monday, 26 October 2015 at 19:31:26 UTC, Ulrich Küttler wrote:
 On Wednesday, 21 October 2015 at 18:49:26 UTC, Andrei 
 Alexandrescu wrote:
 On 10/21/2015 12:25 PM, Jonathan M Davis wrote:
 My experience with immutable containers is that their 
 performance is
 trash precisely because you can't mutate them.
That's actually the experience in the Scala community. Over and again people start with immutable containers all over the place because they're cool, and end up with mutable containers because they work. -- Andrei
Ranges and loops. Same story. Ranges are cool, loops get stuff done.
Heavily disagree. If ranges are only cool, java 8 streams wouldn't be so massively successful and ranges wouldn't be part of C++17. D's ranges are one of the main reasons I use D.
Oct 26 2015
prev sibling parent lobo <swamplobo gmail.com> writes:
On Monday, 26 October 2015 at 19:31:26 UTC, Ulrich Küttler wrote:
 On Wednesday, 21 October 2015 at 18:49:26 UTC, Andrei 
 Alexandrescu wrote:
 On 10/21/2015 12:25 PM, Jonathan M Davis wrote:
 My experience with immutable containers is that their 
 performance is
 trash precisely because you can't mutate them.
That's actually the experience in the Scala community. Over and again people start with immutable containers all over the place because they're cool, and end up with mutable containers because they work. -- Andrei
Ranges and loops. Same story. Ranges are cool, loops get stuff done.
You are conflating ranges and iteration. A range needs something to drive it, whether it be an explicit loop or some higher abstraction from std.algorithm. bye, lobo
Oct 26 2015
prev sibling parent Kagamin <spam here.lot> writes:
On Wednesday, 21 October 2015 at 16:25:19 UTC, Jonathan M Davis 
wrote:
 My experience with immutable containers is that their 
 performance is trash precisely because you can't mutate them.
Well, in context of concurrency immutable containers compete with concurrent containers, not with mutable single-threaded ones. On Wednesday, 21 October 2015 at 14:06:43 UTC, Jonathan M Davis wrote:
 3. Eager value containers.
_Maybe_ this makes sense for specific container types, but in general, it's a horrible idea IMHO. Almost all containers involve allocating something on the heap internally, so the fact that they're treated as values doesn't avoid heap allocations, and reference-counting reference containers solves the problem of the containers living past the end of the lifetime of whatever owns them. And having value type containers is incredibly error-prone - both from an efficiency standpoint and correctness standpoint.
I suppose stl containers are value types because everything is value type in C++, they're just consistent. And then you have 3 or so ways to create a value type and 6 ways to pass it around by reference, and you choose what you want.
Oct 22 2015
prev sibling next sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 10/21/2015 01:05 PM, Andrei Alexandrescu wrote:
 I'm finally getting the cycles to get to thinking about Design by
 Introspection containers. First off, there are several general
 categories of containers. I think D should support all properly. One
 question is which we go for in the standard library.

 1. Functional containers.

 These are immutable; once created, neither their topology nor their
 elements may be observably changed. Manipulating a container entails
 creating an entire new container, often based on an existing container
 (e.g. append takes a container and an element and creates a whole new
 container).

 Internally, functional containers take advantage of common substructure
 and immutability to share actual data. The classic resource for defining
 and implementing functional containers is
 http://www.amazon.com/Purely-Functional-Structures-Chris-Okasaki/dp/0521663504.

 ...
I still think those should be mutable by default in order to have painless interchangeability with other value type containers. Why should corresponding ephemeral and persistent containers have different interfaces? I assume you envision code using those to look as follows? FunSet!int a; a=a.with_(1); auto b=a; a=a.with_(2); a=a.with_(3); // b = {1}, a={1,2,3}; I think this should be allowed too: FunSet!int a; a.insert(1); auto b=a; a.insert(2); a.insert(3); // b = {1}, a={1,2,3}; One of the two versions should be automatically implemented via UFCS.
 2. Reference containers.

 These have classic reference semantics (à la Java). Internally, they may be
implemented either as class objects or as reference counted structs.

 They're by default mutable. Qualifiers should apply to them gracefully.

 3. Eager value containers.

 These are STL-style. Somewhat surprisingly I think these are the worst of the
pack; they expensively duplicate at the drop of a hat and need to be carefully
passed around by reference lest performance silently drops. Nevertheless, when
used as members inside other data structures value semantics might be the
appropriate choice. Also, thinking of them as values often makes code simpler.

 By default eager value containers are mutable. They should support immutable
and const meaningfully.
 4. Copy-on-write containers.

 These combine the advantages of value and reference containers: you get
 to think of them as values, yet they're not expensive to copy. Copying
 only occurs by necessity upon the first attempt to change them.
 ...
IMO "1." ought to combine the advantages of value and reference containers as well, just without any expensive copying at all, even when updates happen.
 The disadvantage is implementations get somewhat complicated. Also, they
 are shunned in C++ because there is no proper support for COW; for
 example, COW strings have been banned starting with C++11 which is quite
 the bummer.

 Together with Scott Meyers, Walter figured out a way to change D to
 support COW properly. The language change consists of two attributes.

 =======

 I'll attempt to implement a few versions of each and see what they look
 like. The question here is what containers are of interest for D's
 standard library.
 ...
List: - forward iteration - bidirectional iteration Stack: - basic stack - ordered stack [0] Queue: - basic queue - heap Set: - hash set - ordered set - accumulating set [1] - trie/radix tree + multiset versions Map: - hash map - ordered map - accumulating map [1] - accumulating map with range update [2] - trie/radix tree - accumulating trie [1] - accumulating trie with range update [2] + multimap versions - array - accumulating array [1] - accumulating array with range update [2] + O(1) reset versions [3] - rope - accumulating rope [1] - accumulating rope with range update [2] [0] ordered stack: Push operations automatically pop the minimal number of elements off the stack prior to pushing, such as to guarantee that the elements on the stack remain ordered. The stack should expose a sorted range in order to support binary search. [1] accumulation: The (ordered!) data structure allows fast queries for the result of some binary associative operations on the elements in a certain range. (the allowed operations are determined in advance and some intermediate results are automatically maintained). (for map, the operation would be just on the values, not on the keys.) This is usually quite easy support, but very useful. [2] range update: here, the idea is that the data structure allows all elements in a certain range to be updated. the updates are performed lazily and have to be compatible with the associative operations (if any). [3] fast reset: here the idea is that the map allows fast reset of its values at the cost of some small additional overhead per lookup. (destructors are called lazily.)
Oct 21 2015
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 10/21/2015 10:21 AM, Timon Gehr wrote:

 I still think those should be mutable by default in order to have
 painless interchangeability with other value type containers. Why should
 corresponding ephemeral and persistent containers have different
 interfaces?

 I assume you envision code using those to look as follows?

 FunSet!int a;
 a=a.with_(1);
 auto b=a;
 a=a.with_(2);
 a=a.with_(3);
 // b = {1}, a={1,2,3};

 I think this should be allowed too:

 FunSet!int a;
 a.insert(1);
 auto b=a;
 a.insert(2);
 a.insert(3);
 // b = {1}, a={1,2,3};

 One of the two versions should be automatically implemented via UFCS.
I recall you and I discussed this briefly in the past, but forgot the conclusion. So are you saying that a.insert(1) is really rebinding a to be its former value plus a new slot? I.e a.insert(1) is the same as a = a.with_(1)? That would work, but only if former reads of a refer to the old data. E.g.: FunSet!int a; auto b = a; assert(a.empty && b.empty); a.insert(1); assert(!a.empty && b.empty); Right? Andrei
Oct 21 2015
parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 10/21/2015 04:36 PM, Andrei Alexandrescu wrote:
 On 10/21/2015 10:21 AM, Timon Gehr wrote:

 I still think those should be mutable by default in order to have
 painless interchangeability with other value type containers. Why should
 corresponding ephemeral and persistent containers have different
 interfaces?

 I assume you envision code using those to look as follows?

 FunSet!int a;
 a=a.with_(1);
 auto b=a;
 a=a.with_(2);
 a=a.with_(3);
 // b = {1}, a={1,2,3};

 I think this should be allowed too:

 FunSet!int a;
 a.insert(1);
 auto b=a;
 a.insert(2);
 a.insert(3);
 // b = {1}, a={1,2,3};

 One of the two versions should be automatically implemented via UFCS.
I recall you and I discussed this briefly in the past, but forgot the conclusion. ...
IIRC the conclusion was postponed. :-)
 So are you saying that a.insert(1) is really rebinding a to be its
 former value plus a new slot? I.e a.insert(1) is the same as a =
 a.with_(1)? That would work, but only if former reads of a refer to the
 old data. E.g.:

 FunSet!int a;
 auto b = a;
 assert(a.empty && b.empty);
 a.insert(1);
 assert(!a.empty && b.empty);

 Right?


 Andrei
Exactly. There would be no mutable aliasing. This way, the persistent data type can behave like a mutable value type, such as a COW or eager copy variant, but with nicer/different performance characteristics. It would be great if exchanging different implementation strategies for value semantics will be as seamless as possible. (If the initially chosen strategy turns out to be wrong, this makes the fix easy.) Of course, the implementation of the persistent data structure will be in terms of non-mutating and possibly O(1)-COW operations only.
Oct 21 2015
next sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 10/21/2015 05:08 PM, Timon Gehr wrote:
 There would be no mutable aliasing.
Here, I mean within the data structure itself. There is nothing wrong with: class Cell{ int x=0; } FunSet!Cell a; a.insert(new Cell()); auto b=a; foreach(c;a) c.x=1; assert(b.x==1); This is analogous to: struct SingletonFunSet(T){ T element; } auto a=SingletonFunSet!Cell(new Cell()); auto b=a; a.element.x=1; assert(b.x==1); Here, SingletonFunSet!Cell is a value type, but it's constituents might not be.
Oct 21 2015
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 10/21/2015 11:13 AM, Timon Gehr wrote:
 On 10/21/2015 05:08 PM, Timon Gehr wrote:
 There would be no mutable aliasing.
Here, I mean within the data structure itself. There is nothing wrong with: class Cell{ int x=0; } FunSet!Cell a; a.insert(new Cell()); auto b=a; foreach(c;a) c.x=1; assert(b.x==1); This is analogous to: struct SingletonFunSet(T){ T element; } auto a=SingletonFunSet!Cell(new Cell()); auto b=a; a.element.x=1; assert(b.x==1); Here, SingletonFunSet!Cell is a value type, but it's constituents might not be.
It seems to me that's a departure from traditional persistent data structures. Those have immutable elements; far as I can tell you discuss containers that only have immutable topology. -- Andrei
Oct 21 2015
parent Timon Gehr <timon.gehr gmx.ch> writes:
On 10/21/2015 07:25 PM, Andrei Alexandrescu wrote:
 It seems to me that's a departure from traditional persistent data
 structures.
I don't think so.
 Those have immutable elements;
Immutable insofar as the elements themselves don't change. It is easy to create a persistent list of immutable references to mutable data in Haskell, for instance.
 far as I can tell you discuss
 containers that only have immutable topology. -- Andrei
The topology as well as all the elements are immutable, just not in the 'transitive qualifier' way. Immutable references to mutable data are useful, they just don't have built-in language support. Persistent data structures work perfectly fine with those.
Oct 21 2015
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 10/21/2015 11:08 AM, Timon Gehr wrote:
 Exactly. There would be no mutable aliasing. This way, the persistent
 data type can behave like a mutable value type, such as a COW or eager
 copy variant, but with nicer/different performance characteristics. It
 would be great if exchanging different implementation strategies for
 value semantics will be as seamless as possible. (If the initially
 chosen strategy turns out to be wrong, this makes the fix easy.) Of
 course, the implementation of the persistent data structure will be in
 terms of non-mutating and possibly O(1)-COW operations only.
Makes sense. Now that we got to talk - did you submit the bugs you found with DIP25? -- Andrei
Oct 21 2015
parent Timon Gehr <timon.gehr gmx.ch> writes:
On 10/21/2015 07:24 PM, Andrei Alexandrescu wrote:
 On 10/21/2015 11:08 AM, Timon Gehr wrote:
 Exactly. There would be no mutable aliasing. This way, the persistent
 data type can behave like a mutable value type, such as a COW or eager
 copy variant, but with nicer/different performance characteristics. It
 would be great if exchanging different implementation strategies for
 value semantics will be as seamless as possible. (If the initially
 chosen strategy turns out to be wrong, this makes the fix easy.) Of
 course, the implementation of the persistent data structure will be in
 terms of non-mutating and possibly O(1)-COW operations only.
Makes sense. Now that we got to talk - did you submit the bugs you found with DIP25? -- Andrei
Yup: https://issues.dlang.org/buglist.cgi?email1=timon&emailreporter1=1&emailtype1=substring&list_id=204001&query_format=advanced&resolution=---&short_desc=DIP25&short_desc_type=allwordssubstr
Oct 21 2015
prev sibling next sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 10/21/2015 01:05 PM, Andrei Alexandrescu wrote:
 2. Reference containers.

 These have classic reference semantics (à la Java). Internally, they may
 be implemented either as class objects or as reference counted structs.

 They're by default mutable. Qualifiers should apply to them gracefully.

 3. Eager value containers.

 These are STL-style. Somewhat surprisingly I think these are the worst
 of the pack; they expensively duplicate at the drop of a hat and need to
 be carefully passed around by reference lest performance silently drops.
 Nevertheless, when used as members inside other data structures value
 semantics might be the appropriate choice. Also, thinking of them as
 values often makes code simpler.

 By default eager value containers are mutable. They should support
 immutable and const meaningfully.
For which containers we want to support is "2." not a (wrapper around a) pointer to "3."?
Oct 21 2015
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 10/21/2015 11:58 AM, Timon Gehr wrote:
 For which containers we want to support is "2." not a (wrapper around a)
 pointer to "3."?
For those that need reference counting. -- Andrei
Oct 21 2015
prev sibling next sibling parent reply Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= writes:
5. Lock-free data structures.
Oct 21 2015
parent Dmitry Olshansky <dmitry.olsh gmail.com> writes:
On 21-Oct-2015 19:13, Ola Fosheim Grøstad wrote:
 5. Lock-free data structures.
More generally - concurrent. It may have fine-grained locking, it may be obstruction-free or even wait-free. -- Dmitry Olshansky
Oct 21 2015
prev sibling next sibling parent reply Brad Anderson <eco gnuk.net> writes:
On Wednesday, 21 October 2015 at 11:05:12 UTC, Andrei 
Alexandrescu wrote:
[snip]
 2. Reference containers.

 These have classic reference semantics (à la Java). Internally, 
 they may be implemented either as class objects or as reference 
 counted structs.

 They're by default mutable. Qualifiers should apply to them 
 gracefully.

 3. Eager value containers.

 These are STL-style. Somewhat surprisingly I think these are 
 the worst of the pack; they expensively duplicate at the drop 
 of a hat and need to be carefully passed around by reference 
 lest performance silently drops. Nevertheless, when used as 
 members inside other data structures value semantics might be 
 the appropriate choice. Also, thinking of them as values often 
 makes code simpler.

 By default eager value containers are mutable. They should 
 support immutable and const meaningfully.
Having both reference and value semantics for containers would be great. I don't understand why reference semantics would be implemented by the container themselves though. Why not a general purpose RC! (or RefCounted! if the current design is deemed sufficient) that can apply to anything, including containers? Then you'd only need to implement the value semantic containers (and maybe throw in some RC version aliases to promote the use of the RC versions so the option isn't overlooked). It seems kind of crazy that anything in D that wants to be reference counted would need to implement the logic themselves. If there are performance advantages (I haven't thought of any but perhaps there are) to bake the RC right into the container it might also be possible to use DbI take advantage of it in RC! when appropriate. It just seems so wrong to implement reference counting dozens of times independently, especially when that means implementing all the containers twice too. Perhaps I'm misreading what you meant though and this is what you intended all along.
 4. Copy-on-write containers.

 These combine the advantages of value and reference containers: 
 you get to think of them as values, yet they're not expensive 
 to copy. Copying only occurs by necessity upon the first 
 attempt to change them.

 The disadvantage is implementations get somewhat complicated. 
 Also, they are shunned in C++ because there is no proper 
 support for COW; for example, COW strings have been banned 
 starting with C++11 which is quite the bummer.

 Together with Scott Meyers, Walter figured out a way to change 
 D to support COW properly. The language change consists of two 
 attributes.
Nice to have this option. I don't think I'd use it much though.
Oct 21 2015
next sibling parent reply Jonathan M Davis <jmdavisProg gmx.com> writes:
On Wednesday, 21 October 2015 at 16:36:46 UTC, Brad Anderson 
wrote:
 On Wednesday, 21 October 2015 at 11:05:12 UTC, Andrei 
 Alexandrescu wrote:
 [snip]
 2. Reference containers.

 These have classic reference semantics (à la Java). 
 Internally, they may be implemented either as class objects or 
 as reference counted structs.

 They're by default mutable. Qualifiers should apply to them 
 gracefully.

 3. Eager value containers.

 These are STL-style. Somewhat surprisingly I think these are 
 the worst of the pack; they expensively duplicate at the drop 
 of a hat and need to be carefully passed around by reference 
 lest performance silently drops. Nevertheless, when used as 
 members inside other data structures value semantics might be 
 the appropriate choice. Also, thinking of them as values often 
 makes code simpler.

 By default eager value containers are mutable. They should 
 support immutable and const meaningfully.
Having both reference and value semantics for containers would be great. I don't understand why reference semantics would be implemented by the container themselves though. Why not a general purpose RC! (or RefCounted! if the current design is deemed sufficient) that can apply to anything, including containers? Then you'd only need to implement the value semantic containers (and maybe throw in some RC version aliases to promote the use of the RC versions so the option isn't overlooked). It seems kind of crazy that anything in D that wants to be reference counted would need to implement the logic themselves. If there are performance advantages (I haven't thought of any but perhaps there are) to bake the RC right into the container it might also be possible to use DbI take advantage of it in RC! when appropriate. It just seems so wrong to implement reference counting dozens of times independently, especially when that means implementing all the containers twice too.
If we had value type containers and reference type containers, I would assume that they would at least share implementation, and maybe the reference types would just be wrappers around the value types. However, I completely fail to understand why you'd ever want a container that was a value type. In my experience, it's very error-prone and adds no value. It just makes it too easy to accidentally copy a container, and it can be pretty easy to have an iterator, range, etc. referring to a container that's already been destroyed (similar to having a dynamic array referring to a static array that's left scope). As long as the containers have a dup method (or whatever we call it) so that they can be copied when you do want to copy them, I would think that that was more than enough. What do you get with a value type container that you consider better than a reference type? It's not like it lives on the stack as a value type. Most of the container's guts are on the heap regardless. - Jonathan M Davis
Oct 21 2015
next sibling parent reply jmh530 <john.michael.hall gmail.com> writes:
On Wednesday, 21 October 2015 at 18:05:07 UTC, Jonathan M Davis 
wrote:
 However, I completely fail to understand why you'd ever want a 
 container that was a value type. In my experience, it's very 
 error-prone and adds no value.
Are you saying there isn't a reason to use static arrays?
Oct 21 2015
parent reply Jonathan M Davis <jmdavisProg gmx.com> writes:
On Wednesday, 21 October 2015 at 18:14:39 UTC, jmh530 wrote:
 On Wednesday, 21 October 2015 at 18:05:07 UTC, Jonathan M Davis 
 wrote:
 However, I completely fail to understand why you'd ever want a 
 container that was a value type. In my experience, it's very 
 error-prone and adds no value.
Are you saying there isn't a reason to use static arrays?
A static array is of a fixed size, which almost no other containers are. It also lives entirely on the stack, which almost no other containers do. If there's a container that lives entirely on the stack, then maybe it would make sense for it to be a value type, but _very_ few containers fall in that category, and all of the classic containers like vector, linked list, map, etc. have no business being value types IMHO. It's just error-prone. Heck, static arrays are quite error-prone thanks to the fact that they convert to dynamic arrays, but they do serve a purpose. So, maybe there are containers that fall in the same category, but I expect that such containers are pretty obviously value types and not reference types, because their nature makes them that way. Regardless, I don't see how it's reasonable in general to make a container be a value type. It's just asking for trouble. If there's any question at all whether a container should be a value type or a reference type, IMHO, it should be a reference type. - Jonathan M Davis
Oct 21 2015
next sibling parent "H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:
On Wed, Oct 21, 2015 at 06:38:08PM +0000, Jonathan M Davis via Digitalmars-d
wrote:
 On Wednesday, 21 October 2015 at 18:14:39 UTC, jmh530 wrote:
[...]
Are you saying there isn't a reason to use static arrays?
A static array is of a fixed size, which almost no other containers are. It also lives entirely on the stack, which almost no other containers do.
[...] You forget that static arrays can also be struct/class members, and in the latter case they are not necessarily on the stack. You do have a point that they are of fixed size, though, and as such are of limited use as containers. T -- "I'm running Windows '98." "Yes." "My computer isn't working now." "Yes, you already said that." -- User-Friendly
Oct 21 2015
prev sibling parent reply jmh530 <john.michael.hall gmail.com> writes:
On Wednesday, 21 October 2015 at 18:38:10 UTC, Jonathan M Davis 
wrote:
 A static array is of a fixed size, which almost no other 
 containers are. It also lives entirely on the stack, which 
 almost no other containers do. If there's a container that 
 lives entirely on the stack, then maybe it would make sense for 
 it to be a value type, but _very_ few containers fall in that 
 category, and all of the classic containers like vector, linked 
 list, map, etc. have no business being value types IMHO. It's 
 just error-prone. Heck, static arrays are quite error-prone 
 thanks to the fact that they convert to dynamic arrays, but 
 they do serve a purpose. So, maybe there are containers that 
 fall in the same category, but I expect that such containers 
 are pretty obviously value types and not reference types, 
 because their nature makes them that way. Regardless, I don't 
 see how it's reasonable in general to make a container be a 
 value type. It's just asking for trouble. If there's any 
 question at all whether a container should be a value type or a 
 reference type, IMHO, it should be a reference type.
I do find myself mixing up static and dynamic arrays... I'll generally defer to you. I suppose within the context of Andrei working on containers with std.allocator, then containers that only live on the stack seems beyond the scope anyway. Your statement just had struck me as weird as I had thought of people like manu saying at various points that they only kept data structures on the stack. My next thought was that static arrays are usually on the stack and are value types. So I figured that whatever the performance people are using were similar.
Oct 21 2015
parent reply Jonathan M Davis <jmdavisProg gmx.com> writes:
On Wednesday, 21 October 2015 at 20:13:31 UTC, jmh530 wrote:
 On Wednesday, 21 October 2015 at 18:38:10 UTC, Jonathan M Davis 
 wrote:
 A static array is of a fixed size, which almost no other 
 containers are. It also lives entirely on the stack, which 
 almost no other containers do. If there's a container that 
 lives entirely on the stack, then maybe it would make sense 
 for it to be a value type, but _very_ few containers fall in 
 that category, and all of the classic containers like vector, 
 linked list, map, etc. have no business being value types 
 IMHO. It's just error-prone. Heck, static arrays are quite 
 error-prone thanks to the fact that they convert to dynamic 
 arrays, but they do serve a purpose. So, maybe there are 
 containers that fall in the same category, but I expect that 
 such containers are pretty obviously value types and not 
 reference types, because their nature makes them that way. 
 Regardless, I don't see how it's reasonable in general to make 
 a container be a value type. It's just asking for trouble. If 
 there's any question at all whether a container should be a 
 value type or a reference type, IMHO, it should be a reference 
 type.
I do find myself mixing up static and dynamic arrays... I'll generally defer to you. I suppose within the context of Andrei working on containers with std.allocator, then containers that only live on the stack seems beyond the scope anyway. Your statement just had struck me as weird as I had thought of people like manu saying at various points that they only kept data structures on the stack. My next thought was that static arrays are usually on the stack and are value types. So I figured that whatever the performance people are using were similar.
Static arrays are a bit of a special case, because they're essentially a chunk of memory on the stack. They can be extremely useful and efficient, but you have to be careful with them. Most containers don't act anything like them. But like I said, there are some container types which inherently make sense on a stack, but most don't. For the most part, the kinds of containers that we're talking about here - vectors/array lists, linked lists, maps, sets, etc. - have to have most of their guts living on the heap. You'd have to _really_ contort things to put them entirely on the stack, and you'd be seriously limiting how many elements that they could hold if you did. If you make a vector or a linked list a value type, then it'll have a few members on the stack, but most of them will be on the heap, and it'll be a value type only because its postblit constructor explicitly copies the portion that's on the heap. It's not really naturally a value type. It's naturally more of a reference type or pseudo-reference type. If the vector is implemented as a reference type, then the few member variables that were on the stack in the value type version end up on the heap with the rest of the container's guts, so you use slightly more memory for them, but it's much more natural that way, since the main guts of the container were on the heap anyway. And if the container is a value type, you risk copying unnecessarily and/or keeping around references/pointers/iterators/ranges to its elements after it's been destroyed, whereas with a reference type, it'll only be copied when you specifically request it. Treating them as value types is inefficient, error-prone, and unnatural. Now, folks like Manu do do some crazy stuff eke out every bit of performance that they can, and they do sometimes use stuff like alloca to put stuff that would otherwise be on the heap on the stack. But a lot of what they do isn't very general purpose either, and they generally avoid stuff like the containers in standard libraries, because they use the heap too much and/or aren't tailored to their needs. They also tend to do stuff that's way more error-prone in an effort to eke out that extra bit of performance. If we can do stuff to support them, then great, but in general - especially when you're talking about the library and not the language - the kinds of stuff that they want don't work for other folks. If someone like Manu wants an efficient container, he _might_ want one that's not a full-on reference type simply so that less of it is on the heap, but he sure isn't going to want to copy it except when he absolutely has to. So, the fact that it's a value type isn't necessarily useful. Having it simply live where it's constructed (be it on the stack or as a member variable) with only the stuff that has to be on the heap on the heap but have the container have a disabled postblit would probably be just as useful in most cases, and scoped or region allactors should be able to help with that. I'm not super familiar with all of the tricks that they pull to make their container use efficient, but I have a hard time seeing how std::vector's semantics are desirable to them. Certainly, when they care about efficiency to the point that they're using static arrays everywhere, the fact that vector has its guts on the heap means that it's in a different class of containers entirely from static arrays and does not have the same desirable properties, even if it is a value type like static arrays. And even then, the fact that static arrays are value types isn't really the gain. It's more that they're on the stack, and that naturally turns them into value types. In any case, I'm pretty sure that we do have a region allocator in std.experimental.allocator, so if someone wants to try and put a container that normally lives on the heap on the stack, it should be possible (at least, with a limited number of elements). And it might actually be easier to do that with a container that's entirely on the heap instead of part of it living on the stack and part of it living on the heap (as occurs with the C++ containers). If Manu (or others like him) have strong feelings about what kinds of containers would be best for them, they can speak up and point out what they need, but I don't see how having value type containers would generally be useful to them except in the cases where the container actually, fully lives on the stack (which is almost never the case), and it seems even more questionable for everyone else. - Jonathan M Davis
Oct 21 2015
parent Timon Gehr <timon.gehr gmx.ch> writes:
On 10/21/2015 11:02 PM, Jonathan M Davis wrote:
 Static arrays are a bit of a special case, because they're essentially a
 chunk of memory on the stack. They can be extremely useful and
 efficient, but you have to be careful with them. Most containers don't
 act anything like them.

 But like I said, there are some container types which inherently make
 sense on a stack, but most don't. For the most part, the kinds of
 containers that we're talking about here - vectors/array lists, linked
 lists, maps, sets, etc. - have to have most of their guts living on the
 heap. You'd have to _really_ contort things to put them entirely on the
 stack,
In many applications, most containers are small. Small containers can have all their guts on the stack.
 and you'd be seriously limiting how many elements that they could
 hold if you did.
Not really. Just allocate on the heap only if there are too many elements.
 If you make a vector or a linked list a value type,
 then it'll have a few members on the stack, but most of them will be on
 the heap, and it'll be a value type only because its postblit
 constructor explicitly copies the portion that's on the heap. It's not
 really naturally a value type.
Value semantics does not mean slow copy, I assume you conflate value semantics and eager copy?
 It's naturally more of a reference type
 or pseudo-reference type.
What you describe is "naturally" (which I interpret to mean: if postblit must be O(1)) a unique reference type. I.e. the most natural way out would be to add disable this(this). But implicit copy can be convenient. Maybe have implicit copy as a wrapper?
Oct 21 2015
prev sibling next sibling parent reply Brad Anderson <eco gnuk.net> writes:
On Wednesday, 21 October 2015 at 18:05:07 UTC, Jonathan M Davis 
wrote:
 On Wednesday, 21 October 2015 at 16:36:46 UTC, Brad Anderson 
 wrote:
 On Wednesday, 21 October 2015 at 11:05:12 UTC, Andrei 
 Alexandrescu wrote:
 [snip]
 2. Reference containers.

 These have classic reference semantics (à la Java). 
 Internally, they may be implemented either as class objects 
 or as reference counted structs.

 They're by default mutable. Qualifiers should apply to them 
 gracefully.

 3. Eager value containers.

 These are STL-style. Somewhat surprisingly I think these are 
 the worst of the pack; they expensively duplicate at the drop 
 of a hat and need to be carefully passed around by reference 
 lest performance silently drops. Nevertheless, when used as 
 members inside other data structures value semantics might be 
 the appropriate choice. Also, thinking of them as values 
 often makes code simpler.

 By default eager value containers are mutable. They should 
 support immutable and const meaningfully.
Having both reference and value semantics for containers would be great. I don't understand why reference semantics would be implemented by the container themselves though. Why not a general purpose RC! (or RefCounted! if the current design is deemed sufficient) that can apply to anything, including containers? Then you'd only need to implement the value semantic containers (and maybe throw in some RC version aliases to promote the use of the RC versions so the option isn't overlooked). It seems kind of crazy that anything in D that wants to be reference counted would need to implement the logic themselves. If there are performance advantages (I haven't thought of any but perhaps there are) to bake the RC right into the container it might also be possible to use DbI take advantage of it in RC! when appropriate. It just seems so wrong to implement reference counting dozens of times independently, especially when that means implementing all the containers twice too.
If we had value type containers and reference type containers, I would assume that they would at least share implementation, and maybe the reference types would just be wrappers around the value types. However, I completely fail to understand why you'd ever want a container that was a value type.
Well they are more efficient when used correctly (no inc/decs). Using them correctly does take some additional knowledge and consideration though. C++11 did take care of a whole class of implicit copying and RVO has long taken care of another class of them. I don't think it happens (in C++) quite as often as you are making it out to be (at least in my experience) but I'll admit I have had a bug or two caused by modifying a copy when I thought I was changing a reference. I could just as easily have bugs caused by modifying a reference when I thought I was modifying a copy though. Either approach comes with its own considerations.
 In my experience, it's very error-prone and adds no value.
Unless we end up with compiler assisted inc/dec elision they are always going to be faster. Perhaps marginally faster but still faster and the realtime guys are always going to demand the fastest option.
 It just makes it too easy to accidentally copy a container, and 
 it can be pretty easy to have an iterator, range, etc. 
 referring to a container that's already been destroyed (similar 
 to having a dynamic array referring to a static array that's 
 left scope).
Having the ranges of a container keep the reference count would make the speed advantage of the value types even more of a factor. That might actually change the performance advantage away from being just subtle to being strong. We'd need to benchmark to say for sure, of course. Just a sidenote regarding iterator invalidation: C++, when following the new C++ Core Guidelines, is actually able to do static analysis of iterators for memory safety without ref count overhead. Herb Sutter showed a preview version of MSVC doing it during his keynote this month at CppCon[1]. It blew me away that they were about to do that with C++.
 As long as the containers have a dup method (or whatever we 
 call it) so that they can be copied when you do want to copy 
 them, I would think that that was more than enough. What do you 
 get with a value type container that you consider better than a 
 reference type? It's not like it lives on the stack as a value 
 type. Most of the container's guts are on the heap regardless.

 - Jonathan M Davis
All that said, I agree with you that the implicit copying isn't really all that desirable. Perhaps there is a better way. What if instead of value semantics we had unique containers (akin to std.typecons.Unique or unique_ptr in C++) and required people to always state whether it was a copy or move? That takes care of the error prone implicit copying while retaining the performance characteristics. 1. https://www.youtube.com/watch?v=hEx5DNLWGgA
Oct 21 2015
parent reply Jonathan M Davis <jmdavisProg gmx.com> writes:
On Wednesday, 21 October 2015 at 18:48:18 UTC, Brad Anderson 
wrote:
 All that said, I agree with you that the implicit copying isn't 
 really all that desirable. Perhaps there is a better way. What 
 if instead of value semantics we had unique containers (akin to 
 std.typecons.Unique or unique_ptr in C++) and required people 
 to always state whether it was a copy or move? That takes care 
 of the error prone implicit copying while retaining the 
 performance characteristics.
Well, we could always have a wrapper for reference type containers that turns them into a value type, though I'm not sure that that really does much in terms of the efficiency you were talking about. But it could means that you'd end up passing around a pointer to that rather than the container itself, in which case, maybe that solves the problem for those who don't want a container to be managed by the GC and don't want it to be ref-counted when passing it around? I don't know. Even if we're talking about a reference type container that's just straight up ref-counted or managed by the GC, there are still ways to pass around pointers or pseudo pointers to it which don't involve ref-counting if that's the real concern. Though honestly, I'd be concerned if a container was being passed around so much that ref-counting was a concern. A range would be better in most cases - but you really wouldn't want to be passing around the container as a value type in either case, because that would just copy it. So, if ref-counting is the concern, I wouldn't think that it would be all that hard for someone who cares to pass around something else which referred to the container and wasn't ref-counted. - Jonathan M Davis
Oct 21 2015
parent Ice Cream Man <thegreatone google.com> writes:
On Wednesday, 21 October 2015 at 19:07:50 UTC, Jonathan M Davis 
wrote:
 On Wednesday, 21 October 2015 at 18:48:18 UTC, Brad Anderson 
 wrote:
 All that said, I agree with you that the implicit copying 
 isn't really all that desirable. Perhaps there is a better 
 way. What if instead of value semantics we had unique 
 containers (akin to std.typecons.Unique or unique_ptr in C++) 
 and required people to always state whether it was a copy or 
 move? That takes care of the error prone implicit copying 
 while retaining the performance characteristics.
Well, we could always have a wrapper for reference type containers that turns them into a value type,
ugh...that sounds horrible.
Oct 21 2015
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 10/21/2015 02:05 PM, Jonathan M Davis wrote:
 If we had value type containers and reference type containers, I would
 assume that they would at least share implementation, and maybe the
 reference types would just be wrappers around the value types.
Well, I thought the same. I was surprised.
 However,
 I completely fail to understand why you'd ever want a container that was
 a value type.
Sometimes you want a value container as a member of a struct which in turn is a value type. Andrei
Oct 21 2015
parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Wednesday, 21 October 2015 at 19:16:28 UTC, Andrei 
Alexandrescu wrote:
 Sometimes you want a value container as a member of a struct 
 which in turn is a value type.
Sure, but at least in theory, the struct's postblit should be able to copy a reference type container via dup or whatnot if that's what you really want (though the issues with const and immutable interacting badly with postblit remain), and creating a whole set of value type variants of containers just for cases like that (which are sometimes useful but usually a bad idea) seems like overkill. - Jonathan M Davis
Oct 21 2015
prev sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 10/21/2015 12:36 PM, Brad Anderson wrote:
 I don't understand why reference semantics would be implemented by the
 container themselves though. Why not a general purpose RC! (or
 RefCounted! if the current design is deemed sufficient) that can apply
 to anything, including containers?
Two reasons: 1. A generic RC wrapper cannot be made safe (or at least I don't know how). 2. A container designed for RC can make advantageous layout decisions. Andrei
Oct 21 2015
prev sibling next sibling parent reply Ulrich Kuettler <kuettler gmail.com> writes:
On Wednesday, 21 October 2015 at 11:05:12 UTC, Andrei 
Alexandrescu wrote:
 I'm finally getting the cycles to get to thinking about Design 
 by Introspection containers. First off, there are several 
 general categories of containers. I think D should support all 
 properly. One question is which we go for in the standard 
 library.

 1. Functional containers.
Very much in favor of functional (or persistent) containers. Immutable. Value semantics. This might take some getting used to, but if properly designed these containers would be killer for D. A dream come true.
 These are immutable; once created, neither their topology nor 
 their elements may be observably changed. Manipulating a 
 container entails creating an entire new container, often based 
 on an existing container (e.g. append takes a container and an 
 element and creates a whole new container).

 Internally, functional containers take advantage of common 
 substructure and immutability to share actual data. The classic 
 resource for defining and implementing functional containers is 
 http://www.amazon.com/Purely-Functional-Structures-Chris-Okasaki/dp/0521663504.
An insanely beautiful example is clojure's PersistantVector. AFAIK it made its way into Scala, too.
Oct 21 2015
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 10/21/2015 01:42 PM, Ulrich Kuettler wrote:
 On Wednesday, 21 October 2015 at 11:05:12 UTC, Andrei Alexandrescu wrote:
 I'm finally getting the cycles to get to thinking about Design by
 Introspection containers. First off, there are several general
 categories of containers. I think D should support all properly. One
 question is which we go for in the standard library.

 1. Functional containers.
Very much in favor of functional (or persistent) containers. Immutable. Value semantics. This might take some getting used to, but if properly designed these containers would be killer for D. A dream come true.
I agree they're cool.
 These are immutable; once created, neither their topology nor their
 elements may be observably changed. Manipulating a container entails
 creating an entire new container, often based on an existing container
 (e.g. append takes a container and an element and creates a whole new
 container).

 Internally, functional containers take advantage of common
 substructure and immutability to share actual data. The classic
 resource for defining and implementing functional containers is
 http://www.amazon.com/Purely-Functional-Structures-Chris-Okasaki/dp/0521663504.
An insanely beautiful example is clojure's PersistantVector. AFAIK it made its way into Scala, too.
Far as I understand from http://hypirion.com/musings/understanding-persistent-vector-pt-1, it's that tree thing with carefully chosen slot sizes for cache friendliness? Andrei
Oct 21 2015
parent Ulrich =?UTF-8?B?S8O8dHRsZXI=?= <kuettler gmail.com> writes:
On Wednesday, 21 October 2015 at 18:55:10 UTC, Andrei 
Alexandrescu wrote:
 On 10/21/2015 01:42 PM, Ulrich Kuettler wrote:
 On Wednesday, 21 October 2015 at 11:05:12 UTC, Andrei 
 Alexandrescu wrote:
 I'm finally getting the cycles to get to thinking about 
 Design by
 Introspection containers. First off, there are several general
 categories of containers. I think D should support all 
 properly. One
 question is which we go for in the standard library.

 1. Functional containers.
Very much in favor of functional (or persistent) containers. Immutable. Value semantics. This might take some getting used to, but if properly designed these containers would be killer for D. A dream come true.
I agree they're cool.
 These are immutable; once created, neither their topology nor 
 their
 elements may be observably changed. Manipulating a container 
 entails
 creating an entire new container, often based on an existing 
 container
 (e.g. append takes a container and an element and creates a 
 whole new
 container).

 Internally, functional containers take advantage of common
 substructure and immutability to share actual data. The 
 classic
 resource for defining and implementing functional containers 
 is
 http://www.amazon.com/Purely-Functional-Structures-Chris-Okasaki/dp/0521663504.
An insanely beautiful example is clojure's PersistantVector. AFAIK it made its way into Scala, too.
Far as I understand from http://hypirion.com/musings/understanding-persistent-vector-pt-1, it's that tree thing with carefully chosen slot sizes for cache friendliness?
Yes. It is about cache friendliness, memory efficiency and speed. Since each node consists of 32 pointers, there are just three inner levels to store 1M values. The bit operations to navigate to the leaf node are a thing of beauty: https://github.com/clojure/clojure/blob/master/src/jvm/clojure/lang/PersistentVector.java#L147-L163 It is a remarkable design on the JVM. Of course, it serves a very particular purpose. D's design space is very different. Obviously.
Oct 21 2015
prev sibling next sibling parent reply Zz <zz nospam.com> writes:
On Wednesday, 21 October 2015 at 11:05:12 UTC, Andrei 
Alexandrescu wrote:
 I'm finally getting the cycles to get to thinking about Design 
 by Introspection containers. First off, there are several 
 general categories of containers. I think D should support all 
 properly. One question is which we go for in the standard 
 library.

 1. Functional containers.

 These are immutable; once created, neither their topology nor 
 their elements may be observably changed. Manipulating a 
 container entails creating an entire new container, often based 
 on an existing container (e.g. append takes a container and an 
 element and creates a whole new container).

 Internally, functional containers take advantage of common 
 substructure and immutability to share actual data. The classic 
 resource for defining and implementing functional containers is 
 http://www.amazon.com/Purely-Functional-Structures-Chris-Okasaki/dp/0521663504.

 2. Reference containers.

 These have classic reference semantics (à la Java). Internally, 
 they may be implemented either as class objects or as reference 
 counted structs.

 They're by default mutable. Qualifiers should apply to them 
 gracefully.

 3. Eager value containers.

 These are STL-style. Somewhat surprisingly I think these are 
 the worst of the pack; they expensively duplicate at the drop 
 of a hat and need to be carefully passed around by reference 
 lest performance silently drops. Nevertheless, when used as 
 members inside other data structures value semantics might be 
 the appropriate choice. Also, thinking of them as values often 
 makes code simpler.

 By default eager value containers are mutable. They should 
 support immutable and const meaningfully.

 4. Copy-on-write containers.

 These combine the advantages of value and reference containers: 
 you get to think of them as values, yet they're not expensive 
 to copy. Copying only occurs by necessity upon the first 
 attempt to change them.

 The disadvantage is implementations get somewhat complicated. 
 Also, they are shunned in C++ because there is no proper 
 support for COW; for example, COW strings have been banned 
 starting with C++11 which is quite the bummer.

 Together with Scott Meyers, Walter figured out a way to change 
 D to support COW properly. The language change consists of two 
 attributes.

 =======

 I'll attempt to implement a few versions of each and see what 
 they look like. The question here is what containers are of 
 interest for D's standard library.


 Andrei
While looking at containers take a look at Jiri Soukup's work some good ideas could come from there. http://www.amazon.ca/Serialization-Persistent-Objects-Structures-Efficient/dp/3642393225/ref=sr_1_1?ie=UTF8&amp;qid=1386946808&amp;sr=8-1&amp;keywords=SERIALIZATION+AND+PERSISTENT+OBJECTS#productPromotions%22%20target=%22_blank Intrusive Data Structures. http://www.codefarms.com/home http://www.codefarms.com/dol http://www.codefarms.com/ppf http://www.codefarms.com/ptl Zz
Oct 21 2015
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 10/21/2015 02:18 PM, Zz wrote:
 While looking at containers take a look at Jiri Soukup's work some good
 ideas could come from there.

 http://www.amazon.ca/Serialization-Persistent-Objects-Structures-Efficient/dp/3642393225/ref=sr_1_1?ie=UTF8&amp;qid=1386946808&amp;sr=8-1&amp;keywords=SERIALIZATION+AND+PERSISTENT+OBJECTS#productPromotions%22%20target=%22_blank


 Intrusive Data Structures.
 http://www.codefarms.com/home
 http://www.codefarms.com/dol
 http://www.codefarms.com/ppf
 http://www.codefarms.com/ptl
Ah, Codefarms is still around. Nice. I remember I've been reading Soukup's articles back in the days of the C++ Report magazine. I couldn't make sense of them, nor could any of my friends. -- Andrei
Oct 21 2015
parent reply Zz <zz nospam.com> writes:
On Wednesday, 21 October 2015 at 19:17:52 UTC, Andrei 
Alexandrescu wrote:
 On 10/21/2015 02:18 PM, Zz wrote:
 While looking at containers take a look at Jiri Soukup's work 
 some good
 ideas could come from there.

 http://www.amazon.ca/Serialization-Persistent-Objects-Structures-Efficient/dp/3642393225/ref=sr_1_1?ie=UTF8&amp;qid=1386946808&amp;sr=8-1&amp;keywords=SERIALIZATION+AND+PERSISTENT+OBJECTS#productPromotions%22%20target=%22_blank


 Intrusive Data Structures.
 http://www.codefarms.com/home
 http://www.codefarms.com/dol
 http://www.codefarms.com/ppf
 http://www.codefarms.com/ptl
Ah, Codefarms is still around. Nice. I remember I've been reading Soukup's articles back in the days of the C++ Report magazine. I couldn't make sense of them, nor could any of my friends. -- Andrei
The following links give an out line of DOL (The main library), most of the articles talked about the newer PTL (at that time). For DOL http://www.codefarms.com/dolclasses http://www.codefarms.com/docs/dol/index.htm for the list of available organisations. As for PTL the following gives an outline. http://www.codefarms.com/docs/ptl/chap4.htm DOL was Macro based while PTL used templates. Zz
Oct 21 2015
parent reply Zz <zz nospam.com> writes:
On Wednesday, 21 October 2015 at 22:12:32 UTC, Zz wrote:
 On Wednesday, 21 October 2015 at 19:17:52 UTC, Andrei 
 Alexandrescu wrote:
 On 10/21/2015 02:18 PM, Zz wrote:
 While looking at containers take a look at Jiri Soukup's work 
 some good
 ideas could come from there.

 http://www.amazon.ca/Serialization-Persistent-Objects-Structures-Efficient/dp/3642393225/ref=sr_1_1?ie=UTF8&amp;qid=1386946808&amp;sr=8-1&amp;keywords=SERIALIZATION+AND+PERSISTENT+OBJECTS#productPromotions%22%20target=%22_blank


 Intrusive Data Structures.
 http://www.codefarms.com/home
 http://www.codefarms.com/dol
 http://www.codefarms.com/ppf
 http://www.codefarms.com/ptl
Ah, Codefarms is still around. Nice. I remember I've been reading Soukup's articles back in the days of the C++ Report magazine. I couldn't make sense of them, nor could any of my friends. -- Andrei
The following links give an out line of DOL (The main library), most of the articles talked about the newer PTL (at that time). For DOL http://www.codefarms.com/dolclasses http://www.codefarms.com/docs/dol/index.htm for the list of available organisations. As for PTL the following gives an outline. http://www.codefarms.com/docs/ptl/chap4.htm DOL was Macro based while PTL used templates. Zz
Sorry the list of available organisations is here: http://www.codefarms.com/docs/dol/ch11aorg.htm Zz
Oct 21 2015
parent reply KlausO <oberhofer users.sf.net> writes:
Am 22.10.2015 um 00:15 schrieb Zz:
 On Wednesday, 21 October 2015 at 22:12:32 UTC, Zz wrote:
 On Wednesday, 21 October 2015 at 19:17:52 UTC, Andrei Alexandrescu wrote:
 On 10/21/2015 02:18 PM, Zz wrote:
 While looking at containers take a look at Jiri Soukup's work some good
 ideas could come from there.

 http://www.amazon.ca/Serialization-Persistent-Objects-Structures-Efficient/dp/3642393225/ref=sr_1_1?ie=UTF8&amp;qid=1386946808&amp;sr=8-1&amp;keywords=SERIALIZATION+AND+PERSISTENT+OBJECTS#productPromotions%22%20target=%22_blank



 Intrusive Data Structures.
 http://www.codefarms.com/home
 http://www.codefarms.com/dol
 http://www.codefarms.com/ppf
 http://www.codefarms.com/ptl
Ah, Codefarms is still around. Nice. I remember I've been reading Soukup's articles back in the days of the C++ Report magazine. I couldn't make sense of them, nor could any of my friends. -- Andrei
The following links give an out line of DOL (The main library), most of the articles talked about the newer PTL (at that time). For DOL http://www.codefarms.com/dolclasses http://www.codefarms.com/docs/dol/index.htm for the list of available organisations. As for PTL the following gives an outline. http://www.codefarms.com/docs/ptl/chap4.htm DOL was Macro based while PTL used templates. Zz
Sorry the list of available organisations is here: http://www.codefarms.com/docs/dol/ch11aorg.htm Zz
Intrusive data structures have their strengths especially when nodes are part of several containers. I implemented some of the intrusive containers back in D1 times. See http://dsource.org/projects/nova/browser/trunk/nova/ds/intrusive KlausO
Oct 22 2015
parent reply safety0ff <safety0ff.dev gmail.com> writes:
On Thursday, 22 October 2015 at 07:13:58 UTC, KlausO wrote:
 Intrusive data structures have their strengths especially when 
 nodes are
 part of several containers.
 I implemented some of the intrusive containers back in D1 times.
 See

 http://dsource.org/projects/nova/browser/trunk/nova/ds/intrusive

 KlausO
I also like having an intrusive container library in my toolbox: they don't limit membership to one container and they don't "bake" memory management into the container type. http://www.boost.org/doc/libs/1_59_0/doc/html/intrusive.html
Oct 22 2015
parent safety0ff <safety0ff.dev gmail.com> writes:
On Thursday, 22 October 2015 at 14:14:09 UTC, safety0ff wrote:
 I also like having an intrusive container library in my 
 toolbox: they don't limit membership to one container and they 
 don't "bake" memory management into the container type.
Also wanted to mention that this allows you to store variable sized data directly in the container without being forced to use a fixed size structure with a pointer to the variable portion. Which is useful for improving cache locality and reducing memory usage.
Oct 22 2015
prev sibling next sibling parent ponce <contact gam3sfrommars.fr> writes:
On Wednesday, 21 October 2015 at 11:05:12 UTC, Andrei 
Alexandrescu wrote:
 The question here is what containers are of interest for D's 
 standard library.
I think the ones in the STL work well. So I'd like something along these lines. Something like std::vector would fit 90% of my needs (eg: a slice using allocators). No interest in lock-free or immutable containers from here.
Oct 21 2015
prev sibling next sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 10/21/2015 01:05 PM, Andrei Alexandrescu wrote:
 I'll attempt to implement a few versions of each and see what they look
 like. The question here is what containers are of interest for D's
 standard library.
There should probably also be wrappers that implement optimizations for small containers.
Oct 21 2015
parent reply Brad Anderson <eco gnuk.net> writes:
On Wednesday, 21 October 2015 at 20:19:27 UTC, Timon Gehr wrote:
 On 10/21/2015 01:05 PM, Andrei Alexandrescu wrote:
 I'll attempt to implement a few versions of each and see what 
 they look
 like. The question here is what containers are of interest for 
 D's
 standard library.
There should probably also be wrappers that implement optimizations for small containers.
I think the allocators can probably deal with that detail. If I remember correctly there is an in-place allocator. I've been using Boost.Container's small_vector lately (basically a generalization of the now ubiquitous short string optimization...though I'm not sure if it repurposes the pointer when the item count is sufficiently small). It's a nice option to have. The Container library now also has static_vector (fixed capacity, changeable size) that fills a use gap between std::array and std::vector;
Oct 21 2015
next sibling parent Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= writes:
On Wednesday, 21 October 2015 at 21:53:11 UTC, Brad Anderson 
wrote:
 to have. The Container library now also has static_vector 
 (fixed capacity, changeable size) that fills a use gap between 
 std::array and std::vector;
Yes, in C++ I reimplement this over and over... Trivial, but boring.
Oct 21 2015
prev sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 10/21/2015 11:53 PM, Brad Anderson wrote:
 On Wednesday, 21 October 2015 at 20:19:27 UTC, Timon Gehr wrote:
 On 10/21/2015 01:05 PM, Andrei Alexandrescu wrote:
 I'll attempt to implement a few versions of each and see what they look
 like. The question here is what containers are of interest for D's
 standard library.
There should probably also be wrappers that implement optimizations for small containers.
I think the allocators can probably deal with that detail. If I remember correctly there is an in-place allocator. ...
Good point, but might it not be the case that alternative implementations of some operations are faster for a small number of elements?
Oct 21 2015
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 10/21/2015 06:04 PM, Timon Gehr wrote:
 Good point, but might it not be the case that alternative
 implementations of some operations are faster for a small number of
 elements?
A container could take advantage of the knowledge for both layout and primitives implementation. As I like to say in one of my seminar: "Few elements mean structure". -- Andrei
Oct 21 2015
prev sibling next sibling parent reply bitwise <bitwise.pvt gmail.com> writes:
I needed containers, so I implemented a few:
https://github.com/bitwise-github/d-collections

Currently included:

List(T) - contiguous list like std::vector
LinkedList(T) - doubly linked list like std::list
Queue(T) - double-ended queue/contiguous circular buffer
Stack(T) - contiguous fifo stack

I would like to eventually add some kind(s) of hash table(s), but 
it's a huge topic, and I don't immediately need one.

I wrote all of the containers from scratch, so they are an 
accurate/current representation of what I would like to see.

The highlights:

1) both value and reference usage is possible.

example: <list.d>
struct ListBase { ... }   // value type list
alias RefCounted!ListBase List;   // ref counted list

List!int  list1;             // list1 is ref counted
ListBase!int  list2;    // list2 is a value type

I initially had this reversed, where List(T) was a value type, 
and ListRef(T) was a reference type, but I flipped it the other 
way to avoid unneeded copying for naive usage. I'm still not sure 
this is the right direction though. Optimization is a job of it's 
own, and it's a job for a pro. I believe the amount of effort 
spent to idiot-proof(for lack of a better term) the containers 
should be limited. Especially with the power of D's ranges, I 
think containers should generally(and in my code, usually do) 
stay put. IMO, If someone wants to pass something around, they 
should pass a range, not the container itself. My solution, 
however, does support both(ref/value), if needed.

What's currently being done in std.container.Array(T) is very 
limiting. With my approach, you have the option of value type or 
reference type. You do not have this option with the current 
"reference type" containers. Also, while you're okay if you get a 
range to the container, there is extra indirection every time you 
access the container through opIndex(), back(), front(), etc.. I 
shouldn't have to pay for this extra indirection/allocation, 
especially when the containers are about to be designed from 
scratch with this issue fully known. By embedding a reference 
count in the container, you limit the maximum achievable 
performance for what should be(IMO) an obscure use case(passing a 
container around). Again though, my solution allows both usages.

2) Cursors(iterators)

My Cursors are similar to C++ iterators, but have several key 
differences.

a) There is no begin/end/rbegin/rend/cbegin/cend/crbegin/crend... 
The cursor knows it limits, and is considered "alive" until it 
has been advanced(by one) beyond either end of the container. 
Similar to an empty range, an invalidated Cursor is gone for 
good, but can be copied ahead of time, before iteration.

b) cursors can be joined to form a range. Unlike C++ however, 
both cursors must point to elements in the container since there 
is no begin/end, and the join is inclusive. There is no "past the 
end of the container" type of Cursor to refer to. Joining two 
cursors can, at the minimum, result in a Range with a length of 1.

example:

List!int a = [1, 2, 3];
auto cur1 = a.first;
auto cur2 = a.last;
auto rng1 = cur1 ~ cur2; // join two cursors
assert(rng1.equal([1, 2, 3]));
auto rng2 = ~cur1; // convert cursor to range
assert(rng2.equal([1]));
assert(cur1 && cur1.alive);
--cur1;
assert(!cur1 && !cur1.alive);

for(auto c = a.first; c; ++c)
	writeln(*c);

My containers' find() methods return cursors, not ranges. The 
result is that code like this actually removes the element you 
intended to remove, and nothing else:

a.remove( a.find(1) );

find() should not return elements that do not match the query the 
programmer supplied as an argument, which is currently the case 
with range-based find().

I'm still considering whether the traditional C++ 
approach(begin/end/etc..) is a better idea, but haven't reach a 
conclusion. I do know, however, that ranges alone are not enough.

3) Gravy.

My containers contain, by default, most of what you'll need to 
use them. The counter argument is code-reuse, but my containers 
are lightyears away from prohibitively complex, so I think the 
way I've done things is fine. Including core functionality into 
containers allows for more efficient implementations in some 
cases. For example, a LinkedList(T) sort. This approach is also 
much friendlier to documentation and intellisense, and allows a 
programmer to find everything they need in one place.

--

There is still work to do, of course.


On Wednesday, 21 October 2015 at 11:05:12 UTC, Andrei 
Alexandrescu wrote:
Design by Introspection.
I prefer simple, predictable constructs.
 2. Reference containers.
 3. Eager value containers.
Both. Bit
Oct 21 2015
parent reply deadalnix <deadalnix gmail.com> writes:
On Thursday, 22 October 2015 at 02:13:12 UTC, bitwise wrote:
 I needed containers, so I implemented a few:
 https://github.com/bitwise-github/d-collections

 Currently included:

 List(T) - contiguous list like std::vector
 LinkedList(T) - doubly linked list like std::list
 Queue(T) - double-ended queue/contiguous circular buffer
 Stack(T) - contiguous fifo stack
You got such a good start with the first 2 ones, but somehow missed on consistency. You should name the 2 other ones QueueList and StackList.
 I initially had this reversed, where List(T) was a value type, 
 and ListRef(T) was a reference type, but I flipped it the other 
 way to avoid unneeded copying for naive usage. I'm still not 
 sure this is the right direction though. Optimization is a job 
 of it's own, and it's a job for a pro. I believe the amount of 
 effort spent to idiot-proof(for lack of a better term) the 
 containers should be limited. Especially with the power of D's 
 ranges, I think containers should generally(and in my code, 
 usually do) stay put. IMO, If someone wants to pass something 
 around, they should pass a range, not the container itself. My 
 solution, however, does support both(ref/value), if needed.
Reference types or COW are definitively the way to go for the naive road. Think about it, 40% to 50% of chrome memory is std::string . Eager copy is not a healthy default.
 There is still work to do, of course.
The elephant in the room: make the template parameter's type qualifier transitive with the collection's qualifier.
Oct 21 2015
next sibling parent reply bitwise <bitwise.pvt gmail.com> writes:
On Thursday, 22 October 2015 at 05:09:38 UTC, deadalnix wrote:
 On Thursday, 22 October 2015 at 02:13:12 UTC, bitwise wrote:
 I needed containers, so I implemented a few:
 https://github.com/bitwise-github/d-collections

 Currently included:

 List(T) - contiguous list like std::vector
 LinkedList(T) - doubly linked list like std::list
 Queue(T) - double-ended queue/contiguous circular buffer
 Stack(T) - contiguous fifo stack
You got such a good start with the first 2 ones, but somehow missed on consistency. You should name the 2 other ones QueueList and StackList.
Actually, I was thinking of calling them QueueSeq and StackSeq ;)
 I initially had this reversed, where List(T) was a value type, 
 and ListRef(T) was a reference type, but I flipped it the 
 other way to avoid unneeded copying for naive usage. I'm still 
 not sure this is the right direction though. Optimization is a 
 job of it's own, and it's a job for a pro. I believe the 
 amount of effort spent to idiot-proof(for lack of a better 
 term) the containers should be limited. Especially with the 
 power of D's ranges, I think containers should generally(and 
 in my code, usually do) stay put. IMO, If someone wants to 
 pass something around, they should pass a range, not the 
 container itself. My solution, however, does support 
 both(ref/value), if needed.
Reference types or COW are definitively the way to go for the naive road. Think about it, 40% to 50% of chrome memory is std::string . Eager copy is not a healthy default.
I can't ignore how many times I've heard people talking about eager-copy containers in C++ and how they're a pain. I believe the value-type option should be available though. Maybe the idea of default-ref-type containers could be taken a step further to hide the value types from casual users: struct List(T) { struct Core { ... } // specified as a fully useable container on it's own RefCounted!Core core; alias core this; } List!int.Core list; // value type container Unlike the current std.container.Array(T) though, the payload inside the container should be fully useable on it's own. I don't understand exactly how a COW container would work. Is this what's done with D strings right now? I've got a bad feeling about this idea. Containers have been around a long time, and I don't think it would be prudent to stray to far from traditional approaches.
 There is still work to do, of course.
The elephant in the room: make the template parameter's type qualifier transitive with the collection's qualifier.
Not sure exactly what you mean by this. Currently, Ranges and Cursors are templated on the type of the container, so If the container is const, you get a const Range or Cursor. I haven't gone over the containers with a fine tooth comb yet, so if you can point anything out, it would be helpful. Bit
Oct 22 2015
parent reply Jonathan M Davis <jmdavisProg gmx.com> writes:
On Thursday, 22 October 2015 at 15:26:30 UTC, bitwise wrote:
 On Thursday, 22 October 2015 at 05:09:38 UTC, deadalnix wrote:
 The elephant in the room: make the template parameter's type 
 qualifier transitive with the collection's qualifier.
Not sure exactly what you mean by this. Currently, Ranges and Cursors are templated on the type of the container, so If the container is const, you get a const Range or Cursor. I haven't gone over the containers with a fine tooth comb yet, so if you can point anything out, it would be helpful.
LOL. const and ranges do _not_ mix well - in part because popFront inherently requires that the range be mutable - but mostly because of templates. If you have const(T)[], and you then you slap another const on it - const(const(T)[]) - the compiler knows full-well that that's equivalent to const(T[]). In fact, it even knows that if you slice a const(T[]), it's perfectly safe for the result to be const(T)[], because it's a different array, and by keeping the elements const, it won't screw with the elements of the one it was sliced from (since they're the same elements). But what about with ranges? If you have Range!(const T) and then you do const Range!(const T), is that equivalent to const(Range!T)? Maybe, maybe not. The compiler certainly doesn't treat it as such - and it can't. Similarly, if you have const(Range!T) or const(Range!(const T)), and you slice the range, the compiler doesn't magically know that the slice can be Range!(const T). As far as the compiler is concerned, we're dealing with different types here. After all, what if the Range type had something like this in it struct Range(T) { static if(is(T == const)) { // do something differently here } } The entire guts of Range could be completely different simply due to a difference in constness. Template constraints could also be used to overload on constness. So, unlike with arrays, there really is no guarantee that two instantiations of the same template are at all related to one another even when they seem to only differ by constness. Heck, to even attempt to have a range type which has an opSlice which returns a tail-const slice _requires_ that you have static ifs checking for constness, or you end up with a recursive template instantiation. So, even without getting containers involved, const and ranges do _not_ mix well, even though const and arrays get along fantastically. And when you have to start considering what a containers opSlice is going to return an how it's going to deal with const, things get that much worse (e.g. is it even going to work to have opSlice be const). And depending on the container's internals, having a range only have const access to its underlying container could be problematic. IIRC, std.container.Array has had some issues with const related to its internals not having been designed in a way that worked well with D's const. So, const throws a very interesting wrench into the mix when dealing with either ranges or containers, and the fact that const(Foo!T) is not necessarily the same as const(Foo!(const T)) can definitely be problematic. - Jonathan M Davis
Oct 22 2015
parent reply bitwise <bitwise.pvt gmail.com> writes:
On Thursday, 22 October 2015 at 15:53:21 UTC, Jonathan M Davis 
wrote:
 On Thursday, 22 October 2015 at 15:26:30 UTC, bitwise wrote:
 On Thursday, 22 October 2015 at 05:09:38 UTC, deadalnix wrote:
 The elephant in the room: make the template parameter's type 
 qualifier transitive with the collection's qualifier.
Not sure exactly what you mean by this. Currently, Ranges and Cursors are templated on the type of the container, so If the container is const, you get a const Range or Cursor. I haven't gone over the containers with a fine tooth comb yet, so if you can point anything out, it would be helpful.
LOL. const and ranges do _not_ mix well - in part because popFront inherently requires that the range be mutable - but mostly because of templates. If you have const(T)[], and you then you slap another const on it - const(const(T)[]) - the compiler knows full-well that that's equivalent to const(T[]). In fact, it even knows that if you slice a const(T[]), it's perfectly safe for the result to be const(T)[], because it's a different array, and by keeping the elements const, it won't screw with the elements of the one it was sliced from (since they're the same elements). But what about with ranges? If you have Range!(const T) and then you do const Range!(const T), is that equivalent to const(Range!T)? Maybe, maybe not. The compiler certainly doesn't treat it as such - and it can't. Similarly, if you have const(Range!T) or const(Range!(const T)), and you slice the range, the compiler doesn't magically know that the slice can be Range!(const T). As far as the compiler is concerned, we're dealing with different types here. After all, what if the Range type had something like this in it struct Range(T) { static if(is(T == const)) { // do something differently here } } The entire guts of Range could be completely different simply due to a difference in constness. Template constraints could also be used to overload on constness. So, unlike with arrays, there really is no guarantee that two instantiations of the same template are at all related to one another even when they seem to only differ by constness. Heck, to even attempt to have a range type which has an opSlice which returns a tail-const slice _requires_ that you have static ifs checking for constness, or you end up with a recursive template instantiation. So, even without getting containers involved, const and ranges do _not_ mix well, even though const and arrays get along fantastically. And when you have to start considering what a containers opSlice is going to return an how it's going to deal with const, things get that much worse (e.g. is it even going to work to have opSlice be const). And depending on the container's internals, having a range only have const access to its underlying container could be problematic. IIRC, std.container.Array has had some issues with const related to its internals not having been designed in a way that worked well with D's const. So, const throws a very interesting wrench into the mix when dealing with either ranges or containers, and the fact that const(Foo!T) is not necessarily the same as const(Foo!(const T)) can definitely be problematic. - Jonathan M Davis
Maybe look at the code next time before you LOL...... Bit
Oct 22 2015
parent reply Jonathan M Davis <jmdavisProg gmx.com> writes:
On Thursday, 22 October 2015 at 16:29:19 UTC, bitwise wrote:
 Maybe look at the code next time before you LOL......
My point would be the same regardless. Range!(const T) and const(Range!T) - and Container!(const T) and const(Container!T) - have no relation as far as the compiler is concerned, and that makes dealing with const correctly a royal pain - particularly if you want a range to act like an array like it's supposed to be able to do (at least insofar as they have the same operations). You asked what deadalnix meant by
 The elephant in the room: make the template parameter's type 
 qualifier transitive
 with the collection's qualifier.
and I tried to explain. - Jonathan M Davis
Oct 22 2015
parent bitwise <bitwise.pvt gmail.com> writes:
On Thursday, 22 October 2015 at 17:13:48 UTC, Jonathan M Davis 
wrote:
 On Thursday, 22 October 2015 at 16:29:19 UTC, bitwise wrote:
 Maybe look at the code next time before you LOL......
My point would be the same regardless. Range!(const T) and const(Range!T) - and Container!(const T) and const(Container!T) - have no relation as far as the compiler is concerned, and that makes dealing with const correctly a royal pain - particularly if you want a range to act like an array like it's supposed to be able to do (at least insofar as they have the same operations). You asked what deadalnix meant by
 The elephant in the room: make the template parameter's type 
 qualifier transitive
 with the collection's qualifier.
and I tried to explain. - Jonathan M Davis
I'm not even sure you're talking about the same thing as deadalnix, and I think I have already done what he's asking. example: List!int a = [1, 2]; writeln( typeof(a[]).stringof ); writeln( typeof(a[].front).stringof ); writeln( is(typeof({ a[].popFront(); })).stringof ); const(List!int) b = [1, 2]; writeln( typeof(b[]).stringof ); writeln( typeof(b[].front).stringof ); writeln( is(typeof({ b[].popFront(); })).stringof ); output: Range!(ListBase!int) int true Range!(const(ListBase!int)) const(int) true Of course, neither of the following are possible because of the design of D's const. import std.container; Array!(const(int)) a; import collections; List!(const(int)) b; Both will produce compilation errors. Bit
Oct 22 2015
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 10/22/15 1:09 AM, deadalnix wrote:
 The elephant in the room: make the template parameter's type qualifier
 transitive with the collection's qualifier.
Could you please give more detail on this? Thanks! -- Andrei
Oct 23 2015
parent reply deadalnix <deadalnix gmail.com> writes:
On Friday, 23 October 2015 at 11:03:37 UTC, Andrei Alexandrescu 
wrote:
 On 10/22/15 1:09 AM, deadalnix wrote:
 The elephant in the room: make the template parameter's type 
 qualifier
 transitive with the collection's qualifier.
Could you please give more detail on this? Thanks! -- Andrei
Sure. We have a problem when it come to collection in the fact that type qualifier do not turtle down as one would expect. Collection!T and Collection!const(T) are 2 completely different types. There is a good reason for this : static if (is(T == const)) { ... } . As a result thing like : void foo(T)(const Collection!const(T) c) {} void main() { Collection!T c; foo(c); // Error, GTFO ! } With the different qualifiers and implicit conversion, thing become quite tricky. You can simulate them partially with a set of carefully crafted alias this (but not having multiple alias this doesn't make things any simpler) but there is the monster of mutually recursive template instanciation that is lurking. As far as i know, jmdavis had some good work done on this, but it was far from perfect.
Oct 23 2015
next sibling parent bitwise <bitwise.pvt gmail.com> writes:
On Friday, 23 October 2015 at 17:44:55 UTC, deadalnix wrote:

 Sure. We have a problem when it come to collection in the fact 
 that type qualifier do not turtle down as one would expect.

 Collection!T and Collection!const(T) are 2 completely different 
 types. There is a good reason for this : static if (is(T == 
 const)) { ... } . As a result thing like :

 void foo(T)(const Collection!const(T) c) {}
 void main() {
   Collection!T c;
   foo(c); // Error, GTFO !
 }

 With the different qualifiers and implicit conversion, thing 
 become quite tricky. You can simulate them partially with a set 
 of carefully crafted alias this (but not having multiple alias 
 this doesn't make things any simpler) but there is the monster 
 of mutually recursive template instanciation that is lurking.

 As far as i know, jmdavis had some good work done on this, but 
 it was far from perfect.
Why not just pass a range to foo?
Oct 23 2015
prev sibling next sibling parent reply bigsandwich <bigsandwich gmail.com> writes:
On Friday, 23 October 2015 at 17:44:55 UTC, deadalnix wrote:
 On Friday, 23 October 2015 at 11:03:37 UTC, Andrei Alexandrescu 
 wrote:
 [...]
Sure. We have a problem when it come to collection in the fact that type qualifier do not turtle down as one would expect. [...]
Its not just type qualifiers. Containers of derived types have the same problem. This is also a problem in C++.
Oct 23 2015
next sibling parent reply bitwise <bitwise.pvt gmail.com> writes:
On Friday, 23 October 2015 at 23:21:31 UTC, bigsandwich wrote:
 On Friday, 23 October 2015 at 17:44:55 UTC, deadalnix wrote:
 On Friday, 23 October 2015 at 11:03:37 UTC, Andrei 
 Alexandrescu wrote:
 [...]
Sure. We have a problem when it come to collection in the fact that type qualifier do not turtle down as one would expect. [...]
Its not just type qualifiers. Containers of derived types have the same problem. This is also a problem in C++.
Thinking deeper about this, it _should_ be the case that deadalnix's example doesn't compile. struct Container(T) { static if(is(T == const)) int changesLayout; // etc... int stuff; } Bit
Oct 23 2015
parent Timon Gehr <timon.gehr gmx.ch> writes:
On 10/24/2015 01:36 AM, bitwise wrote:
 On Friday, 23 October 2015 at 23:21:31 UTC, bigsandwich wrote:
 On Friday, 23 October 2015 at 17:44:55 UTC, deadalnix wrote:
 On Friday, 23 October 2015 at 11:03:37 UTC, Andrei Alexandrescu wrote:
 [...]
Sure. We have a problem when it come to collection in the fact that type qualifier do not turtle down as one would expect. [...]
Its not just type qualifiers. Containers of derived types have the same problem. This is also a problem in C++.
Thinking deeper about this, it _should_ be the case that deadalnix's example doesn't compile. struct Container(T) { static if(is(T == const)) int changesLayout; // etc... int stuff; } Bit
One could introduce a way to indicate that const-conversions should be performed for instantiations of a given templated aggregate with identical layouts.
Oct 24 2015
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 10/23/2015 07:21 PM, bigsandwich wrote:
 On Friday, 23 October 2015 at 17:44:55 UTC, deadalnix wrote:
 On Friday, 23 October 2015 at 11:03:37 UTC, Andrei Alexandrescu wrote:
 [...]
Sure. We have a problem when it come to collection in the fact that type qualifier do not turtle down as one would expect. [...]
Its not just type qualifiers. Containers of derived types have the same problem. This is also a problem in C++.
This is something easy to live with. In fact, mutable containers are not supposed to even convert to containers of base objects. -- Andrei
Oct 25 2015
parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 10/25/2015 08:31 PM, Andrei Alexandrescu wrote:
 On 10/23/2015 07:21 PM, bigsandwich wrote:
 On Friday, 23 October 2015 at 17:44:55 UTC, deadalnix wrote:
 On Friday, 23 October 2015 at 11:03:37 UTC, Andrei Alexandrescu wrote:
 [...]
Sure. We have a problem when it come to collection in the fact that type qualifier do not turtle down as one would expect. [...]
Its not just type qualifiers. Containers of derived types have the same problem. This is also a problem in C++.
This is something easy to live with. In fact, mutable containers are not supposed to even convert to containers of base objects. -- Andrei
This is true for containers with reference semantics, but not for containers with value semantics. This compiles (D code): class A{} class B: A{} void main(){ A[2] a; B[2] b; a=b; } This does not compile (C++ code): class A{}; class B: A{}; int main(){ vector<A*> a; vector<B*> b; a=b; } However, the conversion would be safe. For persistent and COW containers, the copy would even be fast.
Oct 25 2015
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 10/25/2015 05:06 PM, Timon Gehr wrote:
 class A{};
 class B: A{};

 int main(){
      vector<A*> a;
      vector<B*> b;
      a=b;
 }

 However, the conversion would be safe.
Agreed. I don't see that as an important matter though; it's after all a coercion so a function call is plenty fine. -- Andrei
Oct 25 2015
prev sibling next sibling parent reply David Nadlinger <code klickverbot.at> writes:
On Friday, 23 October 2015 at 17:44:55 UTC, deadalnix wrote:
 Collection!T and Collection!const(T) are 2 completely different 
 types.
Isn't this also required anyway because of covariance vs. contravariance considerations? — David
Oct 24 2015
next sibling parent Timon Gehr <timon.gehr gmx.ch> writes:
On 10/24/2015 09:33 PM, David Nadlinger wrote:
 On Friday, 23 October 2015 at 17:44:55 UTC, deadalnix wrote:
 Collection!T and Collection!const(T) are 2 completely different types.
Isn't this also required anyway because of covariance vs. contravariance considerations? — David
(I'm assuming 'this' is referring to a solution to the problem.) Yes, basically one often wants bit-by-bit conversions between structs whose fields corecursively convert to each other bit-by-bit. If we had opImplicitCastTo, this could probably be implemented (somewhat inefficiently) in the library, but a targeted language feature might be in order.
Oct 24 2015
prev sibling parent reply deadalnix <deadalnix gmail.com> writes:
On Saturday, 24 October 2015 at 19:33:03 UTC, David Nadlinger 
wrote:
 On Friday, 23 October 2015 at 17:44:55 UTC, deadalnix wrote:
 Collection!T and Collection!const(T) are 2 completely 
 different types.
Isn't this also required anyway because of covariance vs. contravariance considerations? — David
It is close, but not exactly the same. Covariance/contravariance can be emutalted via alias this without too much trouble for a container (however, it is hard to ensure correctness, but I'd assume not too hard). On the other hand, the qualifier thing turtle from the collection to the element in the collection, which is not that easy to achieve.
Oct 24 2015
parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Sunday, 25 October 2015 at 05:37:02 UTC, deadalnix wrote:
 On Saturday, 24 October 2015 at 19:33:03 UTC, David Nadlinger 
 wrote:
 On Friday, 23 October 2015 at 17:44:55 UTC, deadalnix wrote:
 Collection!T and Collection!const(T) are 2 completely 
 different types.
Isn't this also required anyway because of covariance vs. contravariance considerations? — David
It is close, but not exactly the same. Covariance/contravariance can be emutalted via alias this without too much trouble for a container (however, it is hard to ensure correctness, but I'd assume not too hard). On the other hand, the qualifier thing turtle from the collection to the element in the collection, which is not that easy to achieve.
The bigger problem is probably ranges rather than containers. We get tail-const slices from arrays all the time, but opSlice with no arguments isn't really a range operation (none of the range traits require it), and it's pretty painful to get it to make tail-const work with opSlice anyway (though I know that the EMSI guys were jumping through all kinds of hoops to make it work, so their containers make actually handle it reasonably well). Plus there's the fun problem of how declaring opSlice on a range causes foreach to call it, which essentially means that your range can get saved accidentally, which can be problematic (especially for ranges like RefRange). It's been a while since I sat down and worked through the various issues that we have here. I should probably do that soon and see if I can distill them down well enough to come up with a DIP to resolve them - or at least to clearly present them so that they can be discussed. What we have right now mostly works, but it falls apart in some corner cases - especially if you want to be able to operate on a range like you would an array. - Jonathan M Davis
Oct 25 2015
prev sibling next sibling parent Kagamin <spam here.lot> writes:
On Friday, 23 October 2015 at 17:44:55 UTC, deadalnix wrote:
 void foo(T)(const Collection!const(T) c) {}
 void main() {
   Collection!T c;
   foo(c); // Error, GTFO !
 }
How about this? void f(T)(const Collection!T c) { const Collection!(Unqual!T) cc = c; }
Oct 24 2015
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 10/23/2015 01:44 PM, deadalnix wrote:
 On Friday, 23 October 2015 at 11:03:37 UTC, Andrei Alexandrescu wrote:
 On 10/22/15 1:09 AM, deadalnix wrote:
 The elephant in the room: make the template parameter's type qualifier
 transitive with the collection's qualifier.
Could you please give more detail on this? Thanks! -- Andrei
Sure. We have a problem when it come to collection in the fact that type qualifier do not turtle down as one would expect. Collection!T and Collection!const(T) are 2 completely different types. There is a good reason for this : static if (is(T == const)) { ... } . As a result thing like : void foo(T)(const Collection!const(T) c) {} void main() { Collection!T c; foo(c); // Error, GTFO ! }
Makes sense. The way I like to think about it is in terms of compatibility with built-in types such slices. This code works: void foo(T)(const T[] c) {} void main() { int[] c; foo(c); // fine }
 With the different qualifiers and implicit conversion, thing become
 quite tricky. You can simulate them partially with a set of carefully
 crafted alias this (but not having multiple alias this doesn't make
 things any simpler) but there is the monster of mutually recursive
 template instanciation that is lurking.

 As far as i know, jmdavis had some good work done on this, but it was
 far from perfect.
Thanks, I see. The way I see it is as the work of "alias this"; any failure can be ascribed as a burden on the alias this definition. Jonathan, do you have a link to your work? Andrei
Oct 25 2015
parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Sunday, 25 October 2015 at 15:42:02 UTC, Andrei Alexandrescu 
wrote:
 Jonathan, do you have a link to your work?
Sorry, but no. I haven't mucked around with this issue recently, and whatever I have left on the topic is either buried in a newsgroup post somewhere or buried on my hard drive somewhere where I wouldn't know where to find it. Searching on the newsgroup for discussions on tail-const would find stuff related to this if you want to go spelunking, since it's the tail-const issues where the fact that const(Foo!T) and Foo!(const T) aren't related typically comes up and likely causes the largest problems. The EMSI containers may have some interesting stuff with regards to the problem, since I know that those guys tried very hard to be able to support tail-const ranges. The main thing that I recall is that if you want to be able to get a tail-const range from a const range, you have to use a static if to protect the const opSlice declaration so that you don't end up with a recursive template instantiation (since it has to return another instantiation of the range's template). And even then, an opSlice with no arguments isn't technically a standard range function, because none of the range traits require it (so you can't rely on a range having it). Rather, it's a container function for getting a range. And even if it were a range function, it wouldn't get the special treatment that arrays get when being passed to a templated function (IFTI infers an array's type as being a tail-const slice, which doesn't happen with anything else), and even if it did, a const range wouldn't implicitly convert to its tail-const variant without an alias this, which is likely to create a whole other set of problems - particularly since implicit conversions tend to wreak havoc in generic code. Handling tail-const with ranges in a manner consistent with arrays and supporting $ with ranges are probably the two big places that ranges can't currently emulate the behavior of arrays. The $ problem is technically solvable as-is, but without some form of https://issues.dlang.org/show_bug.cgi?id=7177 being implemented, changing hasSlicing or isRandomAccessRange to require that a range works with $ would likely break too much code, so no range-based code can really using $ unless it explicitly checks for it. However, I'm not sure that the tail-const problems is solvable without further language improvements. We likely need some sort of standard way to convert a const(Foo!T) to a Foo!(const T) - possibly via opSlice, since it wouldn't make sense for something that wasn't emulating an array. And we might need to make changes to IFTI so that it infers tail-const for ranges (at least if they provide such a conversion), but I'm not sure what the implications of that are. There's also the issue of accidentally slicing ranges (e.g. https://issues.dlang.org/show_bug.cgi?id=14619 ) which can cause incorrect behavior in at least some cases, depending on the range type - and if a range is a reference type, then slicing it would be akin to saving it, which could mean allocating. So, unlike with arrays, we probably don't want ranges in general to get sliced just because they're passed to a function, meaning that we may just want IFTI to not play fun games with ranges and let arrays be special there. In any case, I should probably try and find time soon to sit down and at least go over the issues in detail again so that I'm clearer on them and could possibly present a DIP intended to resolve them. I haven't dug through them recently, and my general approach up until now when I've needed to actually get work done has been to just not use const or inout with ranges. - Jonathan M Davis
Oct 25 2015
prev sibling next sibling parent =?UTF-8?B?Tm9yZGzDtnc=?= <per.nordlow gmail.com> writes:
On Wednesday, 21 October 2015 at 11:05:12 UTC, Andrei 
Alexandrescu wrote:
 Andrei
Exiting, times ahead! One thing that has caught my attention lately: I believe one way of making `std.experimental.allocator` usage (more) automatic is to add subtypes of containers for specific limited access patterns. For instance, some graph algorithms, such a calculating subgraphs, for instance: https://github.com/nordlow/justd/blob/master/knet/setops.d#L10 internally uses hashsets (currently implemented as a bool[Value]) for instance: https://github.com/nordlow/justd/blob/master/knet/setops.d#L15 only require the two primitives: - bool insert(Value value): returns true if value was stored, not already existed - bool contains(Value value) In this case (when no references are leaked), `HashSet` could be sub-typed to, say `ExpandableConstantHashSet`, which can safely and automatically make use of a `RegionAllocator` for super-performance. In this way, even complete beginners in D, could safely use Phobos containers to beat C++ performance. Destroy!
Oct 22 2015
prev sibling next sibling parent reply =?UTF-8?B?Tm9yZGzDtnc=?= <per.nordlow gmail.com> writes:
On Wednesday, 21 October 2015 at 11:05:12 UTC, Andrei 
Alexandrescu wrote:
 Internally, functional containers take advantage of common 
 substructure and immutability to share actual data. The classic 
 resource for defining and implementing functional containers is 
 http://www.amazon.com/Purely-Functional-Structures-Chris-Okasaki/dp/0521663504.
Online at: https://www.cs.cmu.edu/~rwh/theses/okasaki.pdf
Oct 22 2015
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 10/22/2015 03:39 AM, Nordlöw wrote:
 On Wednesday, 21 October 2015 at 11:05:12 UTC, Andrei Alexandrescu wrote:
 Internally, functional containers take advantage of common
 substructure and immutability to share actual data. The classic
 resource for defining and implementing functional containers is
 http://www.amazon.com/Purely-Functional-Structures-Chris-Okasaki/dp/0521663504.
Online at: https://www.cs.cmu.edu/~rwh/theses/okasaki.pdf
That's the doctoral dissertation; I assume the book is more approachable. -- Andrei
Oct 22 2015
prev sibling parent reply Jacob Carlborg <doob me.com> writes:
On 2015-10-21 13:05, Andrei Alexandrescu wrote:

 1. Functional containers.

 These are immutable; once created, neither their topology nor their
 elements may be observably changed. Manipulating a container entails
 creating an entire new container, often based on an existing container
 (e.g. append takes a container and an element and creates a whole new
 container).

 Internally, functional containers take advantage of common substructure
 and immutability to share actual data. The classic resource for defining
 and implementing functional containers is
 http://www.amazon.com/Purely-Functional-Structures-Chris-Okasaki/dp/0521663504.
Can these be implemented by the user just declaring a regular container as immutable? The implement will recognize if it's declared as immutable and adapt. -- /Jacob Carlborg
Oct 24 2015
next sibling parent TheFlyingFiddle <kurtyan student.chalmers.se> writes:
On Saturday, 24 October 2015 at 09:22:37 UTC, Jacob Carlborg 
wrote:
 Can these be implemented by the user just declaring a regular 
 container as immutable? The implement will recognize if it's 
 declared as immutable and adapt.
How can a type know it's qualifier? struct Container(T) { //.... access qualifier somehow do stuff with it } alias SM = Container!int; alias SIM = immutable Container!int; It was my understanding that S!int only gets instantiated one time here? Or does the compiler instantiate (immutable S!int) and (S!int)? Or were you thinking of something else?
Oct 24 2015
prev sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 10/24/2015 11:22 AM, Jacob Carlborg wrote:
 On 2015-10-21 13:05, Andrei Alexandrescu wrote:

 1. Functional containers.

 These are immutable; once created, neither their topology nor their
 elements may be observably changed. Manipulating a container entails
 creating an entire new container, often based on an existing container
 (e.g. append takes a container and an element and creates a whole new
 container).

 Internally, functional containers take advantage of common substructure
 and immutability to share actual data. The classic resource for defining
 and implementing functional containers is
 http://www.amazon.com/Purely-Functional-Structures-Chris-Okasaki/dp/0521663504.
Can these be implemented by the user just declaring a regular container as immutable? The implement will recognize if it's declared as immutable and adapt.
Even if this was possible, it would not be a very good idea. Persistent data structures have use cases that would be hindered by required transitive immutability.
Oct 24 2015
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 10/24/15 3:19 PM, Timon Gehr wrote:
 Even if this was possible, it would not be a very good idea. Persistent
 data structures have use cases that would be hindered by required
 transitive immutability.
This part I don't quite get. Are there any languages that offer containers with immutable topology but mutable slots, and call them persistent data structures? -- Andrei
Oct 24 2015
parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 10/24/2015 09:22 PM, Andrei Alexandrescu wrote:
 On 10/24/15 3:19 PM, Timon Gehr wrote:
 Even if this was possible, it would not be a very good idea. Persistent
 data structures have use cases that would be hindered by required
 transitive immutability.
This part I don't quite get.
The slots are not mutable, but they are not /transitively/ immutable either. Note that this does not require any special effort, nor does it /prevent/ stored elements from being (transitively) immutable. Scala does it this way. (Haskell as well, basically.)
 Are there any languages that offer
 containers with immutable topology but mutable slots, and call them
 persistent data structures? -- Andrei
Not that I know of, unless you count the hidden updates Haskell implementations may perform in order to provide lazy semantics (which 'immutable' in D prevents, this alone is a sufficient reason not to force it on users). However, that would be useful as well (as a potential performance optimization if the identity of the stored elements does not matter). This paper of Sarnak and Tarjan discusses a version of what they call persistent rb-trees that only allows updates on the last version (which is often sufficient). (The point of the restriction is to lower space usage.) It changes color information on existing nodes and it keeps mutable lists in the nodes (i.e. making the nodes immutable would not work.) http://www.link.cs.cmu.edu/15859-f07/papers/point-location.pdf One might now say that those implementation details don't matter, and that the slots should still be transitively immutable. Well, one may want to use such data structures _within_ other persistent data structures. (Just consider e.g. the case where, given a 3D point cloud of size n, and a number of axis-aligned boxes, you want a way to count the number of points in each box, given that the boxes arrive in an online fashion, while the points are known from the start. (Here, we want O(n log² n) preprocessing time, O(n log n) space and O(log² n) query time). One way to achieve that involves storing instances of an augmented version of their persistent rb-tree (they should support the range queries I have mentioned in my list of useful data structures) within a persistent augmented tree with predetermined O(log n)-depth topology (this container I had missed to mention).) You can't do that if slots are transitively immutable.
Oct 24 2015
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 10/24/2015 07:03 PM, Timon Gehr wrote:
 On 10/24/2015 09:22 PM, Andrei Alexandrescu wrote:
 On 10/24/15 3:19 PM, Timon Gehr wrote:
 Even if this was possible, it would not be a very good idea. Persistent
 data structures have use cases that would be hindered by required
 transitive immutability.
This part I don't quite get.
The slots are not mutable, but they are not /transitively/ immutable either. Note that this does not require any special effort, nor does it /prevent/ stored elements from being (transitively) immutable. Scala does it this way. (Haskell as well, basically.)
I see. Well, this will be unpleasant to implement in D. One simple way to do so would be to have accessors return rvalues: T front() { ... } Then you get to change the indirections of T, if any, but not what's stored in the container directly. Problem is accessing every element by rvalue is likely to be inefficient in the general case (even on data without copy ctors). Andrei
Oct 25 2015
parent Timon Gehr <timon.gehr gmx.ch> writes:
On 10/25/2015 08:33 PM, Andrei Alexandrescu wrote:
 On 10/24/2015 07:03 PM, Timon Gehr wrote:
 On 10/24/2015 09:22 PM, Andrei Alexandrescu wrote:
 On 10/24/15 3:19 PM, Timon Gehr wrote:
 Even if this was possible, it would not be a very good idea. Persistent
 data structures have use cases that would be hindered by required
 transitive immutability.
This part I don't quite get.
The slots are not mutable, but they are not /transitively/ immutable either. Note that this does not require any special effort, nor does it /prevent/ stored elements from being (transitively) immutable. Scala does it this way. (Haskell as well, basically.)
I see. Well, this will be unpleasant to implement in D. One simple way to do so would be to have accessors return rvalues: T front() { ... } Then you get to change the indirections of T, if any, but not what's stored in the container directly. Problem is accessing every element by rvalue is likely to be inefficient in the general case (even on data without copy ctors). Andrei
As I mentioned, the cheap way out for performance would be to provide what you suggested (persistent topology, arbitrary reference access to slots). Users can then use type qualifiers at their own discretion. There could probably even be short aliases to automatically have immutable elements. What's important is that use cases for mutable elements are not ruled out. They don't necessarily need to be on the path of least resistance.
Oct 25 2015