www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Collections question

reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
There's this oddity of built-in hash tables: a reference to a non-empty 
hash table can be copied and then both references refer to the same hash 
table object. However, if the hash table is null, copying the reference 
won't track the same object later on.

Fast-forward to general collections. If we want to support things like 
reference containers, clearly that oddity must be addressed. There are 
two typical approaches:

1. Factory function:

struct MyCollection(T)
{
     static MyCollection make(U...)(auto ref U args);
     ...
}

So then client code is:

auto c1 = MyCollection!(int).make(1, 2, 3);
auto c2 = MyCollection!(int).make();
auto c3 = c2; // refers to the same collection as c2

2. The opCall trick:

struct MyCollection(T)
{
     static MyCollection opCall(U...)(auto ref U args);
     ...
}

with the client code:

auto c1 = MyCollection!(int)(1, 2, 3);
auto c2 = MyCollection!(int)();
auto c3 = c2; // refers to the same collection as c2

There's some experience in various libraries with both approaches. Which 
would you prefer?


Andrei
Nov 27 2015
next sibling parent reply Adam D. Ruppe <destructionator gmail.com> writes:
On Friday, 27 November 2015 at 20:14:21 UTC, Andrei Alexandrescu 
wrote:
 1. Factory function:
This is my preference for zero arg at least because the opCall thing is commonly misunderstood and confused with C++ default construction and we don't need to encourage that.
     static MyCollection opCall(U...)(auto ref U args);
 auto c1 = MyCollection!(int)(1, 2, 3);
That syntax is the same as constructors... if that's what you want it to look like, we ought to actually use a constructor for all but the zero-argument ones which I'd use a static named function for (perhaps .make or perhaps .makeEmpty too) But the opCall in cases where it conflicts with constructors ought to be discouraged.
Nov 27 2015
parent reply Jakob Ovrum <jakobovrum gmail.com> writes:
On Friday, 27 November 2015 at 20:25:12 UTC, Adam D. Ruppe wrote:
 That syntax is the same as constructors... if that's what you 
 want it to look like, we ought to actually use a constructor 
 for all but the zero-argument ones which I'd use a static named 
 function for (perhaps .make or perhaps .makeEmpty too)
While I think this would be nice and explicit, it's bad for generic code, which would have to specialize to correctly call the nullary version.
Nov 27 2015
parent reply Kagamin <spam here.lot> writes:
On Saturday, 28 November 2015 at 06:26:03 UTC, Jakob Ovrum wrote:
 On Friday, 27 November 2015 at 20:25:12 UTC, Adam D. Ruppe 
 wrote:
 That syntax is the same as constructors... if that's what you 
 want it to look like, we ought to actually use a constructor 
 for all but the zero-argument ones which I'd use a static 
 named function for (perhaps .make or perhaps .makeEmpty too)
While I think this would be nice and explicit, it's bad for generic code, which would have to specialize to correctly call the nullary version.
Well... doesn't work: http://dpaste.dzfl.pl/2c69cc3584b8
Nov 28 2015
parent reply Adam D. Ruppe <destructionator gmail.com> writes:
On Saturday, 28 November 2015 at 18:38:37 UTC, Kagamin wrote:
 Well... doesn't work: http://dpaste.dzfl.pl/2c69cc3584b8
I don't understand... of course you can't call what is returned by makeEmpty.
Nov 28 2015
parent Kagamin <spam here.lot> writes:
On Saturday, 28 November 2015 at 18:43:33 UTC, Adam D. Ruppe 
wrote:
 On Saturday, 28 November 2015 at 18:38:37 UTC, Kagamin wrote:
 Well... doesn't work: http://dpaste.dzfl.pl/2c69cc3584b8
I don't understand... of course you can't call what is returned by makeEmpty.
Recently someone complained that it was a mistake to use round braces for template syntax. I noticed it too, that sometimes it's hard to tell where's template instantiation and where is function invocation.
Nov 28 2015
prev sibling next sibling parent Minas Mina <minas_0 hotmail.co.uk> writes:
On Friday, 27 November 2015 at 20:14:21 UTC, Andrei Alexandrescu 
wrote:
 There's this oddity of built-in hash tables: a reference to a 
 non-empty hash table can be copied and then both references 
 refer to the same hash table object. However, if the hash table 
 is null, copying the reference won't track the same object 
 later on.

 [...]
2, The opCall() one.
Nov 27 2015
prev sibling next sibling parent =?UTF-8?B?THXDrXM=?= Marques <luis luismarques.eu> writes:
On Friday, 27 November 2015 at 20:14:21 UTC, Andrei Alexandrescu 
wrote:
 There's this oddity of built-in hash tables: a reference to a 
 non-empty hash table can be copied and then both references 
 refer to the same hash table object. However, if the hash table 
 is null, copying the reference won't track the same object 
 later on.
I keep hoping that that design decision would be changed...
 1. Factory function:
Something I find deeply unsatisfying about D structs is their inability to reliably set non-trivial invariants, due to the lack of custom default ctors. If you are careful, you disable this(), and provide a factory function that sets the invariant. But then, you aren't doing much more than renaming this() to make() or whatever. The issues with .init could be addressed without prohibiting a default ctor...
Nov 27 2015
prev sibling next sibling parent reply bitwise <bitwise.pvt gmail.com> writes:
On Friday, 27 November 2015 at 20:14:21 UTC, Andrei Alexandrescu 
wrote:
 There's this oddity of built-in hash tables: a reference to a 
 non-empty hash table can be copied and then both references 
 refer to the same hash table object. However, if the hash table 
 is null, copying the reference won't track the same object 
 later on.

 Fast-forward to general collections. If we want to support 
 things like reference containers, clearly that oddity must be 
 addressed. There are two typical approaches:

 1. Factory function:

 struct MyCollection(T)
 {
     static MyCollection make(U...)(auto ref U args);
     ...
 }

 So then client code is:

 auto c1 = MyCollection!(int).make(1, 2, 3);
 auto c2 = MyCollection!(int).make();
 auto c3 = c2; // refers to the same collection as c2

 2. The opCall trick:

 struct MyCollection(T)
 {
     static MyCollection opCall(U...)(auto ref U args);
     ...
 }

 with the client code:

 auto c1 = MyCollection!(int)(1, 2, 3);
 auto c2 = MyCollection!(int)();
 auto c3 = c2; // refers to the same collection as c2

 There's some experience in various libraries with both 
 approaches. Which would you prefer?


 Andrei
Classes/real-ref-types dont act as you're describing, so why should these fake struct wrapper ref things act this way? This will likely achieve the exact opposite of what you're aiming for, by making something that's supposed to act like a reference type have different behaviour from D's built in ref types. Bit
Nov 27 2015
next sibling parent Jakob Ovrum <jakobovrum gmail.com> writes:
On Saturday, 28 November 2015 at 06:59:35 UTC, bitwise wrote:
 Classes/real-ref-types dont act as you're describing
They do, actually. class Collection(E) { ... } Collection!E a; // null reference auto b = new Collection!E(); // reference to empty collection The only outlier is the associative array, which lazily initializes when operations are performed on null references.
Nov 27 2015
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 11/28/15 1:59 AM, bitwise wrote:
 Classes/real-ref-types dont act as you're describing, so why should
 these fake struct wrapper ref things act this way? This will likely
 achieve the exact opposite of what you're aiming for, by making
 something that's supposed to act like a reference type have different
 behaviour from D's built in ref types.
So what would work for you? -- Andrei
Nov 28 2015
parent reply bitwise <bitwise.pvt gmail.com> writes:
On Saturday, 28 November 2015 at 13:39:35 UTC, Andrei 
Alexandrescu wrote:
 On 11/28/15 1:59 AM, bitwise wrote:
 Classes/real-ref-types dont act as you're describing, so why 
 should
 these fake struct wrapper ref things act this way? This will 
 likely
 achieve the exact opposite of what you're aiming for, by making
 something that's supposed to act like a reference type have 
 different
 behaviour from D's built in ref types.
So what would work for you? -- Andrei
Sorry if that response seemed a tad flippant, but I have to be honest...I am completely against this design...to put it mildly. I have my own containers to use, but on top of the fact that I would prefer something which is collaboratively maintained, I don't want to be forced to deal with, or support these "reference" containers, which will most likely happen if they get added to Phobos. I'm really not sure where to begin tearing this idea apart. The principal I have a problem with is much more fundamental than this one decision. In general, there is a lot in D that is very hackish. I understand that you don't want eager copying of containers, but when I way predictability, simplicity, clarity, and flexibility against that concern, there is no way I'm agreeing with you, when you can simply wrap a proper container in a RefCounted(T) or something. A class is a reference type, and a struct is a value type. If a user sees a struct, they should expect a value type which will copy on assign, and if they see a class, they should expect a reference. In D, the differentiation between value and reference types is clearly specified, and D users _should_ be, and should be expected to be, aware of it. If you really want reference containers, they should be implemented either as value-type structs, or classes that can work with RefCounted(T). Baking the reference count directly into the container is limiting, and buys nothing. I really don't see a problem with GC'ed classes if you really want reference types. It's going to be forever, if ever before you can actually turn off the GC when using Phobos. At least, if it's a class, you can use Scoped(T), or RefCounted(T) on it...assuming RefCounted(T) is fixed up to work with classes at some point, which seems like a better path then baking ref counting into a container implementation. I'm feeling a bit repetitive at this point, and wondering if I should have responded to this at all, and I'm sure you know exactly what I'm talking about, and that it's a matter of choice at this point, but there you have it. Bit
Nov 30 2015
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 11/30/2015 12:56 PM, bitwise wrote:
 On Saturday, 28 November 2015 at 13:39:35 UTC, Andrei Alexandrescu wrote:
 On 11/28/15 1:59 AM, bitwise wrote:
 Classes/real-ref-types dont act as you're describing, so why should
 these fake struct wrapper ref things act this way? This will likely
 achieve the exact opposite of what you're aiming for, by making
 something that's supposed to act like a reference type have different
 behaviour from D's built in ref types.
So what would work for you? -- Andrei
Sorry if that response seemed a tad flippant, but I have to be honest...I am completely against this design...to put it mildly. I have my own containers to use, but on top of the fact that I would prefer something which is collaboratively maintained, I don't want to be forced to deal with, or support these "reference" containers, which will most likely happen if they get added to Phobos. I'm really not sure where to begin tearing this idea apart. The principal I have a problem with is much more fundamental than this one decision. In general, there is a lot in D that is very hackish. I understand that you don't want eager copying of containers, but when I way predictability, simplicity, clarity, and flexibility against that concern, there is no way I'm agreeing with you, when you can simply wrap a proper container in a RefCounted(T) or something. A class is a reference type, and a struct is a value type. If a user sees a struct, they should expect a value type which will copy on assign, and if they see a class, they should expect a reference. In D, the differentiation between value and reference types is clearly specified, and D users _should_ be, and should be expected to be, aware of it. If you really want reference containers, they should be implemented either as value-type structs, or classes that can work with RefCounted(T). Baking the reference count directly into the container is limiting, and buys nothing. I really don't see a problem with GC'ed classes if you really want reference types. It's going to be forever, if ever before you can actually turn off the GC when using Phobos. At least, if it's a class, you can use Scoped(T), or RefCounted(T) on it...assuming RefCounted(T) is fixed up to work with classes at some point, which seems like a better path then baking ref counting into a container implementation. I'm feeling a bit repetitive at this point, and wondering if I should have responded to this at all, and I'm sure you know exactly what I'm talking about, and that it's a matter of choice at this point, but there you have it.
Thanks, your response is appreciated! Let me make sure I understand. So, in your opinion: * Value containers plus a way to wrap them with RefCounted is a better solution than containers with built-in reference semantics. * The design supported by D most naturally is: classes have reference semantics and structs have value semantics. * Reference semantics for containers seem to work best with GC. Pursuing reference containers with baked-in RC seems nonproductive. This is all sensible. Here are a couple of follow-up questions and considerations: * I couldn't integrate this with the rest of your post: "The principal I have a problem with is much more fundamental than this one decision. In general, there is a lot in D that is very hackish." Could you please elaborate? * The one matter with the value/RefCounted approach is that RefCounted cannot be made safe. One core design decision I made was to aim for safe containers. I do agree that if safety is off the table, your design would be a very good choice (probably the best I can think of, and I'd start an implementation using it). Thanks, Andrei
Nov 30 2015
parent reply Marc =?UTF-8?B?U2Now7x0eg==?= <schuetzm gmx.net> writes:
On Monday, 30 November 2015 at 18:18:38 UTC, Andrei Alexandrescu 
wrote:
 * The one matter with the value/RefCounted approach is that 
 RefCounted cannot be made  safe.
That's just as true for internal refcounting.
Dec 01 2015
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 12/1/15 4:55 AM, Marc Schütz wrote:
 On Monday, 30 November 2015 at 18:18:38 UTC, Andrei Alexandrescu wrote:
 * The one matter with the value/RefCounted approach is that RefCounted
 cannot be made  safe.
That's just as true for internal refcounting.
I don't think that's the case. The way I wrote code, safety can be achieved with a few controlled insertions of trusted. -- Andrei
Dec 01 2015
parent reply Marc =?UTF-8?B?U2Now7x0eg==?= <schuetzm gmx.net> writes:
On Tuesday, 1 December 2015 at 14:15:47 UTC, Andrei Alexandrescu 
wrote:
 On 12/1/15 4:55 AM, Marc Schütz wrote:
 On Monday, 30 November 2015 at 18:18:38 UTC, Andrei 
 Alexandrescu wrote:
 * The one matter with the value/RefCounted approach is that 
 RefCounted
 cannot be made  safe.
That's just as true for internal refcounting.
I don't think that's the case. The way I wrote code, safety can be achieved with a few controlled insertions of trusted. -- Andrei
As long as you can pass the container and one of it's elements by mutable ref, it's unsafe (see the RCArray discussion [1]). If you can only access the elements by value (i.e. opIndex returns a copy), this precondition isn't fulfilled, but otherwise, I see no way to prevent it with the current language. [1] http://forum.dlang.org/post/huspgmeupgobjubtsmfe forum.dlang.org
Dec 01 2015
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 12/01/2015 09:35 AM, Marc Schütz wrote:
 On Tuesday, 1 December 2015 at 14:15:47 UTC, Andrei Alexandrescu wrote:
 On 12/1/15 4:55 AM, Marc Schütz wrote:
 On Monday, 30 November 2015 at 18:18:38 UTC, Andrei Alexandrescu wrote:
 * The one matter with the value/RefCounted approach is that RefCounted
 cannot be made  safe.
That's just as true for internal refcounting.
I don't think that's the case. The way I wrote code, safety can be achieved with a few controlled insertions of trusted. -- Andrei
As long as you can pass the container and one of it's elements by mutable ref, it's unsafe (see the RCArray discussion [1]). If you can only access the elements by value (i.e. opIndex returns a copy), this precondition isn't fulfilled, but otherwise, I see no way to prevent it with the current language. [1] http://forum.dlang.org/post/huspgmeupgobjubtsmfe forum.dlang.org
Ah, the good old assignment to reference. We need to prevent that from happening in safe code. Got any fresh ideas? -- Andrei
Dec 01 2015
parent reply deadalnix <deadalnix gmail.com> writes:
On Tuesday, 1 December 2015 at 17:27:20 UTC, Andrei Alexandrescu 
wrote:
 Ah, the good old assignment to reference. We need to prevent 
 that from happening in safe code. Got any fresh ideas? -- Andrei
Disable owner when borrowing 'mutably', and not when borrowing 'constly'.
Dec 01 2015
next sibling parent ZombineDev <valid_email he.re> writes:
On Wednesday, 2 December 2015 at 06:45:33 UTC, deadalnix wrote:
 On Tuesday, 1 December 2015 at 17:27:20 UTC, Andrei 
 Alexandrescu wrote:
 Ah, the good old assignment to reference. We need to prevent 
 that from happening in safe code. Got any fresh ideas? -- 
 Andrei
Disable owner when borrowing 'mutably', and not when borrowing 'constly'.
+1
Dec 02 2015
prev sibling next sibling parent Marc =?UTF-8?B?U2Now7x0eg==?= <schuetzm gmx.net> writes:
On Wednesday, 2 December 2015 at 06:45:33 UTC, deadalnix wrote:
 On Tuesday, 1 December 2015 at 17:27:20 UTC, Andrei 
 Alexandrescu wrote:
 Ah, the good old assignment to reference. We need to prevent 
 that from happening in safe code. Got any fresh ideas? -- 
 Andrei
Disable owner when borrowing 'mutably', and not when borrowing 'constly'.
Making it const is enough, it doesn't need to be disabled completely. (Except if you want to go full Rust with uniqueness etc.) Maybe there's a way to relax it further. Not all modifications of the owner are necessarily bad... If we limit the restriction to types with indirections (or types with destructors?), and only apply it in safe functions, this might not even break much code. I suspect that almost all trusted code, and probably most safe code, will already be written in a way that conforms to the new rules.
Dec 02 2015
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 12/02/2015 01:45 AM, deadalnix wrote:
 On Tuesday, 1 December 2015 at 17:27:20 UTC, Andrei Alexandrescu wrote:
 Ah, the good old assignment to reference. We need to prevent that from
 happening in safe code. Got any fresh ideas? -- Andrei
Disable owner when borrowing 'mutably', and not when borrowing 'constly'.
What does "disable owner" mean? Thx! -- Andrei
Dec 02 2015
next sibling parent Marc =?UTF-8?B?U2Now7x0eg==?= <schuetzm gmx.net> writes:
On Wednesday, 2 December 2015 at 15:58:19 UTC, Andrei 
Alexandrescu wrote:
 On 12/02/2015 01:45 AM, deadalnix wrote:
 On Tuesday, 1 December 2015 at 17:27:20 UTC, Andrei 
 Alexandrescu wrote:
 Ah, the good old assignment to reference. We need to prevent 
 that from
 happening in safe code. Got any fresh ideas? -- Andrei
Disable owner when borrowing 'mutably', and not when borrowing 'constly'.
What does "disable owner" mean? Thx! -- Andrei
(He probably means: The owner is the object to which the reference points. Disabling means disallowing any access to it, at compile time.) But your question gave me another idea: Instead of making the owner const, the compiler can insert calls to `owner.opFreeze()` and `owner.opThaw()` at the beginning/end of each borrowing, and leave the owner mutable. It's then up to the implementer to handle things in a way they like. For example, opFreeze() could just set a flag and assert that the underlying memory isn't freed during borrowing, or it could increment/decrement a reference count, or it could queue up any releases of the underlying storage to happen after the last borrow has expired (the idea you proposed as a solution for RCArray). It's helpful in this case if the operators have the following signatures: T opFreeze(); void opThaw(T cookie); For the refcounting solution, opFreeze() can increment the refcount and return a pointer to it, and opThaw() can decrement it again. The methods need to be called each time a borrow starts/ends.
Dec 02 2015
prev sibling parent deadalnix <deadalnix gmail.com> writes:
On Wednesday, 2 December 2015 at 15:58:19 UTC, Andrei 
Alexandrescu wrote:
 On 12/02/2015 01:45 AM, deadalnix wrote:
 On Tuesday, 1 December 2015 at 17:27:20 UTC, Andrei 
 Alexandrescu wrote:
 Ah, the good old assignment to reference. We need to prevent 
 that from
 happening in safe code. Got any fresh ideas? -- Andrei
Disable owner when borrowing 'mutably', and not when borrowing 'constly'.
What does "disable owner" mean? Thx! -- Andrei
I mean you can't use it for the lifetime of the borrowing.
Dec 02 2015
prev sibling next sibling parent Jakob Ovrum <jakobovrum gmail.com> writes:
On Friday, 27 November 2015 at 20:14:21 UTC, Andrei Alexandrescu 
wrote:
 There's some experience in various libraries with both 
 approaches. Which would you prefer?
Well, I think we should recognize that they're the same thing but with different names. I don't have a strong preference for either, but I think the opCall approach might invite questions like "why is T t; different from auto t = T(); with this collection type?". The current container library's `make` function has a neat feature (well, I'm biased here) where the element type doesn't have to be specified when construction arguments are provided: auto arr = make!Array(1, 2, 3); // element type inferred to be `int` auto arr = make!Array([42]); // Also for range construction Naturally this doesn't work with nullary construction, but I think it's worth mentioning because this is not nearly as practical with static member functions. Of course the current model is not usable as-is because `make` uses an ugly hack when "making" empty struct containers.
Nov 27 2015
prev sibling next sibling parent reply Jakob Ovrum <jakobovrum gmail.com> writes:
On Friday, 27 November 2015 at 20:14:21 UTC, Andrei Alexandrescu 
wrote:
 There's some experience in various libraries with both 
 approaches. Which would you prefer?
Another thing: wouldn't providing a custom allocator require a separate primitive? I am assuming that the allocator type won't be a template parameter, or at least that it will support IAllocator. alias Allocator = ...; // some std.allocator type Allocator alloc; // empty collection of int with user-specified allocator auto c = Collection!int.makeCustomAllocation(alloc); // collection of one allocator using theAllocator auto c = Collection!Allocator.make(alloc); // empty collection of Allocator with user-specified allocator auto c = Collection!Allocator.makeCustomAllocation(alloc); The last two don't look like they could use the same name. A way around this would be to forego variadic construction in favour of range construction, but it would necessitate copying of those elements, whether with `only` or putting in an array etc. If we need two names, then opCall becomes less attractive.
Nov 27 2015
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 11/28/15 2:26 AM, Jakob Ovrum wrote:
 Another thing: wouldn't providing a custom allocator require a separate
 primitive?
All collections will work with IAllocator. -- Andrei
Nov 28 2015
parent Jakob Ovrum <jakobovrum gmail.com> writes:
On Saturday, 28 November 2015 at 13:41:10 UTC, Andrei 
Alexandrescu wrote:
 On 11/28/15 2:26 AM, Jakob Ovrum wrote:
 Another thing: wouldn't providing a custom allocator require a 
 separate
 primitive?
All collections will work with IAllocator. -- Andrei
Yes, I assumed as much. So how would this be handled? Is this not a relevant question to the construction issue?
Nov 28 2015
prev sibling next sibling parent Dicebot <public dicebot.lv> writes:
Factory please. Static opCall has always been nothing but trouble 
in my experience.
Nov 28 2015
prev sibling next sibling parent default0 <Kevin.Labschek gmx.de> writes:
On Friday, 27 November 2015 at 20:14:21 UTC, Andrei Alexandrescu 
wrote:
 There's this oddity of built-in hash tables: a reference to a 
 non-empty hash table can be copied and then both references 
 refer to the same hash table object. However, if the hash table 
 is null, copying the reference won't track the same object 
 later on.

 Fast-forward to general collections. If we want to support 
 things like reference containers, clearly that oddity must be 
 addressed. There are two typical approaches:

 1. Factory function:

 struct MyCollection(T)
 {
     static MyCollection make(U...)(auto ref U args);
     ...
 }

 So then client code is:

 auto c1 = MyCollection!(int).make(1, 2, 3);
 auto c2 = MyCollection!(int).make();
 auto c3 = c2; // refers to the same collection as c2

 2. The opCall trick:

 struct MyCollection(T)
 {
     static MyCollection opCall(U...)(auto ref U args);
     ...
 }

 with the client code:

 auto c1 = MyCollection!(int)(1, 2, 3);
 auto c2 = MyCollection!(int)();
 auto c3 = c2; // refers to the same collection as c2

 There's some experience in various libraries with both 
 approaches. Which would you prefer?


 Andrei
This is probably naive and silly, but: can't you just put a dummy element in the hash table on creation of the struct, set a flag "containsDummyElement" and then have all methods you implement from that struct (count, range implementation, etc) check that flag and ignore the one element it contains when the flag is set? Then when the first real element gets added, just remove the element and reset the flag. When the last element gets removed, put the dummy element back. I feel like I'm either misunderstanding the problem or misunderstanding built-in associative arrays, so sorry if what I said above is really stupid and cannot (or should not) be done for reasons everyone else here knows about.
Nov 28 2015
prev sibling next sibling parent Tobias Pankrath <tobias pankrath.net> writes:
On Friday, 27 November 2015 at 20:14:21 UTC, Andrei Alexandrescu 
wrote:
 There's this oddity of built-in hash tables: a reference to a 
 non-empty hash table can be copied and then both references 
 refer to the same hash table object. However, if the hash table 
 is null, copying the reference won't track the same object 
 later on.

 Fast-forward to general collections. [...]

 Andrei
I'd prefer the factory method and we shouldn't allow lazy initialization. That's only confusing, if it sometimes works and sometimes won't work. Null container should throw.
Nov 28 2015
prev sibling next sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 11/27/2015 09:14 PM, Andrei Alexandrescu wrote:
 There's this oddity of built-in hash tables: a reference to a non-empty
 hash table can be copied and then both references refer to the same hash
 table object. However, if the hash table is null, copying the reference
 won't track the same object later on.

 Fast-forward to general collections. If we want to support things like
 reference containers, clearly that oddity must be addressed. There are
 two typical approaches:

 1. Factory function:
...

 2. The opCall trick:
...
3. (Non-internal) factory function: auto c1 = myCollection(1,2,3); auto c2 = myCollection!int(); auto c3 = c2; // refers to the same collection as c2
Nov 28 2015
parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Saturday, 28 November 2015 at 12:20:36 UTC, Timon Gehr wrote:
 3. (Non-internal) factory function:

 auto c1 = myCollection(1,2,3);
 auto c2 = myCollection!int();
 auto c3 = c2; // refers to the same collection as c2
Yeah. In general, I prefer that approach. It's what we currently do with RedBlackTree. It's more flexible (e.g. it can infer the element type like we do with dynamic arrays) and less verbose. The only downside that I can think of is that it doesn't work as well in generic code that's creating a container (as in where the container type is a template argument), but that's not something that's done normally. And if the factory function is just making using templated constructors cleaner, then generic code that's constructing such a container can still use the constructors. It just wouldn't be as nice as using the factory function. But for almost all cases, a non-internal factory function named after the container is less verbose and more flexible. - Jonathan M Davis
Nov 28 2015
prev sibling next sibling parent Sebastiaan Koppe <mail skoppe.eu> writes:
On Friday, 27 November 2015 at 20:14:21 UTC, Andrei Alexandrescu 
wrote:
 1. Factory function:
 2. The opCall trick:
1. Factory Shouldn't opCall be used when you want something to (only) behave as a function? E.g. functors.
Nov 28 2015
prev sibling next sibling parent reply Atila Neves <atila.neves gmail.com> writes:
On Friday, 27 November 2015 at 20:14:21 UTC, Andrei Alexandrescu 
wrote:
 There's this oddity of built-in hash tables: a reference to a 
 non-empty hash table can be copied and then both references 
 refer to the same hash table object. However, if the hash table 
 is null, copying the reference won't track the same object 
 later on.

 [...]
static opCAll is just confusing IMHO. Factory function please. Atila
Nov 29 2015
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 11/29/15 5:06 AM, Atila Neves wrote:
 On Friday, 27 November 2015 at 20:14:21 UTC, Andrei Alexandrescu wrote:
 There's this oddity of built-in hash tables: a reference to a
 non-empty hash table can be copied and then both references refer to
 the same hash table object. However, if the hash table is null,
 copying the reference won't track the same object later on.

 [...]
static opCAll is just confusing IMHO. Factory function please.
Thanks all. I'll go with the factory function. -- Andrei
Nov 29 2015
parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Sunday, 29 November 2015 at 15:06:49 UTC, Andrei Alexandrescu 
wrote:
 Thanks all. I'll go with the factory function. -- Andrei
As Timon suggested, I'd encourage you to go for a free factory function named after the container like RedBlackTree does with redBlackTree rather than having a static factory function, since it's less verbose and allows for type inference, though there's no reason why we couldn't have both: auto c1 = MyCollection!int.make(1, 2, 3); auto c2 = myCollection!int(1, 2, 3); auto c3 = myCollection(1, 2, 3); Regardless, it's better to avoid static opCall. - Jonathan M Davis
Nov 29 2015
prev sibling parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
On 11/27/15 3:14 PM, Andrei Alexandrescu wrote:
 There's this oddity of built-in hash tables: a reference to a non-empty
 hash table can be copied and then both references refer to the same hash
 table object. However, if the hash table is null, copying the reference
 won't track the same object later on.

 Fast-forward to general collections. If we want to support things like
 reference containers, clearly that oddity must be addressed. There are
 two typical approaches:

 1. Factory function:

 struct MyCollection(T)
 {
      static MyCollection make(U...)(auto ref U args);
      ...
 }

 So then client code is:

 auto c1 = MyCollection!(int).make(1, 2, 3);
 auto c2 = MyCollection!(int).make();
 auto c3 = c2; // refers to the same collection as c2

 2. The opCall trick:

 struct MyCollection(T)
 {
      static MyCollection opCall(U...)(auto ref U args);
      ...
 }

 with the client code:

 auto c1 = MyCollection!(int)(1, 2, 3);
 auto c2 = MyCollection!(int)();
 auto c3 = c2; // refers to the same collection as c2

 There's some experience in various libraries with both approaches. Which
 would you prefer?
How do you prevent the AA behavior? In other words, what happens here: MyCollection!(int) c1; auto c2 = c1; c1 ~= 1; assert(c2.contains(1)); // pass? fail? BTW, I third Jonathan's and Timon's suggestion -- go with an external factory function. Use IFTI to its fullest! -Steve
Nov 30 2015
parent reply Tobias Pankrath <tobias pankrath.net> writes:
On Monday, 30 November 2015 at 16:06:43 UTC, Steven Schveighoffer 
wrote:
 MyCollection!(int) c1;
 auto c2 = c1;
 c1 ~= 1;

 assert(c2.contains(1)); // pass? fail?

 BTW, I third Jonathan's and Timon's suggestion -- go with an 
 external factory function. Use IFTI to its fullest!

 -Steve
That should throw, because you're using an uninitialised reference (c1). It's the equivalent to: Class C { .. } C c1; C c2 = c1; c1.foo(); // call via nullptr Or it needs to pass, but that's probably not worth it.
Nov 30 2015
parent Steven Schveighoffer <schveiguy yahoo.com> writes:
On 11/30/15 11:21 AM, Tobias Pankrath wrote:
 On Monday, 30 November 2015 at 16:06:43 UTC, Steven Schveighoffer wrote:
 MyCollection!(int) c1;
 auto c2 = c1;
 c1 ~= 1;

 assert(c2.contains(1)); // pass? fail?

 BTW, I third Jonathan's and Timon's suggestion -- go with an external
 factory function. Use IFTI to its fullest!
That should throw, because you're using an uninitialised reference (c1). It's the equivalent to: Class C { .. } C c1; C c2 = c1; c1.foo(); // call via nullptr Or it needs to pass, but that's probably not worth it.
It means such a collection won't operate in the same way that associative arrays do. If that's the case, I'm OK with that. Technically, a wrapper could be constructed that performed the "lazy creation". But my point to Andrei was that the functions he suggests don't actually address the "oddity" of copying AAs. Addressing it, if it's done in the way you say, is as simple as not worrying about null pointers. -Steve
Nov 30 2015