digitalmars.D - Containers
- deadalnix (44/44) Aug 31 2021 I want to start working on a container library. Interestingly
- rikki cattermole (35/40) Aug 31 2021 Something I'm doing with my own containers is to create a "custom
- Paul Backus (22/32) Aug 31 2021 As far as I know, the best approach anyone's come up with for
- deadalnix (18/21) Aug 31 2021 ```d
- deadalnix (9/11) Aug 31 2021 So I read it and played with the concept a bit. If I'm honest, it
- Kagamin (3/5) Aug 31 2021 Declare opAssign on Vector!(const T) that takes const Vector!T,
- Alexandru Ermicioi (44/72) Aug 31 2021 Converting from Vector!T to Vector!(const T) should be possible
- Per =?UTF-8?B?Tm9yZGzDtnc=?= (23/25) Aug 31 2021 Good initiative. I'd be happy to assist.
- Per =?UTF-8?B?Tm9yZGzDtnc=?= (6/10) Aug 31 2021 Should be
- deadalnix (5/17) Aug 31 2021 There is a ton of good work in there!
- Per =?UTF-8?B?Tm9yZGzDtnc=?= (25/29) Sep 01 2021 Yes. No objections afaict. All good points.
- deadalnix (22/40) Sep 01 2021 I think COW should be the default. It is not difficult to build
- Per =?UTF-8?B?Tm9yZGzDtnc=?= (2/7) Sep 01 2021
- rikki cattermole (3/8) Sep 01 2021 Unrolled linked list, a linked list with X number of entries per node.
- Per =?UTF-8?B?Tm9yZGzDtnc=?= (7/9) Sep 01 2021 Yes, where each node typically is sized to fit in a single
- rikki cattermole (12/21) Sep 01 2021 All depends on how the vector is implemented. If its just an array that
- Steven Schveighoffer (23/66) Aug 31 2021 This is the crux of the tail-const problem. The language allows
- Paul Backus (29/38) Aug 31 2021 One problem I can forsee with this approach is that, from the
- jmh530 (9/37) Aug 31 2021 I think you make some good points here.
- Paul Backus (13/17) Aug 31 2021 It looks like `opHeadMutable` is basically a restricted form of
- Steven Schveighoffer (13/26) Sep 01 2021 It's possible to forward, i.e.:
- Paul Backus (15/27) Sep 01 2021 This runs directly into the issue 1807 problem. The compiler has
- Steven Schveighoffer (18/43) Sep 01 2021 Right, we have to fix that problem anyway.
- Paul Backus (4/14) Sep 01 2021 As long as the type author does not use unsafe casts, the
- Kagamin (12/16) Sep 02 2021 This compiles:
- deadalnix (5/10) Sep 01 2021 It still wouldn't work when manipulating things by reference.
- Paul Backus (13/24) Sep 01 2021 It doesn't work when manipulating built-in types by reference
- deadalnix (4/16) Sep 01 2021 No you right, this is not what I meant (but this is def what I
- Paul Backus (12/15) Sep 01 2021 Are you sure?
- deadalnix (12/29) Sep 01 2021 Yes, const(int[][]) is effectively const(const(const(int)[])[]).
- deadalnix (4/17) Sep 01 2021 If the type is unrelated, then you are back to the heuristic
- deadalnix (7/9) Aug 31 2021 You seem to be confused. The C++11 standard prohibits std::string
- Steven Schveighoffer (6/15) Sep 01 2021 Isn't the reason they prohibited COW is because of the problems with it?
- deadalnix (13/20) Sep 01 2021 Well I never experienced any problem with them, so did many other
- Per =?UTF-8?B?Tm9yZGzDtnc=?= (3/8) Sep 01 2021 In what ways was it a disaster? Because most strings are small
- deadalnix (7/15) Sep 01 2021 It was a disaster because gcc's standard lib used a COW string
- max haughton (13/58) Aug 31 2021 It's good that you are doing this Amaury, thanks.
- Per =?UTF-8?B?Tm9yZGzDtnc=?= (24/30) Sep 01 2021 Speaking of which, here's my D implementation of struct of arrays
- Per =?UTF-8?B?Tm9yZGzDtnc=?= (5/16) Sep 01 2021 Great point. Game engines extensively use such data layouts when
- Per =?UTF-8?B?Tm9yZGzDtnc=?= (6/8) Sep 01 2021 You give Razvan and Andrei a ping aswell. They have done deep
- deadalnix (3/12) Sep 01 2021 I started looking into this after a short discussion with Andrei
- Per =?UTF-8?B?Tm9yZGzDtnc=?= (4/6) Sep 01 2021 Also highly related is Atila's work on doing
- Per =?UTF-8?B?Tm9yZGzDtnc=?= (10/12) Sep 01 2021 Moreover, Rust collections provides makes rich of static members
- deadalnix (3/15) Sep 01 2021 I have adopted that style a lot lately and this is indeed much
- vit (38/47) Sep 01 2021 Try `this` template parameter:
- ikod (6/16) Sep 02 2021 My collections library[1] @safety inherited from the stored data
- Kagamin (2/3) Sep 02 2021 Do you want to expose the COW detail in the interface?
- deadalnix (5/8) Sep 02 2021 It doesn't need to be explicit, but in many instances, this is
I want to start working on a container library. Interestingly enough, I don't think we have what we need on that front. I'll describe here what are the design I intend to go for, and the rationales for it, but before, there is an unsolved problem that I don't know how to work around that is somewhat of a blocker. If one of you guys have an idea how to solve it, I'm taking. If I'm being honest, this is probably the most important problem that need to be solved for D right now. That problem is turtling down type qualifiers. For instance, if we have a Vector!T, then we want it to convert to a Vector!(const T) and we want const(Vector!T) == const(Vector!(const T)), all of it implicitly. Same goes for other qualifiers. To put it bluntly, I have no idea how to achieve this. And I don't think we'll have a good set of collection before this is sorted out. Now onto the design: - Value types. It is very easy to turn a value into a reference type, not so much the other way around. reference type also come with a bag of edges cases, like the collection being null vs empty, that cause confusion and problems in practice. - COW. Value type collection can be very expensive to work with if the eagerly copy everything. On the other hand, COW ensures copies are cheap. There are also very common use case that are well served by COW collections, for instance, when one wants value types for collection that are infrequently changes - but you don't know or can't prove that they won't. - Small collection features. As every engineer, but no mathematician will tell you, is that most numbers are small. It is also true of collection sizes. In order to speed things up for collection that are expected to be small, being able to specify Vector!(T, 8) to reserve 8 slots for elements, in the base collection, and only allocate on heap if the collection size grows past this. LLVM has a collection library doing so and it is the most useful thing in the world (see https://www.youtube.com/watch?v=vElZc6zSIXM for more details). - Not expose much of its internals. This like iterators being raw pointers (or the result of using the in operator) for instance, impose constraint on the internal that almost always end up costing a ton of perf as new techniques are discovered (recent exemples include open addressing with SSE probing for instance). All of this needs to be wrapped. I have a good idea how to solve all these problems. I have no idea how to solve the type qualifier one. I need help.
Aug 31 2021
On 01/09/2021 5:21 AM, deadalnix wrote:- Not expose much of its internals. This like iterators being raw pointers (or the result of using the in operator) for instance, impose constraint on the internal that almost always end up costing a ton of perf as new techniques are discovered (recent exemples include open addressing with SSE probing for instance). All of this needs to be wrapped.Something I'm doing with my own containers is to create a "custom reference" around the internals. ``` struct Thing { this(RCISharedAllocator allocator) { this.impl = allocator.make!ThingImpl(allocator); } this(ref Thing other) { this.impl = other.impl; if (this.impl !is null) atomicOp!"+="(this.impl.refCount, 1); } ~this() { if (this.impl !is null) { if (atomicOp!"-="(this.impl.refCount, 1) == 0) { RCISharedAllocator allocator = this.impl.allocator; allocator.dispose(this.impl); } } } safe: bool isNull() { return impl is null; } private: ThingImpl* impl; } private: struct ThingImpl { safe: RCISharedAllocator allocator; shared(size_t) refCount; } ``` I've found this to work quite well.
Aug 31 2021
On Tuesday, 31 August 2021 at 17:21:02 UTC, deadalnix wrote:For instance, if we have a Vector!T, then we want it to convert to a Vector!(const T) and we want const(Vector!T) == const(Vector!(const T)), all of it implicitly. Same goes for other qualifiers.As far as I know, the best approach anyone's come up with for this is the `headMutable` pattern described in this blog post: https://dlang.org/blog/2020/06/25/a-pattern-for-head-mutable-structures/ This gives us: * `Vector!T` -> `const(Vector!T)` (built in) * `const(Vector!T)` -> `Vector!(const(T))` (via `headMutable`) * `Vector!(const(T))` -> `const(Vector!(const(T)))` (built in) * `const(Vector!(const(T)))` -> `Vector!(const(T))` (via `headMutable`) The only conversion missing is * `const(Vector!(const(T)))` -> `const(Vector!T)` ...which would have to be implemented as another method (`tailUnqual`?). Of course many of these conversions are explicit, not implicit, but I don't think there's any way around that given D's current capabilities. The only user-defined implicit conversions in D are via inheritance and `alias this`, and neither can do what we need here.- COW. Value type collection can be very expensive to work with if the eagerly copy everything. On the other hand, COW ensures copies are cheap. There are also very common use case that are well served by COW collections, for instance, when one wants value types for collection that are infrequently changes - but you don't know or can't prove that they won't.This sounds similar to Clojure's immutable data structures. I agree these would be a good thing to have, although I think traditional mutable containers should also exist alongside them.
Aug 31 2021
On Tuesday, 31 August 2021 at 18:11:04 UTC, Paul Backus wrote:This sounds similar to Clojure's immutable data structures. I agree these would be a good thing to have, although I think traditional mutable containers should also exist alongside them.```d class RefContainer { Container c; } ``` Here you go :) There is obviously some boilerplate missing, but I'm sure you get the idea. One nice bonus is that you get a practically free clone method, that will only do so when you actually mutate things. You'll note that in practice, you don't need to clone the container every time you write, you need to keep a ref count and only clone when the number of references to the container is greater than 1. Lastly, my experience tells me that container being traditionally references was a mistakes. I have many scars to show for it. Thank you for the Head mutable article. I'll dig into this and see if I can make it work.
Aug 31 2021
On Tuesday, 31 August 2021 at 18:24:54 UTC, deadalnix wrote:Thank you for the Head mutable article. I'll dig into this and see if I can make it work.So I read it and played with the concept a bit. If I'm honest, it fall very much short of where it needs to be. It is inconvenient - but workable - when using things by value, however, the shortcoming become very apparent when it is used as a reference, or within a more complex type construction. This similarly go for other proposed solution in the thread. I don't think we'll ever have a set of container that are as good as they should without this being nailed down.
Aug 31 2021
On Tuesday, 31 August 2021 at 17:21:02 UTC, deadalnix wrote:I have a good idea how to solve all these problems. I have no idea how to solve the type qualifier one. I need help.Declare opAssign on Vector!(const T) that takes const Vector!T, then the assignment will compile.
Aug 31 2021
On Tuesday, 31 August 2021 at 17:21:02 UTC, deadalnix wrote:I want to start working on a container library. Interestingly enough, I don't think we have what we need on that front. I'll describe here what are the design I intend to go for, and the rationales for it, but before, there is an unsolved problem that I don't know how to work around that is somewhat of a blocker. If one of you guys have an idea how to solve it, I'm taking. If I'm being honest, this is probably the most important problem that need to be solved for D right now. That problem is turtling down type qualifiers. For instance, if we have a Vector!T, then we want it to convert to a Vector!(const T) and we want const(Vector!T) == const(Vector!(const T)), all of it implicitly. Same goes for other qualifiers.Converting from Vector!T to Vector!(const T) should be possible through copy constructors and opAssign overloads, same for const(Vector!T) to const(Vector!(const T)). having something like this should suffice for initialization for example: ``` static if (is(T == const)) { this(Vector!(Unqual!T) mutable) { this.array = mutable.array; // Or you can do a complete copy. } static if (is(this == const)) { // ... well case for const T and const this } } ```To put it bluntly, I have no idea how to achieve this. And I don't think we'll have a good set of collection before this is sorted out. Now onto the design: - Value types. It is very easy to turn a value into a reference type, not so much the other way around.Would be nice if even as value types different implementations say of a list would follow a well defined interface, same for other type of collections.reference type also come with a bag of edges cases, like the collection being null vs empty, that cause confusion and problems in practice.Imho, there is no problem here. null pointer to collection class means no collection present, while empty collection is just an empty collection. There should not be any conflation between these notions, in general, and could be made clear in the documentation.- Not expose much of its internals. This like iterators being raw pointers (or the result of using the in operator) for instance, impose constraint on the internal that almost always end up costing a ton of perf as new techniques are discovered (recent exemples include open addressing with SSE probing for instance). All of this needs to be wrapped.It would be nice if the collection api would have rich and consistent api for working with it. I.e have common interfaces for lists, sets, queues and etc. that value and reference type collections implement. And that is not only about the list of methods they implement, but also the behavior they exhibit. Java for example has a good api for collections, from which your collection library can take inspiration, and ofc with D twist of features. I also tried poking around with a collection implementation, though not value types but ref types, and found it being problematic to implement, due to limitations of how safety annotations work. You basically cannot force the interface being safe/trusted/system based on element it contains without doing extensive compile time introspection... So if you plan to add ref wrappers this one should be also resolved somehow to have a safe lib, otherwise I'm afraid class wrappers will be only for system code. Best regards, Alexandru.
Aug 31 2021
On Tuesday, 31 August 2021 at 17:21:02 UTC, deadalnix wrote:I want to start working on a container library. Interestingly enough, I don't think we have what we need on that front.Good initiative. I'd be happy to assist. I have a bunch of value-type containers (struct + dup) lying around [here](https://github.com/nordlow/phobos-next/tree/master/src/nxt) being ` safe pure nothrow nogc` whenever possible. It supports significantly more members than emsi-containers at least. It also tries to be dip-1000 compliant but there are like bugs lurking here and there. [Here](https://github.com/nordlow/phobos-next#containers)'s a likely incomplete list of the containers. Feel free to give me a ping if you need any guidance. You might find `nxt.trie` of special interest. Beware of its size, though. There's also a benchmark at benchmarks/containers you can try out via either benchmarks/containers/test-dmd-debug or benchmarks/containers/test-ldc-release . It benchmarks a subset of the containers together with D arrays and AAs. Somewhat related, I've also experimented with a run-time variant of Rust's ownership and borrowing rules at [borrown.d](https://github.com/nordlow/phobos-next/blob/master/src/nxt/borrown.d). Until later, Per
Aug 31 2021
On Tuesday, 31 August 2021 at 21:36:25 UTC, Per Nordlöw wrote:you can try out via either benchmarks/containers/test-dmd-debug or benchmarks/containers/test-ldc-releaseShould be ./test-dmd-debug or ./test-ldc-release under `benchmarks/containers`.
Aug 31 2021
On Tuesday, 31 August 2021 at 21:36:25 UTC, Per Nordlöw wrote:On Tuesday, 31 August 2021 at 17:21:02 UTC, deadalnix wrote:There is a ton of good work in there! Do you agree with the direction explicated? I expected to focus on Vector/Set/Map initially. This should serve most use case in practice.I want to start working on a container library. Interestingly enough, I don't think we have what we need on that front.Good initiative. I'd be happy to assist. I have a bunch of value-type containers (struct + dup) lying around [here](https://github.com/nordlow/phobos-next/tree/master/src/nxt) being ` safe pure nothrow nogc` whenever possible. It supports significantly more members than emsi-containers at least. It also tries to be dip-1000 compliant but there are like bugs lurking here and there. [Here](https://github.com/nordlow/phobos-next#containers)'s a likely incomplete list of the containers. Feel free to give me a ping if you need any guidance. You might find `nxt.trie` of special interest. Beware of its size, though.
Aug 31 2021
On Tuesday, 31 August 2021 at 23:24:57 UTC, deadalnix wrote:There is a ton of good work in there!Yes, and I would really like it to come of use upstream. :)Do you agree with the direction explicated?Yes. No objections afaict. All good points. - CoW: sounds like the future direction in other languages such as Swift so no objections on that matter: - Should we enforce CoW or make it opt-in via storages fed as template parameters? - Will you be supporting - Will you be using ranges or iterators? - What about allocators? If you do can you please use std.experimental.allocator instead of stdx.allocator (used by emsi-containers and libdparse and dsymbol) is more than 2 years old now. std.experimental.allocator has received much love since, for instance `allocateZeroed` for the allocators that support it. - An interesting emsi container that was new to me is `UnrolledList`. It's used extensively in dsymbol.I expected to focus on Vector/Set/Map initially. This should serve most use case in practice.You should probably use robin-hood hashing if you pick open-addressing. I'm not using that (yet) in my hash sets/maps. It has very good asymptotic complexity. You should probably have a look at the state-of-the-art robin-hood implementations in C++ aswell. To run my own C++-benchmark for containers you can do ./test under benchmarks/containers-cxx .
Sep 01 2021
On Wednesday, 1 September 2021 at 10:12:23 UTC, Per Nordlöw wrote:- CoW: sounds like the future direction in other languages such as Swift so no objections on that matter: - Should we enforce CoW or make it opt-in via storages fed as template parameters?I think COW should be the default. It is not difficult to build reference container from COW ones, so we probably don't need it to be optional, simply provide wrappr around the COW container to provide references ones.- Will you be supportingI'm not sure what you mean here. If it is bugfixes and all, I don't have much problem with that. Container are fairly straightforward, so I don't expect the workload to be very high anyways.- Will you be using ranges or iterators?This is D, we must provide ranges. But in operator and similar kind forces us to do both.- What about allocators? If you do can you please use std.experimental.allocator instead of stdx.allocator (used by emsi-containers and libdparse and dsymbol) is more than 2 years old now. std.experimental.allocator has received much love since, for instance `allocateZeroed` for the allocators that support it.I have to admit I have not put too much thought into this.- An interesting emsi container that was new to me is `UnrolledList`. It's used extensively in dsymbol.I have no idea what that is.Yes, that being said, I would like to not expose the implementation. PHP and python were able to move from a tree based approach to open addressing for dictionaries, for great benefit. They could do so because they didn't expose the innards of the container. There are very interesting approach leveraging vector capabilities of modern CPU such as F14: https://engineering.fb.com/2019/04/25/developer-tools/f14/ and exposing the innards would prevent moving to them later on.I expected to focus on Vector/Set/Map initially. This should serve most use case in practice.You should probably use robin-hood hashing if you pick open-addressing.
Sep 01 2021
On Wednesday, 1 September 2021 at 10:32:45 UTC, deadalnix wrote:Ignore that. An editing mistake.- Will you be supportingI'm not sure what you mean here. If it is bugfixes and all, I don't have much problem with that. Container are fairly straightforward, so I don't expect the workload to be very high anyways.
Sep 01 2021
On 01/09/2021 10:32 PM, deadalnix wrote:Unrolled linked list, a linked list with X number of entries per node. Very good for cache efficiency especially if you need append only.- An interesting emsi container that was new to me is `UnrolledList`. It's used extensively in dsymbol.I have no idea what that is.
Sep 01 2021
On Wednesday, 1 September 2021 at 12:09:19 UTC, rikki cattermole wrote:Unrolled linked list, a linked list with X number of entries per node.Yes, where each node typically is sized to fit in a single cache-line. Suitable for small element types. I don't know why an UnrolledList is preferred over a traditional vector type bundled with an allocator that does the same grouping of allocations, though.
Sep 01 2021
On 02/09/2021 8:55 AM, Per Nordlöw wrote:On Wednesday, 1 September 2021 at 12:09:19 UTC, rikki cattermole wrote:All depends on how the vector is implemented. If its just an array that is resized as required with extra capacity, then obviously there is memory fragmentation issues that can result in allocation failing. But I think each have different memory patterns associated with it, so it may not be clear cut. For example I'm working on a color palette data structure using an unrolled linked list as the basis. Insertions, removals are both quite common here, but so is only appending. And of course quite importantly you want fairly fast skipping of entries (hence unrolled rather than a straight linked list) to find the one you want.Unrolled linked list, a linked list with X number of entries per node.Yes, where each node typically is sized to fit in a single cache-line. Suitable for small element types. I don't know why an UnrolledList is preferred over a traditional vector type bundled with an allocator that does the same grouping of allocations, though.
Sep 01 2021
On 8/31/21 1:21 PM, deadalnix wrote:I want to start working on a container library. Interestingly enough, I don't think we have what we need on that front. I'll describe here what are the design I intend to go for, and the rationales for it, but before, there is an unsolved problem that I don't know how to work around that is somewhat of a blocker. If one of you guys have an idea how to solve it, I'm taking. If I'm being honest, this is probably the most important problem that need to be solved for D right now. That problem is turtling down type qualifiers. For instance, if we have a Vector!T, then we want it to convert to a Vector!(const T) and we want const(Vector!T) == const(Vector!(const T)), all of it implicitly. Same goes for other qualifiers.This is the crux of the tail-const problem. The language allows tail-const for only 2 types, T[] and T*. Everything else is fully const or fully mutable.To put it bluntly, I have no idea how to achieve this. And I don't think we'll have a good set of collection before this is sorted out.I had an idea a long time ago, but ashamedly never brought it to fruition. But long story short, I think we need a way to specify "tail" for modifiers, and allow implicitly casting the head. Then you'd have `tailconst(Vector!int)`, and you are good. I actually had a decent syntax for it, but I'm not sure it would fly anyway.Now onto the design: - Value types. It is very easy to turn a value into a reference type, not so much the other way around. reference type also come with a bag of edges cases, like the collection being null vs empty, that cause confusion and problems in practice.Null vs. empty isn't a problem for non-class reference types. Of course, without tail-const, you are kinda screwed.- COW. Value type collection can be very expensive to work with if the eagerly copy everything. On the other hand, COW ensures copies are cheap. There are also very common use case that are well served by COW collections, for instance, when one wants value types for collection that are infrequently changes - but you don't know or can't prove that they won't.I don't agree that COW is a good idea for generic collections. IIRC, C++ tried this with std::string and it was a disaster.- Small collection features. As every engineer, but no mathematician will tell you, is that most numbers are small. It is also true of collection sizes. In order to speed things up for collection that are expected to be small, being able to specify Vector!(T, 8) to reserve 8 slots for elements, in the base collection, and only allocate on heap if the collection size grows past this. LLVM has a collection library doing so and it is the most useful thing in the world (see https://www.youtube.com/watch?v=vElZc6zSIXM for more details).I think this is a great idea.- Not expose much of its internals. This like iterators being raw pointers (or the result of using the in operator) for instance, impose constraint on the internal that almost always end up costing a ton of perf as new techniques are discovered (recent exemples include open addressing with SSE probing for instance). All of this needs to be wrapped.Feel free to copy any of my ideas from dcollections for this. See https://github.com/schveiguy/dcollections/blob/master/concepts.txt Also note that Dcollections were strictly reference types, but had struct-based implementations (which could actually be swapped out with a different implementation if anyone had ever developed any). It came in handy in some cases, like the hash bucket collision linked list was just the implementation for the standard linked list.I have a good idea how to solve all these problems. I have no idea how to solve the type qualifier one. I need help.Feel free to ping, maybe on slack. We can discuss the tail-const problem. -Steve
Aug 31 2021
On Tuesday, 31 August 2021 at 23:56:51 UTC, Steven Schveighoffer wrote:One problem I can forsee with this approach is that, from the compiler's point of view, there is not necessarily any predictable relationship between `qualifier(Template!T)` and `Template!(qualifier(T))`. For all it knows, you could have written code like the following: ```d struct Vector(T) { static if (is(T == const)) { // do one thing } else { // do something completely different } } ``` Maybe in some restricted cases it could examine the uninstantiated template, determine that there's no funny business going on, and conclude that the implicit conversion is valid--but this kind of heuristic-based approach tends to be very brittle and compose poorly (exhibit A: [issue 1807][1]). In general, if we want the compiler to know that a conversion from one template instantiation to another is valid, we are going to have to [prove it by construction][2]--which is to say, by writing a function (like a constructor or an `opCast` overload) that actually performs the conversion. [1]: https://issues.dlang.org/show_bug.cgi?id=1807 [2]: https://en.wikipedia.org/wiki/Constructive_proofTo put it bluntly, I have no idea how to achieve this. And I don't think we'll have a good set of collection before this is sorted out.I had an idea a long time ago, but ashamedly never brought it to fruition. But long story short, I think we need a way to specify "tail" for modifiers, and allow implicitly casting the head. Then you'd have `tailconst(Vector!int)`, and you are good. I actually had a decent syntax for it, but I'm not sure it would fly anyway.
Aug 31 2021
On Wednesday, 1 September 2021 at 00:26:33 UTC, Paul Backus wrote:[snip] One problem I can forsee with this approach is that, from the compiler's point of view, there is not necessarily any predictable relationship between `qualifier(Template!T)` and `Template!(qualifier(T))`. For all it knows, you could have written code like the following: ```d struct Vector(T) { static if (is(T == const)) { // do one thing } else { // do something completely different } } ``` Maybe in some restricted cases it could examine the uninstantiated template, determine that there's no funny business going on, and conclude that the implicit conversion is valid--but this kind of heuristic-based approach tends to be very brittle and compose poorly (exhibit A: [issue 1807][1]). In general, if we want the compiler to know that a conversion from one template instantiation to another is valid, we are going to have to [prove it by construction][2]--which is to say, by writing a function (like a constructor or an `opCast` overload) that actually performs the conversion. [1]: https://issues.dlang.org/show_bug.cgi?id=1807 [2]: https://en.wikipedia.org/wiki/Constructive_proofI think you make some good points here. My recollection is that opHeadMutable [1] was intended as a way to inform the compiler that the conversion is valid. On issue 1807, as much as I want that enhancement, I just hope that there is a general solution to resolve it and not just a heuristic-based one. [1] https://forum.dlang.org/post/cpxfgdmklgusodqouqdr forum.dlang.org
Aug 31 2021
On Wednesday, 1 September 2021 at 01:38:49 UTC, jmh530 wrote:My recollection is that opHeadMutable [1] was intended as a way to inform the compiler that the conversion is valid. [1] https://forum.dlang.org/post/cpxfgdmklgusodqouqdr forum.dlang.orgIt looks like `opHeadMutable` is basically a restricted form of "implicit `opCast`"--that is, rather than allowing the programmer to define *arbitrary* implicit conversions, it lets them define implicit conversions of the form `qual(Template!(Args))` -> `Template!(qual(Args))`. IMO it is a compromise that both sides would find unsatisfying. Those who want convenient implicit conversions will not be happy with it because there are important cases it does not cover (e.g. `qual(Template!(qual(T)))` -> `qual(Template!T)`), but it is still flexible enough that anyone opposed to user-defined implicit conversions in general will likely find it hard to stomach.
Aug 31 2021
On 8/31/21 8:26 PM, Paul Backus wrote:On Tuesday, 31 August 2021 at 23:56:51 UTC, Steven Schveighoffer wrote:It's possible to forward, i.e.: ```d template Vector(T) if (isConst!T) { alias Vector = tailconst(.Vector!(Unqual!T)); } ``` If you look at arrays, `const(T)[]` is the same as `tailconst(T[])`. This would be similar. The problem I have with any other approach is that the compiler doesn't understand the relationships, and you will run into weird corner cases. It can't correctly add or strip the qualifier to the correct members when calling compatible functions. -SteveOne problem I can forsee with this approach is that, from the compiler's point of view, there is not necessarily any predictable relationship between `qualifier(Template!T)` and `Template!(qualifier(T))`.To put it bluntly, I have no idea how to achieve this. And I don't think we'll have a good set of collection before this is sorted out.I had an idea a long time ago, but ashamedly never brought it to fruition. But long story short, I think we need a way to specify "tail" for modifiers, and allow implicitly casting the head. Then you'd have `tailconst(Vector!int)`, and you are good. I actually had a decent syntax for it, but I'm not sure it would fly anyway.
Sep 01 2021
On Wednesday, 1 September 2021 at 11:04:41 UTC, Steven Schveighoffer wrote:It's possible to forward, i.e.: ```d template Vector(T) if (isConst!T) { alias Vector = tailconst(.Vector!(Unqual!T)); } ``` If you look at arrays, `const(T)[]` is the same as `tailconst(T[])`. This would be similar.This runs directly into the issue 1807 problem. The compiler has to recognize this specific pattern as meaningful, and if the programmer writes something *semantically* equivalent using a different pattern, their code will mysteriously break; e.g., ```d alias TailConstOf(T) = tailconst(T); alias Vector(T : const(U)) = TailConstOf!(.Vector!U); ```The problem I have with any other approach is that the compiler doesn't understand the relationships, and you will run into weird corner cases. It can't correctly add or strip the qualifier to the correct members when calling compatible functions.With the implicit `opCast`/constructor approach (or the `opHeadMutable` approach), the compiler doesn't have to understand the relationships. It just needs to know the original type and the desired type, and call the appropriate user-defined function to perform the conversion.
Sep 01 2021
On 9/1/21 7:57 AM, Paul Backus wrote:On Wednesday, 1 September 2021 at 11:04:41 UTC, Steven Schveighoffer wrote:Right, we have to fix that problem anyway. To further explain for readers, the issue is: ```d void foo(T)(Vector!(const(T)) arg); ``` This should accept a `Vector!(const int)` which is aliased as `tailconst(Vector!int)`, but the compiler can't see how to infer `T` as `int` from `tailconst(Vector!int)`. I think the implicit cast solution suffers from similar problems. It works fine if you use `const(T)[]`, which is the standard we should strive for. This is one of those longstanding issues that needs addressing.It's possible to forward, i.e.: ```d template Vector(T) if (isConst!T) { alias Vector = tailconst(.Vector!(Unqual!T)); } ``` If you look at arrays, `const(T)[]` is the same as `tailconst(T[])`. This would be similar.This runs directly into the issue 1807 problem. The compiler has to recognize this specific pattern as meaningful, and if the programmer writes something *semantically* equivalent using a different pattern, their code will mysteriously break; e.g., ```d alias TailConstOf(T) = tailconst(T); alias Vector(T : const(U)) = TailConstOf!(.Vector!U); ```With the implicit `opCast`/constructor approach (or the `opHeadMutable` approach), the compiler doesn't have to understand the relationships. It just needs to know the original type and the desired type, and call the appropriate user-defined function to perform the conversion.It just means you need to handle all the implicit conversions, which is not trivial. The compiler already has rules for implicit const conversions, some of which have subtle pitfalls if you get them wrong. I'd rather utilize those mechanisms rather than relying on each type author to get them all right. -Steve
Sep 01 2021
On Wednesday, 1 September 2021 at 12:32:36 UTC, Steven Schveighoffer wrote:As long as the type author does not use unsafe casts, the compiler will prevent them from getting the conversions wrong.With the implicit `opCast`/constructor approach (or the `opHeadMutable` approach), the compiler doesn't have to understand the relationships. It just needs to know the original type and the desired type, and call the appropriate user-defined function to perform the conversion.It just means you need to handle all the implicit conversions, which is not trivial. The compiler already has rules for implicit const conversions, some of which have subtle pitfalls if you get them wrong. I'd rather utilize those mechanisms rather than relying on each type author to get them all right.
Sep 01 2021
On Wednesday, 1 September 2021 at 12:32:36 UTC, Steven Schveighoffer wrote:To further explain for readers, the issue is: ```d void foo(T)(Vector!(const(T)) arg); ```This compiles: ```d struct Vector(T){} void f(T:const(U),U)(Vector!T arg); void f1() { Vector!(const int) a; f(a); } ```
Sep 02 2021
On Wednesday, 1 September 2021 at 11:57:01 UTC, Paul Backus wrote:With the implicit `opCast`/constructor approach (or the `opHeadMutable` approach), the compiler doesn't have to understand the relationships. It just needs to know the original type and the desired type, and call the appropriate user-defined function to perform the conversion.It still wouldn't work when manipulating things by reference. In fact, the litmus test for a solution to that problem should be: can we reproduce the behavior of builtin types such as T* or T[] with it?
Sep 01 2021
On Wednesday, 1 September 2021 at 12:52:47 UTC, deadalnix wrote:On Wednesday, 1 September 2021 at 11:57:01 UTC, Paul Backus wrote:It doesn't work when manipulating built-in types by reference either: ```d void manipulate(ref const(int)[] arr) {} void main() { const(int[]) arr; manipulate(arr); // Error: cannot pass argument `arr` of type `const(int[])` // to parameter `ref const(int)[] arr` } ```With the implicit `opCast`/constructor approach (or the `opHeadMutable` approach), the compiler doesn't have to understand the relationships. It just needs to know the original type and the desired type, and call the appropriate user-defined function to perform the conversion.It still wouldn't work when manipulating things by reference. In fact, the litmus test for a solution to that problem should be: can we reproduce the behavior of builtin types such as T* or T[] with it?
Sep 01 2021
On Wednesday, 1 September 2021 at 13:22:46 UTC, Paul Backus wrote:It doesn't work when manipulating built-in types by reference either: ```d void manipulate(ref const(int)[] arr) {} void main() { const(int[]) arr; manipulate(arr); // Error: cannot pass argument `arr` of type `const(int[])` // to parameter `ref const(int)[] arr` } ```No you right, this is not what I meant (but this is def what I wrote). I meant that int[][] or int[]* turtle the qualifier down.
Sep 01 2021
On Wednesday, 1 September 2021 at 18:23:06 UTC, deadalnix wrote:No you right, this is not what I meant (but this is def what I wrote). I meant that int[][] or int[]* turtle the qualifier down.Are you sure? ```d void manipulate(const(int)[][] arr) {} void main() { const(int[][]) arr; manipulate(arr); // Error: cannot pass argument `arr` of type `const(int[][])` // to parameter `const(int)[][] arr` } ```
Sep 01 2021
On Wednesday, 1 September 2021 at 18:32:29 UTC, Paul Backus wrote:On Wednesday, 1 September 2021 at 18:23:06 UTC, deadalnix wrote:Yes, const(int[][]) is effectively const(const(const(int)[])[]). In the same way, you want const(Vector!int[]) to be const(const(Vector!(const(int))[]). Which means that you should be able to convert implicitly a reference to a Vector!int into a reference to a Vector!(const(int)). In your example, you remove const qualifiers, so it obviously doesn't work. But it work in the other direction, when you add them. It won't for a collection even with an implicit cast operation, as what is really desired is some kind of reinterpret_cast.No you right, this is not what I meant (but this is def what I wrote). I meant that int[][] or int[]* turtle the qualifier down.Are you sure? ```d void manipulate(const(int)[][] arr) {} void main() { const(int[][]) arr; manipulate(arr); // Error: cannot pass argument `arr` of type `const(int[][])` // to parameter `const(int)[][] arr` } ```
Sep 01 2021
On Wednesday, 1 September 2021 at 11:04:41 UTC, Steven Schveighoffer wrote:It's possible to forward, i.e.: ```d template Vector(T) if (isConst!T) { alias Vector = tailconst(.Vector!(Unqual!T)); } ``` If you look at arrays, `const(T)[]` is the same as `tailconst(T[])`. This would be similar. The problem I have with any other approach is that the compiler doesn't understand the relationships, and you will run into weird corner cases. It can't correctly add or strip the qualifier to the correct members when calling compatible functions. -SteveIf the type is unrelated, then you are back to the heuristic problem.
Sep 01 2021
On Tuesday, 31 August 2021 at 23:56:51 UTC, Steven Schveighoffer wrote:I don't agree that COW is a good idea for generic collections. IIRC, C++ tried this with std::string and it was a disaster.You seem to be confused. The C++11 standard prohibits std::string from using COW. Existing C++ implementations were using COW, and getting out of it was what was a disaster (notably, there were breaking ABI changes, which the C++ ecosystem is not well suited to handle).
Aug 31 2021
On 8/31/21 8:52 PM, deadalnix wrote:On Tuesday, 31 August 2021 at 23:56:51 UTC, Steven Schveighoffer wrote:Isn't the reason they prohibited COW is because of the problems with it? Not saying COW is terrible in all cases, just that a generic container (non-specialized usage) should not be COW. If you need that feature, you should have to ask for it, not have it be an implementation detail. -SteveI don't agree that COW is a good idea for generic collections. IIRC, C++ tried this with std::string and it was a disaster.You seem to be confused. The C++11 standard prohibits std::string from using COW. Existing C++ implementations were using COW, and getting out of it was what was a disaster (notably, there were breaking ABI changes, which the C++ ecosystem is not well suited to handle).
Sep 01 2021
On Wednesday, 1 September 2021 at 11:13:27 UTC, Steven Schveighoffer wrote:Isn't the reason they prohibited COW is because of the problems with it? Not saying COW is terrible in all cases, just that a generic container (non-specialized usage) should not be COW. If you need that feature, you should have to ask for it, not have it be an implementation detail. -SteveWell I never experienced any problem with them, so did many other C++ users. As far as i am concerned, this was a net step backward. That being said, the reason were due to iterator invalidation. There are several function in the string API that do not invalidate iterators, as per the C++ standard, and these function might end up copying the content of the string in some specific situations, which would invalidate iterators. In practice, I think the main problem here is related to exposing the innards of the datastructure. An iterator that contains a reference to the container + index, rather than some kind of raw pointer for instance, could avoid that problem.
Sep 01 2021
On Wednesday, 1 September 2021 at 00:52:04 UTC, deadalnix wrote:You seem to be confused. The C++11 standard prohibits std::string from using COW. Existing C++ implementations were using COW, and getting out of it was what was a disaster (notably, there were breaking ABI changes, which the C++ ecosystem is not well suited to handle).In what ways was it a disaster? Because most strings are small enough to benefit from the small string optimization?
Sep 01 2021
On Wednesday, 1 September 2021 at 20:48:10 UTC, Per Nordlöw wrote:On Wednesday, 1 September 2021 at 00:52:04 UTC, deadalnix wrote:It was a disaster because gcc's standard lib used a COW string implementation, but had to change it when moving the C++11. Considering virutally everything uses string and that a lot of C++ software is distributed in binary format, and that the standard lib is often liked in live at launch time by the OS, you are faced with an ABI breakage nightmare.You seem to be confused. The C++11 standard prohibits std::string from using COW. Existing C++ implementations were using COW, and getting out of it was what was a disaster (notably, there were breaking ABI changes, which the C++ ecosystem is not well suited to handle).In what ways was it a disaster? Because most strings are small enough to benefit from the small string optimization?
Sep 01 2021
On Tuesday, 31 August 2021 at 17:21:02 UTC, deadalnix wrote:I want to start working on a container library. Interestingly enough, I don't think we have what we need on that front. I'll describe here what are the design I intend to go for, and the rationales for it, but before, there is an unsolved problem that I don't know how to work around that is somewhat of a blocker. If one of you guys have an idea how to solve it, I'm taking. If I'm being honest, this is probably the most important problem that need to be solved for D right now. That problem is turtling down type qualifiers. For instance, if we have a Vector!T, then we want it to convert to a Vector!(const T) and we want const(Vector!T) == const(Vector!(const T)), all of it implicitly. Same goes for other qualifiers. To put it bluntly, I have no idea how to achieve this. And I don't think we'll have a good set of collection before this is sorted out. Now onto the design: - Value types. It is very easy to turn a value into a reference type, not so much the other way around. reference type also come with a bag of edges cases, like the collection being null vs empty, that cause confusion and problems in practice. - COW. Value type collection can be very expensive to work with if the eagerly copy everything. On the other hand, COW ensures copies are cheap. There are also very common use case that are well served by COW collections, for instance, when one wants value types for collection that are infrequently changes - but you don't know or can't prove that they won't. - Small collection features. As every engineer, but no mathematician will tell you, is that most numbers are small. It is also true of collection sizes. In order to speed things up for collection that are expected to be small, being able to specify Vector!(T, 8) to reserve 8 slots for elements, in the base collection, and only allocate on heap if the collection size grows past this. LLVM has a collection library doing so and it is the most useful thing in the world (see https://www.youtube.com/watch?v=vElZc6zSIXM for more details). - Not expose much of its internals. This like iterators being raw pointers (or the result of using the in operator) for instance, impose constraint on the internal that almost always end up costing a ton of perf as new techniques are discovered (recent exemples include open addressing with SSE probing for instance). All of this needs to be wrapped. I have a good idea how to solve all these problems. I have no idea how to solve the type qualifier one. I need help.It's good that you are doing this Amaury, thanks. We need better containers to compete with other languages. On top of that, there are patterns that other languages cannot do. One thing that comes to mind is how D is one of not many languages (languages with macros or similar probably can), that can easily allow one to write containers which lie about their layout. The most profitable example of that I can think of is automatically switching arrays of POD types to be a struct of arrays internally. This transformation makes random access to elements slower, however (say) loops where access to a subset of the elements dominates then this is much faster due to improved cache efficiency.
Aug 31 2021
On Wednesday, 1 September 2021 at 02:08:40 UTC, max haughton wrote:layout. The most profitable example of that I can think of is automatically switching arrays of POD types to be a struct of arrays internally. This transformation makes random access to elements slower, however (say) loops where access to a subset of the elements dominates then this is much faster due to improved cache efficiency.Speaking of which, here's my D implementation of struct of arrays (SoA): https://github.com/nordlow/phobos-next/blob/master/src/nxt/soa.d Peter Kirov has an alternative implementation providing some benefits. I've copied it into https://github.com/nordlow/phobos-next/blob/master/src/nxt/soa_petar_kirov.d The two implementations should be united. Note that Petar's SoA-implementation has its `capacity`-property functionally deduced from the `length` property as ```d /** Returns the current capacity - the current length rounded up to a power of two. This assumption allows us to save on having a separate field. */ size_t capacity() const pure safe { return _length.roundUpToPowerOf2; } ``` thereby saving a word of (stack) space. That behaviour could be opt-in aswell via a template parameter.
Sep 01 2021
On Wednesday, 1 September 2021 at 02:08:40 UTC, max haughton wrote:We need better containers to compete with other languages. On top of that, there are patterns that other languages cannot do. One thing that comes to mind is how D is one of not many languages (languages with macros or similar probably can), that can easily allow one to write containers which lie about their layout. The most profitable example of that I can think of is automatically switching arrays of POD types to be a struct of arrays internally. This transformation makes random access to elements slower, however (say) loops where access to a subset of the elements dominates then this is much faster due to improved cache efficiency.Great point. Game engines extensively use such data layouts when storing vertex data. That's reason for Jai wanting a builtin SoA type. Jai has interestingly since made that a library feature.
Sep 01 2021
On Tuesday, 31 August 2021 at 17:21:02 UTC, deadalnix wrote:I want to start working on a container library. Interestingly enough, I don't think we have what we need on that front.You give Razvan and Andrei a ping aswell. They have done deep work on the topic of specifically trying to make RC-backed containers/collections safe pure nothrow nogc. One of Andrei's conclusion was that a run-time flag is needed to reveal whether a const parameter is fed an immutable argument or not.
Sep 01 2021
On Wednesday, 1 September 2021 at 10:32:41 UTC, Per Nordlöw wrote:On Tuesday, 31 August 2021 at 17:21:02 UTC, deadalnix wrote:I started looking into this after a short discussion with Andrei :)I want to start working on a container library. Interestingly enough, I don't think we have what we need on that front.You give Razvan and Andrei a ping aswell. They have done deep work on the topic of specifically trying to make RC-backed containers/collections safe pure nothrow nogc. One of Andrei's conclusion was that a run-time flag is needed to reveal whether a const parameter is fed an immutable argument or not.
Sep 01 2021
On Wednesday, 1 September 2021 at 10:33:47 UTC, deadalnix wrote:I started looking into this after a short discussion with Andrei :)Also highly related is Atila's work on doing `std.typecons.RefCounted` safe at https://github.com/dlang/phobos/pull/8101. Seems stalled, though.
Sep 01 2021
On Tuesday, 31 August 2021 at 17:21:02 UTC, deadalnix wrote:I want to start working on a container library. Interestingly enough, I don't think we have what we need on that front.Moreover, Rust collections provides makes rich of static members such as [1] `C.with_capacity(size_t)` instead of guessing what C(42) means which have introduced bugs in my code previously. Might be worthwhile considering making use of these patterns. My containers make use of such static members. [1] https://doc.rust-lang.org/std/vec/struct.Vec.html#method.with_capacity.
Sep 01 2021
On Wednesday, 1 September 2021 at 10:44:46 UTC, Per Nordlöw wrote:On Tuesday, 31 August 2021 at 17:21:02 UTC, deadalnix wrote:I have adopted that style a lot lately and this is indeed much better than C(42).I want to start working on a container library. Interestingly enough, I don't think we have what we need on that front.Moreover, Rust collections provides makes rich of static members such as [1] `C.with_capacity(size_t)` instead of guessing what C(42) means which have introduced bugs in my code previously. Might be worthwhile considering making use of these patterns. My containers make use of such static members. [1] https://doc.rust-lang.org/std/vec/struct.Vec.html#method.with_capacity.
Sep 01 2021
On Tuesday, 31 August 2021 at 17:21:02 UTC, deadalnix wrote:... For instance, if we have a Vector!T, then we want it to convert to a Vector!(const T) and we want const(Vector!T) == const(Vector!(const T)), all of it implicitly. Same goes for other qualifiers. To put it bluntly, I have no idea how to achieve this. And I don't think we'll have a good set of collection before this is sorted out. ...Try `this` template parameter: ```d import std.traits : CopyConstness, isMutable; void main(){ { Vector!int a; Vector!(const int) b = a; const Vector!int c = b; } { Vector!int a; Vector!(const int) b; b = a; } } struct Vector(T){ alias ElementType = T; this(U, this This)(ref Vector!U rhs) if(!is(U == T) && isCopyable!(This, typeof(rhs))){ //impl } this(U, this This)(Vector!U rhs) if(isCopyable!(typeof(rhs), This)){ //impl } ///TODO copy ctors... void opAssign(U, this This)(auto ref Vector!U rhs) if(isCopyable!(typeof(rhs), This) && isMutable!This){ //impl } } private enum bool isCopyable(From, To) = is( CopyConstness!(From, From.ElementType[]) : CopyConstness!(To, To.ElementType[]) ); ```
Sep 01 2021
On Tuesday, 31 August 2021 at 17:21:02 UTC, deadalnix wrote:I want to start working on a container library. Interestingly enough, I don't think we have what we need on that front.Now onto the design: - Value types. It is very easy to turn a value into a reference type, not so much the other way around. reference type also come with a bag of edges cases, like the collection being null vs empty, that cause confusion and problems in practice.My collections library[1] safety inherited from the stored data types. Any data access methods that returns reference marked with ' system', so you can't use it easily in safe code.I have a good idea how to solve all these problems. I have no idea how to solve the type qualifier one. I need help.You didn't mention iterator stability. [1] https://github.com/ikod/ikod-containers
Sep 02 2021
On Tuesday, 31 August 2021 at 17:21:02 UTC, deadalnix wrote:- Not expose much of its internals.Do you want to expose the COW detail in the interface?
Sep 02 2021
On Thursday, 2 September 2021 at 09:15:10 UTC, Kagamin wrote:On Tuesday, 31 August 2021 at 17:21:02 UTC, deadalnix wrote:It doesn't need to be explicit, but in many instances, this is going to be visible, for instance, if the type stored has a copy constructor, then copying the container will be a very visible operation.- Not expose much of its internals.Do you want to expose the COW detail in the interface?
Sep 02 2021