digitalmars.D - Copy-On-Write (COW) Managed Containers?
- Per =?UTF-8?B?Tm9yZGzDtnc=?= (12/12) Oct 20 2020 What are your thoughts on the pros and cons with copy-on-write
- Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (10/18) Oct 20 2020 Copy-on-write only makes sense if you intend to make copies
- IGotD- (22/33) Oct 20 2020 I can see several cases where you want to do operations on
- Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (17/31) Oct 20 2020 Yes, C++ has std::span for that. Not really sure why they also
- Max Haughton (16/47) Oct 20 2020 I think Facebook's string library still has flag/s for small
- ikod (5/8) Oct 20 2020 I use COW to create copy for hashmap bucket array if user decided
- rikki cattermole (6/10) Oct 20 2020 I am very interested in concurrent data structures. They give the
What are your thoughts on the pros and cons with copy-on-write (COW) managed allocations in D, typically for containers such as `std.container.Array`? Has it been considered for use in a standard containers/collections library in D? Why not? The only example I've found is C++ is std::string which in some versions of STL seems to make use of COW. Swift, on the other hand, uses it extensively to minimize implicit aliasing. At [1] Chris Lattner outlines the design choices behind this decision. [1] https://youtu.be/nWTvXbQHwWs?t=1026
Oct 20 2020
On Tuesday, 20 October 2020 at 12:42:26 UTC, Per Nordlöw wrote:What are your thoughts on the pros and cons with copy-on-write (COW) managed allocations in D, typically for containers such as `std.container.Array`? Has it been considered for use in a standard containers/collections library in D? Why not? The only example I've found is C++ is std::string which in some versions of STL seems to make use of COW.Copy-on-write only makes sense if you intend to make copies without modifying them. That is rather unlikely for arrays. When you create a copy you usually do it with the intent of modifying the copy. I think this behaviour for std::string came about because C++ didn't get std::string_view until recently. I think it is a flaw. If you use reference counting throughout like Swift/Objective-C, then I guess you could do it. But it isn't really suitable for low level programming. It is a high level feature.
Oct 20 2020
On Tuesday, 20 October 2020 at 14:56:44 UTC, Ola Fosheim Grøstad wrote:Copy-on-write only makes sense if you intend to make copies without modifying them. That is rather unlikely for arrays. When you create a copy you usually do it with the intent of modifying the copy. I think this behaviour for std::string came about because C++ didn't get std::string_view until recently. I think it is a flaw. If you use reference counting throughout like Swift/Objective-C, then I guess you could do it. But it isn't really suitable for low level programming. It is a high level feature.I can see several cases where you want to do operations on slices, regardless if it as string or other type of elements. Today std::string use SSO (short string optimization) in most libraries which means that if the string is shorter than a certain size it can be stored inside the class (usually around 16 bytes) otherwise it must allocate the array. If the string is copied it actually copies the data and do not reuse anything. Previously which must be several years from now std::string used COW and the reason was it didn't scale with multiprocessor environments they claimed but I've not seen the actual reasoning behind it. The reason we have string_view is because around C++11 std::string added the zero termination by default which wasn't required before. Now string_view is required because of this and you can really discuss if that was a sane choice. Also, reference counting might very well be suitable for low level programming. It's actually a GC method that is used often. The Linux kernel is full of it, in a manual fashion of course. Isn't both the array in std.container.Array and the regular built-in array COW in D?
Oct 20 2020
On Tuesday, 20 October 2020 at 15:28:36 UTC, IGotD- wrote:I can see several cases where you want to do operations on slices, regardless if it as string or other type of elements.Yes, C++ has std::span for that. Not really sure why they also wanted string_view.anything. Previously which must be several years from now std::string used COW and the reason was it didn't scale with multiprocessor environments they claimed but I've not seen the actual reasoning behind it.I don't know. You can use COW when designing a high level language for multiprocessor execution (HPC), but that is something different than what we are speaking of here? And it only makes sense if the compiler is able to reason about concurrency.The reason we have string_view is because around C++11 std::string added the zero termination by default which wasn't required before. Now string_view is required because of this and you can really discuss if that was a sane choice.I doubt people use std::string for much more than paths and names... It is a very lacklustre design, but then again, no string-representation can fit all use scenarios (in low level programming that is).Also, reference counting might very well be suitable for low level programming.Yes, but not as a homogenous reference strategy.Isn't both the array in std.container.Array and the regular built-in array COW in D?COW would require all mutable operations to test a flag in the object before mutation. That is a performance killer. Maybe you are talking about optimizations? But that would not be COW...
Oct 20 2020
On Tuesday, 20 October 2020 at 15:52:53 UTC, Ola Fosheim Grøstad wrote:On Tuesday, 20 October 2020 at 15:28:36 UTC, IGotD- wrote:I think Facebook's string library still has flag/s for small string, dynamic, and COW. The container will have flags anyway, the performance hit could be mitigated (I am writing a library to help measure this). For example, with some trickery you can turn a branch into a conditional move or bitops - the amortized performance benefit may make it worth doing too, so keep that in mind (i.e. a smaller container with slower flag checking may be faster than the opposite due to cache performance) D has an advantange here, because the metaprogramming makes choosing (say) internal buffer sizes easier, and we can choose not to enable COW for shared types if needed. Phobos could really use some nogc containers using std.exp.allocator.I can see several cases where you want to do operations on slices, regardless if it as string or other type of elements.Yes, C++ has std::span for that. Not really sure why they also wanted string_view.anything. Previously which must be several years from now std::string used COW and the reason was it didn't scale with multiprocessor environments they claimed but I've not seen the actual reasoning behind it.I don't know. You can use COW when designing a high level language for multiprocessor execution (HPC), but that is something different than what we are speaking of here? And it only makes sense if the compiler is able to reason about concurrency.The reason we have string_view is because around C++11 std::string added the zero termination by default which wasn't required before. Now string_view is required because of this and you can really discuss if that was a sane choice.I doubt people use std::string for much more than paths and names... It is a very lacklustre design, but then again, no string-representation can fit all use scenarios (in low level programming that is).Also, reference counting might very well be suitable for low level programming.Yes, but not as a homogenous reference strategy.Isn't both the array in std.container.Array and the regular built-in array COW in D?COW would require all mutable operations to test a flag in the object before mutation. That is a performance killer. Maybe you are talking about optimizations? But that would not be COW...
Oct 20 2020
On Tuesday, 20 October 2020 at 12:42:26 UTC, Per Nordlöw wrote:What are your thoughts on the pros and cons with copy-on-write (COW) managed allocations in D, typically for containers such as `std.container.Array`?I use COW to create copy for hashmap bucket array if user decided to mutate container during byKey/byPair iteration. The only downside I see is temporary doubled memory usage. The benefits are clear - you can provide stable iterators.
Oct 20 2020
On 21/10/2020 4:40 AM, ikod wrote:I use COW to create copy for hashmap bucket array if user decided to mutate container during byKey/byPair iteration. The only downside I see is temporary doubled memory usage. The benefits are clear - you can provide stable iterators.I am very interested in concurrent data structures. They give the guarantee that they will still work with mutation during iteration and won't lock. COW given a concurrent data structure alternative is probably less desirable given that it will require allocation at the minimum.
Oct 20 2020