digitalmars.D - RFC: moving forward with nogc Phobos
- Andrei Alexandrescu (71/71) Sep 29 2014 Back when I've first introduced RCString I hinted that we have a larger
- Daniel Kozak via Digitalmars-d (9/106) Sep 29 2014 V Mon, 29 Sep 2014 03:49:52 -0700
- Andrei Alexandrescu (13/19) Sep 29 2014 (please don't overquote!)
- Daniel N (19/34) Sep 29 2014 How about having something like ResourceManagementPolicy.infer,
- eles (3/4) Sep 29 2014 Finally!
- eles (15/19) Sep 29 2014 Sorry, enthusiasm. I really think this is the key for doing the
- Vladimir Panteleev (7/17) Sep 29 2014 Is this practically feasible without blowing up Phobos several
- Andrei Alexandrescu (6/22) Sep 29 2014 I believe so. For the most part implementations will be identical - just...
- Dicebot (12/12) Sep 29 2014 Any assumption that library code can go away with some set of
- Andrei Alexandrescu (8/18) Sep 29 2014 That's making exactly the confusion I was - that memory allocation
- Dicebot (9/36) Sep 29 2014 Yes but neither decision belongs to library code except for very
- Andrei Alexandrescu (13/40) Sep 29 2014 You just assert it, so all I can say is "I understand you believe this"....
- Dicebot (20/28) Sep 29 2014 I probably have missed the part with arguments :) Your reasoning
- Andrei Alexandrescu (14/45) Sep 29 2014 =================
- Dicebot (10/13) Sep 29 2014 Resisting to go on meaningless argument on other points, this
- Andrei Alexandrescu (2/13) Sep 29 2014 I trust you'll be. -- Andrei
- Chris Williams (12/18) Sep 29 2014 I think the key to this sort of issue is to try and get as much
- Paulo Pinto (8/18) Sep 29 2014 Personally, I would go just for (b) with compiler support for
- Andrei Alexandrescu (3/6) Sep 29 2014 Compiler already knows (after inlining) that ++i and --i cancel each
- Marco Leise (10/17) Sep 30 2014 That helps with very small, inlined functions until Marc
- Manu via Digitalmars-d (6/13) Sep 30 2014 The compiler doesn't know that MyLibrary_AddRef(Thing *t); and
- deadalnix (4/23) Sep 30 2014 Even with simply i++ and i--, the information that they always go
- Jacob Carlborg (5/36) Sep 29 2014 How does allocators fit in this? Will it be an additional argument to
- Andrei Alexandrescu (9/11) Sep 29 2014 There would be one allocator per thread (changeable) deferring to a
- Johannes Pfau (11/31) Sep 30 2014 So you propose RC + global/thread local allocators as the solution for
- Peter Alexander (10/13) Sep 30 2014 Agreed. This is the common case we need to solve for, but this is
- Andrei Alexandrescu (8/17) Sep 30 2014 There would be no possibility to do that. I mean it's not there but it
- Jacob Carlborg (4/5) Sep 30 2014 Weren't all methods in Object supposed to be lifted out from Object anyw...
- Jonathan M Davis via Digitalmars-d (6/9) Oct 28 2014 Yes, but not much work has been done on it, and the little work that has...
- Johannes Pfau (8/15) Sep 30 2014 Passing buffers or sink delegates (like we already do for toString) is
- Vladimir Panteleev (5/13) Sep 30 2014 I don't understand, why wouldn't you be able to temporarily set
- Johannes Pfau (21/36) Sep 30 2014 That's possible but insanely dangerous in case you forget to reset the
- "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= (29/41) Sep 30 2014 Yes, I agree. One option would be to have thread-local region
- Paulo Pinto (9/18) Sep 30 2014 It works when two big ifs come together.
- "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= (4/10) Sep 30 2014 But Objective-C has thread safe ref-counting?!
- Paulo Pinto (3/16) Sep 30 2014 Did you read my second bullet?
- Ola Fosheim Grostad (4/25) Sep 30 2014 Yes? I dont want builtin rc default for single threaded use
- Mike (3/5) Sep 30 2014 I agree.
- Andrei Alexandrescu (3/12) Sep 30 2014 That's doable, but you don't get to place the string at a _specific_
- Andrei Alexandrescu (5/14) Sep 30 2014 Correct. The output of toStringz would be either a GC string or an RC
- Johannes Pfau (21/39) Sep 30 2014 The sarcasm is supposed to be here: '_all_ memory related problems' ;-)
- Sean Kelly (5/10) Sep 30 2014 Yes, I'm hoping this is an adjunct to changes in Phobos to reduce
- Andrei Alexandrescu (16/43) Oct 01 2014 Agreed.
- Johannes Pfau (4/66) Oct 06 2014 OK then I got you wrong and I agree with everything you wrote above.
- Chris Williams (23/30) Sep 29 2014 Forcing someone (or rather, a team of someones) to call into the
- Shammah Chancellor (14/109) Sep 29 2014 I don't like the idea of having to pass in template parameters
- Andrei Alexandrescu (5/11) Sep 29 2014 Don't confuse memory allocation with memory management. There's no such
- Shammah Chancellor (7/24) Sep 29 2014 Sure, but combining the two could be very useful -- as we have noticed
- Daniel N (7/11) Sep 29 2014 There was a solution earlier in this thread which avoids that
- Andrei Alexandrescu (2/14) Oct 01 2014 I'm not sure whether we can do this within D's type system. -- Andrei
- Uranuz (64/71) Sep 29 2014 I'll try to destroy ;) Before thinking out some answers to this
- Mike (7/19) Sep 29 2014 This really hits the nail on the head, and I think your other
- Andrei Alexandrescu (25/87) Oct 01 2014 Sadly this is the way things are going (not only in D, but other
- Freddy (23/102) Sep 29 2014 Internally we should have something like:
- Andrei Alexandrescu (2/23) Sep 29 2014 That's correct. -- Andrei
- Andrei Alexandrescu (4/26) Oct 01 2014 Good idea, and it seems Sean's is even better because it groups
- Foo (20/20) Sep 30 2014 I hate the fact that this will produce template bloat for each
- Foo (3/23) Sep 30 2014 Of course each method/function in Phobos should use the global
- Andrei Alexandrescu (3/23) Sep 30 2014 This won't work because the type of "string" is different for RC vs. GC....
- Foo (3/31) Sep 30 2014 But it would work for phobos functions without template bloat.
- "Marc =?UTF-8?B?U2Now7x0eiI=?= <schuetzm gmx.net> (3/9) Sep 30 2014 Only for internal allocations. If the functions want to return
- Andrei Alexandrescu (2/12) Sep 30 2014 Ah, now I understand the point. Thanks. -- Andrei
- Andrei Alexandrescu (5/36) Sep 30 2014 How is the fact there's less bloat relevant for code that doesn't work?
- John Colvin (19/98) Sep 30 2014 Instead of adding a new template parameter to every function
- Andrei Alexandrescu (4/7) Oct 01 2014 Nice idea, but let's try and explore possibilities within the existing
- Sean Kelly (53/66) Sep 30 2014 Is this for exposition purposes or actually how you expect it to
- H. S. Teoh via Digitalmars-d (42/75) Sep 30 2014 Yeah, this echoes my concern. This looks not that much different, from a
- Andrei Alexandrescu (16/55) Oct 01 2014 The parallel with STL allocators is interesting, but I'm not worried
- Andrei Alexandrescu (33/97) Oct 01 2014 That's pretty much what it would take. The key here is that RCString is
- Sean Kelly (4/17) Oct 01 2014 I'm confused. Is this a general-purpose solution or just one
- Andrei Alexandrescu (2/22) Oct 01 2014 General purpose since your suggested change. -- Andrei
- Sean Kelly (13/47) Oct 01 2014 Both, I suppose? A static if block at the top of each function
- Andrei Alexandrescu (2/5) Oct 01 2014 Correct. -- Andrei
- Dmitry Olshansky (5/14) Sep 30 2014 Incredible code bloat? Boilerplate in each function for the win?
- Andrei Alexandrescu (3/16) Oct 01 2014 Sean's idea to make string an alias of the policy takes care of this
- H. S. Teoh via Digitalmars-d (11/29) Oct 01 2014 But Sean's idea only takes strings into account. Strings aren't the only
- Kiith-Sa (9/45) Oct 01 2014 MMP.Ref!redBlackTreeNode ?
- Sean Kelly (13/25) Oct 01 2014 Assuming you're willing to take the memoryModel type as a
- Cliff (5/34) Oct 01 2014 If you were to forget D restrictions for a moment, and consider
- Andrei Alexandrescu (3/10) Oct 01 2014 There's management for T[], pointers to structs, pointers to class
- "Marc =?UTF-8?B?U2Now7x0eiI=?= <schuetzm gmx.net> (58/136) Sep 30 2014 Ok, here are my few cents:
- "Marc =?UTF-8?B?U2Now7x0eiI=?= <schuetzm gmx.net> (42/52) Sep 30 2014 Ok. What we need for it:
- Andrei Alexandrescu (3/9) Oct 01 2014 I'm not very sure. A GC might need to interoperate closely with the
- "Marc =?UTF-8?B?U2Now7x0eiI=?= <schuetzm gmx.net> (8/21) Oct 01 2014 It needs to know what to scan (ideally with type info), and which
- Oren Tirosh (22/54) Oct 01 2014 Bingo. Have some way to mark the function return type as a unique
- bearophile (5/8) Oct 01 2014 Let's have full-fledged memory zones tracking in the D type
- Andrei Alexandrescu (9/21) Oct 01 2014 I'm skeptical about this approach (though clearly we need to explore it
- Oren T (10/40) Oct 01 2014 The idea is that the unique property is very short-lived: the
- Andrei Alexandrescu (3/10) Oct 01 2014 This all... looks arcane. I'm not sure how it can even made to work if
- Oren T (5/21) Oct 01 2014 At the moment, @nogc code can't call any function returning a
- Oren T (9/25) Oct 01 2014 At the moment, @nogc code can't call any function returning a
- Jacob Carlborg (8/15) Oct 01 2014 Can't we do something like this, or it might be what you're proposing:
- "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= (42/46) Oct 02 2014 That would be better, but how do you deal with "bar(foo())" ?
- Jacob Carlborg (8/14) Oct 02 2014 I haven't really thought how it could be implemented but I was hoping
- "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= (4/8) Oct 02 2014 I haven't looked at Rust in detail, but doesn't the Rust compiler
- Paulo Pinto (8/17) Oct 02 2014 Rust makes use of the type system and the borrow checker to
- "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= (7/11) Oct 02 2014 They constrain usage so that you cannot share mutable objects. It
- "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= (22/25) Oct 02 2014 Some Rust details. «sendable» means that a reference can be
- Paulo Pinto (4/29) Oct 02 2014 The Gc type is gone as of this week.
- "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= (4/6) Oct 02 2014 Thanks, apparently they do it because they want to make a proper
- "Marc =?UTF-8?B?U2Now7x0eiI=?= <schuetzm gmx.net> (7/19) Oct 01 2014 Sure? I already showed in an example how it is possible to chain
- Andrei Alexandrescu (2/20) Oct 01 2014 I'd think so. -- Andrei
- "Marc =?UTF-8?B?U2Now7x0eiI=?= <schuetzm gmx.net> (36/83) Oct 01 2014 I don't have all answers to these questions. Still, I'm convinced
- "Marc =?UTF-8?B?U2Now7x0eiI=?= <schuetzm gmx.net> (37/42) Oct 02 2014 Here's an example implementation of what I have in mind (totally
- Manu via Digitalmars-d (24/27) Sep 30 2014 I generally like the idea, but my immediate concern is that it implies
- Andrei Alexandrescu (10/32) Oct 01 2014 If a lib chooses one specific memory management policy, it can of course...
- =?UTF-8?B?Ik5vcmRsw7Z3Ig==?= (4/6) Sep 30 2014 Slightly related :)
- Andrei Alexandrescu (2/7) Oct 01 2014 Nice, thanks! -- Andrei
- Dmitry Olshansky (14/16) Oct 03 2014 [snip]
- Andrei Alexandrescu (9/23) Oct 03 2014 Awesome. I just started
- Dmitry Olshansky (8/24) Oct 03 2014 Glad you liked it.
- Andrei Alexandrescu (2/26) Oct 03 2014 D script that generates wikitable from that -> awesomeness. -- Andrei
- Dmitry Olshansky (4/35) Oct 03 2014 I'm on it. With GitHub source links. D's regex rocks ;)
- Dmitry Olshansky (12/27) Oct 03 2014 Forgot my wiki credentials. Anyhow I got passable Markdown page fairly
- Dmitry Olshansky (5/27) Oct 03 2014 Ehm, rather (without '!' at the end):
- Dmitry Olshansky (10/26) Oct 03 2014 Got it:
- Andrei Alexandrescu (3/10) Oct 03 2014 Tried to insert it, looks weird. Probably it would be most effective if
Back when I've first introduced RCString I hinted that we have a larger strategy in mind. Here it is. The basic tenet of the approach is to reckon and act on the fact that memory allocation (the subject of allocators) is an entirely distinct topic from memory management, and more generally resource management. This clarifies that it would be wrong to approach alternatives to GC in Phobos by means of allocators. GC is not only an approach to memory allocation, but also an approach to memory management. Reducing it to either one is a mistake. In hindsight this looks rather obvious but it has caused me and many people better than myself a lot of headache. That said allocators are nice to have and use, and I will definitely follow up with std.allocator. However, std.allocator is not the key to a nogc Phobos. Nor are ranges. There is an attitude that either output ranges, or input ranges in conjunction with lazy computation, would solve the issue of creating garbage. https://github.com/D-Programming-Language/phobos/pull/2423 is a good illustration of the latter approach: a range would be lazily created by chaining stuff together. A range-based approach would take us further than the allocators, but I see the following issues with it: (a) the whole approach doesn't stand scrutiny for non-linear outputs, e.g. outputting some sort of associative array or really any composite type quickly becomes tenuous either with an output range (eager) or with exposing an input range (lazy); (b) makes the style of programming without GC radically different, and much more cumbersome, than programming with GC; as a consequence, programmers who consider changing one approach to another, or implementing an algorithm neutral to it, are looking at a major rewrite; (c) would make D/ nogc a poor cousin of C++. This is quite out of character; technically, I have long gotten used to seeing most elaborate C++ code like poor emulation of simple D idioms. But C++ has spent years and decades taking to perfection an approach without a tracing garbage collector. A departure from that would need to be superior, and that doesn't seem to be the case with range-based approaches. =========== Now that we clarified that these existing attempts are not going to work well, the question remains what does. For Phobos I'm thinking of defining and using three policies: enum MemoryManagementPolicy { gc, rc, mrc } immutable gc = ResourceManagementPolicy.gc, rc = ResourceManagementPolicy.rc, mrc = ResourceManagementPolicy.mrc; The three policies are: (a) gc is the classic garbage-collected style of management; (b) rc is a reference-counted style still backed by the GC, i.e. the GC will still be able to pick up cycles and other kinds of leaks. (c) mrc is a reference-counted style backed by malloc. (It should be possible to collapse rc and mrc together and make the distinction dynamically, at runtime. I'm distinguishing them statically here for expository purposes.) The policy is a template parameter to functions in Phobos (and elsewhere), and informs the functions e.g. what types to return. Consider: auto setExtension(MemoryManagementPolicy mmp = gc, R1, R2)(R1 path, R2 ext) if (...) { static if (mmp == gc) alias S = string; else alias S = RCString; S result; ... return result; } On the caller side: auto p1 = setExtension("hello", ".txt"); // fine, use gc auto p2 = setExtension!gc("hello", ".txt"); // same auto p3 = setExtension!rc("hello", ".txt"); // fine, use rc So by default it's going to continue being business as usual, but certain functions will allow passing in a (defaulted) policy for memory management. Destroy! Andrei
Sep 29 2014
V Mon, 29 Sep 2014 03:49:52 -0700 Andrei Alexandrescu via Digitalmars-d <digitalmars-d puremagic.com> napsáno:Back when I've first introduced RCString I hinted that we have a larger strategy in mind. Here it is. The basic tenet of the approach is to reckon and act on the fact that memory allocation (the subject of allocators) is an entirely distinct topic from memory management, and more generally resource management. This clarifies that it would be wrong to approach alternatives to GC in Phobos by means of allocators. GC is not only an approach to memory allocation, but also an approach to memory management. Reducing it to either one is a mistake. In hindsight this looks rather obvious but it has caused me and many people better than myself a lot of headache. That said allocators are nice to have and use, and I will definitely follow up with std.allocator. However, std.allocator is not the key to a nogc Phobos. Nor are ranges. There is an attitude that either output ranges, or input ranges in conjunction with lazy computation, would solve the issue of creating garbage. https://github.com/D-Programming-Language/phobos/pull/2423 is a good illustration of the latter approach: a range would be lazily created by chaining stuff together. A range-based approach would take us further than the allocators, but I see the following issues with it: (a) the whole approach doesn't stand scrutiny for non-linear outputs, e.g. outputting some sort of associative array or really any composite type quickly becomes tenuous either with an output range (eager) or with exposing an input range (lazy); (b) makes the style of programming without GC radically different, and much more cumbersome, than programming with GC; as a consequence, programmers who consider changing one approach to another, or implementing an algorithm neutral to it, are looking at a major rewrite; (c) would make D/ nogc a poor cousin of C++. This is quite out of character; technically, I have long gotten used to seeing most elaborate C++ code like poor emulation of simple D idioms. But C++ has spent years and decades taking to perfection an approach without a tracing garbage collector. A departure from that would need to be superior, and that doesn't seem to be the case with range-based approaches. =========== Now that we clarified that these existing attempts are not going to work well, the question remains what does. For Phobos I'm thinking of defining and using three policies: enum MemoryManagementPolicy { gc, rc, mrc } immutable gc = ResourceManagementPolicy.gc, rc = ResourceManagementPolicy.rc, mrc = ResourceManagementPolicy.mrc; The three policies are: (a) gc is the classic garbage-collected style of management; (b) rc is a reference-counted style still backed by the GC, i.e. the GC will still be able to pick up cycles and other kinds of leaks. (c) mrc is a reference-counted style backed by malloc. (It should be possible to collapse rc and mrc together and make the distinction dynamically, at runtime. I'm distinguishing them statically here for expository purposes.) The policy is a template parameter to functions in Phobos (and elsewhere), and informs the functions e.g. what types to return. Consider: auto setExtension(MemoryManagementPolicy mmp = gc, R1, R2)(R1 path, R2 ext) if (...) { static if (mmp == gc) alias S = string; else alias S = RCString; S result; ... return result; } On the caller side: auto p1 = setExtension("hello", ".txt"); // fine, use gc auto p2 = setExtension!gc("hello", ".txt"); // same auto p3 = setExtension!rc("hello", ".txt"); // fine, use rc So by default it's going to continue being business as usual, but certain functions will allow passing in a (defaulted) policy for memory management. Destroy! AndreiI would add something like this: DefaultMemoryManagementPolicy(rc) module A; void main() { auto p1 = setExtension("hello", ".txt"); // use rc }
Sep 29 2014
On 9/29/14, 4:03 AM, Daniel Kozak via Digitalmars-d wrote:I would add something like this: DefaultMemoryManagementPolicy(rc) module A; void main() { auto p1 = setExtension("hello", ".txt"); // use rc }(please don't overquote!) Yah, I realized I forgot to mention this: if we play our cards right, a lot of code will build in both approaches to memory management by just flipping a switch. In particular, the switch can be defaulted to something else. I was thinking of leaving it to the user: module A; immutable myMMP = rc; void main() { auto p1 = setExtension!myMMP("hello", ".txt"); } Andrei
Sep 29 2014
On Monday, 29 September 2014 at 10:49:53 UTC, Andrei Alexandrescu wrote:Back when I've first introduced RCString I hinted that we have a larger strategy in mind. Here it is. The policy is a template parameter to functions in Phobos (and elsewhere), and informs the functions e.g. what types to return. Consider: auto setExtension(MemoryManagementPolicy mmp = gc, R1, R2)(R1 path, R2 ext) if (...) { static if (mmp == gc) alias S = string; else alias S = RCString; S result; ... return result; }How about having something like ResourceManagementPolicy.infer, which under the hood could work something like below... you could combine it with your original suggestion, with an overridable MemoryManagementPolicy(just removed it to make the example shorter) auto setExtension(R1, R2)(R1 path, R2 ext) if (...) { static if(functionAttributes!(__traits(parent, setExtension)) & FunctionAttribute.nogc) alias S = RCString; else alias S = string; ... return result; } Daniel N
Sep 29 2014
On Monday, 29 September 2014 at 10:49:53 UTC, Andrei Alexandrescu wrote:entirely distinct topicFinally!
Sep 29 2014
On Monday, 29 September 2014 at 11:37:00 UTC, eles wrote:On Monday, 29 September 2014 at 10:49:53 UTC, Andrei Alexandrescu wrote:Sorry, enthusiasm. I really think this is the key for doing the management of all resources in the right way. For me, the memory should be seen as a resource that simply happens to have the possibility of being manageable in a more flexible way and with specific constraints. For example, with respect to other kind of resources, you could use a lazy approach to deallocate memory, as unlike many other resources memory is like money: is fungible [1]. Other resources are not. OTOH, the memory comes with some of its own quirks, such as the cycles (these could be, in theory, possible for other kind of resources, but are exceptions). Memory management is not necessarily deterministic neither. Other resources might require determinism, however. [1] http://en.wikipedia.org/wiki/Fungibilityentirely distinct topicFinally!
Sep 29 2014
On Monday, 29 September 2014 at 10:49:53 UTC, Andrei Alexandrescu wrote:auto setExtension(MemoryManagementPolicy mmp = gc, R1, R2)(R1 path, R2 ext) if (...) { static if (mmp == gc) alias S = string; else alias S = RCString; S result; ... return result; }Is this practically feasible without blowing up Phobos several times in size and complexity? And I'm not sure adding a template parameter to every function is going to work well, what with all the existing template parameters - especially the optional ones.
Sep 29 2014
On 9/29/14, 5:06 AM, Vladimir Panteleev wrote:On Monday, 29 September 2014 at 10:49:53 UTC, Andrei Alexandrescu wrote:I believe so. For the most part implementations will be identical - just look at the RCString primitives, which are virtually the same as string's.auto setExtension(MemoryManagementPolicy mmp = gc, R1, R2)(R1 path, R2 ext) if (...) { static if (mmp == gc) alias S = string; else alias S = RCString; S result; ... return result; }Is this practically feasible without blowing up Phobos several times in size and complexity?And I'm not sure adding a template parameter to every function is going to work well, what with all the existing template parameters - especially the optional ones.Not all functions, just those that allocate. I agree there will be a few decisions to be made there. Andrei
Sep 29 2014
Any assumption that library code can go away with some set of pre-defined allocation strategies is crap. This whole discussion was about how important it is to move allocation decisions to user code (ranges are just one tool to achieve that, Don has been presenting examples of how we do that with plain arrays in DConf 2014 talk). In that regard allocators + ranges are still the way to go in my opinion. Yes, sometimes those result in very hard to use API - providing GC-heavy but friendly alternatives for those shouldn't do any harm. But in general full decoupling of algorithms from allocations is necessary. If that makes D poor cousin of C++ we may have a learn few tricks from C++.
Sep 29 2014
On 9/29/14, 5:29 AM, Dicebot wrote:Any assumption that library code can go away with some set of pre-defined allocation strategies is crap. This whole discussion was about how important it is to move allocation decisions to user code (ranges are just one tool to achieve that, Don has been presenting examples of how we do that with plain arrays in DConf 2014 talk).That's making exactly the confusion I was - that memory allocation strategy is the same as memory management strategy.In that regard allocators + ranges are still the way to go in my opinion. Yes, sometimes those result in very hard to use API - providing GC-heavy but friendly alternatives for those shouldn't do any harm. But in general full decoupling of algorithms from allocations is necessary. If that makes D poor cousin of C++ we may have a learn few tricks from C++.As long as things are trivial they can be done with relative ease, albeit with more pain. But consider e.g. the recent JSON library by Sönke. It needs to create a lookup data structure and return things like strings from it. What primitives do you think could it define? Andrei
Sep 29 2014
On Monday, 29 September 2014 at 15:18:40 UTC, Andrei Alexandrescu wrote:On 9/29/14, 5:29 AM, Dicebot wrote:Yes but neither decision belongs to library code except for very rare cases.Any assumption that library code can go away with some set of pre-defined allocation strategies is crap. This whole discussion was about how important it is to move allocation decisions to user code (ranges are just one tool to achieve that, Don has been presenting examples of how we do that with plain arrays in DConf 2014 talk).That's making exactly the confusion I was - that memory allocation strategy is the same as memory management strategy.Sounds like it may have to define own kind of allocator with certain implementation restrictions (and implement it in terms of GC by default). I have not actually read the code for that proposal so hard to guess. Will need to do it if it really matters.In that regard allocators + ranges are still the way to go in my opinion. Yes, sometimes those result in very hard to use API - providing GC-heavy but friendly alternatives for those shouldn't do any harm. But in general full decoupling of algorithms from allocations is necessary. If that makes D poor cousin of C++ we may have a learn few tricks from C++.As long as things are trivial they can be done with relative ease, albeit with more pain. But consider e.g. the recent JSON library by Sönke. It needs to create a lookup data structure and return things like strings from it. What primitives do you think could it define?
Sep 29 2014
On 9/29/14, 8:53 AM, Dicebot wrote:On Monday, 29 September 2014 at 15:18:40 UTC, Andrei Alexandrescu wrote:You just assert it, so all I can say is "I understand you believe this". I've motivated my argument. You may want to do the same for yours.On 9/29/14, 5:29 AM, Dicebot wrote:Yes but neither decision belongs to library code except for very rare cases.Any assumption that library code can go away with some set of pre-defined allocation strategies is crap. This whole discussion was about how important it is to move allocation decisions to user code (ranges are just one tool to achieve that, Don has been presenting examples of how we do that with plain arrays in DConf 2014 talk).That's making exactly the confusion I was - that memory allocation strategy is the same as memory management strategy.So you don't have an answer. And again you are confusing memory allocation with memory management. I have sketched an approach that works and will take us to Phobos being most transparently usable with tracing collection or with reference counting. Part of that is RCString (and generally reference counted slices and hashtables), and another part is the refcounted attribute for classes. I will push it through. If you have any objections, it would be great if you argued them properly. Thanks, AndreiSounds like it may have to define own kind of allocator with certain implementation restrictions (and implement it in terms of GC by default). I have not actually read the code for that proposal so hard to guess. Will need to do it if it really matters.In that regard allocators + ranges are still the way to go in my opinion. Yes, sometimes those result in very hard to use API - providing GC-heavy but friendly alternatives for those shouldn't do any harm. But in general full decoupling of algorithms from allocations is necessary. If that makes D poor cousin of C++ we may have a learn few tricks from C++.As long as things are trivial they can be done with relative ease, albeit with more pain. But consider e.g. the recent JSON library by Sönke. It needs to create a lookup data structure and return things like strings from it. What primitives do you think could it define?
Sep 29 2014
On Monday, 29 September 2014 at 17:04:54 UTC, Andrei Alexandrescu wrote:I probably have missed the part with arguments :) Your reasoning is not fundamentally different from "GC should be enough" but extended to several options from single one. My argument is simple - one can't forsee everything. I remember reading book of one guy who has been advocating thing called "policy-based design", you may know him ;) Was quite impressed with the simple but practical basic idea - decoupling parts of the implementation that are not inherently related.Yes but neither decision belongs to library code except for very rare cases.You just assert it, so all I can say is "I understand you believe this". I've motivated my argument. You may want to do the same for yours.So you don't have an answer. And again you are confusing memory allocation with memory management.Yes, sorry, I don't have an answer. Or time do deeply dive into the code unless it is really important or my direct responsibility. Unfortunately, I don't see an answer how your proposal fits our code either. Most of Sociomantic code relies on using arrays as ref arguments to avoid creating of new GC roots (no, we don't need/want to switch to ARC). This was several times called as the reason why Phobos in its current shape is largely unusable for out needs even when D2 switch is finished. I don't see how proposal in original post changes that.
Sep 29 2014
On 9/29/14, 10:19 AM, Dicebot wrote:On Monday, 29 September 2014 at 17:04:54 UTC, Andrei Alexandrescu wrote:No problem, let me paste it again:I probably have missed the part with arguments :)Yes but neither decision belongs to library code except for very rare cases.You just assert it, so all I can say is "I understand you believe this". I've motivated my argument. You may want to do the same for yours.The basic tenet of the approach is to reckon and act on the fact that memory allocation (the subject of allocators) is an entirely distinct topic from memory management, and more generally resource management. This clarifies that it would be wrong to approach alternatives to GC in Phobos by means of allocators. GC is not only an approach to memory allocation, but also an approach to memory management. Reducing it to either one is a mistake. In hindsight this looks rather obvious but it has caused me and many people better than myself a lot of headache. That said allocators are nice to have and use, and I will definitely follow up with std.allocator. However, std.allocator is not the key to a nogc Phobos. Nor are ranges. There is an attitude that either output ranges, or input ranges in conjunction with lazy computation, would solve the issue of creating garbage. https://github.com/D-Programming-Language/phobos/pull/2423 is a good illustration of the latter approach: a range would be lazily created by chaining stuff together. A range-based approach would take us further than the allocators, but I see the following issues with it: (a) the whole approach doesn't stand scrutiny for non-linear outputs, e.g. outputting some sort of associative array or really any composite type quickly becomes tenuous either with an output range (eager) or with exposing an input range (lazy); (b) makes the style of programming without GC radically different, and much more cumbersome, than programming with GC; as a consequence, programmers who consider changing one approach to another, or implementing an algorithm neutral to it, are looking at a major rewrite; (c) would make D/ nogc a poor cousin of C++. This is quite out of character; technically, I have long gotten used to seeing most elaborate C++ code like poor emulation of simple D idioms. But C++ has spent years and decades taking to perfection an approach without a tracing garbage collector. A departure from that would need to be superior, and that doesn't seem to be the case with range-based approaches.=================Your reasoning is not fundamentally different from "GC should be enough" but extended to several options from single one.Where's RC in the "GC should be enough"?My argument is simple - one can't forsee everything. I remember reading book of one guy who has been advocating thing called "policy-based design", you may know him ;) Was quite impressed with the simple but practical basic idea - decoupling parts of the implementation that are not inherently related.Totally. Then it would be great if you trusted the guy when he makes a judgment call in which reasonable people may disagree. There are many memory /allocation/ policies but precious few memory /management/ policies. I only know "manual", "scoped", "reference counted", and "tracing" based on... the last 50 years of software development.Passing arrays by reference is plenty adequate with all memory management strategies. You'll need to wait and see how the proposal changes that, but if you naysay, back it up. AndreiSo you don't have an answer. And again you are confusing memory allocation with memory management.Yes, sorry, I don't have an answer. Or time do deeply dive into the code unless it is really important or my direct responsibility. Unfortunately, I don't see an answer how your proposal fits our code either. Most of Sociomantic code relies on using arrays as ref arguments to avoid creating of new GC roots (no, we don't need/want to switch to ARC). This was several times called as the reason why Phobos in its current shape is largely unusable for out needs even when D2 switch is finished. I don't see how proposal in original post changes that.
Sep 29 2014
On Monday, 29 September 2014 at 22:18:38 UTC, Andrei Alexandrescu wrote:Passing arrays by reference is plenty adequate with all memory management strategies. You'll need to wait and see how the proposal changes that, but if you naysay, back it up.Resisting to go on meaningless argument on other points, this pretty much says that focus on things that are important for me is abandoned in favor of something that mostly doesn't matter. Am I supposed to be happy? :) Am I supposed to be twice as happy when you propose to close pull requests that do help because of this proposal? I am waiting for what comes next but right now "not impressed" is most optimistic way to put this. Sorry :(
Sep 29 2014
On 9/29/14, 3:43 PM, Dicebot wrote:On Monday, 29 September 2014 at 22:18:38 UTC, Andrei Alexandrescu wrote:I trust you'll be. -- AndreiPassing arrays by reference is plenty adequate with all memory management strategies. You'll need to wait and see how the proposal changes that, but if you naysay, back it up.Resisting to go on meaningless argument on other points, this pretty much says that focus on things that are important for me is abandoned in favor of something that mostly doesn't matter. Am I supposed to be happy? :) Am I supposed to be twice as happy when you propose to close pull requests that do help because of this proposal? I am waiting for what comes next but right now "not impressed" is most optimistic way to put this. Sorry :(
Sep 29 2014
On Monday, 29 September 2014 at 12:29:33 UTC, Dicebot wrote:Any assumption that library code can go away with some set of pre-defined allocation strategies is crap. This whole discussion was about how important it is to move allocation decisions to user code (ranges are just one tool to achieve that, Don has been presenting examples of how we do that with plain arrays in DConf 2014 talk).I think the key to this sort of issue is to try and get as much functionality in Phobos marked nogc as possible. After that, building new library-like functionality into a DUB package that assumes nogc and only uses the nogc code in Phobos would be the next step. Should that get to a state where it's popular and supported, pulling it in as std.nogc.* might make sense, but trying to redo Phobos as a manual memory collection library is infeasible. Were I your company, I'd start working on leading such an effort. Unlike Tango, I don't think a development like this would split the community nor the community's resources in a useless fashion.
Sep 29 2014
Am 29.09.2014 12:49, schrieb Andrei Alexandrescu:[...] The three policies are: (a) gc is the classic garbage-collected style of management; (b) rc is a reference-counted style still backed by the GC, i.e. the GC will still be able to pick up cycles and other kinds of leaks. (c) mrc is a reference-counted style backed by malloc. (It should be possible to collapse rc and mrc together and make the distinction dynamically, at runtime. I'm distinguishing them statically here for expository purposes.) ...Personally, I would go just for (b) with compiler support for increment/decrement removal, as I think it will be too complex having to support everything and this will complicate all libraries. Anyway, that was just my 0.02€. Stepping out the thread as I just toy around with D and cannot add much more to the discussion. -- Paulo
Sep 29 2014
On 9/29/14, 10:16 AM, Paulo Pinto wrote:Personally, I would go just for (b) with compiler support for increment/decrement removal, as I think it will be too complex having to support everything and this will complicate all libraries.Compiler already knows (after inlining) that ++i and --i cancel each other, so we should be in good shape there. -- Andrei
Sep 29 2014
Am Mon, 29 Sep 2014 15:04:03 -0700 schrieb Andrei Alexandrescu <SeeWebsiteForEmail erdani.org>:On 9/29/14, 10:16 AM, Paulo Pinto wrote:That helps with very small, inlined functions until Marc Sch=C3=BCtz's work on borrowed pointers makes it redundant by unifying scoped copies of GC, RC and stack pointers. In any case inc/dec elision is an optimization and and not an enabling feature. It sure is on the radar and can be improved later on. --=20 MarcoPersonally, I would go just for (b) with compiler support for increment/decrement removal, as I think it will be too complex having to support everything and this will complicate all libraries.=20 Compiler already knows (after inlining) that ++i and --i cancel each=20 other, so we should be in good shape there. -- Andrei
Sep 30 2014
On 30 September 2014 08:04, Andrei Alexandrescu via Digitalmars-d <digitalmars-d puremagic.com> wrote:On 9/29/14, 10:16 AM, Paulo Pinto wrote:The compiler doesn't know that MyLibrary_AddRef(Thing *t); and MyLibrary_DecRef(Thing *t); cancel eachother out though... rc needs primitives that the compiler understands implicitly, so that rc logic can be more complex than ++i/--i;Personally, I would go just for (b) with compiler support for increment/decrement removal, as I think it will be too complex having to support everything and this will complicate all libraries.Compiler already knows (after inlining) that ++i and --i cancel each other, so we should be in good shape there. -- Andrei
Sep 30 2014
On Wednesday, 1 October 2014 at 01:26:45 UTC, Manu via Digitalmars-d wrote:On 30 September 2014 08:04, Andrei Alexandrescu via Digitalmars-d <digitalmars-d puremagic.com> wrote:Even with simply i++ and i--, the information that they always go by pair is lost on the compiler in many cases.On 9/29/14, 10:16 AM, Paulo Pinto wrote:The compiler doesn't know that MyLibrary_AddRef(Thing *t); and MyLibrary_DecRef(Thing *t); cancel eachother out though... rc needs primitives that the compiler understands implicitly, so that rc logic can be more complex than ++i/--i;Personally, I would go just for (b) with compiler support for increment/decrement removal, as I think it will be too complex having to support everything and this will complicate all libraries.Compiler already knows (after inlining) that ++i and --i cancel each other, so we should be in good shape there. -- Andrei
Sep 30 2014
On 2014-09-29 12:49, Andrei Alexandrescu wrote:Now that we clarified that these existing attempts are not going to work well, the question remains what does. For Phobos I'm thinking of defining and using three policies: enum MemoryManagementPolicy { gc, rc, mrc } immutable gc = ResourceManagementPolicy.gc, rc = ResourceManagementPolicy.rc, mrc = ResourceManagementPolicy.mrc; The three policies are: (a) gc is the classic garbage-collected style of management; (b) rc is a reference-counted style still backed by the GC, i.e. the GC will still be able to pick up cycles and other kinds of leaks. (c) mrc is a reference-counted style backed by malloc. (It should be possible to collapse rc and mrc together and make the distinction dynamically, at runtime. I'm distinguishing them statically here for expository purposes.) The policy is a template parameter to functions in Phobos (and elsewhere), and informs the functions e.g. what types to return. Consider: auto setExtension(MemoryManagementPolicy mmp = gc, R1, R2)(R1 path, R2 ext) if (...) { static if (mmp == gc) alias S = string; else alias S = RCString; S result; ... return result; } On the caller side: auto p1 = setExtension("hello", ".txt"); // fine, use gc auto p2 = setExtension!gc("hello", ".txt"); // same auto p3 = setExtension!rc("hello", ".txt"); // fine, use rcHow does allocators fit in this? Will it be an additional argument to the function. Or a separate stack that one can push and pop allocators to? -- /Jacob Carlborg
Sep 29 2014
On 9/29/14, 10:25 AM, Jacob Carlborg wrote:How does allocators fit in this? Will it be an additional argument to the function. Or a separate stack that one can push and pop allocators to?There would be one allocator per thread (changeable) deferring to a global interlocked allocator. Most algorithms would just use whatever allocator is installed. I know the notion of a thread-local and then global allocator is liable to cause some an apoplexy attack. But it's time to model things as they are - memory is a global resource and it ought to be treated as such. No need to pass allocators around except for special cases. Andrei
Sep 29 2014
Am Mon, 29 Sep 2014 15:11:26 -0700 schrieb Andrei Alexandrescu <SeeWebsiteForEmail erdani.org>:On 9/29/14, 10:25 AM, Jacob Carlborg wrote:How does allocators fit in this? Will it be an additional argument to the function. Or a separate stack that one can push and pop allocators to?There would be one allocator per thread (changeable) deferring to a global interlocked allocator. Most algorithms would just use whatever allocator is installed. I know the notion of a thread-local and then global allocator is liable to cause some an apoplexy attack. But it's time to model things as they are - memory is a global resource and it ought to be treated as such. No need to pass allocators around except for special cases. AndreiNo need to pass allocators around except for special cases.So you propose RC + global/thread local allocators as the solution for all memory related problems as 'memory management is not allocation'. And you claim that using output ranges / providing buffers / allocators is not an option because it only works in some special cases? What if I don't want automated memory _management_? What if I want a function to use a stack buffer? Or if I want to free manually? If I want std.string.toStringz to put the result into a temporary stack buffer your solution doesn't help at all. Passing an ouput range, allocator or buffer would all solve this.
Sep 30 2014
On Tuesday, 30 September 2014 at 08:34:26 UTC, Johannes Pfau wrote:What if I don't want automated memory _management_? What if I want a function to use a stack buffer? Or if I want to free manually?Agreed. This is the common case we need to solve for, but this is memory allocation, not management. I'm not sure where manual management fits into Andrei's scheme. Andrei, could you give an example of, e.g. how toStringz would work with a stack buffer in your proposed scheme? Another thought: if we use a template parameter, what's the story for virtual functions (e.g. Object.toString)? They can't be templated.
Sep 30 2014
On 9/30/14, 3:41 AM, Peter Alexander wrote:On Tuesday, 30 September 2014 at 08:34:26 UTC, Johannes Pfau wrote:There would be no possibility to do that. I mean it's not there but it can be added e.g. as a "manual" option of performing memory management. The "manual" overloads for functions would require an output range parameter. Not all functions might support a "manual" option - that'd be rejected statically.What if I don't want automated memory _management_? What if I want a function to use a stack buffer? Or if I want to free manually?Agreed. This is the common case we need to solve for, but this is memory allocation, not management. I'm not sure where manual management fits into Andrei's scheme. Andrei, could you give an example of, e.g. how toStringz would work with a stack buffer in your proposed scheme?Another thought: if we use a template parameter, what's the story for virtual functions (e.g. Object.toString)? They can't be templated.Good point. We need to think about that. Andrei
Sep 30 2014
On 30/09/14 14:29, Andrei Alexandrescu wrote:Good point. We need to think about that.Weren't all methods in Object supposed to be lifted out from Object anyway? -- /Jacob Carlborg
Sep 30 2014
On Tuesday, September 30, 2014 15:18:17 Jacob Carlborg via Digitalmars-d wrote:On 30/09/14 14:29, Andrei Alexandrescu wrote:Yes, but not much work has been done on it, and the little work that has been done is blocked by at least one compiler bug: https://issues.dlang.org/show_bug.cgi?id=12537 - Jonathan M DavisGood point. We need to think about that.Weren't all methods in Object supposed to be lifted out from Object anyway?
Oct 28 2014
Am Tue, 30 Sep 2014 05:29:55 -0700 schrieb Andrei Alexandrescu <SeeWebsiteForEmail erdani.org>:Passing buffers or sink delegates (like we already do for toString) is possible for some functions. For toString it works fine. Then implement to!RCString(object) using the toString(sink delegate) overload. For all other functions RC is indeed difficult, probably only possible with different manually written overloads (and a dummy parameter as we can't overload on return type)?Another thought: if we use a template parameter, what's the story for virtual functions (e.g. Object.toString)? They can't be templated.Good point. We need to think about that.
Sep 30 2014
On Tuesday, 30 September 2014 at 08:34:26 UTC, Johannes Pfau wrote:What if I don't want automated memory _management_? What if I want a function to use a stack buffer? Or if I want to free manually? If I want std.string.toStringz to put the result into a temporary stack buffer your solution doesn't help at all. Passing an ouput range, allocator or buffer would all solve this.I don't understand, why wouldn't you be able to temporarily set the thread-local allocator to use the stack buffer, and restore it once done?
Sep 30 2014
Am Tue, 30 Sep 2014 10:47:54 +0000 schrieb "Vladimir Panteleev" <vladimir thecybershadow.net>:On Tuesday, 30 September 2014 at 08:34:26 UTC, Johannes Pfau wrote:That's possible but insanely dangerous in case you forget to reset the thread allocator. Also storing stack pointers in global state (even thread-local) is dangerous, for example interaction with fibers could lead to bugs, etc. (What if I set the allocator to a stack allocator and call a function which yields from a Fiber?). You also loose all possibilities to use 'scope' or a similar mechanism to prevent escaping a stack pointer. Also a stack buffer is not a complete allocator, but in some cases like toStringz it works even better than allocators (less overhead as you know the required buffer size before calling toStringz and there's only one allocation) And it is a hack. Of course you can provide a wrapper which does oldAlloc = threadLocalAllocator; threadLocalAllocator = stackbuf; func(); scope(exit) threadLocalAllocator = oldAlloc; But how could anybody think this is good API design? I think I'd rather fork the required Phobos functions instead of using such a wrapper.What if I don't want automated memory _management_? What if I want a function to use a stack buffer? Or if I want to free manually? If I want std.string.toStringz to put the result into a temporary stack buffer your solution doesn't help at all. Passing an ouput range, allocator or buffer would all solve this.I don't understand, why wouldn't you be able to temporarily set the thread-local allocator to use the stack buffer, and restore it once done?
Sep 30 2014
On Tuesday, 30 September 2014 at 12:02:10 UTC, Johannes Pfau wrote:That's possible but insanely dangerous in case you forget to reset the thread allocator. Also storing stack pointers in global state (even thread-local) is dangerous, for example interaction with fibers could lead to bugs, etc. (What if I set the allocator to a stack allocator and call a function which yields from a Fiber?). You also loose all possibilities to use 'scope' or a similar mechanism to prevent escaping a stack pointer.Yes, I agree. One option would be to have thread-local region allocator that can only be used for "scoped" allocation. That is, only for allocations that are not assigned to globals or can get stuck in fibers and that are returned to the calling function. That way the context can free the region when done and you can get away with little allocation overhead if used prudently. I also don't agree with the sentiment that allocation/management can be kept fully separate. If you have a region allocator that is refcounted it most certainly is interrelated with a fairly tight coupling. Also the idea exposed in this thread that release()/retain() is purely arithmetic and can be optimized as such is quite wrong. retain() is conceptually a locking construct on a memory region that prevents reuse. I've made a case for TSX, but one can probably come up with other multi-threaded examples. These hacks are not making D more attractive to people who find C++ lacking in elegance. Actually, creating a phobos light with nothrow, nogc, a light runtime and basic building blocks such as intrinsics to build your own RC with compiler support sounds like a more interesting option. I am really not interested in library provided allocators or RC. If I am not going to use malloc/GC then I want to write my own and have dedicated allocators for the most common objects. I think it is quite reasonable that people who want to take the difficult road of not using GC at all also have to do some extra work, but provide a clean slate to work from!
Sep 30 2014
On Tuesday, 30 September 2014 at 12:32:08 UTC, Ola Fosheim Grøstad wrote:On Tuesday, 30 September 2014 at 12:02:10 UTC, Johannes Pfau wrote:It works when two big ifs come together. - inside the same scope (e.g. function level) - when the referece is not shared between threads. While it is of limited applicability, Objective-C (and eventually Swift) codebases prove it helps in most real life use cases. -- Paulo...Also the idea exposed in this thread that release()/retain() ispurely arithmetic and can be optimized as such is quite wrong. retain() is conceptually a locking construct on a memory region that prevents reuse. I've made a case for TSX, but one can probably come up with other multi-threaded examples.
Sep 30 2014
On Tuesday, 30 September 2014 at 12:51:25 UTC, Paulo Pinto wrote:It works when two big ifs come together. - inside the same scope (e.g. function level) - when the referece is not shared between threads. While it is of limited applicability, Objective-C (and eventually Swift) codebases prove it helps in most real life use cases.But Objective-C has thread safe ref-counting?! If it isn't thread safe it is of very limited utility, you can usually get away with unique_ptr in single threaded scenarios.
Sep 30 2014
Am 30.09.2014 14:55, schrieb "Ola Fosheim Grøstad" <ola.fosheim.grostad+dlang gmail.com>":On Tuesday, 30 September 2014 at 12:51:25 UTC, Paulo Pinto wrote:Did you read my second bullet?It works when two big ifs come together. - inside the same scope (e.g. function level) - when the referece is not shared between threads. While it is of limited applicability, Objective-C (and eventually Swift) codebases prove it helps in most real life use cases.But Objective-C has thread safe ref-counting?! If it isn't thread safe it is of very limited utility, you can usually get away with unique_ptr in single threaded scenarios.
Sep 30 2014
On Tuesday, 30 September 2014 at 20:13:38 UTC, Paulo Pinto wrote:Am 30.09.2014 14:55, schrieb "Ola Fosheim Grøstad" <ola.fosheim.grostad+dlang gmail.com>":Yes? I dont want builtin rc default for single threaded use cases. I do want it when references are shared between threads, e.g. for cache objects.On Tuesday, 30 September 2014 at 12:51:25 UTC, Paulo Pinto wrote:Did you read my second bullet?It works when two big ifs come together. - inside the same scope (e.g. function level) - when the referece is not shared between threads. While it is of limited applicability, Objective-C (and eventually Swift) codebases prove it helps in most real life use cases.But Objective-C has thread safe ref-counting?! If it isn't thread safe it is of very limited utility, you can usually get away with unique_ptr in single threaded scenarios.
Sep 30 2014
On Tuesday, 30 September 2014 at 12:32:08 UTC, Ola Fosheim Grøstad wrote:...basic building blocks such as intrinsics to build your own RC with compiler support sounds like a more interesting option.I agree.
Sep 30 2014
On 9/30/14, 3:47 AM, Vladimir Panteleev wrote:On Tuesday, 30 September 2014 at 08:34:26 UTC, Johannes Pfau wrote:That's doable, but you don't get to place the string at a _specific_ buffer. -- AndreiWhat if I don't want automated memory _management_? What if I want a function to use a stack buffer? Or if I want to free manually? If I want std.string.toStringz to put the result into a temporary stack buffer your solution doesn't help at all. Passing an ouput range, allocator or buffer would all solve this.I don't understand, why wouldn't you be able to temporarily set the thread-local allocator to use the stack buffer, and restore it once done?
Sep 30 2014
On 9/30/14, 1:34 AM, Johannes Pfau wrote:So you propose RC + global/thread local allocators as the solution for all memory related problems as 'memory management is not allocation'. And you claim that using output ranges / providing buffers / allocators is not an option because it only works in some special cases?Correct. I assume you meant an irony/sarcasm somewhere :o).What if I don't want automated memory _management_? What if I want a function to use a stack buffer? Or if I want to free manually? If I want std.string.toStringz to put the result into a temporary stack buffer your solution doesn't help at all. Passing an ouput range, allocator or buffer would all solve this.Correct. The output of toStringz would be either a GC string or an RC string. Andrei
Sep 30 2014
Am Tue, 30 Sep 2014 05:23:29 -0700 schrieb Andrei Alexandrescu <SeeWebsiteForEmail erdani.org>:On 9/30/14, 1:34 AM, Johannes Pfau wrote:The sarcasm is supposed to be here: '_all_ memory related problems' ;-) I guess my point is that although RC is useful in some cases output ranges / sink delegates / pre-allocated buffers are still necessary in other cases and RC is not the solution for _everything_. As Manu often pointed out sometimes you do not want any dynamic allocation (toStringz in games is a good example) and here RC doesn't help. Another example is format which can already write to output ranges and uses sink delegates internally. That's a much better abstraction than simply returning a reference counted string (allocated with a thread local allocator). Using sink delegates internally is also more efficient than creating temporary RCStrings. And sometimes there's no allocation at all this way (directly writing to a socket/file).So you propose RC + global/thread local allocators as the solution for all memory related problems as 'memory management is not allocation'. And you claim that using output ranges / providing buffers / allocators is not an option because it only works in some special cases?Correct. I assume you meant an irony/sarcasm somewhere :o).But why not provide 3 overloads then? toStringz(OutputRange) string toStringz(Policy) //char*, actually RCString toStringz(Policy) The notion I got from some of your posts is that you're opposed to such overloads, or did I misinterpret that?What if I don't want automated memory _management_? What if I want a function to use a stack buffer? Or if I want to free manually? If I want std.string.toStringz to put the result into a temporary stack buffer your solution doesn't help at all. Passing an ouput range, allocator or buffer would all solve this.Correct. The output of toStringz would be either a GC string or an RC string.
Sep 30 2014
On Tuesday, 30 September 2014 at 16:49:48 UTC, Johannes Pfau wrote:I guess my point is that although RC is useful in some cases output ranges / sink delegates / pre-allocated buffers are still necessary in other cases and RC is not the solution for _everything_.Yes, I'm hoping this is an adjunct to changes in Phobos to reduce the frequency of implicit allocation in general. The less garbage that's generated, the less GC vs. RC actually matters.
Sep 30 2014
On 9/30/14, 9:49 AM, Johannes Pfau wrote:I guess my point is that although RC is useful in some cases output ranges / sink delegates / pre-allocated buffers are still necessary in other cases and RC is not the solution for _everything_.Agreed.As Manu often pointed out sometimes you do not want any dynamic allocation (toStringz in games is a good example) and here RC doesn't help. Another example is format which can already write to output ranges and uses sink delegates internally. That's a much better abstraction than simply returning a reference counted string (allocated with a thread local allocator). Using sink delegates internally is also more efficient than creating temporary RCStrings. And sometimes there's no allocation at all this way (directly writing to a socket/file).Agreed.I'm not opposed. Here's what I think. As an approach to using Phobos without a GC, it's been suggested that we supplement garbage-creating functions with new functions that use output ranges everywhere, or lazy ranges everywhere. I think a better approach is to make memory management a policy that makes convenient use of reference counting possible. So instead of garbage there'd be reference counted stuff. Of course, to the extent using lazy computation and/or output ranges is a good thing to have for various reasons, they remain valid techniques that are and will continue being used in Phobos. My point is that acknowledging and systematically using reference counted types is an essential part of the entire approach. AndreiBut why not provide 3 overloads then? toStringz(OutputRange) string toStringz(Policy) //char*, actually RCString toStringz(Policy) The notion I got from some of your posts is that you're opposed to such overloads, or did I misinterpret that?What if I don't want automated memory _management_? What if I want a function to use a stack buffer? Or if I want to free manually? If I want std.string.toStringz to put the result into a temporary stack buffer your solution doesn't help at all. Passing an ouput range, allocator or buffer would all solve this.Correct. The output of toStringz would be either a GC string or an RC string.
Oct 01 2014
Am Wed, 01 Oct 2014 02:21:44 -0700 schrieb Andrei Alexandrescu <SeeWebsiteForEmail erdani.org>:On 9/30/14, 9:49 AM, Johannes Pfau wrote:OK then I got you wrong and I agree with everything you wrote above. Thanks for clarifying.I guess my point is that although RC is useful in some cases output ranges / sink delegates / pre-allocated buffers are still necessary in other cases and RC is not the solution for _everything_.Agreed.As Manu often pointed out sometimes you do not want any dynamic allocation (toStringz in games is a good example) and here RC doesn't help. Another example is format which can already write to output ranges and uses sink delegates internally. That's a much better abstraction than simply returning a reference counted string (allocated with a thread local allocator). Using sink delegates internally is also more efficient than creating temporary RCStrings. And sometimes there's no allocation at all this way (directly writing to a socket/file).Agreed.I'm not opposed. Here's what I think. As an approach to using Phobos without a GC, it's been suggested that we supplement garbage-creating functions with new functions that use output ranges everywhere, or lazy ranges everywhere. I think a better approach is to make memory management a policy that makes convenient use of reference counting possible. So instead of garbage there'd be reference counted stuff. Of course, to the extent using lazy computation and/or output ranges is a good thing to have for various reasons, they remain valid techniques that are and will continue being used in Phobos. My point is that acknowledging and systematically using reference counted types is an essential part of the entire approach. AndreiBut why not provide 3 overloads then? toStringz(OutputRange) string toStringz(Policy) //char*, actually RCString toStringz(Policy) The notion I got from some of your posts is that you're opposed to such overloads, or did I misinterpret that?What if I don't want automated memory _management_? What if I want a function to use a stack buffer? Or if I want to free manually? If I want std.string.toStringz to put the result into a temporary stack buffer your solution doesn't help at all. Passing an ouput range, allocator or buffer would all solve this.Correct. The output of toStringz would be either a GC string or an RC string.
Oct 06 2014
On Monday, 29 September 2014 at 10:49:53 UTC, Andrei Alexandrescu wrote:On the caller side: auto p1 = setExtension("hello", ".txt"); // fine, use gc auto p2 = setExtension!gc("hello", ".txt"); // same auto p3 = setExtension!rc("hello", ".txt"); // fine, use rc So by default it's going to continue being business as usual, but certain functions will allow passing in a (defaulted) policy for memory management.Forcing someone (or rather, a team of someones) to call into the library in a consistent fashion like this seems like a rather risky venture. I suppose that you could add some special compiler checks to make sure that people are being consistent, but I'd probably rather see some way of templating modules so that the chances for human error are reduced. --- foo.d --- module std.foo(GC = gc); void bar() { static if (gc) { ... } } --- usercode.d --- import std.foo!rc; void fooCaller() { bar(); } Though truthfully, I'd rather it be a compiler flag. But I presume that there's an issue with that, which it is too early for my brain to think of.
Sep 29 2014
On 2014-09-29 10:49:52 +0000, Andrei Alexandrescu said:Back when I've first introduced RCString I hinted that we have a larger strategy in mind. Here it is. The basic tenet of the approach is to reckon and act on the fact that memory allocation (the subject of allocators) is an entirely distinct topic from memory management, and more generally resource management. This clarifies that it would be wrong to approach alternatives to GC in Phobos by means of allocators. GC is not only an approach to memory allocation, but also an approach to memory management. Reducing it to either one is a mistake. In hindsight this looks rather obvious but it has caused me and many people better than myself a lot of headache. That said allocators are nice to have and use, and I will definitely follow up with std.allocator. However, std.allocator is not the key to a nogc Phobos. Nor are ranges. There is an attitude that either output ranges, or input ranges in conjunction with lazy computation, would solve the issue of creating garbage. https://github.com/D-Programming-Language/phobos/pull/2423 is a good illustration of the latter approach: a range would be lazily created by chaining stuff together. A range-based approach would take us further than the allocators, but I see the following issues with it: (a) the whole approach doesn't stand scrutiny for non-linear outputs, e.g. outputting some sort of associative array or really any composite type quickly becomes tenuous either with an output range (eager) or with exposing an input range (lazy); (b) makes the style of programming without GC radically different, and much more cumbersome, than programming with GC; as a consequence, programmers who consider changing one approach to another, or implementing an algorithm neutral to it, are looking at a major rewrite; (c) would make D/ nogc a poor cousin of C++. This is quite out of character; technically, I have long gotten used to seeing most elaborate C++ code like poor emulation of simple D idioms. But C++ has spent years and decades taking to perfection an approach without a tracing garbage collector. A departure from that would need to be superior, and that doesn't seem to be the case with range-based approaches. =========== Now that we clarified that these existing attempts are not going to work well, the question remains what does. For Phobos I'm thinking of defining and using three policies: enum MemoryManagementPolicy { gc, rc, mrc } immutable gc = ResourceManagementPolicy.gc, rc = ResourceManagementPolicy.rc, mrc = ResourceManagementPolicy.mrc; The three policies are: (a) gc is the classic garbage-collected style of management; (b) rc is a reference-counted style still backed by the GC, i.e. the GC will still be able to pick up cycles and other kinds of leaks. (c) mrc is a reference-counted style backed by malloc. (It should be possible to collapse rc and mrc together and make the distinction dynamically, at runtime. I'm distinguishing them statically here for expository purposes.) The policy is a template parameter to functions in Phobos (and elsewhere), and informs the functions e.g. what types to return. Consider: auto setExtension(MemoryManagementPolicy mmp = gc, R1, R2)(R1 path, R2 ext) if (...) { static if (mmp == gc) alias S = string; else alias S = RCString; S result; ... return result; } On the caller side: auto p1 = setExtension("hello", ".txt"); // fine, use gc auto p2 = setExtension!gc("hello", ".txt"); // same auto p3 = setExtension!rc("hello", ".txt"); // fine, use rc So by default it's going to continue being business as usual, but certain functions will allow passing in a (defaulted) policy for memory management. Destroy! AndreiI don't like the idea of having to pass in template parameters everywhere -- even for allocators. Is there some way we could have "allocator contexts"? E.G. with( auto allocator = ReferencedCounted() ) { auto foo = setExtension("hello", "txt"); } ReferenceCounted() could replace a thread-local "new" delegate with something it has, and when it goes out of scope, it would reset it to whatever it was before. This would create some runtime overhead -- but I'm not sure how much more than already exists. -Shammah
Sep 29 2014
On 9/29/14, 11:44 AM, Shammah Chancellor wrote:I don't like the idea of having to pass in template parameters everywhere -- even for allocators.I agree.Is there some way we could have "allocator contexts"? E.G. with( auto allocator = ReferencedCounted() )Don't confuse memory allocation with memory management. There's no such a thing as a "reference counted allocator". Andrei
Sep 29 2014
On 2014-09-29 22:15:33 +0000, Andrei Alexandrescu said:On 9/29/14, 11:44 AM, Shammah Chancellor wrote:Sure, but combining the two could be very useful -- as we have noticed with a allocators that work off of a garbage collector. With regards to reference counting, you could implement one that automatically wraps the type in an RC struct and proxies them. Being able to redefined aliases during different sections of compilation would be required though.I don't like the idea of having to pass in template parameters everywhere -- even for allocators.I agree.Is there some way we could have "allocator contexts"? E.G. with( auto allocator = ReferencedCounted() )Don't confuse memory allocation with memory management. There's no such a thing as a "reference counted allocator". Andrei
Sep 29 2014
On Monday, 29 September 2014 at 22:15:32 UTC, Andrei Alexandrescu wrote:On 9/29/14, 11:44 AM, Shammah Chancellor wrote:There was a solution earlier in this thread which avoids that problem. When a function is annotated with nogc there's sufficient info to chose the correct implementation without any parameters, it's already known whether we are instantiated from a nogc block or not.I don't like the idea of having to pass in template parameters everywhere -- even for allocators.I agree.
Sep 29 2014
On 9/29/14, 11:44 AM, Shammah Chancellor wrote:I don't like the idea of having to pass in template parameters everywhere -- even for allocators. Is there some way we could have "allocator contexts"? E.G. with( auto allocator = ReferencedCounted() ) { auto foo = setExtension("hello", "txt"); } ReferenceCounted() could replace a thread-local "new" delegate with something it has, and when it goes out of scope, it would reset it to whatever it was before. This would create some runtime overhead -- but I'm not sure how much more than already exists.I'm not sure whether we can do this within D's type system. -- Andrei
Oct 01 2014
auto p1 = setExtension("hello", ".txt"); // fine, use gc auto p2 = setExtension!gc("hello", ".txt"); // same auto p3 = setExtension!rc("hello", ".txt"); // fine, use rc So by default it's going to continue being business as usual, but certain functions will allow passing in a (defaulted) policy for memory management. Destroy!I'll try to destroy ;) Before thinking out some answers to this problem let me ask a little more questions. 1. As far as I understand allocation and memory management of entities like class (Object), dynamic arrays and associative arrays is part of language/ runtime. What is proposed here is *fix* to standart library. But that allocation and MM happening via GC is not *fault* of standart library but is predefined behaviour of D lang itself and it's runtime. The standard library becomes a `hostage` of runtime library in this situation. Do you really sure that we should "fix" standart library in that way? For me it looks like implementing struts for standard lib (which is not broken yet ;) ) in order to compensate behaviour of runtime lib. 2. Second question is slightly oftopic, but I still want put it there. What I dislike about ranges and standart library is that it's hard to understand what is the returned value of library function. I have some *pedals* (front, popFront) to push and do some magic. Of course it was made for purpose of making universal algorithms. But the mor I use ranges, *auto* then less I believe that I use static-typed language. What is wanted to make code clear is having distinct variable declaration with specification of it's type. With all of these auto's logic of programme becomes unclear, because data structures are unclear. So I came to the question: is the memory management or allocation policy syntacticaly part of declaration or is it a inner implementation detail that should not be shown in decl? Should rc and gc string look simillar or not? string str1 = makeGCString("test"); string str2 = makeRCString("test"); // --- vs --- GCString str1 = "test"; RCString str2 = "test"; // --- or --- String!GC str1 = "test"; String!RC str2 = "test"; // --- or even --- gc string str1 = "test"; rc string str2 = "test"; As far as I understand currently we will have: string str1 = "test"; RCString str2 = "test"; So another question is why the same object "string" is implemented as different types. Array and struct (class)? 3. Should algorithms based on range interface care about allocation? Range is about iteration and access to elements but not about allocation and memory mangement. I would like to have attributes rc, gc (or like these) to switch MM-policy versus *String!RC* or *RCString* but we cannot apply attributes to literal. Passing to allgorithm something like this: find( rc "test", rc "t" ) is syntactically incorrect. But we can use this form: find( RCString("test"), RCString("t") ) But above form is more verbose. As continuation of this question I have next question. 4. How to deal with literals? How to make them ref-counted? I ask this because even when writing RCString("test") syntactically expression "test" is still GC-managed literal. I pass GC-managed literal into struct to make it RC-managed. Why just not make it RC from the start? Adding some additional template parameter to algrorithm wil not fix this. It is a problem of D itself and it's runtime library. So I assume that std lib is not broken this way and we should not try to fix it this way. Thanks for attention.
Sep 29 2014
On Monday, 29 September 2014 at 20:07:41 UTC, Uranuz wrote:1. As far as I understand allocation and memory management of entities like class (Object), dynamic arrays and associative arrays is part of language/ runtime. What is proposed here is *fix* to standart library. But that allocation and MM happening via GC is not *fault* of standart library but is predefined behaviour of D lang itself and it's runtime. The standard library becomes a `hostage` of runtime library in this situation. Do you really sure that we should "fix" standart library in that way? For me it looks like implementing struts for standard lib (which is not broken yet ;) ) in order to compensate behaviour of runtime lib.This really hits the nail on the head, and I think your other comments and questions are also quite insightful. IMO the proposal that started this thread, nogc, and -vgc are all beating around the bush rather than addressing the fundamental problem. Mike
Sep 29 2014
On 9/29/14, 1:07 PM, Uranuz wrote:1. As far as I understand allocation and memory management of entities like class (Object), dynamic arrays and associative arrays is part of language/ runtime. What is proposed here is *fix* to standart library. But that allocation and MM happening via GC is not *fault* of standart library but is predefined behaviour of D lang itself and it's runtime. The standard library becomes a `hostage` of runtime library in this situation. Do you really sure that we should "fix" standart library in that way? For me it looks like implementing struts for standard lib (which is not broken yet ;) ) in order to compensate behaviour of runtime lib.The change will be to both the runtime and the standard library.2. Second question is slightly oftopic, but I still want put it there. What I dislike about ranges and standart library is that it's hard to understand what is the returned value of library function. I have some *pedals* (front, popFront) to push and do some magic. Of course it was made for purpose of making universal algorithms. But the mor I use ranges, *auto* then less I believe that I use static-typed language. What is wanted to make code clear is having distinct variable declaration with specification of it's type. With all of these auto's logic of programme becomes unclear, because data structures are unclear. So I came to the question: is the memory management or allocation policy syntacticaly part of declaration or is it a inner implementation detail that should not be shown in decl?Sadly this is the way things are going (not only in D, but other languages such as C++, Haskell, Scala, etc). Type proliferation has costs, but also a ton of benefits. Most often the memory management policy will be part of function signatures because it affects data type definitions.Should rc and gc string look simillar or not? string str1 = makeGCString("test"); string str2 = makeRCString("test"); // --- vs --- GCString str1 = "test"; RCString str2 = "test"; // --- or --- String!GC str1 = "test"; String!RC str2 = "test"; // --- or even --- gc string str1 = "test"; rc string str2 = "test"; As far as I understand currently we will have: string str1 = "test"; RCString str2 = "test";Per Sean's idea things would go GC.string vs. RC.string, where GC and RC are two memory management policies (simple structs defining aliases and probably a few primitives).So another question is why the same object "string" is implemented as different types. Array and struct (class)?A reference counted string has a different layout than immutable(char)[].3. Should algorithms based on range interface care about allocation? Range is about iteration and access to elements but not about allocation and memory mangement.Most don't.I would like to have attributes rc, gc (or like these) to switch MM-policy versus *String!RC* or *RCString* but we cannot apply attributes to literal. Passing to allgorithm something like this: find( rc "test", rc "t" ) is syntactically incorrect. But we can use this form: find( RCString("test"), RCString("t") ) But above form is more verbose. As continuation of this question I have next question.If language changes are necessary, we will make language changes. I'm trying first to explore solutions within the language.4. How to deal with literals? How to make them ref-counted?I don't know yet.I ask this because even when writing RCString("test") syntactically expression "test" is still GC-managed literal. I pass GC-managed literal into struct to make it RC-managed. Why just not make it RC from the start? Adding some additional template parameter to algrorithm wil not fix this. It is a problem of D itself and it's runtime library.I understand. The problem is actually worse with array literals, which are silently dynamically allocated on the garbage-collected heap: auto s = "hello"; // at least there's no allocation auto a = [1, 2, 3]; // dynamic allocation A language-based solution would change array literal syntax. A library-based solution would leave array literals with today's syntax and semantics and offer a controlled alternative a la: auto a = MyMemPolicy.array(1, 2, 3); // coolSo I assume that std lib is not broken this way and we should not try to fix it this way. Thanks for attention.And thanks for your great points. Andrei
Oct 01 2014
On Monday, 29 September 2014 at 10:49:53 UTC, Andrei Alexandrescu wrote:Back when I've first introduced RCString I hinted that we have a larger strategy in mind. Here it is. The basic tenet of the approach is to reckon and act on the fact that memory allocation (the subject of allocators) is an entirely distinct topic from memory management, and more generally resource management. This clarifies that it would be wrong to approach alternatives to GC in Phobos by means of allocators. GC is not only an approach to memory allocation, but also an approach to memory management. Reducing it to either one is a mistake. In hindsight this looks rather obvious but it has caused me and many people better than myself a lot of headache. That said allocators are nice to have and use, and I will definitely follow up with std.allocator. However, std.allocator is not the key to a nogc Phobos. Nor are ranges. There is an attitude that either output ranges, or input ranges in conjunction with lazy computation, would solve the issue of creating garbage. https://github.com/D-Programming-Language/phobos/pull/2423 is a good illustration of the latter approach: a range would be lazily created by chaining stuff together. A range-based approach would take us further than the allocators, but I see the following issues with it: (a) the whole approach doesn't stand scrutiny for non-linear outputs, e.g. outputting some sort of associative array or really any composite type quickly becomes tenuous either with an output range (eager) or with exposing an input range (lazy); (b) makes the style of programming without GC radically different, and much more cumbersome, than programming with GC; as a consequence, programmers who consider changing one approach to another, or implementing an algorithm neutral to it, are looking at a major rewrite; (c) would make D/ nogc a poor cousin of C++. This is quite out of character; technically, I have long gotten used to seeing most elaborate C++ code like poor emulation of simple D idioms. But C++ has spent years and decades taking to perfection an approach without a tracing garbage collector. A departure from that would need to be superior, and that doesn't seem to be the case with range-based approaches. =========== Now that we clarified that these existing attempts are not going to work well, the question remains what does. For Phobos I'm thinking of defining and using three policies: enum MemoryManagementPolicy { gc, rc, mrc } immutable gc = ResourceManagementPolicy.gc, rc = ResourceManagementPolicy.rc, mrc = ResourceManagementPolicy.mrc; The three policies are: (a) gc is the classic garbage-collected style of management; (b) rc is a reference-counted style still backed by the GC, i.e. the GC will still be able to pick up cycles and other kinds of leaks. (c) mrc is a reference-counted style backed by malloc. (It should be possible to collapse rc and mrc together and make the distinction dynamically, at runtime. I'm distinguishing them statically here for expository purposes.) The policy is a template parameter to functions in Phobos (and elsewhere), and informs the functions e.g. what types to return. Consider: auto setExtension(MemoryManagementPolicy mmp = gc, R1, R2)(R1 path, R2 ext) if (...) { static if (mmp == gc) alias S = string; else alias S = RCString; S result; ... return result; } On the caller side: auto p1 = setExtension("hello", ".txt"); // fine, use gc auto p2 = setExtension!gc("hello", ".txt"); // same auto p3 = setExtension!rc("hello", ".txt"); // fine, use rc So by default it's going to continue being business as usual, but certain functions will allow passing in a (defaulted) policy for memory management. Destroy! AndreiInternally we should have something like: --- template String(MemoryManagementPolicy mmp=gc){ /++ ... +/ } auto setExtension(MemoryManagementPolicy mmp = gc, R1, R2)(R1 path, R2 ext) if (...) { auto result=String!mmp(); /++ +/ } ---- or maybe even allowing user types in the template argument(the original purpose of templates) --- auto setExtension(String = string, R1, R2)(R1 path, R2){ /++ +/ } ----
Sep 29 2014
On 9/29/14, 3:11 PM, Freddy wrote:That's correct. -- AndreiInternally we should have something like: --- template String(MemoryManagementPolicy mmp=gc){ /++ ... +/ } auto setExtension(MemoryManagementPolicy mmp = gc, R1, R2)(R1 path, R2 ext) if (...) { auto result=String!mmp(); /++ +/ } ---- or maybe even allowing user types in the template argument(the original purpose of templates) --- auto setExtension(String = string, R1, R2)(R1 path, R2){ /++ +/ }
Sep 29 2014
On 9/29/14, 3:11 PM, Freddy wrote:Good idea, and it seems Sean's is even better because it groups everything related to memory management where it belongs - in the memory management policy. -- AndreiInternally we should have something like: --- template String(MemoryManagementPolicy mmp=gc){ /++ ... +/ } auto setExtension(MemoryManagementPolicy mmp = gc, R1, R2)(R1 path, R2 ext) if (...) { auto result=String!mmp(); /++ +/ } ---- or maybe even allowing user types in the template argument(the original purpose of templates) --- auto setExtension(String = string, R1, R2)(R1 path, R2){ /++ +/ } ----
Oct 01 2014
I hate the fact that this will produce template bloat for each function/method. I'm also in favor of "let the user pick", but I would use a global variable: ---- enum MemoryManagementPolicy { gc, rc, mrc } immutable gc = ResourceManagementPolicy.gc, rc = ResourceManagementPolicy.rc, mrc = ResourceManagementPolicy.mrc; auto RMP = gc; ---- and in my code: ---- RMP = rc; string str = "foo"; // compiler knows -> ref counted // ... RMP = gc; string str2 = "bar"; // normal behaviour restored ----
Sep 30 2014
On Tuesday, 30 September 2014 at 13:38:43 UTC, Foo wrote:I hate the fact that this will produce template bloat for each function/method. I'm also in favor of "let the user pick", but I would use a global variable: ---- enum MemoryManagementPolicy { gc, rc, mrc } immutable gc = ResourceManagementPolicy.gc, rc = ResourceManagementPolicy.rc, mrc = ResourceManagementPolicy.mrc; auto RMP = gc; ---- and in my code: ---- RMP = rc; string str = "foo"; // compiler knows -> ref counted // ... RMP = gc; string str2 = "bar"; // normal behaviour restored ----Of course each method/function in Phobos should use the global RMP.
Sep 30 2014
On 9/30/14, 6:38 AM, Foo wrote:I hate the fact that this will produce template bloat for each function/method. I'm also in favor of "let the user pick", but I would use a global variable: ---- enum MemoryManagementPolicy { gc, rc, mrc } immutable gc = ResourceManagementPolicy.gc, rc = ResourceManagementPolicy.rc, mrc = ResourceManagementPolicy.mrc; auto RMP = gc; ---- and in my code: ---- RMP = rc; string str = "foo"; // compiler knows -> ref counted // ... RMP = gc; string str2 = "bar"; // normal behaviour restored ----This won't work because the type of "string" is different for RC vs. GC. -- Andrei
Sep 30 2014
On Tuesday, 30 September 2014 at 13:59:23 UTC, Andrei Alexandrescu wrote:On 9/30/14, 6:38 AM, Foo wrote:But it would work for phobos functions without template bloat.I hate the fact that this will produce template bloat for each function/method. I'm also in favor of "let the user pick", but I would use a global variable: ---- enum MemoryManagementPolicy { gc, rc, mrc } immutable gc = ResourceManagementPolicy.gc, rc = ResourceManagementPolicy.rc, mrc = ResourceManagementPolicy.mrc; auto RMP = gc; ---- and in my code: ---- RMP = rc; string str = "foo"; // compiler knows -> ref counted // ... RMP = gc; string str2 = "bar"; // normal behaviour restored ----This won't work because the type of "string" is different for RC vs. GC. -- Andrei
Sep 30 2014
On Tuesday, 30 September 2014 at 14:05:43 UTC, Foo wrote:On Tuesday, 30 September 2014 at 13:59:23 UTC, Andrei Alexandrescu wrote:Only for internal allocations. If the functions want to return something, the type must known.On 9/30/14, 6:38 AM, Foo wrote: This won't work because the type of "string" is different for RC vs. GC. -- AndreiBut it would work for phobos functions without template bloat.
Sep 30 2014
On 9/30/14, 7:13 AM, "Marc Schütz" <schuetzm gmx.net>" wrote:On Tuesday, 30 September 2014 at 14:05:43 UTC, Foo wrote:Ah, now I understand the point. Thanks. -- AndreiOn Tuesday, 30 September 2014 at 13:59:23 UTC, Andrei Alexandrescu wrote:Only for internal allocations. If the functions want to return something, the type must known.On 9/30/14, 6:38 AM, Foo wrote: This won't work because the type of "string" is different for RC vs. GC. -- AndreiBut it would work for phobos functions without template bloat.
Sep 30 2014
On 9/30/14, 7:05 AM, Foo wrote:On Tuesday, 30 September 2014 at 13:59:23 UTC, Andrei Alexandrescu wrote:How is the fact there's less bloat relevant for code that doesn't work? I.e. it doesn't compile. It needs to return string for GC and RCString for RC. AndreiOn 9/30/14, 6:38 AM, Foo wrote:But it would work for phobos functions without template bloat.I hate the fact that this will produce template bloat for each function/method. I'm also in favor of "let the user pick", but I would use a global variable: ---- enum MemoryManagementPolicy { gc, rc, mrc } immutable gc = ResourceManagementPolicy.gc, rc = ResourceManagementPolicy.rc, mrc = ResourceManagementPolicy.mrc; auto RMP = gc; ---- and in my code: ---- RMP = rc; string str = "foo"; // compiler knows -> ref counted // ... RMP = gc; string str2 = "bar"; // normal behaviour restored ----This won't work because the type of "string" is different for RC vs. GC. -- Andrei
Sep 30 2014
On Monday, 29 September 2014 at 10:49:53 UTC, Andrei Alexandrescu wrote:Back when I've first introduced RCString I hinted that we have a larger strategy in mind. Here it is. The basic tenet of the approach is to reckon and act on the fact that memory allocation (the subject of allocators) is an entirely distinct topic from memory management, and more generally resource management. This clarifies that it would be wrong to approach alternatives to GC in Phobos by means of allocators. GC is not only an approach to memory allocation, but also an approach to memory management. Reducing it to either one is a mistake. In hindsight this looks rather obvious but it has caused me and many people better than myself a lot of headache. That said allocators are nice to have and use, and I will definitely follow up with std.allocator. However, std.allocator is not the key to a nogc Phobos. Nor are ranges. There is an attitude that either output ranges, or input ranges in conjunction with lazy computation, would solve the issue of creating garbage. https://github.com/D-Programming-Language/phobos/pull/2423 is a good illustration of the latter approach: a range would be lazily created by chaining stuff together. A range-based approach would take us further than the allocators, but I see the following issues with it: (a) the whole approach doesn't stand scrutiny for non-linear outputs, e.g. outputting some sort of associative array or really any composite type quickly becomes tenuous either with an output range (eager) or with exposing an input range (lazy); (b) makes the style of programming without GC radically different, and much more cumbersome, than programming with GC; as a consequence, programmers who consider changing one approach to another, or implementing an algorithm neutral to it, are looking at a major rewrite; (c) would make D/ nogc a poor cousin of C++. This is quite out of character; technically, I have long gotten used to seeing most elaborate C++ code like poor emulation of simple D idioms. But C++ has spent years and decades taking to perfection an approach without a tracing garbage collector. A departure from that would need to be superior, and that doesn't seem to be the case with range-based approaches. =========== Now that we clarified that these existing attempts are not going to work well, the question remains what does. For Phobos I'm thinking of defining and using three policies: enum MemoryManagementPolicy { gc, rc, mrc } immutable gc = ResourceManagementPolicy.gc, rc = ResourceManagementPolicy.rc, mrc = ResourceManagementPolicy.mrc; The three policies are: (a) gc is the classic garbage-collected style of management; (b) rc is a reference-counted style still backed by the GC, i.e. the GC will still be able to pick up cycles and other kinds of leaks. (c) mrc is a reference-counted style backed by malloc. (It should be possible to collapse rc and mrc together and make the distinction dynamically, at runtime. I'm distinguishing them statically here for expository purposes.) The policy is a template parameter to functions in Phobos (and elsewhere), and informs the functions e.g. what types to return. Consider: auto setExtension(MemoryManagementPolicy mmp = gc, R1, R2)(R1 path, R2 ext) if (...) { static if (mmp == gc) alias S = string; else alias S = RCString; S result; ... return result; } On the caller side: auto p1 = setExtension("hello", ".txt"); // fine, use gc auto p2 = setExtension!gc("hello", ".txt"); // same auto p3 = setExtension!rc("hello", ".txt"); // fine, use rc So by default it's going to continue being business as usual, but certain functions will allow passing in a (defaulted) policy for memory management. Destroy! AndreiInstead of adding a new template parameter to every function (which won't necessarily play nicely with existing IFTI and variadic templates), why not allow template modules? import stringRC = std.string!rc; import stringGC = std.string!gc; // in std/string.d module std.string(MemoryManagementPolicy mmp) pure trusted S capitalize(S)(S s) if (isSomeString!S) { //... static if(mmp == MemoryManagementPolicy.gc) { //... } else static if ....... }
Sep 30 2014
On 9/30/14, 7:07 AM, John Colvin wrote:Instead of adding a new template parameter to every function (which won't necessarily play nicely with existing IFTI and variadic templates), why not allow template modules?Nice idea, but let's try and explore possibilities within the existing rich language. If a need for new language features arises, I trust we'll see it. -- Andrei
Oct 01 2014
On Monday, 29 September 2014 at 10:49:53 UTC, Andrei Alexandrescu wrote:The policy is a template parameter to functions in Phobos (and elsewhere), and informs the functions e.g. what types to return. Consider: auto setExtension(MemoryManagementPolicy mmp = gc, R1, R2)(R1 path, R2 ext) if (...) { static if (mmp == gc) alias S = string; else alias S = RCString; S result; ... return result; }Is this for exposition purposes or actually how you expect it to work? Quite honestly, I can't imagine how I could write a template function in D that needs to work with this approach. As much as I hate to say it, this is pretty much exactly what C++ allocators were designed for. They handle allocation, sure, but they also hold aliases for all relevant types for the data being allocated. If the MemoryManagementPolicy enum were replaced with an alias to a type that I could use to at least obtain relevant aliases, that would be something. But even that approach dramatically complicates code that uses it. Having written standards-compliant containers in C++, I honestly can't imagine the average user writing code that works this way. Once you assert that the reference type may be a pointer or it may be some complex proxy to data stored elsewhere, a lot of composability pretty much flies right out the window. For example, I have an implementation of C++ unordered_map/set/etc designed to be a customizable cache, so one of its template arguments is a policy type that allows eviction behavior to be chosen at declaration time. Maybe the cache is size-limited, maybe it's age-limited, maybe it's a combination of the two or something even more complicated. The problem is that the container defines all the aliases relating to the underlying data, but the policy, which needs to be aware of these, is passed as a template argument to this container. To make something that's fully aware of C++ allocators then, I'd have to define a small type that takes the container template arguments (the contained type and the allocator type) and generates the aliases and pass this to the policy, which in turn passes the type through to the underlying container so it can declare its public aliases and whatever else is true standards-compliant fashion (or let the container derive this itself, but then you run into the potential for disagreement). And while this is possible, doing so would complicate the creation of the cache policies to the point where it subverts their intent, which was to make it easy for the user to tune the behavior of the cache to their own particular needs by defining a simple type which implements a few functions. Ultimately, I decided against this approach for the cache container and decided to restrict the allocators to those which defined a pointer to T as T* so the policies could be coded with basically no knowledge of the underlying storage. So... while I support the goal you're aiming at, I want to see a much more comprehensive example of how this will work and how it will affect code written by D *users*. Because it isn't enough for Phobos to be written this way. Basically all D code will have to take this into account for the strategy to be truly viable. Simply outlining one of the most basic functions in Phobos, which already looks like it will have a static conditional at the beginning and *need to be aware of the fact that an RCString type exists* makes me terrified of what a realistic example will look like.
Sep 30 2014
On Tue, Sep 30, 2014 at 04:10:43PM +0000, Sean Kelly via Digitalmars-d wrote:On Monday, 29 September 2014 at 10:49:53 UTC, Andrei Alexandrescu wrote:[...]The policy is a template parameter to functions in Phobos (and elsewhere), and informs the functions e.g. what types to return. Consider: auto setExtension(MemoryManagementPolicy mmp = gc, R1, R2)(R1 path, R2 ext) if (...) { static if (mmp == gc) alias S = string; else alias S = RCString; S result; ... return result; }Is this for exposition purposes or actually how you expect it to work? Quite honestly, I can't imagine how I could write a template function in D that needs to work with this approach. As much as I hate to say it, this is pretty much exactly what C++ allocators were designed for. They handle allocation, sure, but they also hold aliases for all relevant types for the data being allocated.So... while I support the goal you're aiming at, I want to see a much more comprehensive example of how this will work and how it will affect code written by D *users*. Because it isn't enough for Phobos to be written this way. Basically all D code will have to take this into account for the strategy to be truly viable. Simply outlining one of the most basic functions in Phobos, which already looks like it will have a static conditional at the beginning and *need to be aware of the fact that an RCString type exists* makes me terrified of what a realistic example will look like.Yeah, this echoes my concern. This looks not that much different, from a user's POV, from C++ containers' allocator template parameters. Yes I know we're not talking about *allocators* per se but about *memory management*, but I'm talking about the need to explicitly pass mmp to *every* *single* *function* if you desire anything but the default. How many people actually *use* the allocator parameter in STL? Certainly, many people do... but the code is anything but readable / maintainable. Not only that, but every single function will have to handle this parameter somehow, and if static if's at the top of the function is what we're starting with, I fear seeing what we end up with. Furthermore, in order for this to actually work, it has to be percolated throughout the entire codebase -- any D library that even remotely uses Phobos for anything will have to percolate this parameter throughout its API -- at least, any part of the API that might potentially use a Phobos function. Otherwise, you still have the situation where a given D library doesn't allow the user to select a memory management scheme, and internally calls Phobos functions with the default settings. So this still doesn't solve the problem that today, people who need to use nogc can't use a lot of existing libraries because the library depends on the GC, even if it doesn't assume anything about the MM scheme, but just happens to call some obscure Phobos function with the default MM parameter. The only way this could work was if *every* D library author voluntarily rewrites a lot of code in order to percolate this MM parameter through to the API, on the off-chance that some obscure user somewhere might have need to use it. I don't see much likelihood of this actually happening. Then there's the matter of functions like parseJSON() that needs to allocate nodes and return a tree (or whatever) of these nodes. Note that they need to *allocate*, not just know what kind of memory management model is to be used. So how do you propose to address this? Via another parameter (compile-time or otherwise) to specify which allocator to use? So how does the memory management parameter solve anything then? And how would such a thing be implemented? Using a 3-way static-if branch in every single point in parseJSON where it needs to allocate nodes? We could just as well write it in C++, if that's the case. This proposal has many glaring holes that need to be fixed before it can be viable. T -- EMACS = Extremely Massive And Cumbersome System
Sep 30 2014
On 9/30/14, 10:33 AM, H. S. Teoh via Digitalmars-d wrote:Yeah, this echoes my concern. This looks not that much different, from a user's POV, from C++ containers' allocator template parameters. Yes I know we're not talking about*allocators* per se but about *memory management*, but I'm talking about the need to explicitly pass mmp to *every* *single* *function* if you desire anything but the default. How many people actually*use* the allocator parameter in STL? Certainly, many people do... but the code is anything but readable / maintainable.The parallel with STL allocators is interesting, but I'm not worried about it that much. I don't want to go off on a tangent but I'm fairly certain std::allocator is hard to use for entirely different reasons than the intended use patterns of MemoryManagementPolicy.Not only that, but every single function will have to handle this parameter somehow, and if static if's at the top of the function is what we're starting with, I fear seeing what we end up with.Apparently Sean's idea would take care of that.Furthermore, in order for this to actually work, it has to be percolated throughout the entire codebase -- any D library that even remotely uses Phobos for anything will have to percolate this parameter throughout its API -- at least, any part of the API that might potentially use a Phobos function.Yes, but that's entirely expected. We're adding genuinely new functionality to Phobos.Otherwise, you still have the situation where a given D library doesn't allow the user to select a memory management scheme, and internally calls Phobos functions with the default settings.Correct.So this still doesn't solve the problem that today, people who need to use nogc can't use a lot of existing libraries because the library depends on the GC, even if it doesn't assume anything about the MM scheme, but just happens to call some obscure Phobos function with the default MM parameter. The only way this could work was if*every* D library author voluntarily rewrites a lot of code in order to percolate this MM parameter through to the API, on the off-chance that some obscure user somewhere might have need to use it. I don't see much likelihood of this actually happening.A simple way to put this is Libraries that use the GC will continue to use the GC. There's no way around that unless we choose to break them all.Then there's the matter of functions like parseJSON() that needs to allocate nodes and return a tree (or whatever) of these nodes. Note that they need to*allocate*, not just know what kind of memory management model is to be used. So how do you propose to address this? Via another parameter (compile-time or otherwise) to specify which allocator to use? So how does the memory management parameter solve anything then? And how would such a thing be implemented? Using a 3-way static-if branch in every single point in parseJSON where it needs to allocate nodes? We could just as well write it in C++, if that's the case.parseJSON() would get a memory management policy parameter, and will use the currently installed memory allocator for allocation.This proposal has many glaring holes that need to be fixed before it can be viable.Affirmative. That's why it's an RFC, very far from a proposal. I'm glad I got a bunch of good ideas. Andrei
Oct 01 2014
On 9/30/14, 9:10 AM, Sean Kelly wrote:On Monday, 29 September 2014 at 10:49:53 UTC, Andrei Alexandrescu wrote:That's pretty much what it would take. The key here is that RCString is almost a drop-in replacement for string, so the code using it is almost identical. There will be places where code needs to be replaced, e.g. auto s = "literal"; would need to become S s = "literal"; So creation of strings will change a bit, but overall there's not a lot of churn.The policy is a template parameter to functions in Phobos (and elsewhere), and informs the functions e.g. what types to return. Consider: auto setExtension(MemoryManagementPolicy mmp = gc, R1, R2)(R1 path, R2 ext) if (...) { static if (mmp == gc) alias S = string; else alias S = RCString; S result; ... return result; }Is this for exposition purposes or actually how you expect it to work?Quite honestly, I can't imagine how I could write a template function in D that needs to work with this approach.You mean write a function that accepts a memory management policy, or a function that uses one?As much as I hate to say it, this is pretty much exactly what C++ allocators were designed for. They handle allocation, sure, but they also hold aliases for all relevant types for the data being allocated. If the MemoryManagementPolicy enum were replaced with an alias to a type that I could use to at least obtain relevant aliases, that would be something. But even that approach dramatically complicates code that uses it.I think making MemoryManagementPolicy a meaningful type is a great idea. It would e.g. define the string type, so the code becomes: auto setExtension(alias MemoryManagementPolicy = gc, R1, R2)(R1 path, R2 ext) if (...) { MemoryManagementPolicy.string result; ... return result; } This is a lot more general and extensible. Thanks! Why do you think there'd be dramatic complication of code? (Granted, at some point we must acknowledge that some egg breaking is necessary for the proverbial omelette.)Having written standards-compliant containers in C++, I honestly can't imagine the average user writing code that works this way. Once you assert that the reference type may be a pointer or it may be some complex proxy to data stored elsewhere, a lot of composability pretty much flies right out the window.The thing is, again, we must make some changes if we want D to be usable without a GC. One of them is e.g. to not allocate built-in slices all over the place.For example, I have an implementation of C++ unordered_map/set/etc designed to be a customizable cache, so one of its template arguments is a policy type that allows eviction behavior to be chosen at declaration time. Maybe the cache is size-limited, maybe it's age-limited, maybe it's a combination of the two or something even more complicated. The problem is that the container defines all the aliases relating to the underlying data, but the policy, which needs to be aware of these, is passed as a template argument to this container. To make something that's fully aware of C++ allocators then, I'd have to define a small type that takes the container template arguments (the contained type and the allocator type) and generates the aliases and pass this to the policy, which in turn passes the type through to the underlying container so it can declare its public aliases and whatever else is true standards-compliant fashion (or let the container derive this itself, but then you run into the potential for disagreement). And while this is possible, doing so would complicate the creation of the cache policies to the point where it subverts their intent, which was to make it easy for the user to tune the behavior of the cache to their own particular needs by defining a simple type which implements a few functions. Ultimately, I decided against this approach for the cache container and decided to restrict the allocators to those which defined a pointer to T as T* so the policies could be coded with basically no knowledge of the underlying storage.That sounds like a rather involved artifact. Hopefully we can leverage D's better expressiveness to make building such complex libraries easier.So... while I support the goal you're aiming at, I want to see a much more comprehensive example of how this will work and how it will affect code written by D *users*.Agreed.Because it isn't enough for Phobos to be written this way. Basically all D code will have to take this into account for the strategy to be truly viable. Simply outlining one of the most basic functions in Phobos, which already looks like it will have a static conditional at the beginning and *need to be aware of the fact that an RCString type exists* makes me terrified of what a realistic example will look like.That would be overreacting :o). Andrei
Oct 01 2014
On Wednesday, 1 October 2014 at 08:55:55 UTC, Andrei Alexandrescu wrote:On 9/30/14, 9:10 AM, Sean Kelly wrote:I'm confused. Is this a general-purpose solution or just one that switches between string and RCString?Is this for exposition purposes or actually how you expect it to work?That's pretty much what it would take. The key here is that RCString is almost a drop-in replacement for string, so the code using it is almost identical. There will be places where code needs to be replaced, e.g. auto s = "literal"; would need to become S s = "literal"; So creation of strings will change a bit, but overall there's not a lot of churn.
Oct 01 2014
On 10/1/14, 6:52 AM, Sean Kelly wrote:On Wednesday, 1 October 2014 at 08:55:55 UTC, Andrei Alexandrescu wrote:General purpose since your suggested change. -- AndreiOn 9/30/14, 9:10 AM, Sean Kelly wrote:I'm confused. Is this a general-purpose solution or just one that switches between string and RCString?Is this for exposition purposes or actually how you expect it to work?That's pretty much what it would take. The key here is that RCString is almost a drop-in replacement for string, so the code using it is almost identical. There will be places where code needs to be replaced, e.g. auto s = "literal"; would need to become S s = "literal"; So creation of strings will change a bit, but overall there's not a lot of churn.
Oct 01 2014
On Wednesday, 1 October 2014 at 08:55:55 UTC, Andrei Alexandrescu wrote:On 9/30/14, 9:10 AM, Sean Kelly wrote:Both, I suppose? A static if block at the top of each function that must be aware of every RC type the user may expect? What if it's a user-defined RC type and this function is in Phobos?Quite honestly, I can't imagine how I could write a template function in D that needs to work with this approach.You mean write a function that accepts a memory management policy, or a function that uses one?From my experience with C++ containers. Having an alias for a type is okay, but bank of aliases where one is a pointer to the type, one is a const pointer to the type, etc, makes writing the involved code feel really unnatural.As much as I hate to say it, this is pretty much exactly what C++ allocators were designed for. They handle allocation, sure, but they also hold aliases for all relevant types for the data being allocated. If the MemoryManagementPolicy enum were replaced with an alias to a type that I could use to at least obtain relevant aliases, that would be something. But even that approach dramatically complicates code that uses it.I think making MemoryManagementPolicy a meaningful type is a great idea. It would e.g. define the string type, so the code becomes: auto setExtension(alias MemoryManagementPolicy = gc, R1, R2)(R1 path, R2 ext) if (...) { MemoryManagementPolicy.string result; ... return result; } This is a lot more general and extensible. Thanks! Why do you think there'd be dramatic complication of code? (Granted, at some point we must acknowledge that some egg breaking is necessary for the proverbial omelette.)The thing is, again, we must make some changes if we want D to be usable without a GC. One of them is e.g. to not allocate built-in slices all over the place.So let the user supply a scratch buffer that will hold the result? With the RC approach we're still allocating, they just aren't built-in slices, correct?That would be overreacting :o).I hope it is :-)
Oct 01 2014
On 10/1/14, 7:03 AM, Sean Kelly wrote:So let the user supply a scratch buffer that will hold the result? With the RC approach we're still allocating, they just aren't built-in slices, correct?Correct. -- Andrei
Oct 01 2014
29-Sep-2014 14:49, Andrei Alexandrescu пишет:auto setExtension(MemoryManagementPolicy mmp = gc, R1, R2)(R1 path, R2 ext) if (...) { static if (mmp == gc) alias S = string; else alias S = RCString; S result; ... return result; }Incredible code bloat? Boilerplate in each function for the win? I'm at loss as to how it would make things better. -- Dmitry Olshansky
Sep 30 2014
On 9/30/14, 11:06 AM, Dmitry Olshansky wrote:29-Sep-2014 14:49, Andrei Alexandrescu пишет:Sean's idea to make string an alias of the policy takes care of this concern. -- Andreiauto setExtension(MemoryManagementPolicy mmp = gc, R1, R2)(R1 path, R2 ext) if (...) { static if (mmp == gc) alias S = string; else alias S = RCString; S result; ... return result; }Incredible code bloat? Boilerplate in each function for the win? I'm at loss as to how it would make things better.
Oct 01 2014
On Wed, Oct 01, 2014 at 02:51:08AM -0700, Andrei Alexandrescu via Digitalmars-d wrote:On 9/30/14, 11:06 AM, Dmitry Olshansky wrote:But Sean's idea only takes strings into account. Strings aren't the only allocated resource Phobos needs to deal with. So extrapolating from that idea, each memory management struct (or whatever other aggregate we end up using), say call it MMP, will have to define MMP.string, MMP.jsonNode (since parseJSON() need to allocate not only strings but JSON nodes), MMP.redBlackTreeNode, MMP.listNode, MMP.userDefinedNode, ... Nope, still don't see how this could work. Please clarify, kthx. T -- Sometimes the best solution to morale problems is just to fire all of the unhappy people. -- despair.com29-Sep-2014 14:49, Andrei Alexandrescu пишет:Sean's idea to make string an alias of the policy takes care of this concern. -- Andreiauto setExtension(MemoryManagementPolicy mmp = gc, R1, R2)(R1 path, R2 ext) if (...) { static if (mmp == gc) alias S = string; else alias S = RCString; S result; ... return result; }Incredible code bloat? Boilerplate in each function for the win? I'm at loss as to how it would make things better.
Oct 01 2014
On Wednesday, 1 October 2014 at 17:53:43 UTC, H. S. Teoh via Digitalmars-d wrote:On Wed, Oct 01, 2014 at 02:51:08AM -0700, Andrei Alexandrescu via Digitalmars-d wrote:MMP.Ref!redBlackTreeNode ? (where Ref is e.g. a ref-counted pointer type (like RefCounted but with class support) for RC MMP but plain GC reference for GC MMP, etc.) I kinda like this idea, since it might possibly allow user-defined memory management policies (which wouldn't get special compiler treatment that e.g. RC may need, though).On 9/30/14, 11:06 AM, Dmitry Olshansky wrote:But Sean's idea only takes strings into account. Strings aren't the only allocated resource Phobos needs to deal with. So extrapolating from that idea, each memory management struct (or whatever other aggregate we end up using), say call it MMP, will have to define MMP.string, MMP.jsonNode (since parseJSON() need to allocate not only strings but JSON nodes), MMP.redBlackTreeNode, MMP.listNode, MMP.userDefinedNode, ... Nope, still don't see how this could work. Please clarify, kthx. T29-Sep-2014 14:49, Andrei Alexandrescu пишет:Sean's idea to make string an alias of the policy takes care of this concern. -- Andreiauto setExtension(MemoryManagementPolicy mmp = gc, R1, R2)(R1 path, R2 ext) if (...) { static if (mmp == gc) alias S = string; else alias S = RCString; S result; ... return result; }Incredible code bloat? Boilerplate in each function for the win? I'm at loss as to how it would make things better.
Oct 01 2014
On Wednesday, 1 October 2014 at 17:53:43 UTC, H. S. Teoh via Digitalmars-d wrote:But Sean's idea only takes strings into account. Strings aren't the only allocated resource Phobos needs to deal with. So extrapolating from that idea, each memory management struct (or whatever other aggregate we end up using), say call it MMP, will have to define MMP.string, MMP.jsonNode (since parseJSON() need to allocate not only strings but JSON nodes), MMP.redBlackTreeNode, MMP.listNode, MMP.userDefinedNode, ... Nope, still don't see how this could work. Please clarify, kthx.Assuming you're willing to take the memoryModel type as a template argument, I imagine we could do something where the user can specialize the memoryModel for their own types, a bit like how information is derived for iterators in C++. The problem is that this still means passing the memoryModel in as a template argument. What I'd really want is for it to be a global, except that templated virtuals is logically impossible. I guess something could maybe be sorted out via a factory design, but that's not terribly D-like. I'm at a loss for how to make this memoryModel thing work the way I'd actually want it to if I were to use it.
Oct 01 2014
On Wednesday, 1 October 2014 at 18:37:50 UTC, Sean Kelly wrote:On Wednesday, 1 October 2014 at 17:53:43 UTC, H. S. Teoh via Digitalmars-d wrote:If you were to forget D restrictions for a moment, and consider an idealized language, how would you express this? Maybe providing that will trigger some ideas from people beyond what we have seen so far by removing implied restrictions.But Sean's idea only takes strings into account. Strings aren't the only allocated resource Phobos needs to deal with. So extrapolating from that idea, each memory management struct (or whatever other aggregate we end up using), say call it MMP, will have to define MMP.string, MMP.jsonNode (since parseJSON() need to allocate not only strings but JSON nodes), MMP.redBlackTreeNode, MMP.listNode, MMP.userDefinedNode, ... Nope, still don't see how this could work. Please clarify, kthx.Assuming you're willing to take the memoryModel type as a template argument, I imagine we could do something where the user can specialize the memoryModel for their own types, a bit like how information is derived for iterators in C++. The problem is that this still means passing the memoryModel in as a template argument. What I'd really want is for it to be a global, except that templated virtuals is logically impossible. I guess something could maybe be sorted out via a factory design, but that's not terribly D-like. I'm at a loss for how to make this memoryModel thing work the way I'd actually want it to if I were to use it.
Oct 01 2014
On 10/1/14, 10:51 AM, H. S. Teoh via Digitalmars-d wrote:But Sean's idea only takes strings into account. Strings aren't the only allocated resource Phobos needs to deal with. So extrapolating from that idea, each memory management struct (or whatever other aggregate we end up using), say call it MMP, will have to define MMP.string, MMP.jsonNode (since parseJSON() need to allocate not only strings but JSON nodes), MMP.redBlackTreeNode, MMP.listNode, MMP.userDefinedNode, ... Nope, still don't see how this could work. Please clarify, kthx.There's management for T[], pointers to structs, pointers to class objects, associative arrays, and that covers everything. -- Andrei
Oct 01 2014
Ok, here are my few cents: On Monday, 29 September 2014 at 10:49:53 UTC, Andrei Alexandrescu wrote:Back when I've first introduced RCString I hinted that we have a larger strategy in mind. Here it is. The basic tenet of the approach is to reckon and act on the fact that memory allocation (the subject of allocators) is an entirely distinct topic from memory management, and more generally resource management. This clarifies that it would be wrong to approach alternatives to GC in Phobos by means of allocators. GC is not only an approach to memory allocation, but also an approach to memory management. Reducing it to either one is a mistake. In hindsight this looks rather obvious but it has caused me and many people better than myself a lot of headache.I would argue that GC is at its core _only_ a memory management strategy. It just so happens that the one in D's runtime also comes with an allocator, with which it is tightly integrated. In theory, a GC can work with any (and multiple) allocators, and you could of course also call GC.free() manually, because, as you say, management and allocation are entirely distinct topics.That said allocators are nice to have and use, and I will definitely follow up with std.allocator. However, std.allocator is not the key to a nogc Phobos.Agreed.Nor are ranges. There is an attitude that either output ranges, or input ranges in conjunction with lazy computation, would solve the issue of creating garbage. https://github.com/D-Programming-Language/phobos/pull/2423 is a good illustration of the latter approach: a range would be lazily created by chaining stuff together. A range-based approach would take us further than the allocators, but I see the following issues with it: (a) the whole approach doesn't stand scrutiny for non-linear outputs, e.g. outputting some sort of associative array or really any composite type quickly becomes tenuous either with an output range (eager) or with exposing an input range (lazy); (b) makes the style of programming without GC radically different, and much more cumbersome, than programming with GC; as a consequence, programmers who consider changing one approach to another, or implementing an algorithm neutral to it, are looking at a major rewrite; (c) would make D/ nogc a poor cousin of C++. This is quite out of character; technically, I have long gotten used to seeing most elaborate C++ code like poor emulation of simple D idioms. But C++ has spent years and decades taking to perfection an approach without a tracing garbage collector. A departure from that would need to be superior, and that doesn't seem to be the case with range-based approaches.I agree with this, too.=========== Now that we clarified that these existing attempts are not going to work well, the question remains what does. For Phobos I'm thinking of defining and using three policies: enum MemoryManagementPolicy { gc, rc, mrc } immutable gc = ResourceManagementPolicy.gc, rc = ResourceManagementPolicy.rc, mrc = ResourceManagementPolicy.mrc; The three policies are: (a) gc is the classic garbage-collected style of management; (b) rc is a reference-counted style still backed by the GC, i.e. the GC will still be able to pick up cycles and other kinds of leaks. (c) mrc is a reference-counted style backed by malloc. (It should be possible to collapse rc and mrc together and make the distinction dynamically, at runtime. I'm distinguishing them statically here for expository purposes.) The policy is a template parameter to functions in Phobos (and elsewhere), and informs the functions e.g. what types to return. Consider: auto setExtension(MemoryManagementPolicy mmp = gc, R1, R2)(R1 path, R2 ext) if (...) { static if (mmp == gc) alias S = string; else alias S = RCString; S result; ... return result; } On the caller side: auto p1 = setExtension("hello", ".txt"); // fine, use gc auto p2 = setExtension!gc("hello", ".txt"); // same auto p3 = setExtension!rc("hello", ".txt"); // fine, use rc So by default it's going to continue being business as usual, but certain functions will allow passing in a (defaulted) policy for memory management.This, however, I disagree with strongly. For one thing - this has already been noted by others - it would make the functions' implementation extremely ugly (`static if` hell), it would make them harder to unit test, and from a user's point of view, it's very tedious and might interfere badly with UFCS. But more importantly, IMO, it's the wrong thing to do. These functions shouldn't know anything about memory management policy at all. They allocate, which means they need to know about _allocation_ policy, but memory _management_ policy needs to be decided by the user. Now, your suggestion in a way still leaves that decision to the user, but does so in a very intrusive way, by passing a template flag. This is clearly a violation of the separation of concerns. Contrary to the typical case, implementation details of the user's code leak into the library code, and not the other way round, but that's just as bad. I'm convinced this isn't necessary. Let's take `setExtension()` as an example, standing in for any of a class of similar functions. This function allocates memory, returns it, and abandons it; it gives up ownership of the memory. The fact that the memory has been freshly allocated means that it is (head) unique, and therefore the caller (= library user) can take over the ownership. This, in turn, means that the caller can decide how she wants to manage it. (I'll try to make a sketch on how this can be implemented in another post.) As a conclusion, I would say that APIs should strive for the following principles, in this order: 1. Avoid allocation altogether, for example by laziness (ranges), or by accepting sinks. 2. If allocations are necessary (or desirable, to make the API more easily usable), try hard to return a unique value (this of course needs to be expressed in the return type). 3. If both of the above fails, only then return a GCed pointer, or alternatively provide several variants of the function (though this shouldn't be necessary often). An interesting alternative: Instead of passing a flag directly describing the policy, pass the function a type that it should wrap it's return value in. As for the _allocation_ strategy: It indeed needs to be configurable, but here, the same objections against a template parameter apply. As the allocator doesn't necessarily need to be part of the type, a (thread) global variable can be used to specify it. This lends itself well to idioms like with(MyAllocator alloc) { // ... }Destroy!Done :-)
Sep 30 2014
On Tuesday, 30 September 2014 at 19:10:19 UTC, Marc Schütz wrote:I'm convinced this isn't necessary. Let's take `setExtension()` as an example, standing in for any of a class of similar functions. This function allocates memory, returns it, and abandons it; it gives up ownership of the memory. The fact that the memory has been freshly allocated means that it is (head) unique, and therefore the caller (= library user) can take over the ownership. This, in turn, means that the caller can decide how she wants to manage it. (I'll try to make a sketch on how this can be implemented in another post.)Ok. What we need for it: 1) unique, or a way to expressly specify uniqueness on a function's return type, as well as restrict function params by it (and preferably overloading on uniqueness). DMD already has this concept internally, it just needs to be formalized. 2) A few modifications to RefCounted to be constructable from unique values. 3) A wrapper type similar to std.typecons.Unique, that also supports moving. Let's called it Owned(T). 4) Borrowing. setExtension() can then look like this: Owned!string setExtension(in char[] path, in char[] ext); To be used: void saveFileAs(in char[] name) { import std.path: setExtension; import std.file: write; name. // scope const(char[]) setExtension("txt"). // Owned!string write(data); } The Owned(T) value implicitly converts to `scope!this(T)` via alias this; it can therefore be conveniently passed to std.file.write() (which already takes the filename as `in`) without copying or moving. The value then is released automatically at the end of the statement, because it is only a temporary and is not assigned to a variable. For transferring ownership: RefCounted!string[] filenames; // ... filenames ~= name.setExtension("txt").release; `Owned!T.release()` returns the payload as a unique value, and resets the payload to it's init value (in this case `null`). RefCounted's constructor then accepts this unique value and takes ownership of it. When the Owned value's destructor is called, it finds the payload to be null and doesn't free the memory. Inlining and subsequent optimization can turn the destructor into a no-op in this case. Optionally, Owned!T can provide an `alias this` to its release method; in this case, the method doesn't need to be called explicitly. It is however debatable whether being explicit with moving isn't the better choice.
Sep 30 2014
On 9/30/14, 12:10 PM, "Marc Schütz" <schuetzm gmx.net>" wrote:I would argue that GC is at its core _only_ a memory management strategy. It just so happens that the one in D's runtime also comes with an allocator, with which it is tightly integrated. In theory, a GC can work with any (and multiple) allocators, and you could of course also call GC.free() manually, because, as you say, management and allocation are entirely distinct topics.I'm not very sure. A GC might need to interoperate closely with the allocator. -- Andrei
Oct 01 2014
On Wednesday, 1 October 2014 at 09:52:46 UTC, Andrei Alexandrescu wrote:On 9/30/14, 12:10 PM, "Marc Schütz" <schuetzm gmx.net>" wrote:It needs to know what to scan (ideally with type info), and which allocator to release memory with, but it doesn't need to be an allocator itself. It certainly helps with the implementation, but ideally there would be a well defined interface between allocators and GCs, so that both can be plugged in as desired, even with multiple GCs in parallel.I would argue that GC is at its core _only_ a memory management strategy. It just so happens that the one in D's runtime also comes with an allocator, with which it is tightly integrated. In theory, a GC can work with any (and multiple) allocators, and you could of course also call GC.free() manually, because, as you say, management and allocation are entirely distinct topics.I'm not very sure. A GC might need to interoperate closely with the allocator. -- Andrei
Oct 01 2014
On Tuesday, 30 September 2014 at 19:10:19 UTC, Marc Schütz wrote:[...] I'm convinced this isn't necessary. Let's take `setExtension()` as an example, standing in for any of a class of similar functions. This function allocates memory, returns it, and abandons it; it gives up ownership of the memory. The fact that the memory has been freshly allocated means that it is (head) unique, and therefore the caller (= library user) can take over the ownership. This, in turn, means that the caller can decide how she wants to manage it.Bingo. Have some way to mark the function return type as a unique pointer. This does not imply full-fledged unique pointer type support in the language - just enough to have the caller ensure continuity of memory management policy from there. One problem with actually implementing this is that using reference counting as a memory management policy requires extra space for the reference counter in the object, just as garbage collection requires support for scanning and identification of interior object memory range. While allocation and memory management may be quite independent in theory, practical high performance implementations tend to be intimately related.(I'll try to make a sketch on how this can be implemented in another post.)Do elaborate!As a conclusion, I would say that APIs should strive for the following principles, in this order: 1. Avoid allocation altogether, for example by laziness (ranges), or by accepting sinks. 2. If allocations are necessary (or desirable, to make the API more easily usable), try hard to return a unique value (this of course needs to be expressed in the return type). 3. If both of the above fails, only then return a GCed pointer, or alternatively provide several variants of the function (though this shouldn't be necessary often). An interesting alternative: Instead of passing a flag directly describing the policy, pass the function a type that it should wrap it's return value in. As for the _allocation_ strategy: It indeed needs to be configurable, but here, the same objections against a template parameter apply. As the allocator doesn't necessarily need to be part of the type, a (thread) global variable can be used to specify it. This lends itself well to idioms like with(MyAllocator alloc) { // ... }Assuming there is some dependency between the allocator and the memory management policy I guess this would be initialized on thread start that cannot be modified later. All code running inside the thread would need to either match the configured policy, not handle any kind of pointers or use a limited subset of unique pointers. Another way to ensure that code can run on either RC or GC is to make certain objects (specifically, Exceptions) always allocate a reference counter, regardless of the currently configured policy.
Oct 01 2014
Oren Tirosh:Bingo. Have some way to mark the function return type as a unique pointer. This does not imply full-fledged unique pointer type support in the languageLet's have full-fledged memory zones tracking in the D type system :-) Bye, bearophile
Oct 01 2014
On 10/1/14, 8:48 AM, Oren Tirosh wrote:On Tuesday, 30 September 2014 at 19:10:19 UTC, Marc Schütz wrote:I'm skeptical about this approach (though clearly we need to explore it for e.g. passing ownership of data across threads). For strings and other "casual" objects I think we should focus on GC/RC strategies. This is because people do things like: auto s = setExtension(s1, s2); and then attempt to use s as a regular variable (copy it etc). Making s unique would make usage quite surprising and cumbersome. Andrei[...] I'm convinced this isn't necessary. Let's take `setExtension()` as an example, standing in for any of a class of similar functions. This function allocates memory, returns it, and abandons it; it gives up ownership of the memory. The fact that the memory has been freshly allocated means that it is (head) unique, and therefore the caller (= library user) can take over the ownership. This, in turn, means that the caller can decide how she wants to manage it.Bingo. Have some way to mark the function return type as a unique pointer.
Oct 01 2014
On Wednesday, 1 October 2014 at 17:13:38 UTC, Andrei Alexandrescu wrote:On 10/1/14, 8:48 AM, Oren Tirosh wrote:The idea is that the unique property is very short-lived: the caller immediately assigns it to a pointer of the appropriate policy: either RC or GC. This keeps the callee agnostic of the chosen policy and does not require templating multiple versions of the code. The allocator configured for the thread must match the generated code at the call site i.e. if the caller uses RC pointers the allocator must allocate space for the reference counter (at negative offset to keep compatibility).On Tuesday, 30 September 2014 at 19:10:19 UTC, Marc Schütz wrote:I'm skeptical about this approach (though clearly we need to explore it for e.g. passing ownership of data across threads). For strings and other "casual" objects I think we should focus on GC/RC strategies. This is because people do things like: auto s = setExtension(s1, s2); and then attempt to use s as a regular variable (copy it etc). Making s unique would make usage quite surprising and cumbersome.[...] I'm convinced this isn't necessary. Let's take `setExtension()` as an example, standing in for any of a class of similar functions. This function allocates memory, returns it, and abandons it; it gives up ownership of the memory. The fact that the memory has been freshly allocated means that it is (head) unique, and therefore the caller (= library user) can take over the ownership. This, in turn, means that the caller can decide how she wants to manage it.Bingo. Have some way to mark the function return type as a unique pointer.
Oct 01 2014
On 10/1/14, 10:25 AM, Oren T wrote:The idea is that the unique property is very short-lived: the caller immediately assigns it to a pointer of the appropriate policy: either RC or GC. This keeps the callee agnostic of the chosen policy and does not require templating multiple versions of the code. The allocator configured for the thread must match the generated code at the call site i.e. if the caller uses RC pointers the allocator must allocate space for the reference counter (at negative offset to keep compatibility).This all... looks arcane. I'm not sure how it can even made to work if user code just uses "auto". -- Andrei
Oct 01 2014
On Wednesday, 1 October 2014 at 17:33:34 UTC, Andrei Alexandrescu wrote:On 10/1/14, 10:25 AM, Oren T wrote:At the moment, nogc code can't call any function returning a pointer. Under this scheme nogc is allowed to call either code that returns an explicitly RC tyThe idea is that the unique property is very short-lived: the caller immediately assigns it to a pointer of the appropriate policy: either RC or GC. This keeps the callee agnostic of the chosen policy and does not require templating multiple versions of the code. The allocator configured for the thread must match the generated code at the call site i.e. if the caller uses RC pointers the allocator must allocate space for the reference counter (at negative offset to keep compatibility).This all... looks arcane. I'm not sure how it can even made to work if user code just uses "auto". -- Andrei
Oct 01 2014
On Wednesday, 1 October 2014 at 17:33:34 UTC, Andrei Alexandrescu wrote:On 10/1/14, 10:25 AM, Oren T wrote:At the moment, nogc code can't call any function returning a pointer. Under this scheme nogc is allowed to call either code that returns an explicitly RC type (Exception, RCString) or code returning an "agnostic" unique pointer that may be used from either gc or nogc code. I already see some holes and problems, but I wonder if something along these lines may be made to work.The idea is that the unique property is very short-lived: the caller immediately assigns it to a pointer of the appropriate policy: either RC or GC. This keeps the callee agnostic of the chosen policy and does not require templating multiple versions of the code. The allocator configured for the thread must match the generated code at the call site i.e. if the caller uses RC pointers the allocator must allocate space for the reference counter (at negative offset to keep compatibility).This all... looks arcane. I'm not sure how it can even made to work if user code just uses "auto". -- Andrei
Oct 01 2014
On 01/10/14 19:25, Oren T wrote:The idea is that the unique property is very short-lived: the caller immediately assigns it to a pointer of the appropriate policy: either RC or GC. This keeps the callee agnostic of the chosen policy and does not require templating multiple versions of the code. The allocator configured for the thread must match the generated code at the call site i.e. if the caller uses RC pointers the allocator must allocate space for the reference counter (at negative offset to keep compatibility).Can't we do something like this, or it might be what you're proposing: Foo foo () { return new Foo; } gc a = foo(); // a contains an instance of Foo allocated with the GC rc b = foo(); // b contains an instance of Foo allocated with the RC allocator -- /Jacob Carlborg
Oct 01 2014
On Thursday, 2 October 2014 at 06:29:24 UTC, Jacob Carlborg wrote:gc a = foo(); // a contains an instance of Foo allocated with the GC rc b = foo(); // b contains an instance of Foo allocated with the RC allocatorThat would be better, but how do you deal with "bar(foo())" ? Context dependent instantiation is a semantic challenge when you also have overloading, but I guess you can get somewhere if you make whole program optimization mandatory and use a state-of-the-art constraint solver to handle the type system. Could lead you to NP-complete type resolution? But still doable (in most cases). I think you basically have 2 realistic choices if you want easy-going syntax for the end user: 1. implement rc everywhere in standard libraries and make it possible to turn off rc in a call-chain by having compiler support (and whole program optimization). To support manual management you need some kind of protocol for traversing allocated data-structures to free them. e.g.: define memory strategy malloc = some…manual…allocation…strategy…description; auto a = bar(foo()); // use gc or rc based on compiler flag auto a = rc( bar(foo()) ); // use rc in a gc context auto a = malloc( bar(foo()) ); // manual management (requires a protocol for traversal of recursive datastructures) 2. provide allocation strategy as a parameter e.g.: auto a = foo(); // alloc with gc auto a = foo!rc(); // alloc with rc auto a = foo!malloc(); // alloc with malloc But going the C++ way of having explicit allocators and non-embedded reference counters (double indirectio) probably is the easier solution in terms of bringing D to completion. How many years are you going to spend on making D ref count by default in a flawless and performant manner? Sure having RC being as easy to use as GC is a nice idea, but if it turns out to be either slower or more bug ridden than GC, then what is the point? Note that: 1. A write to a ref-count means the 64 bytes cacheline is dirty and has to be written back to memory. So you don't write 4 bytes, you write to 64 bytes. That's pretty expensive. 2. The memory bus is increasingly becoming the bottle neck of hardware architectures. => RC everywhere without heavy duty compiler/hardware support is a bad long term idea.
Oct 02 2014
On 02/10/14 11:41, "Ola Fosheim Grøstad" <ola.fosheim.grostad+dlang gmail.com>" wrote:That would be better, but how do you deal with "bar(foo())" ? Context dependent instantiation is a semantic challenge when you also have overloading, but I guess you can get somewhere if you make whole program optimization mandatory and use a state-of-the-art constraint solver to handle the type system. Could lead you to NP-complete type resolution? But still doable (in most cases).I haven't really thought how it could be implemented but I was hoping that the caller could magically decide the allocation strategy instead of the callee. It looks like Rust is doing something like that but I haven't looked at it in detail. -- /Jacob Carlborg
Oct 02 2014
On Thursday, 2 October 2014 at 11:41:14 UTC, Jacob Carlborg wrote:I haven't really thought how it could be implemented but I was hoping that the caller could magically decide the allocation strategy instead of the callee. It looks like Rust is doing something like that but I haven't looked at it in detail.I haven't looked at Rust in detail, but doesn't the Rust compiler take full control over memory management? I think that is a good idea, but it is at odds with D's general direction.
Oct 02 2014
On Thursday, 2 October 2014 at 13:29:58 UTC, Ola Fosheim Grøstad wrote:On Thursday, 2 October 2014 at 11:41:14 UTC, Jacob Carlborg wrote:Rust makes use of the type system and the borrow checker to validate how the pointers are being used. The usual errors when dealing with pointers are compile time errors in Rust. -- PauloI haven't really thought how it could be implemented but I was hoping that the caller could magically decide the allocation strategy instead of the callee. It looks like Rust is doing something like that but I haven't looked at it in detail.I haven't looked at Rust in detail, but doesn't the Rust compiler take full control over memory management? I think that is a good idea, but it is at odds with D's general direction.
Oct 02 2014
On Thursday, 2 October 2014 at 18:52:18 UTC, Paulo Pinto wrote:Rust makes use of the type system and the borrow checker to validate how the pointers are being used. The usual errors when dealing with pointers are compile time errors in Rust.They constrain usage so that you cannot share mutable objects. It is described in reasonable high level here: http://doc.rust-lang.org/0.11.0/rust.html#memory-and-concurrency-models But is sketchy on implementation details, semantic restrictions that follows and consequences when interacting with foreign code etc.
Oct 02 2014
On Thursday, 2 October 2014 at 19:45:17 UTC, Ola Fosheim Grøstad wrote:But is sketchy on implementation details, semantic restrictions that follows and consequences when interacting with foreign code etc.Some Rust details. «sendable» means that a reference can be transferred to another thread (or task/fiber/whatever). From http://doc.rust-lang.org/std/gc/ : «The Gc type provides shared ownership of an immutable value. Destruction is not deterministic, and will occur some time between every Gc handle being gone and the end of the task. The garbage collector is task-local so Gc<T> is not sendable.» From http://doc.rust-lang.org/std/rc/index.html : «The Rc type provides shared ownership of an immutable value. Destruction is deterministic, and will occur as soon as the last owner is gone. It is marked as non-sendable because it avoids the overhead of atomic reference counting. The downgrade method can be used to create a non-owning Weak pointer to the box. A Weak pointer can be upgraded to an Rc pointer, but will return None if the value has already been freed.» So… they don't really solve the issues a nogc version of D should be able to deal with beyond having built-in unique_ptr style semantics? Or?
Oct 02 2014
On Thursday, 2 October 2014 at 20:10:42 UTC, Ola Fosheim Grøstad wrote:On Thursday, 2 October 2014 at 19:45:17 UTC, Ola Fosheim Grøstad wrote:The Gc type is gone as of this week. https://github.com/rust-lang/meeting-minutes/blob/master/weekly-meetings/2014-09-30.mdBut is sketchy on implementation details, semantic restrictions that follows and consequences when interacting with foreign code etc.Some Rust details. «sendable» means that a reference can be transferred to another thread (or task/fiber/whatever). From http://doc.rust-lang.org/std/gc/ : «The Gc type provides shared ownership of an immutable value. Destruction is not deterministic, and will occur some time between every Gc handle being gone and the end of the task. The garbage collector is task-local so Gc<T> is not sendable.» From http://doc.rust-lang.org/std/rc/index.html : «The Rc type provides shared ownership of an immutable value. Destruction is deterministic, and will occur as soon as the last owner is gone. It is marked as non-sendable because it avoids the overhead of atomic reference counting. The downgrade method can be used to create a non-owning Weak pointer to the box. A Weak pointer can be upgraded to an Rc pointer, but will return None if the value has already been freed.» So… they don't really solve the issues a nogc version of D should be able to deal with beyond having built-in unique_ptr style semantics? Or?
Oct 02 2014
On Thursday, 2 October 2014 at 20:42:16 UTC, Paulo Pinto wrote:The Gc type is gone as of this week. https://github.com/rust-lang/meeting-minutes/blob/master/weekly-meetings/2014-09-30.mdThanks, apparently they do it because they want to make a proper tracing gc available later: https://github.com/pnkfelix/rfcs/blob/fsk-remove-refcounting-gc-of-t/active/0000-remove-refcounting-gc-of-t.md
Oct 02 2014
On Wednesday, 1 October 2014 at 17:13:38 UTC, Andrei Alexandrescu wrote:On 10/1/14, 8:48 AM, Oren Tirosh wrote:Sure? I already showed in an example how it is possible to chain calls seamlessly that return unique objects. The users would only notice it when they are trying to make a real copy (i.e. not borrowing). Do you think this happens frequently enough to be of concern?Bingo. Have some way to mark the function return type as a unique pointer.I'm skeptical about this approach (though clearly we need to explore it for e.g. passing ownership of data across threads). For strings and other "casual" objects I think we should focus on GC/RC strategies. This is because people do things like: auto s = setExtension(s1, s2); and then attempt to use s as a regular variable (copy it etc). Making s unique would make usage quite surprising and cumbersome.
Oct 01 2014
On 10/1/14, 1:56 PM, "Marc Schütz" <schuetzm gmx.net>" wrote:On Wednesday, 1 October 2014 at 17:13:38 UTC, Andrei Alexandrescu wrote:I'd think so. -- AndreiOn 10/1/14, 8:48 AM, Oren Tirosh wrote:Sure? I already showed in an example how it is possible to chain calls seamlessly that return unique objects. The users would only notice it when they are trying to make a real copy (i.e. not borrowing). Do you think this happens frequently enough to be of concern?Bingo. Have some way to mark the function return type as a unique pointer.I'm skeptical about this approach (though clearly we need to explore it for e.g. passing ownership of data across threads). For strings and other "casual" objects I think we should focus on GC/RC strategies. This is because people do things like: auto s = setExtension(s1, s2); and then attempt to use s as a regular variable (copy it etc). Making s unique would make usage quite surprising and cumbersome.
Oct 01 2014
On Wednesday, 1 October 2014 at 15:48:39 UTC, Oren Tirosh wrote:On Tuesday, 30 September 2014 at 19:10:19 UTC, Marc Schütz wrote: One problem with actually implementing this is that using reference counting as a memory management policy requires extra space for the reference counter in the object, just as garbage collection requires support for scanning and identification of interior object memory range. While allocation and memory management may be quite independent in theory, practical high performance implementations tend to be intimately related.I don't have all answers to these questions. Still, I'm convinced this is doable. A straight-forwarding and general way to convert a unique object to a ref-counted one is to allocate new memory for it plus the reference count, move the original object into it, and release the original memory. This is safe, because there can be no external pointers to the object, as it is unique. Of course, this can be optimized if the allocator supports extending an allocation. It could then preallocate a few extra bytes at the end to make the extend operation always succeed, similar to your suggestion to always allocate a reference counter. I think the most difficult part is to find an efficient and user-friendly way for the wrapper types to get at the allocator. Maybe the allocators should all implement an interface (a real one, not duck-typing). The wrappers (Owned, RC) can then include a pointer to the allocator (or for RC, embed it next to the reference count). This would make it possible to specify a (thread) global default allocator at runtime, which all library functions use by convention (for example let's call it `alloc`, then they would call `alloc.make!MyStruct()`). At the same time, it is safe to change the default allocator at any time, and to use different allocators in parallel in the same thread. The alternative is obviously a template parameter to the function that returns the unique object. But this unfortunately is then not restricted to just the function, but "infects" the return type, too. And from there, it needs to spread to the RC wrapper, or any containers. Thus we'd have incompatible RC types, which I would imagine would be very inconvenient and restrictive. Besides, it would probably be too tedious to specify the allocator everywhere. Therfore, I think the additional cost of an allocator interface pointer is worth it. For Owned!T (with T being a pointer or reference), it would just be two words, which we can return efficiently. We already have slices doing that, and AFAIK there's no significantly worse performance because of them.(I'll try to make a sketch on how this can be implemented in another post.)Do elaborate!As a conclusion, I would say that APIs should strive for the following principles, in this order: 1. Avoid allocation altogether, for example by laziness (ranges), or by accepting sinks. 2. If allocations are necessary (or desirable, to make the API more easily usable), try hard to return a unique value (this of course needs to be expressed in the return type). 3. If both of the above fails, only then return a GCed pointer, or alternatively provide several variants of the function (though this shouldn't be necessary often). An interesting alternative: Instead of passing a flag directly describing the policy, pass the function a type that it should wrap it's return value in. As for the _allocation_ strategy: It indeed needs to be configurable, but here, the same objections against a template parameter apply. As the allocator doesn't necessarily need to be part of the type, a (thread) global variable can be used to specify it. This lends itself well to idioms like with(MyAllocator alloc) { // ... }Assuming there is some dependency between the allocator and the memory management policy I guess this would be initialized on thread start that cannot be modified later. All code running inside the thread would need to either match the configured policy, not handle any kind of pointers or use a limited subset of unique pointers. Another way to ensure that code can run on either RC or GC is to make certain objects (specifically, Exceptions) always allocate a reference counter, regardless of the currently configured policy.
Oct 01 2014
On Wednesday, 1 October 2014 at 15:48:39 UTC, Oren Tirosh wrote:On Tuesday, 30 September 2014 at 19:10:19 UTC, Marc Schütz wrote:Here's an example implementation of what I have in mind (totally untested and won't compile because of `scope`): http://wiki.dlang.org/User:Schuetzm/RC,_Owned_and_allocators This is just a sketch to explain the general idea. Some things probably won't work as implemented, especially the disable postblit and opAssign() of Owned!T. I think it needs to implement implicit moving, otherwise one would have to call `release()` everywhere. As in the other post, the function that produces the value returns Owned!T. The types don't require unique however (although integration with DMD's idea of unique would still be useful). Because of auto-borrowing via alias this, Owned!T and RC!T both can pass their payloads to functions that accept them by `scope`. The ref-count is not touched for borrowing. Usage examples: Owned!string setExtension(in char[] path, in char[] ext); void saveFileAs(in char[] name) { import std.path: setExtension; import std.file: write; name. // scope const(char[]) setExtension("txt"). // Owned!string write(data); } RC!string[] stringList; void addToGlobalList(scope RC!string s) { stringList ~= s; // increments ref-count } RC!string foo; addToGlobalList(foo); // borrowing doesn't change ref-count auto newFileName = "hello-world".setExtension("txt"); auto tmp1 = newFileName; // ERROR: cannot copy scope tmp2 = newFileName; // OK, borrowing foo = newFileName; // ERROR: cannot copy foo = newFileName.release(); // OK, move auto bar = newFileName.toRC(); // ditto(I'll try to make a sketch on how this can be implemented in another post.)Do elaborate!
Oct 02 2014
On 29 September 2014 20:49, Andrei Alexandrescu via Digitalmars-d <digitalmars-d puremagic.com> wrote:[...] Destroy! AndreiI generally like the idea, but my immediate concern is that it implies that every function that may deal with allocation is a template. This interferes with C/C++ compatibility in a pretty big way. Or more generally, the idea of a lib. Does this mean that a lib will be required to produce code for every permutation of functions according to memory management strategy? Usually libs don't contain code for uninstantiated templates. With this in place, I worry that traditional use of libs, separate compilation, external language linkage, etc, all become very problematic. Pervasive templates can only work well if all code is D code, and if all code is compiled together. Most non-OSS industry doesn't ship source, they ship libs. And if libs are to become impractical, then dependencies become a problem; instead of linking libphobos.so, you pretty much have to compile phobos together with your app (already basically true for phobos, but it's fairly unique). What if that were a much larger library? What if you have 10s of dependencies all distributed in this manner? Does it scale? I guess this doesn't matter if this is only a proposal for phobos... but I suspect the pattern will become pervasive if it works, and yeah, I'm not sure where that leads.
Sep 30 2014
On 9/30/14, 6:53 PM, Manu via Digitalmars-d wrote:I generally like the idea, but my immediate concern is that it implies that every function that may deal with allocation is a template. This interferes with C/C++ compatibility in a pretty big way. Or more generally, the idea of a lib. Does this mean that a lib will be required to produce code for every permutation of functions according to memory management strategy? Usually libs don't contain code for uninstantiated templates.If a lib chooses one specific memory management policy, it can of course be non-templated with regard to that. If it wants to offer its users the choice, it would probably have to offer some templates.With this in place, I worry that traditional use of libs, separate compilation, external language linkage, etc, all become very problematic. Pervasive templates can only work well if all code is D code, and if all code is compiled together. Most non-OSS industry doesn't ship source, they ship libs. And if libs are to become impractical, then dependencies become a problem; instead of linking libphobos.so, you pretty much have to compile phobos together with your app (already basically true for phobos, but it's fairly unique). What if that were a much larger library? What if you have 10s of dependencies all distributed in this manner? Does it scale? I guess this doesn't matter if this is only a proposal for phobos... but I suspect the pattern will become pervasive if it works, and yeah, I'm not sure where that leads.Thanks for the point. I submit that Phobos has and will be different from other D libraries; as the standard library, it has the role of supporting widely varying needs, and as such it makes a lot of sense to make it highly generic and configurable. Libraries that are for specific domains can avail themselves of a narrower design scope. Andrei
Oct 01 2014
On Monday, 29 September 2014 at 10:49:53 UTC, Andrei Alexandrescu wrote:Back when I've first introduced RCString I hinted that we have a larger strategy in mind. Here it is.Slightly related :) https://github.com/D-Programming-Language/phobos/pull/2573
Sep 30 2014
On 9/30/14, 10:46 PM, "Nordlöw" wrote:On Monday, 29 September 2014 at 10:49:53 UTC, Andrei Alexandrescu wrote:Nice, thanks! -- AndreiBack when I've first introduced RCString I hinted that we have a larger strategy in mind. Here it is.Slightly related :) https://github.com/D-Programming-Language/phobos/pull/2573
Oct 01 2014
29-Sep-2014 14:49, Andrei Alexandrescu пишет:Back when I've first introduced RCString I hinted that we have a larger strategy in mind. Here it is.[snip] I think it would be well worth it to actually do a bit of research. Before we get into the fry and spill blood (or LOCs) everywhere. Can we: 1. Present a list of allocating functions. 2. What they (currently) allocate: string, T[], V[K] or something else. 3. See what alternatives they have (that do not allocate if any). 4. Plot course for these that do not have. (Just listing how function signature would change is good enough). Thanks! P.S. If there are no takers I'd get do myself it in a week or so. -- Dmitry Olshansky
Oct 03 2014
On 10/3/14, 11:27 AM, Dmitry Olshansky wrote:29-Sep-2014 14:49, Andrei Alexandrescu пишет:Awesome. I just started http://wiki.dlang.org/Stuff_in_Phobos_That_Generates_Garbage and I encourage us all to add to it (sorted by module and then by artifact name).Back when I've first introduced RCString I hinted that we have a larger strategy in mind. Here it is.[snip] I think it would be well worth it to actually do a bit of research. Before we get into the fry and spill blood (or LOCs) everywhere. Can we: 1. Present a list of allocating functions.2. What they (currently) allocate: string, T[], V[K] or something else.Mention that in the "Possible Fix(es)" column.3. See what alternatives they have (that do not allocate if any).Yah.4. Plot course for these that do not have. (Just listing how function signature would change is good enough).Yah.Thanks! P.S. If there are no takers I'd get do myself it in a week or so.Let's all get this rolling! Andrei
Oct 03 2014
03-Oct-2014 23:50, Andrei Alexandrescu пишет:On 10/3/14, 11:27 AM, Dmitry Olshansky wrote:Glad you liked it. Being in favor of automation as a start I just toggled -vgc flag in Win64 makefile and built phobos. Raw data (CSV) is here: https://gist.github.com/anonymous/763adcd62ab60a66e9d8 Time to mine it... -- Dmitry Olshansky29-Sep-2014 14:49, Andrei Alexandrescu пишет:Awesome. I just started http://wiki.dlang.org/Stuff_in_Phobos_That_Generates_Garbage and I encourage us all to add to it (sorted by module and then by artifact name).Back when I've first introduced RCString I hinted that we have a larger strategy in mind. Here it is.[snip] I think it would be well worth it to actually do a bit of research. Before we get into the fry and spill blood (or LOCs) everywhere. Can we: 1. Present a list of allocating functions.
Oct 03 2014
On 10/3/14, 1:18 PM, Dmitry Olshansky wrote:03-Oct-2014 23:50, Andrei Alexandrescu пишет:D script that generates wikitable from that -> awesomeness. -- AndreiOn 10/3/14, 11:27 AM, Dmitry Olshansky wrote:Glad you liked it. Being in favor of automation as a start I just toggled -vgc flag in Win64 makefile and built phobos. Raw data (CSV) is here: https://gist.github.com/anonymous/763adcd62ab60a66e9d8 Time to mine it...29-Sep-2014 14:49, Andrei Alexandrescu пишет:Awesome. I just started http://wiki.dlang.org/Stuff_in_Phobos_That_Generates_Garbage and I encourage us all to add to it (sorted by module and then by artifact name).Back when I've first introduced RCString I hinted that we have a larger strategy in mind. Here it is.[snip] I think it would be well worth it to actually do a bit of research. Before we get into the fry and spill blood (or LOCs) everywhere. Can we: 1. Present a list of allocating functions.
Oct 03 2014
04-Oct-2014 00:21, Andrei Alexandrescu пишет:On 10/3/14, 1:18 PM, Dmitry Olshansky wrote:I'm on it. With GitHub source links. D's regex rocks ;) -- Dmitry Olshansky03-Oct-2014 23:50, Andrei Alexandrescu пишет:D script that generates wikitable from that -> awesomeness. -- AndreiOn 10/3/14, 11:27 AM, Dmitry Olshansky wrote:Glad you liked it. Being in favor of automation as a start I just toggled -vgc flag in Win64 makefile and built phobos. Raw data (CSV) is here: https://gist.github.com/anonymous/763adcd62ab60a66e9d8 Time to mine it...29-Sep-2014 14:49, Andrei Alexandrescu пишет:Awesome. I just started http://wiki.dlang.org/Stuff_in_Phobos_That_Generates_Garbage and I encourage us all to add to it (sorted by module and then by artifact name).Back when I've first introduced RCString I hinted that we have a larger strategy in mind. Here it is.[snip] I think it would be well worth it to actually do a bit of research. Before we get into the fry and spill blood (or LOCs) everywhere. Can we: 1. Present a list of allocating functions.
Oct 03 2014
04-Oct-2014 00:21, Dmitry Olshansky пишет:04-Oct-2014 00:21, Andrei Alexandrescu пишет:[snip]On 10/3/14, 1:18 PM, Dmitry Olshansky wrote:03-Oct-2014 23:50, Andrei Alexandrescu пишет:Forgot my wiki credentials. Anyhow I got passable Markdown page fairly quickly. Looks like this: https://github.com/DmitryOlshansky/phobos/wiki/Phobos-GC-happy-list! Tool to get it, anybody feel free to take over from here: https://gist.github.com/anonymous/dc0000d3b801a7bedff0 Takes DMD's output from stdin, so: make -f posix.mak | ./this_script (Needs -vgc flag obviously) -- Dmitry OlshanskyI'm on it. With GitHub source links. D's regex rocks ;)Glad you liked it. Being in favor of automation as a start I just toggled -vgc flag in Win64 makefile and built phobos. Raw data (CSV) is here: https://gist.github.com/anonymous/763adcd62ab60a66e9d8 Time to mine it...D script that generates wikitable from that -> awesomeness. -- Andrei
Oct 03 2014
04-Oct-2014 00:42, Dmitry Olshansky пишет:04-Oct-2014 00:21, Dmitry Olshansky пишет:Ehm, rather (without '!' at the end): https://github.com/DmitryOlshansky/phobos/wiki/Phobos-GC-happy-list -- Dmitry Olshansky04-Oct-2014 00:21, Andrei Alexandrescu пишет:[snip]On 10/3/14, 1:18 PM, Dmitry Olshansky wrote:03-Oct-2014 23:50, Andrei Alexandrescu пишет:Forgot my wiki credentials. Anyhow I got passable Markdown page fairly quickly. Looks like this: https://github.com/DmitryOlshansky/phobos/wiki/Phobos-GC-happy-list!I'm on it. With GitHub source links. D's regex rocks ;)Glad you liked it. Being in favor of automation as a start I just toggled -vgc flag in Win64 makefile and built phobos. Raw data (CSV) is here: https://gist.github.com/anonymous/763adcd62ab60a66e9d8 Time to mine it...D script that generates wikitable from that -> awesomeness. -- Andrei
Oct 03 2014
04-Oct-2014 00:56, Dmitry Olshansky пишет:04-Oct-2014 00:42, Dmitry Olshansky пишет:Got it: https://gist.github.com/DmitryOlshansky/d718be4ec12158cf2f02 Tries hard to detect class & function name (it's all on heuristics + regex... e-hm) and generates mediawiki table. DWiki won't let me edit it, but the output is here: https://gist.github.com/DmitryOlshansky/341aa7f6d6f0d53ffc59 Anybody with a proper D parser may do a way better job ;) -- Dmitry Olshansky04-Oct-2014 00:21, Dmitry Olshansky пишет:04-Oct-2014 00:21, Andrei Alexandrescu пишет:[snip]On 10/3/14, 1:18 PM, Dmitry Olshansky wrote:03-Oct-2014 23:50, Andrei Alexandrescu пишет:Glad you liked it. Being in favor of automation as a start I just toggled -vgc flag in Win64 makefile and built phobos. Raw data (CSV) is here: https://gist.github.com/anonymous/763adcd62ab60a66e9d8 Time to mine it...D script that generates wikitable from that -> awesomeness. -- Andrei
Oct 03 2014
On 10/3/14, 3:59 PM, Dmitry Olshansky wrote:Got it: https://gist.github.com/DmitryOlshansky/d718be4ec12158cf2f02 Tries hard to detect class & function name (it's all on heuristics + regex... e-hm) and generates mediawiki table. DWiki won't let me edit it, but the output is here: https://gist.github.com/DmitryOlshansky/341aa7f6d6f0d53ffc59 Anybody with a proper D parser may do a way better job ;)Tried to insert it, looks weird. Probably it would be most effective if you fixed your wiki account. Thanks! -- Andrei
Oct 03 2014