www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - why allocators are not discussed here

reply "cybervadim" <vadim.goryunov gmail.com> writes:
I know Andrey mentioned he was going to work on Allocators a year 
ago. In DConf 2013 he described the problems he needs to solve 
with Allocators. But I wonder if I am missing the discussion 
around that - I tried searching this forum, found a few threads 
that was not actually a brain storm for Allocators design.

Please point me in the right direction
or
is there a reason it is not discussed
or
should we open the discussion?


The easiest approach for Allocators design I can imagine would be 
to let user specify which Allocator operator new should get the 
memory from (introducing a new keyword allocator). This gives a 
total control, but assumes user knows what he is doing.

Example:

CustomAllocator ca;
allocator(ca) {
   auto a = new A; // operator new will use 
ScopeAllocator::malloc()
   auto b = new B;

   free(a); // that should call ScopeAllocator::free()
   // if free() is missing for allocated area, it is a user 
responsibility to make sure custom Allocator can handle that
}

By default allocator is the druntime using GC, free(a) does 
nothing for it.


if some library defines its allocator (e.g. specialized 
container), there should be ability to:
1. override allocator
2. get access to the allocator used

I understand that I spent 5 mins thinking about the way 
Allocators may look.
My point is - if somebody is working on it, can you please share 
your ideas?
Jun 25 2013
next sibling parent reply "Adam D. Ruppe" <destructionator gmail.com> writes:
On Tuesday, 25 June 2013 at 22:22:09 UTC, cybervadim wrote:
 (introducing a new keyword allocator)
It would be easier to just pass an allocator object that provides the necessary methods and don't use new at all. (I kinda wish new wasn't in the language. It'd make this a little more consistent.) The allocator's create function could also return wrapped types, like RefCounted!T or NotNull!T depending on what it does. Though the devil is in the details here and I don't think I can say more without trying to actually do it.
Jun 25 2013
next sibling parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Wed, Jun 26, 2013 at 12:50:36AM +0200, Adam D. Ruppe wrote:
 On Tuesday, 25 June 2013 at 22:22:09 UTC, cybervadim wrote:
(introducing a new keyword allocator)
It would be easier to just pass an allocator object that provides the necessary methods and don't use new at all. (I kinda wish new wasn't in the language. It'd make this a little more consistent.)
It's not too late to introduce a default allocator object that maps to built-in GC primitives. Maybe something like: struct DefaultAllocator { T* alloc(T, A...)(A args) { return new T(args); } void free(T)(T* ref) { // no-op } } We can then change Phobos to always use allocator.alloc and allocator.free, which it gets from user code somehow, and in the default case it would do the Right Thing.
 The allocator's create function could also return wrapped types,
 like RefCounted!T or NotNull!T depending on what it does.
So maybe something like: struct RefCountedAllocator { RefCounted!T alloc(T, A...)(A args) { return allocRefCounted(args); } void free(T)(RefCounted!T ref) { dotDotDotMagic(ref); } } etc..
 Though the devil is in the details here and I don't think I can say
 more without trying to actually do it.
The main issue I see is how *not* to get stuck in C++'s situation where you have to specify allocator objects everywhere, which is highly inconvenient and liable for people to avoid using, which defeats the purpose of having allocators. It would be nice, IMO, if we can somehow let the user specify a custom allocator for, say, the whole of Phobos, so that people who care about this sorta thing can just replace the GC wholesale and then use Phobos to their hearts' content without having to manually specify allocator objects everywhere and risk forgetting a single case that eventually leads to memory leakage. T -- Computers shouldn't beep through the keyhole.
Jun 25 2013
prev sibling parent reply Robert Schadek <realburner gmx.de> writes:
On 06/26/2013 12:50 AM, Adam D. Ruppe wrote:
 On Tuesday, 25 June 2013 at 22:22:09 UTC, cybervadim wrote:
 (introducing a new keyword allocator)
It would be easier to just pass an allocator object that provides the necessary methods and don't use new at all. (I kinda wish new wasn't in the language. It'd make this a little more consistent.)
I did think about this as well, but than I came up with something that IMHO is even simpler. Imagine we have two delegates: void* delegate(size_t); // this one allocs void delegate(void*); // this one frees you pass both to a function that constructs you object. The first is used for allocation the memory, the second gets attached to the TypeInfo and is used by the gc to free the object. This would be completely transparent to the user. The use in a container is similar. Just use the alloc delegate to construct the objects and attach the free delegate to the typeinfo. You could even mix allocator strategies in the middle of the lifetime of the container.
Jun 26 2013
parent reply Dmitry Olshansky <dmitry.olsh gmail.com> writes:
26-Jun-2013 14:03, Robert Schadek пишет:
 On 06/26/2013 12:50 AM, Adam D. Ruppe wrote:
 On Tuesday, 25 June 2013 at 22:22:09 UTC, cybervadim wrote:
 (introducing a new keyword allocator)
It would be easier to just pass an allocator object that provides the necessary methods and don't use new at all. (I kinda wish new wasn't in the language. It'd make this a little more consistent.)
I did think about this as well, but than I came up with something that IMHO is even simpler. Imagine we have two delegates: void* delegate(size_t); // this one allocs void delegate(void*); // this one frees you pass both to a function that constructs you object. The first is used for allocation the memory, the second gets attached to the TypeInfo and is used by the gc to free the object.
Then it's just GC but with an extra complication.
 This would be completely transparent to the user.

 The use in a container is similar. Just use the alloc delegate to
 construct the objects and
 attach the free delegate to the typeinfo. You could even mix allocator
 strategies in the middle
 of the lifetime of the container.
-- Dmitry Olshansky
Jun 26 2013
parent reply Robert Schadek <realburner gmx.de> writes:
 Imagine we have two delegates:

 void* delegate(size_t);  // this one allocs
 void delegate(void*);    // this one frees

 you pass both to a function that constructs you object. The first is
 used for allocation the
 memory, the second gets attached to the TypeInfo and is used by the gc
 to free
 the object.
Then it's just GC but with an extra complication.
IMHO, not really, as the place you get the memory from is not managed by the GC, or at least not directly. The GC algorithm would see that there is a "free delegate" attached to the object and would use this to free the memory. The same should hold true for calling GC.free. Or are you talking about ref counting and such?
Jun 26 2013
parent reply Marco Leise <Marco.Leise gmx.de> writes:
Am Wed, 26 Jun 2013 16:30:50 +0200
schrieb Robert Schadek <realburner gmx.de>:

 
 Imagine we have two delegates:

 void* delegate(size_t);  // this one allocs
 void delegate(void*);    // this one frees

 you pass both to a function that constructs you object. The first is
 used for allocation the
 memory, the second gets attached to the TypeInfo and is used by the gc
 to free
 the object.
Does it mean 16 extra bytes for every allocation ? -- Marco
Jun 26 2013
parent Robert Schadek <realburner gmx.de> writes:
On 06/26/2013 10:06 PM, Marco Leise wrote:
 Does it mean 16 extra bytes for every allocation ?
yes, or wrap it, and you have 4 or 8 bytes, but yes you would to have save it somewhere
Jun 26 2013
prev sibling next sibling parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Wed, Jun 26, 2013 at 12:22:04AM +0200, cybervadim wrote:
 I know Andrey mentioned he was going to work on Allocators a year
 ago. In DConf 2013 he described the problems he needs to solve with
 Allocators. But I wonder if I am missing the discussion around that
 - I tried searching this forum, found a few threads that was not
 actually a brain storm for Allocators design.
 
 Please point me in the right direction
 or
 is there a reason it is not discussed
 or
 should we open the discussion?
That would be nice to get things going. :) Ever since I found D and subscribed to this mailing list, I've been hearing rumors of allocators, but they seem to be rather lacking in the department of concrete evidence. They're like the Big Foot or Swamp Ape of D. Maybe it's time we got out into the field and produced some real evidence of these mythical beasts. :-P
 The easiest approach for Allocators design I can imagine would be to
 let user specify which Allocator operator new should get the memory
 from (introducing a new keyword allocator). This gives a total
 control, but assumes user knows what he is doing.
 
 Example:
 
 CustomAllocator ca;
 allocator(ca) {
   auto a = new A; // operator new will use ScopeAllocator::malloc()
   auto b = new B;
 
   free(a); // that should call ScopeAllocator::free()
   // if free() is missing for allocated area, it is a user
 responsibility to make sure custom Allocator can handle that
 }
 
 By default allocator is the druntime using GC, free(a) does nothing
 for it.
I believe the current direction is to avoid needing new language features / syntax. So the above probably won't happen.
 if some library defines its allocator (e.g. specialized container),
 there should be ability to:
 1. override allocator
 2. get access to the allocator used
 
 I understand that I spent 5 mins thinking about the way Allocators
 may look.
 My point is - if somebody is working on it, can you please share
 your ideas?
Well, thanks for getting the ball rolling. Maybe Andrei can pipe up about any experimental designs he's currently considering. But barring that, I'm thinking about how allocators would be used in user code. I think it's pretty much a given that the C++ way of sticking it to the end of template arguments doesn't really fly: it's just too much of a hassle to keep having to worry about passing allocators around template arguments, that people just don't bother. So coming back to square one, how would allocators be used? 1) Usually, the user would just be content with the GC, and not ever have to worry about allocators. So this means that whatever allocator design we adopt, it should be practically invisible to ordinary users unless they're specifically looking to change how memory is allocated. 2) Furthermore, it's unlikely that in the same piece of code, you'd want to use 3 or 4 different allocators for different objects; while such cases may exist, it seems to me to be more likely that you want either (a) a very specific object (say a class instance or container) to use a particular allocator, or (b) you want to transitively block off an entire section of code (which may be the entire program in some cases) to use a particular allocator. As a first stab at it, I'd say (a) can be implemented by a static class member reference to an allocator, that can be set from user code. And maybe (b) can be implemented by making gc_alloc / gc_free overridable function pointers? Then we can override their values and use scope guards to revert them back to the values they were before. This allows us to use the runtime stack to manage which allocator is currently active. This lets *all* memory allocations be rerouted through the custom allocator without needing to hand-edit every call to new down the call graph. This is just a very crude first stab at the problem, though. In particular, (a) isn't very satisfactory. And also the interaction of allocated objects with the call stack: if any custom-allocated objects in (b) survive past the containing function which sets/resets the function pointers, there could be problems: if a member function of such an object needs to allocate memory, it will pick up the ambient allocator instead of the custom allocator in effect when the object was first created. Also, we may have the problem of the wrong allocator being used to free the object. Anyone has better ideas? T -- All problems are easy in retrospect.
Jun 25 2013
next sibling parent reply "cybervadim" <vadim.goryunov gmail.com> writes:
On Tuesday, 25 June 2013 at 22:50:55 UTC, H. S. Teoh wrote:
 On Wed, Jun 26, 2013 at 12:22:04AM +0200, cybervadim wrote:

 That would be nice to get things going. :)

 Ever since I found D and subscribed to this mailing list, I've 
 been
 hearing rumors of allocators, but they seem to be rather 
 lacking in the
 department of concrete evidence. They're like the Big Foot or 
 Swamp Ape
 of D. Maybe it's time we got out into the field and produced 
 some real
 evidence of these mythical beasts. :-P

 Well, thanks for getting the ball rolling. Maybe Andrei can 
 pipe up
 about any experimental designs he's currently considering.

 But barring that, I'm thinking about how allocators would be 
 used in
 user code. I think it's pretty much a given that the C++ way of 
 sticking
 it to the end of template arguments doesn't really fly: it's 
 just too
 much of a hassle to keep having to worry about passing 
 allocators around
 template arguments, that people just don't bother. So coming 
 back to
 square one, how would allocators be used?

 1) Usually, the user would just be content with the GC, and not 
 ever
 have to worry about allocators. So this means that whatever 
 allocator
 design we adopt, it should be practically invisible to ordinary 
 users
 unless they're specifically looking to change how memory is 
 allocated.

 2) Furthermore, it's unlikely that in the same piece of code, 
 you'd want
 to use 3 or 4 different allocators for different objects; while 
 such
 cases may exist, it seems to me to be more likely that you want 
 either
 (a) a very specific object (say a class instance or container) 
 to use a
 particular allocator, or (b) you want to transitively block off 
 an
 entire section of code (which may be the entire program in some 
 cases)
 to use a particular allocator.

 As a first stab at it, I'd say (a) can be implemented by a 
 static class
 member reference to an allocator, that can be set from user 
 code.

 And maybe (b) can be implemented by making gc_alloc / gc_free
 overridable function pointers? Then we can override their 
 values and use
 scope guards to revert them back to the values they were 
 before. This
 allows us to use the runtime stack to manage which allocator is
 currently active. This lets *all* memory allocations be 
 rerouted through
 the custom allocator without needing to hand-edit every call to 
 new down
 the call graph.

 This is just a very crude first stab at the problem, though. In
 particular, (a) isn't very satisfactory. And also the 
 interaction of
 allocated objects with the call stack: if any custom-allocated 
 objects
 in (b) survive past the containing function which sets/resets 
 the
 function pointers, there could be problems: if a member 
 function of such
 an object needs to allocate memory, it will pick up the ambient
 allocator instead of the custom allocator in effect when the 
 object was
 first created. Also, we may have the problem of the wrong 
 allocator
 being used to free the object.

 Anyone has better ideas?


 T
From my experience all objects may be divided into 2 categories 1. temporaries. Program usually have some kind of event loop. During one iteration of this loop some temporary objects are created and then discarded. The ideal case for stack (or ranged or area) allocator, where you define allocator at the beginning of the loop cycle, use it for all temporaries, then free all the memory in one go at the end of iteration. 2. containers. Program receives an event from the outside and puts some data into container OR update the data if the record already exists. The important thing here is - when updating the data in container, you may want to resize the existing area. If you are working with temporary which should be placed into container, a copy can be made (with corresponding memory allocation from container allocator). Not sure if there is anything better than stack/area allocator for the first class. For the second class user should be able to choose default GC or more precise memory handling (e.g. explicit malloc/free for resizing). Anything I am missing in this categorization? So even if we get allocators that lets us deal with temporaries, that will be a huge benefit.
Jun 25 2013
parent reply "bearophile" <bearophileHUGS lycos.com> writes:
cybervadim:

 From my experience all objects may be divided into 2 categories
 1. temporaries. Program usually have some kind of event loop. 
 During one iteration of this loop some temporary objects are 
 created and then discarded. The ideal case for stack (or ranged 
 or area) allocator, where you define allocator at the beginning 
 of the loop cycle, use it for all temporaries, then free all 
 the memory in one go at the end of iteration.
 2. containers. Program receives an event from the outside and 
 puts some data into container OR update the data if the record 
 already exists.
 The important thing here is - when updating the data in 
 container, you may want to resize the existing area.
Many garbage collectors use the same idea (and manage it automatically), with two or three different generations: http://en.wikipedia.org/wiki/Garbage_collection_%28computer_science%29#Generational_GC_.28ephemeral_GC.29 Bye, bearophile
Jun 25 2013
parent "cybervadim" <vadim.goryunov gmail.com> writes:
 Many garbage collectors use the same idea (and manage it 
 automatically), with two or three different generations:

 http://en.wikipedia.org/wiki/Garbage_collection_%28computer_science%29#Generational_GC_.28ephemeral_GC.29

 Bye,
 bearophile
The problem with GC is that it doesn't know which is temporary and which is not, so it has to traverse tree to determine that. Allocators in my opinion should let user specify explicitly the temporaries.
Jun 25 2013
prev sibling parent reply "Adam D. Ruppe" <destructionator gmail.com> writes:
On Tuesday, 25 June 2013 at 22:50:55 UTC, H. S. Teoh wrote:
 And maybe (b) can be implemented by making gc_alloc / gc_free
 overridable function pointers? Then we can override their 
 values and use scope guards to revert them back to the values 
 they were before.
Yea, I was thinking this might be a way to go. You'd have a global (well, thread-local) allocator instance that can be set and reset through stack calls. You'd want it to be RAII or delegate based, so the scope is clear. with_allocator(my_alloc, { do whatever here }); or { ChangeAllocator!my_alloc dummy; do whatever here } // dummy's destructor ends the allocator scope I think the former is a bit nicer, since the dummy variable is a bit silly. We'd hope that delegate can be inlined. But, the template still has a big advantage: you can change the type. And I think that is potentially enormously useful. Another question is how to tie into output ranges. Take std.conv.to. auto s = to!string(10); // currently, this hits the gc What if I want it to go on a stack buffer? One option would be to rewrite it to use an output range, and then call it like: char[20] buffer; auto s = to!string(10, buffer); // it returns the slice of the buffer it actually used (and we can do overloads so to!string(10, radix) still works, as well as to!string(10, radix, buffer). Hassle, I know...) Naturally, the default argument is to use the 'global' allocator, whatever that is, which does nothing special. The fun part is the output range works for that, and could also work for something like this: struct malloced_string { char* ptr; size_t length; size_t capacity; void put(char c) { if(length >= capacity) ptr = realloc(ptr, capacity*2); ptr[length++] = c; } char[] slice() { return ptr[0 .. length]; } alias slice this; mixin RefCounted!this; // pretend this works } { malloced_string str; auto got = to!string(10, str); } // str is out of scope, so it gets free()'d. unsafe though: if you stored a copy of got somewhere, it is now a pointer to freed memory. I'd kinda like language support of some sort to help mitigate that though, like being a borrowed pointer that isn't allowed to be stored, but that's another discussion. And that should work. So then what we might do is provide these little output range wrappers for various allocators, and use them on many functions. So we'd write: import std.allocators; import std.range; // mallocator is provided in std.allocators and offers the goods OutputRange!(char, mallocator) str; auto got = to!string(10, str); What's nice here is the output range is useful for more than just allocators. You could also to!string(10, my_file) or a delegate, blah blah blah. So it isn't too much of a burden, it is something you might naturally use anyway.
 Also, we may have the problem of the wrong allocator
 being used to free the object.
Another reason why encoding the allocator into the type is so nice. For the minimal D I've been playing with, the idea I'm running with is all allocated memory has some kind of special type, and then naked pointers are always assumed to be borrowed, so you should never store or free them. auto foo = HeapArray!char(capacity); void bar(char[] lol){} bar(foo); // allowed, foo has an alias this on slice // but.... struct A { char[] lol; // not allowed, because you don't know when lol is going to be freed } foo frees itself with refcounting.
Jun 25 2013
next sibling parent reply "Adam D. Ruppe" <destructionator gmail.com> writes:
I was just quickly skimming some criticism of C++ allocators, 
since my thought here is similar to what they do. On one hand, 
maybe D can do it right by tweaking C++'s design rather than 
discarding it.

On the other hand, with all the C++ I've done, I have never 
actually used STL allocators, which could say something about me 
or could say something about them.


One thing I saw said making the differently allocated object a 
different type sucks. ...but must it? The complaint there was "so 
much for just doing a function that takes a std::string". But, 
the way I'd want to do it in D is the function would take a 
char[] instead, and our special allocated type provides that via 
opSlice and/or alias this.

So you'd only have to worry about the different type if you 
intend to take ownership of the container yourself. Which we 
already kinda think about in D: if you store a char[], someone 
else could overwrite it, so we prefer to store an 
immutable(char)[] aka string. If you're given a char[] and want 
to store it, you might idup. So I don't think doing a private 
copy with some other allocation scheme is any more of a hassle.

(BTW immutable objects IMO should *always* be garbage collected, 
because part of immutability is infinite lifetime. So we might 
want to be careful with implicit conversions to immutable based 
on allocation method, which I believe we can protect through 
member functions.)


Anyway, bottom line is I don't think that criticism necessarily 
applies to D. But there's surely many others and I'm more or less 
a n00b re c++'s allocators so idk yet.
Jun 25 2013
parent Dmitry Olshansky <dmitry.olsh gmail.com> writes:
26-Jun-2013 05:24, Adam D. Ruppe пишет:
 I was just quickly skimming some criticism of C++ allocators, since my
 thought here is similar to what they do. On one hand, maybe D can do it
 right by tweaking C++'s design rather than discarding it.
Criticisms are: A) Was defined to not have any state (as noted in the standard) B) Parametrized on type (T) yet a container that is parametrized on it may need to allocate something else completely (a node with T). C) Containers are parametrized on allocators so say 2 lists with different allocators are incompatible in a sense that e.g. you can't splice pieces of them together. Of the above IMHO we can deduce that a) Should support stateful allocators but we have to make sure we don't pay storage space for state-less ones (global ones e.g. mallocator). b) Should preferably be typeless and let container define what they allocate c) Hardly solvable unless we require a way to reassign objects between allocators (at least of similar kinds)
 Anyway, bottom line is I don't think that criticism necessarily applies
 to D. But there's surely many others and I'm more or less a n00b re
 c++'s allocators so idk yet.
-- Dmitry Olshansky
Jun 26 2013
prev sibling next sibling parent Jacob Carlborg <doob me.com> writes:
On 2013-06-26 01:16, Adam D. Ruppe wrote:

 You'd want it to be RAII or delegate based, so the scope is clear.

 with_allocator(my_alloc, {
       do whatever here
 });


 or

 {
     ChangeAllocator!my_alloc dummy;

     do whatever here
 } // dummy's destructor ends the allocator scope


 I think the former is a bit nicer, since the dummy variable is a bit
 silly. We'd hope that delegate can be inlined.
It won't be inlined. You would need to make it a template parameter to have it inlined. -- /Jacob Carlborg
Jun 26 2013
prev sibling parent reply Dmitry Olshansky <dmitry.olsh gmail.com> writes:
26-Jun-2013 03:16, Adam D. Ruppe пишет:
 On Tuesday, 25 June 2013 at 22:50:55 UTC, H. S. Teoh wrote:
 And maybe (b) can be implemented by making gc_alloc / gc_free
 overridable function pointers? Then we can override their values and
 use scope guards to revert them back to the values they were before.
Yea, I was thinking this might be a way to go. You'd have a global (well, thread-local) allocator instance that can be set and reset through stack calls. You'd want it to be RAII or delegate based, so the scope is clear. with_allocator(my_alloc, { do whatever here }); or { ChangeAllocator!my_alloc dummy; do whatever here } // dummy's destructor ends the allocator scope
Both suffer from a) being totally unsafe and in fact bug prone since all references obtained in there are now dangling (and there is no indication where they came from) b) imagine you need to use an allocator for a stateful object. Say forward range of some other ranges (e.g. std.regex) both scoped/stacked to allocate its internal stuff. 2nd one may handle it but not the 1st one. c) transfer of objects allocated differently up the call graph (scope graph?), is pretty much neglected I see. I kind of wondering how our knowledgeable community has come to this. (must have been starving w/o allocators way too long)
 {
     malloced_string str;
     auto got = to!string(10, str);
 } // str is out of scope, so it gets free()'d. unsafe though: if you
 stored a copy of got somewhere, it is now a pointer to freed memory. I'd
 kinda like language support of some sort to help mitigate that though,
 like being a borrowed pointer that isn't allowed to be stored, but
 that's another discussion.
In contrast 'container as an output range' works both safely and would be still customizable. IMHO the only place for allocators is in containers other kinds of code may just ignore allocators completely. std.algorithm and friends should imho be customized on 2 things only: a) containers to use (instead of array) b) optionally a memory source (or allocator) f container is temporary(scoped) to tie its life-time to smth. Want temporary stuff? Use temporary arrays, hashmaps and whatnot i.e. types tailored for a particular use case (e.g. with a temporary/scoped allocator in mind). These would all be unsafe though. Alternative is ref-counting pointers to an allocator. With word on street about ARC it could be nice direction to pursue. Allocators (as Andrei points out in his video) have many kinds: a) persistence: infinite, manual, scoped b) size: unlimited vs fixed c) block-size: any, fixed, or *any* up to some maximum size Most of these ARE NOT interchangeable! Yet some are composable however I'd argue that allocators are not composable but have some reusable parts that in turn are composable. Code would have to cutter for specific flavors of allocators still so we'd better reduce this problem to the selection of containers. -- Dmitry Olshansky
Jun 26 2013
next sibling parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Wed, Jun 26, 2013 at 06:51:54PM +0400, Dmitry Olshansky wrote:
 26-Jun-2013 03:16, Adam D. Ruppe пишет:
On Tuesday, 25 June 2013 at 22:50:55 UTC, H. S. Teoh wrote:
And maybe (b) can be implemented by making gc_alloc / gc_free
overridable function pointers? Then we can override their values and
use scope guards to revert them back to the values they were before.
Yea, I was thinking this might be a way to go. You'd have a global (well, thread-local) allocator instance that can be set and reset through stack calls. You'd want it to be RAII or delegate based, so the scope is clear. with_allocator(my_alloc, { do whatever here }); or { ChangeAllocator!my_alloc dummy; do whatever here } // dummy's destructor ends the allocator scope
Both suffer from a) being totally unsafe and in fact bug prone since all references obtained in there are now dangling (and there is no indication where they came from)
How is this different from using malloc() and free() manually? You have no indication of where a void* came from either, and the danger of dangling references is very real, as any C/C++ coder knows. And I assume that *some* people will want to be defining custom allocators that wrap around malloc/free (e.g. the game engine guys who want total control).
 b) imagine you need to use an allocator for a stateful object. Say
 forward range of some other ranges (e.g. std.regex) both
 scoped/stacked to allocate its internal stuff. 2nd one may handle it
 but not the 1st one.
Yeah this is a complicated area. A container basically needs to know how to allocate its elements. So somehow that information has to be somewhere.
 c) transfer of objects allocated differently up the call graph
 (scope graph?), is pretty much neglected I see.
They're incompatible. You can't safely make a linked list that contains both GC-allocated nodes and malloc() nodes. That's just a bomb waiting to explode in your face. So in that sense, Adam's idea of using a different type for differently-allocated objects makes sense. A container has to declare what kind of allocation its members are using; any other way is asking for trouble.
 I kind of wondering how our knowledgeable community has come to this.
 (must have been starving w/o allocators way too long)
We're just trying to provoke Andrei into responding. ;-) [...]
 IMHO the only place for allocators is in containers other kinds of
 code may just ignore allocators completely.
But some people clamoring for allocators are doing so because they're bothered by Phobos using ~ for string concatenation, which implicitly uses the GC. I don't think we can just ignore that.
 std.algorithm and friends should imho be customized on 2 things only:
 
 a) containers to use (instead of array)
 b) optionally a memory source (or allocator) f container is
 temporary(scoped) to tie its life-time to smth.
 
 Want temporary stuff? Use temporary arrays, hashmaps and whatnot
 i.e. types tailored for a particular use case (e.g. with a
 temporary/scoped allocator in mind).
 These would all be unsafe though. Alternative is ref-counting
 pointers to an allocator. With word on street about ARC it could be
 nice direction to pursue.
Ref-counting is not fool-proof, though. There's always cycles to mess things up.
 Allocators (as Andrei points out in his video) have many kinds:
 a) persistence: infinite, manual, scoped
 b) size: unlimited vs fixed
 c) block-size: any, fixed, or *any* up to some maximum size
 
 Most of these ARE NOT interchangeable!
 Yet some are composable however I'd argue that allocators are not
 composable but have some reusable parts that in turn are composable.
I was listening to Andrei's talk this morning, but I didn't quite understand what he means by composable allocators. Is he talking about nesting, say, a GC inside a region allocated by a region allocator?
 Code would have to cutter for specific flavors of allocators still
 so we'd better reduce this problem to the selection of containers.
[...] Hmm. Sounds like we have two conflicting things going on here: 1) En massé replacement of gc_alloc/gc_free in a certain block of code (which may be the entire program), e.g., for the avoidance of GC in game engines, etc.. Basically, the code is allocator-agnostic, but at some higher level we want to control which allocator is being used. 2) Specific customization of containers, etc., as to which allocator(s) should be used, with (hopefully) some kind of support from the type system to prevent mistakes like dangling pointers, escaping references, etc.. Here, the code is NOT allocator-agnostic; it has to be written with the specific allocation model in mind. You can't just replace the allocator with another one without introducing bugs or problems. These two may interact in complex ways... e.g., you might want to use malloc to allocate a pool, then use a custom gc_alloc/gc_free to allocate from this pool in order to support language built-ins like ~ and ~= without needing to rewrite every function that uses strings. Maybe we should stop conflating these two things so that we stop confusing ourselves, and hopefully it will be easier to analyse afterwards. T -- You have to expect the unexpected. -- RL
Jun 26 2013
next sibling parent "Brian Rogoff" <brogoff gmail.com> writes:
On Wednesday, 26 June 2013 at 17:25:24 UTC, H. S. Teoh wrote:
 I was listening to Andrei's talk this morning, but I didn't 
 quite
 understand what he means by composable allocators. Is he 
 talking about
 nesting, say, a GC inside a region allocated by a region 
 allocator?
Maybe he was talking about a freelist allocator over a reap, as described by the HeapLayers project http://heaplayers.org/ in the paper from 2001 titled 'Composing High-Performance Memory Allocators'. I'm pretty sure that web site was referenced in the talk. A few publications there are from Andrei. I agree that D should support programming without a GC, with different GCs than the default one, and custom allocators, and that features which demand a GC will be troublesome. -- Brian
Jun 26 2013
prev sibling next sibling parent "Adam D. Ruppe" <destructionator gmail.com> writes:
On Wednesday, 26 June 2013 at 17:25:24 UTC, H. S. Teoh wrote:
 malloc to allocate a pool, then use a custom gc_alloc/gc_free to
 allocate from this pool in order to support language built-ins 
 like ~ and ~= without needing to rewrite every function that 
 uses strings.
Blargh, I forgot about operator ~ on built ins. For custom types it is easy enough to manage, just overload it. You can even do ~= on types that aren't allowed to allocate, if they have a certain capacity set up ahead of time (like a stack buffer) But for built ins, blargh, I don't even think we can hint on them to the gc. Maybe we should just go ahead and make the gc generational. (If you aren't using gc, I say leave binary ~ unimplemented in all cases. Use ~= on a temporary instead whenever you would do that. It is easier to follow the lifetime if you explicitly declare your temporary.)
Jun 26 2013
prev sibling parent Dmitry Olshansky <dmitry.olsh gmail.com> writes:
26-Jun-2013 21:23, H. S. Teoh пишет:

 Both suffer from
 a) being totally unsafe and in fact bug prone since all references
 obtained in there are now dangling (and there is no indication where
 they came from)
How is this different from using malloc() and free() manually? You have no indication of where a void* came from either, and the danger of dangling references is very real, as any C/C++ coder knows. And I assume that *some* people will want to be defining custom allocators that wrap around malloc/free (e.g. the game engine guys who want total control).
Why the heck you people think I purpose to use malloc directly as alternative to whatever hackish allocator stack proposed? Use the darn container. For starters I'd make allocation strategy a parameter of each containers. At least they do OWN memory. Then refactor out common pieces into a framework of allocation helpers. I'd personally in the end would separate concerns into 3 entities: 1. Memory area objects - think as allocators but without the circuitry to do the allocation, e.g. a chunk of memory returned by malloc/alloca can be wrapped into a memory area object. 2. Allocators (Policies) - a potentially nested combination of such "circuitry" that makes use of memory areas. Free-lists, pools, stacks etc. Safe ones have ref-counting on memory areas, unsafe once don't. (Though safety largely depends on the way you got that chunk of memory) 3. Containers/Warppers as above objects that handle life-cycle of objects and make use of allocators. In fact allocators are part of type but not memory area objects.
 b) imagine you need to use an allocator for a stateful object. Say
 forward range of some other ranges (e.g. std.regex) both
 scoped/stacked to allocate its internal stuff. 2nd one may handle it
 but not the 1st one.
Yeah this is a complicated area. A container basically needs to know how to allocate its elements. So somehow that information has to be somewhere.
 c) transfer of objects allocated differently up the call graph
 (scope graph?), is pretty much neglected I see.
They're incompatible. You can't safely make a linked list that contains both GC-allocated nodes and malloc() nodes.
What I mean is that if types are the same as built-ins it would be a horrible mistake. If not then we are talking about containers anyway. And if these have a ref-counted pointer to their allocator then the whole thing is safe albeit at the cost of performance. Sadly alias this to some built-in (=e.g. slice) allows squirreling away underlying reference too easily. As such I don't believe in any of the 2 *lies*: a) built-ins can be refurbished to use custom allocators b) we can add opSlice/alias this or whatever to our custom type to get access to the underlying built-ins safely and transparently Both are just nuclear bombs waiting a good time to explode. That's just a bomb waiting
 to explode in your face. So in that sense, Adam's idea of using a
 different type for differently-allocated objects makes sense.
Yes, but one should be careful here as not to have exponential explosion in the code size. So some allocators have to be compatible and if there is a way to transfer ownership it'd be bonus points (and a large pot of these mind you).
 A
 container has to declare what kind of allocation its members are using;
 any other way is asking for trouble.
Hence my thoughts to move this piece of "circuitry" to containers proper. The whole idea that by swapping malloc with myMalloc you can translate to a wildly different allocation scheme doesn't quite hold. I think it may be interesting to try and put a "wall" in different place namely in between allocation strategy and memory areas it works on.
 I kind of wondering how our knowledgeable community has come to this.
 (must have been starving w/o allocators way too long)
We're just trying to provoke Andrei into responding. ;-)
Cool, then keep it coming but ... safety and other holes has to be taken care of.
 [...]
 IMHO the only place for allocators is in containers other kinds of
 code may just ignore allocators completely.
But some people clamoring for allocators are doing so because they're bothered by Phobos using ~ for string concatenation, which implicitly uses the GC. I don't think we can just ignore that.
~= would work with any sensible array-like contianer. ~ is sadly only a convenience for scripts and/or non-performance (determinism) critical apps unfortunately.
 std.algorithm and friends should imho be customized on 2 things only:

 a) containers to use (instead of array)
 b) optionally a memory source (or allocator) f container is
 temporary(scoped) to tie its life-time to smth.

 Want temporary stuff? Use temporary arrays, hashmaps and whatnot
 i.e. types tailored for a particular use case (e.g. with a
 temporary/scoped allocator in mind).
 These would all be unsafe though. Alternative is ref-counting
 pointers to an allocator. With word on street about ARC it could be
 nice direction to pursue.
Ref-counting is not fool-proof, though. There's always cycles to mess things up.
You surely shouldn't have allocators reference each other cyclically? Then I see this as a DAG with allocator at the bottom and objects referencing it.
 Allocators (as Andrei points out in his video) have many kinds:
 a) persistence: infinite, manual, scoped
 b) size: unlimited vs fixed
 c) block-size: any, fixed, or *any* up to some maximum size

 Most of these ARE NOT interchangeable!
 Yet some are composable however I'd argue that allocators are not
 composable but have some reusable parts that in turn are composable.
I was listening to Andrei's talk this morning, but I didn't quite understand what he means by composable allocators. Is he talking about nesting, say, a GC inside a region allocated by a region allocator?
I'd say something like: fixed size region allocator with GC as fallback. Or pool for small allocations + malloc/free with a free-list for bigger allocations etc. And the stuff should be as easily composable as I just listed.
 Code would have to cutter for specific flavors of allocators still
 so we'd better reduce this problem to the selection of containers.
[...] Hmm. Sounds like we have two conflicting things going on here: 1) En massé replacement of gc_alloc/gc_free in a certain block of code (which may be the entire program), e.g., for the avoidance of GC in game engines, etc.. Basically, the code is allocator-agnostic, but at some higher level we want to control which allocator is being used.
There is no allocator agnostic code that allocates. It either happens to call free/dispose/destroy manually (implicitly with ref-counts) or it does not. It either escapes references to who knows where or doesn't.
 2) Specific customization of containers, etc., as to which allocator(s)
 should be used, with (hopefully) some kind of support from the type
 system to prevent mistakes like dangling pointers, escaping references,
 etc.. Here, the code is NOT allocator-agnostic; it has to be written
 with the specific allocation model in mind. You can't just replace the
 allocator with another one without introducing bugs or problems.
With another one of the same _kind_ I'd say.
 These two may interact in complex ways... e.g., you might want to use
 malloc to allocate a pool, then use a custom gc_alloc/gc_free to
 allocate from this pool in order to support language built-ins like ~
 and ~= without needing to rewrite every function that uses strings.
I guess we have to re-write them. Or don't allocate in string functions.
 Maybe we should stop conflating these two things so that we stop
 confusing ourselves, and hopefully it will be easier to analyse
 afterwards.
-- Dmitry Olshansky
Jun 26 2013
prev sibling parent reply "Adam D. Ruppe" <destructionator gmail.com> writes:
So to try some ideas, I started implementing a simple container 
with replaceable allocators: a singly linked list.

All was going kinda well until I realized the forward range it 
offers to iterate its contents makes it possible to escape a 
reference to a freed node.

auto range = list.range;
auto range2 = range;
range.removeFront();

range2 now refers to a freed node. Maybe the nodes could be 
refcounted, though a downside there is even the range won't be 
sharable, it would be a different type based on allocation 
method. (I was hoping to make the range be a sharable component, 
even as the list itself changed type with allocators.)

I guess we could  disable copy construction, and make it a 
forward range instead of an input one, but that takes some of the 
legitimate usefulness away.

Interestingly though, opApply would be ok here, since all it 
would expose is the payload.

(though if the payload is a reference type, does the container 
take ownership of it? How do we indicate that? Perhaps more 
interestingly, how do we indicate the /lack/ of ownership at the 
transfer point?)



This is all fairly easy if we just decide "we're going to do this 
with GC" or "we're going to do this C style" and do the whole 
program like that, libraries and all. But trying to mix and match 
just gets more complicated the more I think about it :( It makes 
the question of "allocators" look trivial.
Jun 26 2013
parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Thu, Jun 27, 2013 at 12:43:54AM +0200, Adam D. Ruppe wrote:
 So to try some ideas, I started implementing a simple container with
 replaceable allocators: a singly linked list.
 
 All was going kinda well until I realized the forward range it
 offers to iterate its contents makes it possible to escape a
 reference to a freed node.
[...]
 (though if the payload is a reference type, does the container take
 ownership of it? How do we indicate that? Perhaps more interestingly,
 how do we indicate the /lack/ of ownership at the transfer point?)
Maybe a type distinction akin to C++'s auto_ptr might help? Say we introduce OwnedRef!T vs. plain old T*. So something returning OwnedRef!T will need to assume ownership of the object, whereas something returning T* would just be returning a reference, but the container continues to hold ownership over the object.
 This is all fairly easy if we just decide "we're going to do this
 with GC" or "we're going to do this C style" and do the whole
 program like that, libraries and all. But trying to mix and match
 just gets more complicated the more I think about it :( It makes the
 question of "allocators" look trivial.
Heh. Yeah, I'm started to wonder if it even makes sense to try to mix-n-match GC-based and non-GC-based allocators. It seems that maybe we just have to settle for the fact of life that a GC-based object is fundamentally incompatible with a pool-allocated object, and both are also fundamentally incompatible with malloc-allocated objects, 'cos you need the code to be aware in each instance of what needs to be done to cleanup, etc.. T -- GEEK = Gatherer of Extremely Enlightening Knowledge
Jun 26 2013
parent reply "Adam D. Ruppe" <destructionator gmail.com> writes:
On Wednesday, 26 June 2013 at 23:02:47 UTC, H. S. Teoh wrote:
 Maybe a type distinction akin to C++'s auto_ptr might help?
Yeah, that's what I'm thinking, but I don't really like it. Perhaps I'm trying too hard to cover everything, and should be happier with just doing what C++ does. Full memory safety is prolly out the window anyway. In std.typecons, there's a Unique!T, but it doesn't look complete. A lot of the code is commented out, maybe it was started back in the days of bug city.
 Yeah, I'm started to wonder if it even makes sense to try to 
 mix-n-match GC-based and non-GC-based allocators.
It might not be so bad if we modified D to add a lent storage class, or something, similar to some discussions about scope in the past. These would be values you may work with, but never keep; assigning them to anything is not allowed and you may only pass them to a function or return them from a function if that is also marked lent. Any regular reference would be implicitly usable as lent. int* ptr; void bar(int* a) { foo(a); // ok } int* foo(lent int* a) { bar(a); // error, cannot call bar with lent pointer ptr = a; // error, cannot assign lent value to non-lent field foo2(a); // ok foo(foo2(a)); // ok return a; // error, cannot return a lent value } lent int* foo2(lent int* a) { return a; // ok } foo(ptr); // ok (if foo actually compiled) And finally, if you take the address of a lent reference, that itself is lent; &(lent int*) == lent int**. Then, if possible, it would be cool if: lent int* a; { int* b; a = b; } That was an error, because a outlived b. But since you can't store a anywhere, the only time this would happen would be something like here. And hell maybe we could hammer around that by making lent variables head const and say they must be initialized at declaration, so "lent int* a;" is illegal as well as "a = b;". But we wouldn't want it transitively const, because then: void fillBuffer(lent char[] buffer) {} would be disallowed and that is something I would definitely want. Part of me thinks pure might help with this too.... but eh maybe not because even a pure function could in theory escape a reference via its other parameters. But with this kind of thing, we could do a nicer pointer type that does: lent T getThis() { return _this; } alias getThis this; and thus implicitly convert our inner pointer to something we can use on the outside world with some confidence that they won't sneak away any references to it. If combined with disabling the address of operator on the container itself, we could really lock down ownership.
Jun 26 2013
next sibling parent "BLM768" <blm768 gmail.com> writes:
On Wednesday, 26 June 2013 at 23:59:01 UTC, Adam D. Ruppe wrote:
 On Wednesday, 26 June 2013 at 23:02:47 UTC, H. S. Teoh wrote:
 Maybe a type distinction akin to C++'s auto_ptr might help?
It might not be so bad if we modified D to add a lent storage class, or something, similar to some discussions about scope in the past. These would be values you may work with, but never keep; assigning them to anything is not allowed and you may only pass them to a function or return them from a function if that is also marked lent. Any regular reference would be implicitly usable as lent.
Something along those lines would probably be a good solution. It seems that we're working with three types of objects: 1. Objects that are "owned" by a scope (can be stack-allocated) 2. Objects that are "owned" by a another object (C/C++-like memory management) 3. Objects that have no single "owner" (GC memory management) The first two would probably operate under semantics like "lent" or "scope", although I'd like to propose an extension to the rules: it should be possible to store a weak reference to these The third type seems to be pretty much solved, seeing as we have a (mostly) working GC. Something like this might be a nice way to implement it: class Thing {} reference void main() { scope Thing t1; //stack-allocated doSomething(t1); owned Thing t2 = new Thing; //heap-allocated but freed at end of scope doSomething(t2); }
Jun 27 2013
prev sibling parent reply Marco Leise <Marco.Leise gmx.de> writes:
Am Thu, 27 Jun 2013 01:59:00 +0200
schrieb "Adam D. Ruppe" <destructionator gmail.com>:

 void fillBuffer(lent char[] buffer) {}
 
 would be disallowed and that is something I would definitely want.
Isn't that what scope is for? -- Marco
Jun 28 2013
next sibling parent "Dicebot" <public dicebot.lv> writes:
On Friday, 28 June 2013 at 07:07:39 UTC, Marco Leise wrote:
 Am Thu, 27 Jun 2013 01:59:00 +0200
 schrieb "Adam D. Ruppe" <destructionator gmail.com>:

 void fillBuffer(lent char[] buffer) {}
 
 would be disallowed and that is something I would definitely 
 want.
Isn't that what scope is for?
Reading dlang.org makes you guess so but official position is that 'scope' does not exist, so it is hard to say what it is really for.
Jun 28 2013
prev sibling parent reply "Adam D. Ruppe" <destructionator gmail.com> writes:
On Friday, 28 June 2013 at 07:07:39 UTC, Marco Leise wrote:
 Isn't that what scope is for?
I don't really know. In practice, it does something else (usually nothing, but suppresses heap closure allocation on delegates). The DIPs relating to it all talk about returning refs from functions and I'm not sure if they relate to the built ins or not- I don't think it would quite work for what I have in mind.
Jun 28 2013
next sibling parent "Dicebot" <public dicebot.lv> writes:
On Friday, 28 June 2013 at 11:55:46 UTC, Adam D. Ruppe wrote:
 On Friday, 28 June 2013 at 07:07:39 UTC, Marco Leise wrote:
 Isn't that what scope is for?
I don't really know. In practice, it does something else (usually nothing, but suppresses heap closure allocation on delegates). The DIPs relating to it all talk about returning refs from functions and I'm not sure if they relate to the built ins or not- I don't think it would quite work for what I have in mind.
It is no-op keyword in current implementation for everything but delegates. DIP speculation was based on http://dlang.org/attribute.html#scope and "Parameter Storage Classes" in http://dlang.org/function.html but that info is obviously outdated.
Jun 28 2013
prev sibling parent reply "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Friday, June 28, 2013 13:55:45 Adam D. Ruppe wrote:
 On Friday, 28 June 2013 at 07:07:39 UTC, Marco Leise wrote:
 Isn't that what scope is for?
I don't really know. In practice, it does something else (usually nothing, but suppresses heap closure allocation on delegates). The DIPs relating to it all talk about returning refs from functions and I'm not sure if they relate to the built ins or not- I don't think it would quite work for what I have in mind.
Per the spec, all scope is supposed to do is prevent references in a parameter to be escaped. To be specific, it says ------- ref­er­ences in the pa­ra­me­ter can­not be es­caped (e.g. as­signed to a global vari­able) ------- So, in theory, if you had something like auto foo(scope int[] i) {...} it would prevent i or anything refering to it from being returned or assigned to any variable which will outlive the function call. However, scope currently does _nothing_ for anything other than delegates - which is why I think that using the in attribute is such an incredibly bad idea. Using either in or scope on anything other than delegates could result in all kinds of code breakage if/when scope is ever implemented for types other than delegates. For delegates, it has the advantage of telling the compiler that it doesn't need to allocate a closure (since the delegate won't be used passed the point when it's calling scope will exist as could occur if the delegate escaped the function it was passed to), but I'm not sure that even that works 100% correctly right now. We really should sort out exactly what we're going to do with scope one of these days soon. But the stuff that some of the DIPS do with scope (e.g. returning with scope - which is completely against the spec at this point) are suggestions and not at all how it currently works. - Jonathan M Davis
Jun 28 2013
parent reply "Adam D. Ruppe" <destructionator gmail.com> writes:
On Friday, 28 June 2013 at 17:43:21 UTC, Jonathan M Davis wrote:
 it would prevent i or anything refering to it from being 
 returned or assigned to any variable which will outlive the
 function call. However,
That's fairly close to what I'd want. But there's two cases I'm not sure it would cover: 1: struct Unique(T) { scope T borrow(); } If the unique pointer decides to let its reference slip, it wouldn't want it going somewhere else and escaping, since that breaks the unique need. This is important for a few cases. Here's one: int* foo; { Unique!(int*) bar; foo = bar.borrow; int* ok = bar.borrow; // this should be ok, because this never exists outside the same scope as the Unique } // foo now talks to a freed *bar, so that shouldn't be allowed Similarly, if bar were reassigned, this could cause trouble, but what we might do is just disallow such reassignments, but maybe it could work if it always goes down in scope. I'd have to think about that. (I'm thinking my borrowed thing might have to be a type constructor rather than a storage class. Otherwise, you could get around it by: int* bar(scope int* foo) { int* b = foo; return b; } Unless the compiler is very smart about following where it goes.) But if scope works on the return value too, it might be ok. maybe 2: void bar(scope int* foo, int** bar) { *bar = foo; } Actually, I'm reasonably clear the spec's scope words would work for this one. But we'd need to be sure - this is one case where pure wouldn't help (pure generally would help, since it disallows assignments to the outside world, but there's enough holes that you could leak a reference). To be memory safe, these would all have to be guaranteed.
Jun 28 2013
parent "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Friday, June 28, 2013 19:56:44 Adam D. Ruppe wrote:
 struct Unique(T) {
 scope T borrow();
 }
Per the current spec, this would not be a valid use of scope, as scope is specifically a parameter storage class and can only be used on function parameters (just like in, out, ref, and lazy). scope seems to be specifically intended for guaranteeing that an argument passed to a function does not escape that function. - Jonathan M Davis
Jun 28 2013
prev sibling next sibling parent reply "Jason House" <jason.james.house gmail.com> writes:
Bloomberg released an STL alternative called BSL which contains 
an alternate allocator model. In a nutshell object supporting 
custom allocators can optionally take an allocator pointer as an 
argument. Containers will save the pointer and use it for all 
their allocations. It seems simple enough and does not embed the 
allocator in the type.

https://github.com/bloomberg/bsl/wiki/BDE-Allocator-model

On Tuesday, 25 June 2013 at 22:22:09 UTC, cybervadim wrote:
 I know Andrey mentioned he was going to work on Allocators a 
 year ago. In DConf 2013 he described the problems he needs to 
 solve with Allocators. But I wonder if I am missing the 
 discussion around that - I tried searching this forum, found a 
 few threads that was not actually a brain storm for Allocators 
 design.

 Please point me in the right direction
 or
 is there a reason it is not discussed
 or
 should we open the discussion?


 The easiest approach for Allocators design I can imagine would 
 be to let user specify which Allocator operator new should get 
 the memory from (introducing a new keyword allocator). This 
 gives a total control, but assumes user knows what he is doing.

 Example:

 CustomAllocator ca;
 allocator(ca) {
   auto a = new A; // operator new will use 
 ScopeAllocator::malloc()
   auto b = new B;

   free(a); // that should call ScopeAllocator::free()
   // if free() is missing for allocated area, it is a user 
 responsibility to make sure custom Allocator can handle that
 }

 By default allocator is the druntime using GC, free(a) does 
 nothing for it.


 if some library defines its allocator (e.g. specialized 
 container), there should be ability to:
 1. override allocator
 2. get access to the allocator used

 I understand that I spent 5 mins thinking about the way 
 Allocators may look.
 My point is - if somebody is working on it, can you please 
 share your ideas?
Jun 26 2013
next sibling parent reply "cybervadim" <vadim.goryunov gmail.com> writes:
On Wednesday, 26 June 2013 at 13:16:25 UTC, Jason House wrote:
 Bloomberg released an STL alternative called BSL which contains 
 an alternate allocator model. In a nutshell object supporting 
 custom allocators can optionally take an allocator pointer as 
 an argument. Containers will save the pointer and use it for 
 all their allocations. It seems simple enough and does not 
 embed the allocator in the type.

 https://github.com/bloomberg/bsl/wiki/BDE-Allocator-model
I think the problem with such approach is that you have to maniacally add support for custom allocator to every class if you want them to be on a custom allocator. If we simply able to say - all memory allocated in this area {} should use my custom allocator, that would simplify the code and no need to change std lib. The next step is to notify allocator when the memory should be released. But for the stack based allocator that is not required. More over, if we introduce access to different GCs (e.g. mark-n-sweep, semi-copy, ref counted), we should be able to say this {} piece of code is my temporary, so use semi-copy GC, the other code is long lived and not much objects created, so use ref counted. That is, it is all runtime support and no need changing the library code.
Jun 26 2013
parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Wed, Jun 26, 2013 at 04:10:49PM +0200, cybervadim wrote:
 On Wednesday, 26 June 2013 at 13:16:25 UTC, Jason House wrote:
Bloomberg released an STL alternative called BSL which contains an
alternate allocator model. In a nutshell object supporting custom
allocators can optionally take an allocator pointer as an
argument. Containers will save the pointer and use it for all
their allocations. It seems simple enough and does not embed the
allocator in the type.

https://github.com/bloomberg/bsl/wiki/BDE-Allocator-model
I think the problem with such approach is that you have to maniacally add support for custom allocator to every class if you want them to be on a custom allocator.
Yeah, that's a major inconvenience with the C++ allocator model. There's no way to say "switch to allocator A within this block of code"; if you're given a binary-only library that doesn't support allocators, you're out of luck. And even if you have the source code, you have to manually modify every single line of code that performs allocation to take an additional parameter -- not a very feasible approach.
 If we simply able to say - all memory allocated in this area {}
 should use my custom allocator, that would simplify the code and no
 need to change std lib.
 The next step is to notify allocator when the memory should be
 released. But for the stack based allocator that is not required.
 More over, if we introduce access to different GCs (e.g.
 mark-n-sweep, semi-copy, ref counted), we should be able to say this
 {} piece of code is my temporary, so use semi-copy GC, the other
 code is long lived and not much objects created, so use ref counted.
 That is, it is all runtime support and no need changing the library
 code.
Yeah, I think the best approach would be one that doesn't require changing a whole mass of code to support. Also, one that doesn't require language changes would be far more likely to be accepted, as the core D devs are leery of adding yet more complications to the language. That's why I proposed that gc_alloc and gc_free be made into thread-global function pointers, that can be swapped with a custom allocator's version. This doesn't have to be visible to user code; it can just be an implementation detail in std.allocator, for example. It allows us to implement custom allocators across a block of code that doesn't know (and doesn't need to know) what allocator will be used. T -- Fact is stranger than fiction.
Jun 26 2013
parent reply "cybervadim" <vadim.goryunov gmail.com> writes:
On Wednesday, 26 June 2013 at 14:26:03 UTC, H. S. Teoh wrote:
 Yeah, I think the best approach would be one that doesn't 
 require
 changing a whole mass of code to support. Also, one that 
 doesn't require
 language changes would be far more likely to be accepted, as 
 the core D
 devs are leery of adding yet more complications to the language.

 That's why I proposed that gc_alloc and gc_free be made into
 thread-global function pointers, that can be swapped with a 
 custom
 allocator's version. This doesn't have to be visible to user 
 code; it
 can just be an implementation detail in std.allocator, for 
 example. It
 allows us to implement custom allocators across a block of code 
 that
 doesn't know (and doesn't need to know) what allocator will be 
 used.
Yes, being able to change gc_alloc, gc_free would do the work. If runtime remembers the stack of gc_alloc/gc_free functions like pushd, popd, that would simplify its usage. I think this is a very nice and simple solution to the problem.
Jun 26 2013
parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Wed, Jun 26, 2013 at 04:31:40PM +0200, cybervadim wrote:
 On Wednesday, 26 June 2013 at 14:26:03 UTC, H. S. Teoh wrote:
Yeah, I think the best approach would be one that doesn't require
changing a whole mass of code to support. Also, one that doesn't
require language changes would be far more likely to be accepted, as
the core D devs are leery of adding yet more complications to the
language.

That's why I proposed that gc_alloc and gc_free be made into
thread-global function pointers, that can be swapped with a custom
allocator's version. This doesn't have to be visible to user code; it
can just be an implementation detail in std.allocator, for example.
It allows us to implement custom allocators across a block of code
that doesn't know (and doesn't need to know) what allocator will be
used.
Yes, being able to change gc_alloc, gc_free would do the work. If runtime remembers the stack of gc_alloc/gc_free functions like pushd, popd, that would simplify its usage. I think this is a very nice and simple solution to the problem.
Adam's idea does this: tie each replacement of gc_alloc/gc_free to some stack-based object, that automatically cleans up in the dtor. So something along these lines: struct CustomAlloc(A) { void* function(size_t size) old_alloc; void function(void* ptr) old_free; this(A alloc) { old_alloc = gc_alloc; old_free = gc_free; gc_alloc = &A.alloc; gc_free = &A.free; } ~this() { gc_alloc = old_alloc; gc_free = old_free; // Cleans up, e.g., region allocator deletes the // region A.cleanup(); } } class C {} void main() { auto c = new C(); // allocates using default allocator (GC) { CustomAlloc!MyAllocator _; // Everything from here on until end of block // uses MyAllocator auto d = new C(); // allocates using MyAllocator { CustomAlloc!AnotherAllocator _; auto e = new C(); // allocates using AnotherAllocator // End of scope: auto cleanup, gc_alloc and // gc_free reverts back to MyAllocator } auto f = new C(); // allocates using MyAllocator // End of scope: auto cleanup, gc_alloc and // gc_free reverts back to default values } auto g = new C(); // allocates using default allocator } So you effectively have an allocator stack, and user code never has to directly manipulate gc_alloc/gc_free (which would be dangerous). T -- Almost all proofs have bugs, but almost all theorems are true. -- Paul Pedersen
Jun 26 2013
parent "Dicebot" <public dicebot.lv> writes:
Some type system help is required to guarantee that references to 
such scope-allocated data won't escape.
Jun 26 2013
prev sibling parent "Brad Anderson" <eco gnuk.net> writes:
On Wednesday, 26 June 2013 at 13:16:25 UTC, Jason House wrote:
 Bloomberg released an STL alternative called BSL which contains 
 an alternate allocator model. In a nutshell object supporting 
 custom allocators can optionally take an allocator pointer as 
 an argument. Containers will save the pointer and use it for 
 all their allocations. It seems simple enough and does not 
 embed the allocator in the type.

 https://github.com/bloomberg/bsl/wiki/BDE-Allocator-model
There is also EASTL's (Electronic Arts version of STL for gamedev) take on allocators. http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2271.html#eastl_allocator
Jun 28 2013
prev sibling next sibling parent reply Dmitry Olshansky <dmitry.olsh gmail.com> writes:
26-Jun-2013 02:22, cybervadim пишет:
 I know Andrey mentioned he was going to work on Allocators a year ago.
 In DConf 2013 he described the problems he needs to solve with
 Allocators. But I wonder if I am missing the discussion around that - I
 tried searching this forum, found a few threads that was not actually a
 brain storm for Allocators design.

 Please point me in the right direction
 or
 is there a reason it is not discussed
 or
 should we open the discussion?


 The easiest approach for Allocators design I can imagine would be to let
 user specify which Allocator operator new should get the memory from
 (introducing a new keyword allocator). This gives a total control, but
 assumes user knows what he is doing.

 Example:

 CustomAllocator ca;
 allocator(ca) {
    auto a = new A; // operator new will use ScopeAllocator::malloc()
    auto b = new B;

    free(a); // that should call ScopeAllocator::free()
    // if free() is missing for allocated area, it is a user
 responsibility to make sure custom Allocator can handle that
 }
Awful. What that extra syntax had brought you? Except that now new is unsafe by design? Other questions involve how does this allocation scope goes inside of functions, what is the mechanism of passing it up and down of call-stack. Last but not least I fail to see how scoped allocators alone (as presented) solve even half of the problem. -- Dmitry Olshansky
Jun 26 2013
parent reply "cybervadim" <vadim.goryunov gmail.com> writes:
On Wednesday, 26 June 2013 at 14:17:03 UTC, Dmitry Olshansky 
wrote:
 Awful. What that extra syntax had brought you? Except that now 
 new is unsafe by design?
 Other questions involve how does this allocation scope goes 
 inside of functions, what is the mechanism of passing it up and 
 down of call-stack.

 Last but not least I fail to see how scoped allocators alone 
 (as presented) solve even half of the problem.
Extra syntax allows me not touching the existing code. Imagine you have a stateless event processing. That is event comes, you do some calculation, prepare the answer and send it back. It will look like: void onEvent(Event event) { process(); } Because it is stateless, you know all the memory allocated during processing will not be required afterwards. So the syntax I suggested requires a very little change in code. process() may be implemented using std lib, doing several news and resizing. With new syntax: void onEvent(Event event) { ScopedAllocator alloc; allocator(alloc) { process(); } } So now you do not use GC for all that is created inside the process(). ScopedAllocator is a simple stack that will free all memory in one go. It is up to the runtime implementation to make sure all memory that is allocated inside allocator{} scope is actually allocated using ScopedAllocator and not GC. Does it make sense?
Jun 26 2013
parent reply Dmitry Olshansky <dmitry.olsh gmail.com> writes:
26-Jun-2013 18:27, cybervadim пишет:
 On Wednesday, 26 June 2013 at 14:17:03 UTC, Dmitry Olshansky wrote:
 Awful. What that extra syntax had brought you? Except that now new is
 unsafe by design?
 Other questions involve how does this allocation scope goes inside of
 functions, what is the mechanism of passing it up and down of call-stack.

 Last but not least I fail to see how scoped allocators alone (as
 presented) solve even half of the problem.
Extra syntax allows me not touching the existing code. Imagine you have a stateless event processing. That is event comes, you do some calculation, prepare the answer and send it back. It will look like: void onEvent(Event event) { process(); } Because it is stateless, you know all the memory allocated during processing will not be required afterwards.
Here is a chief problem - the assumption that is required to make it magically work. Now what I see is: T arr[];//TLS //somewhere down the line arr = ... ; else{ ... alloctor(myAlloc){ arr = array(filter!....); } ... } return arr; Having an unsafe magic wand that may transmogrify some code to switch allocation strategy I consider naive and dangerous. Who ever told you process does return before allocating a few Gigs of RAM (and hoping on GC collection)? Right, nobody. Maybe it's an event loop that may run forever. What is missing is that code up to date assumes new == GC and works _like that_.
 So the syntax I suggested
 requires a very little change in code. process() may be implemented
 using std lib, doing several news and resizing.

 With new syntax:


 void onEvent(Event event)
 {
     ScopedAllocator alloc;
     allocator(alloc) {
       process();
     }
 }

 So now you do not use GC for all that is created inside the process().
 ScopedAllocator is a simple stack that will free all memory in one go.

 It is up to the runtime implementation to make sure all memory that is
 allocated inside allocator{} scope is actually allocated using
 ScopedAllocator and not GC.

 Does it make sense?
Yes, but it's horribly broken. -- Dmitry Olshansky
Jun 26 2013
parent reply "cybervadim" <vadim.goryunov gmail.com> writes:
On Wednesday, 26 June 2013 at 14:59:41 UTC, Dmitry Olshansky 
wrote:
 Here is a chief problem - the assumption that is required to 
 make it magically work.

 Now what I see is:

 T arr[];//TLS

 //somewhere down the line
 arr = ... ;
 else{
 ...
 alloctor(myAlloc){
 	arr = array(filter!....);
 }
 ...
 }
 return arr;

 Having an unsafe magic wand that may transmogrify some code to 
 switch allocation strategy I consider naive and dangerous.

 Who ever told you process does return before allocating a few 
 Gigs of RAM (and hoping on GC collection)? Right, nobody. Maybe 
 it's an event loop that may run forever.

 What is missing is that code up to date assumes new == GC and 
 works _like that_.
Not magic, but the tool which is quite powerful and thus it may shoot your leg. This is unsafe, but if you want it safe, don't use allocators, stay with GC. In the example above, you get first arr freed by GC, second arr may point to nothing if myAlloc was implemented to free it before. Or you may get a proper arr reference if myAlloc used malloc and didn't free it. The fact that you may write bad code does not make the language (or concept) bad.
Jun 26 2013
parent Dmitry Olshansky <dmitry.olsh gmail.com> writes:
26-Jun-2013 23:04, cybervadim пишет:
 On Wednesday, 26 June 2013 at 14:59:41 UTC, Dmitry Olshansky wrote:
 Having an unsafe magic wand that may transmogrify some code to switch
 allocation strategy I consider naive and dangerous.

 Who ever told you process does return before allocating a few Gigs of
 RAM (and hoping on GC collection)? Right, nobody. Maybe it's an event
 loop that may run forever.

 What is missing is that code up to date assumes new == GC and works
 _like that_.
Not magic, but the tool which is quite powerful and thus it may shoot your leg.
I know what kind of thing you are talking about. It's ain't powerful it's just a hack that doesn't quite do what advertised.
 This is unsafe, but if you want it safe, don't use allocators, stay with
 GC.
BTW you were talking changing allocation of the code you didn't write. There is not even single fact that makes the thing safe. It's all working by chance or because the thing was designed to work with scoped allocator to begin with. I believe the 2nd case (design to use scoped allocation) is a) The behavior is guaranteed (determinism vs GC etc) b) Safety is assured be the designer not pure luck (and reasonable assumption that may not hold)
 In the example above, you get first arr freed by GC, second arr may
 point to nothing if myAlloc was implemented to free it before. Or you
 may get a proper arr reference if myAlloc used malloc and didn't free
 it.
Yeah I know, hence I showed it. BTW forget about malloc I'm not talking about explicit malloc being an alternative to you scheme.
 The fact that you may write bad code does not make the language (or
 concept) bad.
It does. Because it introduces easy unreliable and bug prone usage. -- Dmitry Olshansky
Jun 26 2013
prev sibling next sibling parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Wed, Jun 26, 2013 at 01:16:31AM +0200, Adam D. Ruppe wrote:
 On Tuesday, 25 June 2013 at 22:50:55 UTC, H. S. Teoh wrote:
And maybe (b) can be implemented by making gc_alloc / gc_free
overridable function pointers? Then we can override their values
and use scope guards to revert them back to the values they were
before.
Yea, I was thinking this might be a way to go. You'd have a global (well, thread-local) allocator instance that can be set and reset through stack calls. You'd want it to be RAII or delegate based, so the scope is clear. with_allocator(my_alloc, { do whatever here }); or { ChangeAllocator!my_alloc dummy; do whatever here } // dummy's destructor ends the allocator scope I think the former is a bit nicer, since the dummy variable is a bit silly. We'd hope that delegate can be inlined.
Actually, D's frontend leaves something to be desired when it comes to inlining delegates. It *is* done sometimes, but not as often as one may like. For example, opApply generally doesn't inline its delegate, even when it's just a thin wrapper around a foreach loop. But yeah, I think the former has nicer syntax. Maybe we can help the compiler with inlining by making the delegate a compile-time parameter? But it forces a switch of parameter order, which is Not Nice (hurts readability 'cos the allocator argument comes after the block instead of before).
 But, the template still has a big advantage: you can change the
 type. And I think that is potentially enormously useful.
True. It can use different types for different allocators that does (or doesn't) do cleanups at the end of the scope, depending on what the allocator needs to do.
 Another question is how to tie into output ranges. Take std.conv.to.
 
 auto s = to!string(10); // currently, this hits the gc
 
 What if I want it to go on a stack buffer? One option would be to
 rewrite it to use an output range, and then call it like:
 
 char[20] buffer;
 auto s = to!string(10, buffer); // it returns the slice of the
 buffer it actually used
 
 (and we can do overloads so to!string(10, radix) still works, as
 well as to!string(10, radix, buffer). Hassle, I know...)
I think supporting the multi-argument version of to!string() is a good thing, but what to do with library code that calls to!string()? It'd be nice if we could somehow redirect those GC calls without having to comb through the entire Phobos codebase for stray calls to to!string(). [...]
 The fun part is the output range works for that, and could also work
 for something like this:
 
 struct malloced_string {
     char* ptr;
     size_t length;
     size_t capacity;
     void put(char c) {
         if(length >= capacity)
            ptr = realloc(ptr, capacity*2);
         ptr[length++] = c;
     }
 
     char[] slice() { return ptr[0 .. length]; }
     alias slice this;
     mixin RefCounted!this; // pretend this works
 }
 
 
 {
    malloced_string str;
    auto got = to!string(10, str);
 } // str is out of scope, so it gets free()'d. unsafe though: if you
 stored a copy of got somewhere, it is now a pointer to freed memory.
 I'd kinda like language support of some sort to help mitigate that
 though, like being a borrowed pointer that isn't allowed to be
 stored, but that's another discussion.
Nice!
 And that should work. So then what we might do is provide these
 little output range wrappers for various allocators, and use them on
 many functions.
 
 So we'd write:
 
 import std.allocators;
 import std.range;
 
 // mallocator is provided in std.allocators and offers the goods
 OutputRange!(char, mallocator) str;
 
 auto got = to!string(10, str);
I like this. However, it still doesn't address how to override the default allocator in, say, Phobos functions.
 What's nice here is the output range is useful for more than just
 allocators. You could also to!string(10, my_file) or a delegate,
 blah blah blah. So it isn't too much of a burden, it is something
 you might naturally use anyway.
Now *that* is a very nice idea. I like having a way of bypassing using a string buffer, and just writing the output directly to where it's intended to go. I think to() with an output range parameter definitely should be implemented. It doesn't address all of the issues, but it's a very big first step IMO.
Also, we may have the problem of the wrong allocator
being used to free the object.
Another reason why encoding the allocator into the type is so nice. For the minimal D I've been playing with, the idea I'm running with is all allocated memory has some kind of special type, and then naked pointers are always assumed to be borrowed, so you should never store or free them.
Interesting idea. So basically you can tell which allocator was used to allocate an object just by looking at its type? That's not a bad idea, actually.
 auto foo = HeapArray!char(capacity);
 
 void bar(char[] lol){}
 
 bar(foo); // allowed, foo has an alias this on slice
This is nice. Hooray for alias this. :)
 // but....
 
 struct A {
    char[] lol; // not allowed, because you don't know when lol is
 going to be freed
 }
 
 
 foo frees itself with refcounting.
This is a bit inconvenient. So your member variables will have to know what allocation type is being used. Not the end of the world, of course, but not as pretty as one would like. On Wed, Jun 26, 2013 at 03:24:57AM +0200, Adam D. Ruppe wrote:
 I was just quickly skimming some criticism of C++ allocators, since
 my thought here is similar to what they do. On one hand, maybe D can
 do it right by tweaking C++'s design rather than discarding it.
 
 On the other hand, with all the C++ I've done, I have never actually
 used STL allocators, which could say something about me or could say
 something about them.
 
 
 One thing I saw said making the differently allocated object a
 different type sucks. ...but must it? The complaint there was "so
 much for just doing a function that takes a std::string". But, the
 way I'd want to do it in D is the function would take a char[]
 instead, and our special allocated type provides that via opSlice
 and/or alias this.
Yeah I think alias this adds a whole new factor into the equation. The advantage of having a distinct type makes it much easier to implement, and allows you to mix differently-allocated objects without having to worry about things like calling the right version of gc_free to cleanup properly. You can even have the same underlying data type be allocated in two different ways, and the cleanup will happen correctly. Basically, when you allocate some object O of class C using allocator A, then it follows that no matter what you do with the gc_alloc/gc_free function pointers afterwards, O must be freed using A.free. So in a sense, O needs to carry around a function pointer to A.free in its dtor (or whoever frees it). So this actually argues for having a distinct type for an instance of C allocated using A, vs. an instance of C allocated using a different allocator B. You need to store that function pointer to A.free and B.free *somewhere*, otherwise things won't work properly. [...]
 Anyway, bottom line is I don't think that criticism necessarily
 applies to D.
Agreed, in D, distinct types per allocator is, at the very least, not as bad as it is in C++.
 But there's surely many others and I'm more or less a
 n00b re c++'s allocators so idk yet.
Who *isn't* a n00b wrt to C++'s allocators, since so few people actually use it? :-P T -- He who sacrifices functionality for ease of use, loses both and deserves neither. -- Slashdotter
Jun 26 2013
parent "Adam D. Ruppe" <destructionator gmail.com> writes:
On Wednesday, 26 June 2013 at 16:40:20 UTC, H. S. Teoh wrote:
 I think supporting the multi-argument version of to!string() is 
 a good thing, but what to do with library code that calls 
 to!string()? It'd be nice if we could somehow redirect those GC 
 calls without having to comb through the entire Phobos codebase 
 for stray calls to to!string().
Let's consider what kinds of allocations we have. We can break them up into two broad groups: internal and visible. Internal allocations, in theory, don't matter. These can be on the stack, the gc heap, malloc/free, whatever. The function itself is responsible for their entire lifetime. Changing these either optimize, in the case of reusing a region, or leak if you switch it to manual and the function doesn't know it. Visible allocations are important because the caller is responsible for freeing them. Here, I really think we want the type system's help: either it should return something that we know we're responsible for, or take a buffer/output range from us to receive the data in the first place. Either way, the function signature should reflect what's going on with visible allocations. It'd possibly return a wrapped type and it'd take an output range/buffer/allocator. With internals though, the only reason I can see why you'd want to change them outside the function is to give them a region of some sort to work with, especially since you don't know for sure what it is doing - these are all local variables to the function/call stack. And here, I don't think we want to change the allocator wholesale. At most, we'd want to give it hints that what we're doing are short lived. (Or, better yet, have it figure this out on its own, like a generational gc.) So I think this is more about tweaking the gc than replacing it, at most adding a couple new functions to it: GC.hint_short_lived // returns a helper struct with a static refcount: TempGcAllocator { static int tempCount = 0; static void* localRegion; this() { tempCount++; } // pretend this works ~this() { tempCount--; if(tempCount == 0) gc.tryToCollect(localRegion); } T create(T, Args...)(Args args) { return GC.new_short_lived T(args); } } and gc.tryToCollect() does a quick scan for anything into the local region. If there's nothing in there, it frees the whole thing. If there is, in the name of memory safety, it just reintegrates that local region into the regular memory and gc's its components normally. The reason the count is static is that you don't have to pass this thing down the call stack. Any function that wants to adapt to this generational hint system just calls hint_short_lived. If you're a leaf function, that's ok, the static count means you'll inherit the region from the function above you. You would NOT use this in main(), as that defeats the purpose.
 I think to() with an output range parameter definitely
 should be implemented.
No doubt about it, we should aim for most phobos functions not to allocate at all, if given an output range they can use.
 Interesting idea. So basically you can tell which allocator was 
 used to allocate an object just by looking at its type?
Right, then you'll know if you have to free() it. (Or it can free itself with its destructor.)
 This is a bit inconvenient. So your member variables will have 
 to know what allocation type is being used. Not the end of the
 world, of course, but not as pretty as one would like.
Yeah, you'd need to know if you own them or not too (are you responsible for freeing that string you just got passed? If no, are you sure it won't be freed while you're still using it?), but I just think that's a part of memory management you can't sidestep. There's two easy answers: 1) always make a private copy of anything you store (and perhaps write to) or 2) use a gc and trust it to always be the owner. In any other case, I think you *have* to think about it, and the type telling you can help you make that decision.
 and allows you to mix differently-allocated objects without 
 having to
Important to remember though that you are borrowing these references, not taking ownership. I think the rule of all pointers/slices are borrowed is fairly workable though. With the gc, that's ok, you don't own anything. The garbage collector is responsible for it all, so store away. (Though if it is mutable, you might want to idup it so you don't get overwritten by someone else. But that's a separate question from allocation method.... and already encoded in D's type system). So never free() a naked pointer, unless you know what you're doing like interfacing with a C library, prefer to only free a ManuallyAllocated!(pointer). hell a C library binding could change the type too, it'd still be binary compatible. RefCounted!T wouldn't be, but ManuallyAllocated!T would just be a wrapper around T*. I think I'm starting to ramble!
Jun 26 2013
prev sibling next sibling parent reply "Dicebot" <public dicebot.lv> writes:
By the way, while this topic gets some attention, I want to make 
a notice that there are actually two orthogonal entities that 
arise when speaking about configurable allocation - allocators 
itself and global allocation policies. I think good design should 
address both of those.

For example, changing global allocator for custom one has limited 
usability - you are anyway limited by the language design that 
makes only GC or ref-counting viable general options. However, 
some way to prohibit automatic allocations at runtime while still 
allowing manual ones may be useful - and it does not matter what 
allocator is actually used to get that memory. Once such API is 
designed, tighter classification and control may be added with 
time.
Jun 26 2013
parent reply Dmitry Olshansky <dmitry.olsh gmail.com> writes:
26-Jun-2013 21:35, Dicebot пишет:
 By the way, while this topic gets some attention, I want to make a
 notice that there are actually two orthogonal entities that arise when
 speaking about configurable allocation - allocators itself and global
 allocation policies. I think good design should address both of those.
Sadly I believe that global allocators would still have to be compatible with GC (to not break code in hard to track ways) thus basically being a GC. Hence we can easily stop talking about them ;) -- Dmitry Olshansky
Jun 26 2013
parent reply "Dicebot" <public dicebot.lv> writes:
On Wednesday, 26 June 2013 at 19:40:54 UTC, Dmitry Olshansky 
wrote:
 Sadly I believe that global allocators would still have to be 
 compatible with GC (to not break code in hard to track ways) 
 thus basically being a GC. Hence we can easily stop talking 
 about them ;)
Nice way to say "we don't really need that embedded, kernel and gamedev guys". GC as a safe an obvious approach should be the default but druntime needs to provide means for tight and dangerous control upon explicit request.
Jun 26 2013
parent reply Dmitry Olshansky <dmitry.olsh gmail.com> writes:
27-Jun-2013 00:53, Dicebot пишет:
 On Wednesday, 26 June 2013 at 19:40:54 UTC, Dmitry Olshansky wrote:
 Sadly I believe that global allocators would still have to be
 compatible with GC (to not break code in hard to track ways) thus
 basically being a GC. Hence we can easily stop talking about them ;)
Nice way to say "we don't really need that embedded, kernel and gamedev guys". GC as a safe an obvious approach should be the default but druntime needs to provide means for tight and dangerous control upon explicit request.
Just don't use certain built-ins. Stub them out in run-time if you like. The only problematic point I see is closures allocated on heap. Frankly I see embedded, kernel and gamedev guys using ref-counting and custom data structures all the time. They all want that level of control and determinism anyway or are so resource constrained that GC is too much code space or run-time overhead anyway. Needless to say that custom run-time for the first 2 categories is required anyway so just hack the druntime. It would be nice to have hooks readily available (and documented?) to do so but hardly beyond that. -- Dmitry Olshansky
Jun 26 2013
next sibling parent "Dicebot" <public dicebot.lv> writes:
On Wednesday, 26 June 2013 at 21:00:54 UTC, Dmitry Olshansky 
wrote:
 Needless to say that custom run-time for the first 2 categories 
 is required anyway so just hack the druntime. It would be nice 
 to have hooks readily available (and documented?) to do so but 
 hardly beyond that.
It is an API issue. Hacking druntime is, unfortunately, inevitable but keeping ability to swap those two with no code changes simplifies development process and makes less tempting too forget about this use case when doing std lib / runtime stuff - it has been a second-class citizen for rather long time.
Jun 26 2013
prev sibling parent reply "Adam D. Ruppe" <destructionator gmail.com> writes:
On Wednesday, 26 June 2013 at 21:00:54 UTC, Dmitry Olshansky 
wrote:
 Just don't use certain built-ins. Stub them out in run-time if 
 you like. The only problematic point I see is closures 
 allocated on heap.
Actually, I was kinda sorta able to solve this in my minimal d. // this would be used for automatic heap closures, but there's no way to free it... ///* extern(C) void* _d_allocmemory(size_t bytes) { auto ptr = manual_malloc(bytes); debug(allocations) { char[16] buffer; write("warning: automatic memory allocation ", intToString(cast(size_t) ptr, buffer)); } return ptr; } struct HeapClosure(T) if(is(T == delegate)) { mixin SimpleRefCounting!(T, q{ char[16] buffer; write("\nfreeing closure ", intToString(cast(size_t) payload.ptr, buffer),"\n"); manual_free(payload.ptr); }); } HeapClosure!T makeHeapClosure(T)(T t) { // if(__traits(isNested, T)) { return HeapClosure!T(t); } void closureTest2(HeapClosure!(void delegate()) test) { write("\nptr is ", cast(size_t) test.ptr, "\n"); test(); auto b = test; } void closureTest() { string a = "whoa"; scope(exit) write("\n\nexit\n\n"); //throw new Exception("test"); closureTest2( makeHeapClosure({ write(a); }) ); } It worked in my toy tests. The trick would be though to never store or use a non-scope builtin delegate. Using RTInfo, I believe I can statically verify you don't do this in the whole program, but haven't actually tried yet. I also left built in append unimplemented, but did custom types with ~= that are pretty convenient. Binary ~ is a loss though, too easy to lose pointers with that.
Jun 26 2013
parent Dmitry Olshansky <dmitry.olsh gmail.com> writes:
27-Jun-2013 01:05, Adam D. Ruppe пишет:
 On Wednesday, 26 June 2013 at 21:00:54 UTC, Dmitry Olshansky wrote:
 Just don't use certain built-ins. Stub them out in run-time if you
 like. The only problematic point I see is closures allocated on heap.
Actually, I was kinda sorta able to solve this in my minimal d. // this would be used for automatic heap closures, but there's no way to free it...
[snip a cool hack] Yeah, I suspected something like this might work. Basically defining your own ref-count closure type and forging delegate keyword in your codebase (except in the file that defines heap closure). That still leaves chasing code like auto dg = (...){ ... } though. Maybe having it as a template Closure!(ret-type, arg types...) and instantiator function called simply closure could be more ecstatically pleasing (this is IMHO).
 It worked in my toy tests. The trick would be though to never store or
 use a non-scope builtin delegate. Using RTInfo, I believe I can
 statically verify you don't do this in the whole program,  but haven't
 actually tried yet.


 I also left built in append unimplemented, but did custom types with ~=
 that are pretty convenient. Binary ~ is a loss though, too easy to lose
 pointers with that.
-- Dmitry Olshansky
Jun 26 2013
prev sibling parent reply "John Colvin" <john.loughran.colvin gmail.com> writes:
On Tuesday, 25 June 2013 at 22:22:09 UTC, cybervadim wrote:
 I know Andrey mentioned he was going to work on Allocators a 
 year ago. In DConf 2013 he described the problems he needs to 
 solve with Allocators. But I wonder if I am missing the 
 discussion around that - I tried searching this forum, found a 
 few threads that was not actually a brain storm for Allocators 
 design.

 Please point me in the right direction
 or
 is there a reason it is not discussed
 or
 should we open the discussion?


 The easiest approach for Allocators design I can imagine would 
 be to let user specify which Allocator operator new should get 
 the memory from (introducing a new keyword allocator). This 
 gives a total control, but assumes user knows what he is doing.

 Example:

 CustomAllocator ca;
 allocator(ca) {
   auto a = new A; // operator new will use 
 ScopeAllocator::malloc()
   auto b = new B;

   free(a); // that should call ScopeAllocator::free()
   // if free() is missing for allocated area, it is a user 
 responsibility to make sure custom Allocator can handle that
 }

 By default allocator is the druntime using GC, free(a) does 
 nothing for it.


 if some library defines its allocator (e.g. specialized 
 container), there should be ability to:
 1. override allocator
 2. get access to the allocator used

 I understand that I spent 5 mins thinking about the way 
 Allocators may look.
 My point is - if somebody is working on it, can you please 
 share your ideas?
Old but perhaps relevant? http://www.linkedin.com/news?viewArticle=&articleID=-1&gid=86782&type=member&item=253295471&articleURL=http%3A%2F%2Fwww%2Eallendowney%2Ecom%2Fss08%2Fhandouts%2Fberger02reconsidering%2Epdf&urlhash=96TJ&goback=%2Egmr_86782%2Egde_86782_member_253295471 (It's an academic article about memory allocation from 2002)
Jun 27 2013
parent reply "deadalnix" <deadalnix gmail.com> writes:
On Thursday, 27 June 2013 at 22:50:47 UTC, John Colvin wrote:
 Old but perhaps relevant?

 http://www.linkedin.com/news?viewArticle=&articleID=-1&gid=86782&type=member&item=253295471&articleURL=http%3A%2F%2Fwww%2Eallendowney%2Ecom%2Fss08%2Fhandouts%2Fberger02reconsidering%2Epdf&urlhash=96TJ&goback=%2Egmr_86782%2Egde_86782_member_253295471

 (It's an academic article about memory allocation from 2002)
Interesting paper. Still concurrency isn't really addressed, which is a problem to be future proof.
Jun 28 2013
parent "Brian Rogoff" <brogoff gmail.com> writes:
On Friday, 28 June 2013 at 10:57:45 UTC, deadalnix wrote:
 On Thursday, 27 June 2013 at 22:50:47 UTC, John Colvin wrote:
 Old but perhaps relevant?

 http://www.linkedin.com/news?viewArticle=&articleID=-1&gid=86782&type=member&item=253295471&articleURL=http%3A%2F%2Fwww%2Eallendowney%2Ecom%2Fss08%2Fhandouts%2Fberger02reconsidering%2Epdf&urlhash=96TJ&goback=%2Egmr_86782%2Egde_86782_member_253295471

 (It's an academic article about memory allocation from 2002)
Interesting paper. Still concurrency isn't really addressed, which is a problem to be future proof.
http://en.wikipedia.org/wiki/Hoard_memory_allocator
Jun 28 2013