www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - std.allocator needs your help

reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Hello,


I am making good progress on the design of std.allocator, and I am 
optimistic about the way it turns out. D's introspection capabilities 
really shine through, and in places the design does feel really 
archetypal - e.g. "this is the essence of a freelist allocator". It's a 
very good feeling. The overall inspiration comes from Berger's 
HeapLayers, but D's introspection takes that pattern to a whole new level.

Several details are still in flux, but at the top level it seems most 
natural to divide things in two tiers:

1. Typed allocator, i.e. every request for allocation comes with the 
exact type requested;

2. Untyped allocator - traffics exclusively in ubyte[].

Initially I thought it's impossible to separate the two, but it turns 
out there are profound generic realities about untyped allocators that 
make them the perfect foundation for typed allocators, without 
considerable abstraction costs.

I need to prove that before I'm sure, because so far I only have an 
archetype of untyped allocators, with typed allocators to follow. 
There's always the possibility that some hitch shows itself only when 
actually trying to implement various typed allocators.

Anyhow, here's the general interface of an untyped allocator. In fact 
I'll showcase the simplest allocator - the NullAllocator, which has zero 
memory to offer. (In spite of its do-absolutely-nothing stance, the 
NullAllocator is surprisingly useful as a "terminator" for certain 
layered allocators.) Anyhow, here it is (explanations follow):

struct NullAllocator
{
     enum alignment = real.alignof;
     enum size_t available = 0;
     ubyte[] allocate(size_t s) shared { return null; }
     bool expand(ref ubyte[] b, size_t minDelta, size_t maxDelta) shared
     { assert(b is null); return false; }
     bool reallocate(ref ubyte[] b, size_t) shared
     { assert(b is null); return false; }
     void deallocate(ubyte[] b) shared { assert(b is null); }
     void collect() shared { }
     void deallocateAll() shared { }
     static shared NullAllocator it;
}

Primitives:

* alignment is a compile-time constant yielding the alignment of 
allocated data. Many allocators offer the maximum alignment of the 
platform (for which I used real.alignof above). This constant is 
required for all allocators.

* available is a property that returns how many total (not necessarily 
contiguous) bytes are available for allocation. The NullAllocator knows 
statically it has 0 bytes available so it implements it as an enum. 
Generally allocators will implement it as a  property. This property is 
optional.

* allocate(s) returns a ubyte[] with length at least s, or null. (It 
does not throw on out-of-memory because that would hurt composability; 
it turns out many elemental allocators do naturally run out of memory.) 
This method is required for all allocators. In most allocators this 
method should be  safe.

* expand(b, minDelta, maxDelta) grows b's length by at least minDelta 
(and on a best-effort basis by at least maxDelta) and returns true, or 
does nothing and returns false. In most allocators this should be  safe. 
(One key insight is that expand() can be made  safe whereas shrink() or 
realloc() are most often not; such mini-epiphanies are very exciting 
because they put the design on a beam guide with few principles and many 
consequences.) The precondition is that b is null or has been previously 
returned by allocate(). This method is optional.

* reallocate(b, s) reallocates (like C's realloc) b to at least s bytes 
and returns true, or does nothing and returns false. It is possible that 
the memory is moved. This method is optional, and usually unsafe. (This 
was mostly introduced to accommodate C's realloc() within the allocator 
design.)

* deallocate(b) deallocates the memory allocated for b. b must have been 
previously allocated by the same allocator. This method is usually 
unsafe (but there are notable implementations that may offer safety, 
such as unbounded freelists.) This method is optional.

* deallocateAll() deallocates in one shot all memory previously 
allocated by this allocator. This method is optional, and when present 
is almost always unsafe (I see no meaningful  safe implementation.) 
Region allocators are notable examples of allocators that define this 
method.

* collect() frees unused memory that had been allocated with this 
allocator. This optional method is implemented by tracing collectors and 
is usually  safe.

* "it" is an interesting artifact. Allocators may or may not hold 
per-instance state. Those that don't are required to define a global 
shared or thread-local singleton called "it" that will be used for all 
calls related to allocation. Of course, it is preferred for maximum 
flexibility that "it" is shared so as to clarify that the allocator is 
safe to use among different threads. In the case of the NullAllocator, 
this is obviously true, but non-trivial examples are the malloc-based 
allocator and the global garbage collector, both of which implement 
thread-safe allocation (at great effort no less).

There are quite a few more things to define more precisely, but this 
part of the design has become quite stable. To validate the design, I've 
defined some simple allocators (Mallocator, GCAllocator, Freelist, 
StackRegion, Region etc) but not the more involved ones, such as 
coalescing heaps, buddy system etc.

If you have an interest in memory allocators and would like to design 
(or have around) an allocator that fits the API above, please post it. 
For layering purposes, don't call malloc() or sbrk() for getting large 
blocks of memory from the system; instead, use another Allocator as your 
parent allocator in a pattern as follows:

struct MyAllocator(Allocator)
{
     static if (is(typeof(Allocator.it))) alias Allocator.it _parent;
     else Allocator _parent;
     ... use _parent for getting memory ...
}

This allows MyAllocator to use stateful and stateless allocators 
transparently by just referring to _parent. Of course, your allocator 
may take other generic parameters in addition to Allocator, or take none 
at all if it's a "root" allocator (sbrk() or Mallocator would be examples).

Destroy!


Andrei
Sep 22 2013
next sibling parent reply "Nicholas Londey" <londey gmail.com> writes:
 * allocate(s) returns a ubyte[] with length at least s, or 
 null. (It does not throw on out-of-memory because that would 
 hurt composability; it turns out many elemental allocators do 
 naturally run out of memory.) This method is required for all 
 allocators. In most allocators this method should be  safe.
Would an 'expected' be a possible alternative to returning null? Am a big fan of this talk of yours. http://channel9.msdn.com/Shows/Going+Deep/C-and-Beyond-2012-Andrei-Alexandrescu-Systematic-Error-Handling-in-C
Sep 22 2013
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 9/22/13 6:34 PM, Nicholas Londey wrote:
 * allocate(s) returns a ubyte[] with length at least s, or null. (It
 does not throw on out-of-memory because that would hurt composability;
 it turns out many elemental allocators do naturally run out of
 memory.) This method is required for all allocators. In most
 allocators this method should be  safe.
Would an 'expected' be a possible alternative to returning null? Am a big fan of this talk of yours. http://channel9.msdn.com/Shows/Going+Deep/C-and-Beyond-2012-Andrei-Alexandrescu-Systematic-Error-Handling-in-C
Thanks! I think expected would be appropriate for the high-level allocator. Andrei
Sep 22 2013
prev sibling next sibling parent reply Manu <turkeyman gmail.com> writes:
On 23 September 2013 09:49, Andrei Alexandrescu <
SeeWebsiteForEmail erdani.org> wrote:

 Hello,


 I am making good progress on the design of std.allocator, and I am
 optimistic about the way it turns out. D's introspection capabilities
 really shine through, and in places the design does feel really archetypal
 - e.g. "this is the essence of a freelist allocator". It's a very good
 feeling. The overall inspiration comes from Berger's HeapLayers, but D's
 introspection takes that pattern to a whole new level.

 Several details are still in flux, but at the top level it seems most
 natural to divide things in two tiers:

 1. Typed allocator, i.e. every request for allocation comes with the exact
 type requested;

 2. Untyped allocator - traffics exclusively in ubyte[].

 Initially I thought it's impossible to separate the two, but it turns out
 there are profound generic realities about untyped allocators that make
 them the perfect foundation for typed allocators, without considerable
 abstraction costs.

 I need to prove that before I'm sure, because so far I only have an
 archetype of untyped allocators, with typed allocators to follow. There's
 always the possibility that some hitch shows itself only when actually
 trying to implement various typed allocators.

 Anyhow, here's the general interface of an untyped allocator. In fact I'll
 showcase the simplest allocator - the NullAllocator, which has zero memory
 to offer. (In spite of its do-absolutely-nothing stance, the NullAllocator
 is surprisingly useful as a "terminator" for certain layered allocators.)
 Anyhow, here it is (explanations follow):

 struct NullAllocator
 {
     enum alignment = real.alignof;
     enum size_t available = 0;
     ubyte[] allocate(size_t s) shared { return null; }
     bool expand(ref ubyte[] b, size_t minDelta, size_t maxDelta) shared
     { assert(b is null); return false; }
     bool reallocate(ref ubyte[] b, size_t) shared
     { assert(b is null); return false; }
     void deallocate(ubyte[] b) shared { assert(b is null); }
     void collect() shared { }
     void deallocateAll() shared { }
     static shared NullAllocator it;
 }

 Primitives:

 * alignment is a compile-time constant yielding the alignment of allocated
 data. Many allocators offer the maximum alignment of the platform (for
 which I used real.alignof above). This constant is required for all
 allocators.

 * available is a property that returns how many total (not necessarily
 contiguous) bytes are available for allocation. The NullAllocator knows
 statically it has 0 bytes available so it implements it as an enum.
 Generally allocators will implement it as a  property. This property is
 optional.

 * allocate(s) returns a ubyte[] with length at least s, or null. (It does
 not throw on out-of-memory because that would hurt composability; it turns
 out many elemental allocators do naturally run out of memory.) This method
 is required for all allocators. In most allocators this method should be
  safe.

 * expand(b, minDelta, maxDelta) grows b's length by at least minDelta (and
 on a best-effort basis by at least maxDelta) and returns true, or does
 nothing and returns false. In most allocators this should be  safe. (One
 key insight is that expand() can be made  safe whereas shrink() or
 realloc() are most often not; such mini-epiphanies are very exciting
 because they put the design on a beam guide with few principles and many
 consequences.) The precondition is that b is null or has been previously
 returned by allocate(). This method is optional.

 * reallocate(b, s) reallocates (like C's realloc) b to at least s bytes
 and returns true, or does nothing and returns false. It is possible that
 the memory is moved. This method is optional, and usually unsafe. (This was
 mostly introduced to accommodate C's realloc() within the allocator design.)

 * deallocate(b) deallocates the memory allocated for b. b must have been
 previously allocated by the same allocator. This method is usually unsafe
 (but there are notable implementations that may offer safety, such as
 unbounded freelists.) This method is optional.

 * deallocateAll() deallocates in one shot all memory previously allocated
 by this allocator. This method is optional, and when present is almost
 always unsafe (I see no meaningful  safe implementation.) Region allocators
 are notable examples of allocators that define this method.

 * collect() frees unused memory that had been allocated with this
 allocator. This optional method is implemented by tracing collectors and is
 usually  safe.

 * "it" is an interesting artifact. Allocators may or may not hold
 per-instance state. Those that don't are required to define a global shared
 or thread-local singleton called "it" that will be used for all calls
 related to allocation. Of course, it is preferred for maximum flexibility
 that "it" is shared so as to clarify that the allocator is safe to use
 among different threads. In the case of the NullAllocator, this is
 obviously true, but non-trivial examples are the malloc-based allocator and
 the global garbage collector, both of which implement thread-safe
 allocation (at great effort no less).

 There are quite a few more things to define more precisely, but this part
 of the design has become quite stable. To validate the design, I've defined
 some simple allocators (Mallocator, GCAllocator, Freelist, StackRegion,
 Region etc) but not the more involved ones, such as coalescing heaps, buddy
 system etc.

 If you have an interest in memory allocators and would like to design (or
 have around) an allocator that fits the API above, please post it. For
 layering purposes, don't call malloc() or sbrk() for getting large blocks
 of memory from the system; instead, use another Allocator as your parent
 allocator in a pattern as follows:

 struct MyAllocator(Allocator)
 {
     static if (is(typeof(Allocator.it))) alias Allocator.it _parent;
     else Allocator _parent;
     ... use _parent for getting memory ...
 }

 This allows MyAllocator to use stateful and stateless allocators
 transparently by just referring to _parent. Of course, your allocator may
 take other generic parameters in addition to Allocator, or take none at all
 if it's a "root" allocator (sbrk() or Mallocator would be examples).

 Destroy!
Well it looks like a good start, but the thing I'm left wondering after reading this is still... how is it actually used? I think the greatest challenge is finding a simple, clean and correct way to actually specify which allocator should be used for making allocations throughout your code, and perhaps more troublesome; within generic code, and then layers of generic code. Are you intending to be able to associate a particular allocator with a class declaration? What about a struct declaration? What about a region of code (ie, a call tree/branch). What if the given allocator should be used for most of the tree, except for a couple of things beneath that always want to use their explicit allocator? What if I want to associate an allocator instance, not just an allocator type (ie, I don't want multiple instances of the same type(/s) of allocators in my code, how are they shared? It wasn't clear to me from your demonstration, but 'collect()' implies that GC becomes allocator-aware; how does that work? deallocateAll() and collect() may each free a whole lot of memory, but it seems to me that they may not actually be aware of the individual allocations they are freeing; how do the appropriate destructors for the sub-allocations get called? I have a suspicion you're going to answer most of these questions with the concept of allocator layering, but I just don't completely see it. It's quite an additional burden of resources and management to manage the individual allocations with a range allocator above what is supposed to be a performance critical allocator to begin with. C++'s design seems reasonable in some ways, but history has demonstrated that it's a total failure, which is almost never actually used (I've certainly never seen anyone use it). Some allocators that I use regularly to think about: A ring-buffer: * Like a region allocator I guess, but the circular nature adds some minor details, and requires the user to mark the heap from time to time, freeing all allocations between the old mark and the new mark. A pool: * Same as a free-list, but with a maximum size, ie, finite pool of objects pre-allocated and pre-populating the freelist. A pool-group: * Allocate from a group of pools allocating differently sized objects. (this is a good test for allocator layering, supporting a group of pools above, and fallback to the malloc heap for large objects)
Sep 22 2013
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 9/22/13 6:35 PM, Manu wrote:
 Well it looks like a good start, but the thing I'm left wondering after
 reading this is still... how is it actually used?
 I think the greatest challenge is finding a simple, clean
Oxford comma please :o)
 and correct
 way to actually specify which allocator should be used for making
 allocations throughout your code, and perhaps more troublesome; within
 generic code, and then layers of generic code.
My design makes it very easy to experiment by allowing one to define complex allocators out of a few simple building blocks. It is not a general-purpose allocator, but it allows one to define any number of such.
 Are you intending to be able to associate a particular allocator with a
 class declaration?
No, or at least not at this level.
 What about a struct declaration?
Same answer.
 What about a region of code (ie, a call tree/branch).
 What if the given allocator should be used for most of the tree, except
 for a couple of things beneath that always want to use their explicit
 allocator?
The proposed design makes it easy to create allocator objects. How they are used and combined is left to the application.
 What if I want to associate an allocator instance, not just an allocator
 type (ie, I don't want multiple instances of the same type(/s) of
 allocators in my code, how are they shared?
An allocator instance is a variable like any other. So you use the classic techniques (shared globals, thread-local globals, passing around as parameter) for using the same allocator object from multiple places.
 It wasn't clear to me from your demonstration, but 'collect()' implies
 that GC becomes allocator-aware; how does that work?
No, each allocator has its own means of dealing with memory. One could define a tracing allocator independent of the global GC.
 deallocateAll() and collect() may each free a whole lot of memory, but
 it seems to me that they may not actually be aware of the individual
 allocations they are freeing; how do the appropriate destructors for the
 sub-allocations get called?
No destructors are called at this level. Again, all these allocators understand is ubyte[].
 I have a suspicion you're going to answer most of these questions with
 the concept of allocator layering, but I just don't completely see it.
No, it's just that at this level some of these questions don't even have an appropriate answer - like we discuss atoms and molecules and you ask about building floors and beams and pillars.
 It's quite an additional burden of resources and management to manage
 the individual allocations with a range allocator above what is supposed
 to be a performance critical allocator to begin with.
I don't understand this.
 C++'s design seems reasonable in some ways, but history has demonstrated
 that it's a total failure, which is almost never actually used (I've
 certainly never seen anyone use it).
Agreed. I've seen some uses of it that quite fall within the notion of the proverbial exception that prove the rule.
 Some allocators that I use regularly to think about:

 A ring-buffer:
    * Like a region allocator I guess, but the circular nature adds some
 minor details, and requires the user to mark the heap from time to time,
 freeing all allocations between the old mark and the new mark.

 A pool:
    * Same as a free-list, but with a maximum size, ie, finite pool of
 objects pre-allocated and pre-populating the freelist.
I implemented the finite size for a freelist.
 A pool-group:
    * Allocate from a group of pools allocating differently sized
 objects. (this is a good test for allocator layering, supporting a group
 of pools above, and fallback to the malloc heap for large objects)
I implemented that as well, it's one of the best designs I've defined in my life. Andrei
Sep 22 2013
next sibling parent reply Brad Roberts <braddr puremagic.com> writes:
On 9/22/13 7:28 PM, Andrei Alexandrescu wrote:
 On 9/22/13 6:35 PM, Manu wrote:

 Are you intending to be able to associate a particular allocator with a
 class declaration?
No, or at least not at this level.
 What about a struct declaration?
Same answer.
 What about a region of code (ie, a call tree/branch).
 What if the given allocator should be used for most of the tree, except
 for a couple of things beneath that always want to use their explicit
 allocator?
The proposed design makes it easy to create allocator objects. How they are used and combined is left to the application.
I think the question he's asking, which a lot of people are anxiously anticipating is... how does the intersect with the parts of the language and core libraries that have been blocked (for a loose definition of blocked) waiting for the allocator design. Primarily, but far from exclusively, the container library. Yes, the allocator design is clearly a lower level component, but it's also far easier than the integration with the language and libraries.
Sep 22 2013
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 9/22/13 7:45 PM, Brad Roberts wrote:
 I think the question he's asking, which a lot of people are anxiously
 anticipating is... how does the intersect with the parts of the language
 and core libraries that have been blocked (for a loose definition of
 blocked) waiting for the allocator design.  Primarily, but far from
 exclusively, the container library.

 Yes, the allocator design is clearly a lower level component, but it's
 also far easier than the integration with the language and libraries.
Containers and other objects that want to allow customization of their allocation would be parameterized during compilation with an allocator type. Functions that need to allocate memory may similarly accept a parameter of allocator type. One possibility I'm keeping in mind is that of defining a dynamic interface (i.e. in the OOP sense) for a global allocator. Then people can globally change what allocator must be used for operator new invocations. The work at the level described so far is quite orthogonal on these high level choices. Right now it's all about a statically-typed allocator that is good at allocating and deallocating octets. Andrei
Sep 22 2013
parent reply Manu <turkeyman gmail.com> writes:
On 23 September 2013 13:01, Andrei Alexandrescu <
SeeWebsiteForEmail erdani.org> wrote:

 On 9/22/13 7:45 PM, Brad Roberts wrote:

 I think the question he's asking, which a lot of people are anxiously
 anticipating is... how does the intersect with the parts of the language
 and core libraries that have been blocked (for a loose definition of
 blocked) waiting for the allocator design.  Primarily, but far from
 exclusively, the container library.

 Yes, the allocator design is clearly a lower level component, but it's
 also far easier than the integration with the language and libraries.
Containers and other objects that want to allow customization of their allocation would be parameterized during compilation with an allocator type. Functions that need to allocate memory may similarly accept a parameter of allocator type.
You've just described C++ verbatim. And you've previously agreed that C++'s approach totally failed. Can you explain how this is different from C++, and how it solves the issues that defeated that design? One possibility I'm keeping in mind is that of defining a dynamic interface
 (i.e. in the OOP sense) for a global allocator. Then people can globally
 change what allocator must be used for operator new invocations.
Right, this is getting warmer. It's a stark contrast to what you suggest above though, and when I try and think this through, it gets very complex, very fast. I can't imagine how such a system could ever be declared safe. However, this is more or less what I want. I don't know how to reconcile the 2 requirements. How much have you thought on this? This is where I think some serious creativity will need to be applied... Following this train of thought, I can imagine a really nice end goal would be that the existing GC is effectively plugged in as a library, and people can easily substitute it for their own GC if they want/need to. The work at the level described so far is quite orthogonal on these high
 level choices. Right now it's all about a statically-typed allocator that
 is good at allocating and deallocating octets.


 Andrei
Sep 22 2013
next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 9/22/2013 9:19 PM, Manu wrote:
 Following this train of thought, I can imagine a really nice end goal would be
 that the existing GC is effectively plugged in as a library, and people can
 easily substitute it for their own GC if they want/need to.
It already is, and has been from the beginning. Rainer, for example, uses a different GC for VisualD. dmd knows naught about the GC.
Sep 22 2013
next sibling parent reply Manu <turkeyman gmail.com> writes:
On 23 September 2013 15:37, Walter Bright <newshound2 digitalmars.com>wrote:

 On 9/22/2013 9:19 PM, Manu wrote:

 Following this train of thought, I can imagine a really nice end goal
 would be
 that the existing GC is effectively plugged in as a library, and people
 can
 easily substitute it for their own GC if they want/need to.
It already is, and has been from the beginning. Rainer, for example, uses a different GC for VisualD. dmd knows naught about the GC.
Sure, true. But little things like separate it into its own lib, so you can easily link a different one, or not link one at all. Also, what about language support? How much language support is there currently? I don't think I could currently write the ref-counting GC that I'd like without extensive language support. inc/dec ref calls would need to be inserted all over the place by he compiler, and optimiser would need to know how to remove redundant inc/dec sequences.
Sep 23 2013
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 9/23/2013 3:17 AM, Manu wrote:
 Sure, true. But little things like separate it into its own lib, so you can
 easily link a different one, or not link one at all.
To link in a different one, put the different one on the link command before libphobos2.a. The different one will override the one in the library.
 Also, what about language support?
How it's done is deliberately not part of the language.
 How much language support is there currently?
Zero.
 I don't think I could currently write the ref-counting GC that I'd like without
 extensive language support.
 inc/dec ref calls would need to be inserted all over the place by he compiler,
 and optimiser would need to know how to remove redundant inc/dec sequences.
A ref counting system is something entirely different than just plugging in another GC.
Sep 23 2013
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 9/23/13 11:24 AM, Walter Bright wrote:
 On 9/23/2013 3:17 AM, Manu wrote:
 Sure, true. But little things like separate it into its own lib, so
 you can
 easily link a different one, or not link one at all.
To link in a different one, put the different one on the link command before libphobos2.a. The different one will override the one in the library.
 Also, what about language support?
How it's done is deliberately not part of the language.
 How much language support is there currently?
Zero.
I think there's at least indirect support in e.g. generating typeinfo objects appropriately. Many of the components of those typeinfos are expressly defined for helping a tracing collector. Andrei
Sep 23 2013
parent Walter Bright <newshound2 digitalmars.com> writes:
On 9/23/2013 12:09 PM, Andrei Alexandrescu wrote:
 I think there's at least indirect support in e.g. generating typeinfo objects
 appropriately. Many of the components of those typeinfos are expressly defined
 for helping a tracing collector.
Those are not language features. Also, the latest support for that is done with a library-defined template.
Sep 23 2013
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 9/22/13 10:37 PM, Walter Bright wrote:
 On 9/22/2013 9:19 PM, Manu wrote:
 Following this train of thought, I can imagine a really nice end goal
 would be
 that the existing GC is effectively plugged in as a library, and
 people can
 easily substitute it for their own GC if they want/need to.
It already is, and has been from the beginning. Rainer, for example, uses a different GC for VisualD.
Correct.
 dmd knows naught about the GC.
Well except for plugging a library call for calls to new. But that's expected. (I'm already pretending delete doesn't exist :o).) Andrei
Sep 23 2013
parent reply Manu <turkeyman gmail.com> writes:
On 24 September 2013 00:05, Andrei Alexandrescu <
SeeWebsiteForEmail erdani.org> wrote:

 On 9/22/13 10:37 PM, Walter Bright wrote:

 On 9/22/2013 9:19 PM, Manu wrote:

 Following this train of thought, I can imagine a really nice end goal
 would be
 that the existing GC is effectively plugged in as a library, and
 people can
 easily substitute it for their own GC if they want/need to.
It already is, and has been from the beginning. Rainer, for example, uses a different GC for VisualD.
Correct. dmd knows naught about the GC.

 Well except for plugging a library call for calls to new. But that's
 expected.

 (I'm already pretending delete doesn't exist :o).)
delete is important if your class is being allocated by a pool or something... But you said before, people won't use 'new' if they are using an allocator. I'm really not sure that's a good idea. Most people (library authors!) don't actually care about their memory allocation, they will continue to use 'new', just because it's a keyword. It also screws with generic code; X should be allocated with 'new', but Y should be allocated with yAllocator.alloc()? What if you decide that Z, which allocates with 'new', becomes a problem and you want to switch it into a pool? You now need to track down every instance of 'new Z', and change it.
Sep 23 2013
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 9/23/13 7:50 AM, Manu wrote:
 delete is important if your class is being allocated by a pool or
 something...
It's important as an API function, but not as a language primitive. "new" should also have been a function.
 But you said before, people won't use 'new' if they are using an
 allocator. I'm really not sure that's a good idea.
I don't see another way if people are to define and use multiple allocators.
 Most people (library authors!) don't actually care about their memory
 allocation, they will continue to use 'new', just because it's a keyword.
"new considered harmful" etc.
 It also screws with generic code; X should be allocated with 'new', but
 Y should be allocated with yAllocator.alloc()?
 What if you decide that Z, which allocates with 'new', becomes a problem
 and you want to switch it into a pool? You now need to track down every
 instance of 'new Z', and change it.
We can only improve that situation by allowing people to replace the global allocator with their own allocators. Again, there's a disconnect here - I'm discussing "how to make it easy for people to define allocators" and you discuss "how to make it possible for people to plug allocators, once defined, as the global allocator". These are two distinct endeavors. At the level I'm at, I'm concerned with making good allocators easy to implement. You may say you don't care, and that's good feedback, but it's what I have for the time being. Andrei
Sep 23 2013
parent reply Manu <turkeyman gmail.com> writes:
On 24 September 2013 01:02, Andrei Alexandrescu <
SeeWebsiteForEmail erdani.org> wrote:

 On 9/23/13 7:50 AM, Manu wrote:

 delete is important if your class is being allocated by a pool or
 something...
It's important as an API function, but not as a language primitive. "new" should also have been a function.
I like operator new. It goes blue, and the statement isn't riddled with exclamation marks. Imagine if new was a template (it would have to be a template). Allocating a template would require nested template syntax every time, which is pretty ugly. X x = new!(X!arg(args...)); // ewoo, paren spam... But you said before, people won't use 'new' if they are using an
 allocator. I'm really not sure that's a good idea.
I don't see another way if people are to define and use multiple allocators.
Well if an allocator is somehow associated with a class/struct, then 'new MyClass' would invoke that allocator. If you: pushThreadAllocator(myAllocator); X x = new X; popThreadAllocator(); Then new will use your thread-local allocator. Continue for global, fiber, etc. Maybe there's opportunity for 'scope' to offer some nice convenience here? Most people (library authors!) don't actually care about their memory
 allocation, they will continue to use 'new', just because it's a keyword.
"new considered harmful" etc. It also screws with generic code; X should be allocated with 'new', but
 Y should be allocated with yAllocator.alloc()?
 What if you decide that Z, which allocates with 'new', becomes a problem
 and you want to switch it into a pool? You now need to track down every
 instance of 'new Z', and change it.
We can only improve that situation by allowing people to replace the global allocator with their own allocators. Again, there's a disconnect here - I'm discussing "how to make it easy for people to define allocators" and you discuss "how to make it possible for people to plug allocators, once defined, as the global allocator". These are two distinct endeavors. At the level I'm at, I'm concerned with making good allocators easy to implement. You may say you don't care, and that's good feedback, but it's what I have for the time being.
Well, I'm discussing "how do people affect/override 'new' in various circumstances?" But sure, I said before, if we're limiting this discussion to the API of an allocator then it looks fine, I see no obvious issues. But I think the question requires consideration of the intended goal to make informed decisions about the design, even at this level.
Sep 23 2013
parent reply Jacob Carlborg <doob me.com> writes:
On 2013-09-23 17:47, Manu wrote:

 I like operator new. It goes blue, and the statement isn't riddled with
 exclamation marks.
For Ruby, many editors and IDE's highlight standard built in functions as keywords. Like "new", "require", "attr_accessor" (and friends) to mention a few.
 Imagine if new was a template (it would have to be a template).
 Allocating a template would require nested template syntax every time,
 which is pretty ugly.

    X x = new!(X!arg(args...)); // ewoo, paren spam...
Why not: X x = new!X(args); -- /Jacob Carlborg
Sep 23 2013
next sibling parent Dmitry Olshansky <dmitry.olsh gmail.com> writes:
23-Sep-2013 22:52, Jacob Carlborg пишет:
 On 2013-09-23 17:47, Manu wrote:
 Imagine if new was a template (it would have to be a template).
 Allocating a template would require nested template syntax every time,
 which is pretty ugly.

    X x = new!(X!arg(args...)); // ewoo, paren spam...
Why not: X x = new!X(args);
+1 Or call it make for the time being so as to not collide with the keyword. -- Dmitry Olshansky
Sep 23 2013
prev sibling parent reply kraybit <stdin kraybit.com> writes:
On 9/23/13 20:52 , Jacob Carlborg wrote:
 On 2013-09-23 17:47, Manu wrote:
    X x = new!(X!arg(args...)); // ewoo, paren spam...
Why not: X x = new!X(args);
I believe Manu's example was about X having template parameters as well. As in auto v = new Vec3!float(0,0,0); Either way, what would it be then? a) or b)? auto v = new!(Vec3!float)(0,0,0); // a) auto v = new!(Vec3!float(0,0,0)); // b) ? And which would be the same as: auto v = new!Vec3!float(0,0,0); ? I'd assume it's the same as a), no?
Sep 25 2013
next sibling parent kraybit <stdin kraybit.com> writes:
I mean, if one could get away with
	
	auto v = new! Vec3!float(0,0,0);
	
it'd be pretty nice?
Or perhaps patently confusing..
*more coffee*
Sep 25 2013
prev sibling next sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 9/25/13 9:13 AM, kraybit wrote:
 On 9/23/13 20:52 , Jacob Carlborg wrote:
 On 2013-09-23 17:47, Manu wrote:
    X x = new!(X!arg(args...)); // ewoo, paren spam...
Why not: X x = new!X(args);
I believe Manu's example was about X having template parameters as well. As in auto v = new Vec3!float(0,0,0); Either way, what would it be then? a) or b)? auto v = new!(Vec3!float)(0,0,0); // a) auto v = new!(Vec3!float(0,0,0)); // b) ? And which would be the same as: auto v = new!Vec3!float(0,0,0); ? I'd assume it's the same as a), no?
I think there's no need to change the syntax of new, as long as it's lowered into a template. Right now it's special cased inside the compiler. Andrei
Sep 25 2013
prev sibling parent Jacob Carlborg <doob me.com> writes:
On 2013-09-25 18:13, kraybit wrote:

 I believe Manu's example was about X having template parameters as well.
 As in
      auto v = new Vec3!float(0,0,0);


 Either way, what would it be then? a) or b)?
      auto v = new!(Vec3!float)(0,0,0); // a)
      auto v = new!(Vec3!float(0,0,0)); // b)
 ?


 And which would be the same as:
      auto v = new!Vec3!float(0,0,0);
 ?

 I'd assume it's the same as a), no?
Right. -- /Jacob Carlborg
Sep 26 2013
prev sibling parent reply Johannes Pfau <nospam example.com> writes:
Am Tue, 24 Sep 2013 00:50:02 +1000
schrieb Manu <turkeyman gmail.com>:

 [...] It also screws with generic code; X should be allocated with
 'new', but Y should be allocated with yAllocator.alloc()?
 What if you decide that Z, which allocates with 'new', becomes a
 problem and you want to switch it into a pool? You now need to track
 down every instance of 'new Z', and change it.
That's not a problem with the proposed GC allocator. We should add proper overloads to emplace and a "create" function and then all code should use create!(Allocator, MyClass)(args) or create!MyClass(allocatorInstance, args). New should then be considered as deprecated and replaced by create!(GCAllocator, Class)().
Sep 23 2013
parent reply Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
On 9/23/13, Johannes Pfau <nospam example.com> wrote:
 New should then be considered as deprecated and replaced by
 create!(GCAllocator, Class)().
What? That's never gonna happen. For one thing, it looks ugly as hell. And for another, it's going to break everything written in D.
Sep 23 2013
parent Johannes Pfau <nospam example.com> writes:
Am Mon, 23 Sep 2013 18:36:57 +0200
schrieb Andrej Mitrovic <andrej.mitrovich gmail.com>:

 On 9/23/13, Johannes Pfau <nospam example.com> wrote:
 New should then be considered as deprecated and replaced by
 create!(GCAllocator, Class)().
What? That's never gonna happen. For one thing, it looks ugly as hell. And for another, it's going to break everything written in D.
That's why I say "considered" and not really deprecated. But if you want to / have to write allocator aware code which can use the GC it's a nice solution: auto list(Payload, Allocator = GCAllocator)() { create!(Allocator) ... } Of course the API should be improved. For example create could default to the GC allocator. Then it's "auto a = new Class(...)" vs "auto a = create!Class(...)". But IIRC when emplace was first implemented and class allocators were removed it was quite clear that new could be easily replaced by a template function and has no real place as a builtin anymore. It's just too late to remove it.
Sep 23 2013
prev sibling next sibling parent "QAston" <qaston gmail.com> writes:
On Monday, 23 September 2013 at 04:19:34 UTC, Manu wrote:
 You've just described C++ verbatim.
 And you've previously agreed that C++'s approach totally 
 failed. Can you
 explain how this is different from C++, and how it solves the 
 issues that
 defeated that design?

 One possibility I'm keeping in mind is that of defining a 
 dynamic interface
Once you have static allocator you can go dynamic easily: interface DynamicAllocator // isAllocator==true { // whatever the interface is } struct StaticAllocator // isAllocator==true { } struct Container(ALLOCATOR = DynamicAllocator) if (isAllocator!ALLOCATOR) { this(ALLOCATOR a){} } class DynamicAllocatorWrapper(T) : DynamicAllocator if (is(T) == struct) { // wraps any static allocator to be callable using dynamic interface } With that boilerplate you can easily create allocator agnostic containers: auto c = Container(new DynamicAllocatorWrapper!(StaticAllocator)()) and you can create specialized version if you need speed very much: auto c = Container!(StaticAllocator)(StaticAllocator());
Sep 23 2013
prev sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 9/22/13 9:19 PM, Manu wrote:
 On 23 September 2013 13:01, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org <mailto:SeeWebsiteForEmail erdani.org>>
 wrote:
     Containers and other objects that want to allow customization of
     their allocation would be parameterized during compilation with an
     allocator type. Functions that need to allocate memory may similarly
     accept a parameter of allocator type.

 You've just described C++ verbatim.
 And you've previously agreed that C++'s approach totally failed. Can you
 explain how this is different from C++, and how it solves the issues
 that defeated that design?
As I just wrote in my previous post, the two mistakes in C++'s allocator design are: 1. Allocators are parameterized by type. 2. Allocators cannot have state. I fix both.
     One possibility I'm keeping in mind is that of defining a dynamic
     interface (i.e. in the OOP sense) for a global allocator. Then
     people can globally change what allocator must be used for operator
     new invocations.

 Right, this is getting warmer. It's a stark contrast to what you suggest
 above though, and when I try and think this through, it gets very
 complex, very fast.
 I can't imagine how such a system could ever be declared safe. However,
 this is more or less what I want. I don't know how to reconcile the 2
 requirements.
There are possibilities, which I hope to get to soon.
 How much have you thought on this? This is where I think some serious
 creativity will need to be applied...

 Following this train of thought, I can imagine a really nice end goal
 would be that the existing GC is effectively plugged in as a library,
 and people can easily substitute it for their own GC if they want/need to.
Sean has already done that. The current GC is entirely replaceable (and in particular extricable). Andrei
Sep 23 2013
prev sibling parent reply Manu <turkeyman gmail.com> writes:
On 23 September 2013 12:28, Andrei Alexandrescu <
SeeWebsiteForEmail erdani.org> wrote:

 On 9/22/13 6:35 PM, Manu wrote:

 Well it looks like a good start, but the thing I'm left wondering after
 reading this is still... how is it actually used?
 I think the greatest challenge is finding a simple, clean
Oxford comma please :o) and correct
 way to actually specify which allocator should be used for making
 allocations throughout your code, and perhaps more troublesome; within
 generic code, and then layers of generic code.
My design makes it very easy to experiment by allowing one to define complex allocators out of a few simple building blocks. It is not a general-purpose allocator, but it allows one to define any number of such.
Oh okay, so this isn't really intended as a system then, so much a suggested API? That makes almost all my questions redundant. I'm interested in the system, not the API of a single allocator (although your API looks fine to me). I already have allocators I use in my own code. Naturally, they don't inter-operate with anything, and that's what I thought std.allocator was meant to address. Are you intending to be able to associate a particular allocator with a
 class declaration?
No, or at least not at this level. What about a struct declaration?

 Same answer.


  What about a region of code (ie, a call tree/branch).
 What if the given allocator should be used for most of the tree, except
 for a couple of things beneath that always want to use their explicit
 allocator?
The proposed design makes it easy to create allocator objects. How they are used and combined is left to the application.
Is that the intended limit of std.allocator's responsibility, or will patterns come later? Leaving the usage up to the application means we've gained nothing. I already have more than enough allocators which I use throughout my code. The problem is that they don't inter-operate, and certainly not with foreign code/libraries. This is what I hoped std.allocator would address. What if I want to associate an allocator instance, not just an allocator
 type (ie, I don't want multiple instances of the same type(/s) of
 allocators in my code, how are they shared?
An allocator instance is a variable like any other. So you use the classic techniques (shared globals, thread-local globals, passing around as parameter) for using the same allocator object from multiple places.
Okay, that's fine... but this sort of manual management implies that I'm using it explicitly. That's where it all falls down for me. Eg, I want to use a library, it's allocation patterns are incompatible with my application; I need to provide it with an allocator. What now? Is every library responsible for presenting the user with a mechanism for providing allocators? What if the author forgets? (a problem I've frequently had to chase up in the past when dealing with 3rd party libraries) Once a library is designed to expect a user to supply an allocator, what happens if the user doesn't? Fall-back logic/boilerplate exists in every library I guess... And does that mean that applications+libraries are required to ALWAYS allocate through given allocator objects? That effectively makes the new keyword redundant. And what about the GC? I can't really consider std.allocator intil it presents some usage patterns. It wasn't clear to me from your demonstration, but 'collect()' implies
 that GC becomes allocator-aware; how does that work?
No, each allocator has its own means of dealing with memory. One could define a tracing allocator independent of the global GC.
I'm not sure what this means. Other than I gather that the GC and allocators are fundamentally separate? Is it possible to create a tracing allocator without language support? Does the current language insert any runtime calls to support the GC? I want a ref-counting GC for instance to replace the existing GC, but it's impossible to implement one of them nicely without support from the language, to insert implicit inc/dec ref calls all over the place, and to optimise away redundant inc/dec sequences. deallocateAll() and collect() may each free a whole lot of memory, but
 it seems to me that they may not actually be aware of the individual
 allocations they are freeing; how do the appropriate destructors for the
 sub-allocations get called?
No destructors are called at this level. Again, all these allocators understand is ubyte[].
Fair enough. That's a problem for another time then. I have a suspicion you're going to answer most of these questions with
 the concept of allocator layering, but I just don't completely see it.
No, it's just that at this level some of these questions don't even have an appropriate answer - like we discuss atoms and molecules and you ask about building floors and beams and pillars.
Yep, fair enough. I just saw "I am making good progress on the design of std.allocator" and presumed you had some thoughts about actual usage semantics of the system. I can easily define an allocator to use in my own code if it's entirely up to me how I use it, but that completely defeats the purpose of this exercise. Until there aren't standard usage patterns, practises, conventions that ALL code follows, then we have nothing. I was hoping to hear your thoughts about those details. It's quite an additional burden of resources and management to manage
 the individual allocations with a range allocator above what is supposed
 to be a performance critical allocator to begin with.
I don't understand this.
It's irrelevant here. But fwiw, in relation to the prior point about block-freeing a range allocation; there will be many *typed* allocations within these ranges, but a typical range allocator doesn't keep track of the allocations within. This seems like a common problem that may or may not want to be addressed in std.allocator. If the answer is simply "your range allocator should keep track of the offsets of allocations, and their types", then fine. But that seems like boilerplate that could be automated, or maybe there is a different/separate system for such tracking? C++'s design seems reasonable in some ways, but history has demonstrated
 that it's a total failure, which is almost never actually used (I've
 certainly never seen anyone use it).
Agreed. I've seen some uses of it that quite fall within the notion of the proverbial exception that prove the rule.
I think the main fail of C++'s design is that it mangles the type. I don't think a type should be defined by the way it's memory is allocated, especially since that could change from application to application, or even call to call. For my money, that's the fundamental flaw in C++'s design. Some allocators that I use regularly to think about:
 A ring-buffer:
    * Like a region allocator I guess, but the circular nature adds some
 minor details, and requires the user to mark the heap from time to time,
 freeing all allocations between the old mark and the new mark.

 A pool:
    * Same as a free-list, but with a maximum size, ie, finite pool of
 objects pre-allocated and pre-populating the freelist.
I implemented the finite size for a freelist. A pool-group:
    * Allocate from a group of pools allocating differently sized
 objects. (this is a good test for allocator layering, supporting a group
 of pools above, and fallback to the malloc heap for large objects)
I implemented that as well, it's one of the best designs I've defined in my life.
Well as an atom, as you say, it seems like a good first step. I can't see any obvious issues, although I don't think I quite understand the collect() function if it has no relation to the GC. What is it's purpose? If the idea is that you might implement some sort of tracking heap which is able to perform a collect, how is that actually practical without language support? I had imagined going into this that, like the range interface which the _language_ understands and interacts with, the allocator interface would be the same, ie, the language would understand this API and integrate it with 'new', and the GC... somehow. If allocators are just an object like in C++ that people may or may not use, I don't think it'll succeed as a system. I reckon it needs deep language integration to be truly useful. The key problem to solve is the friction between different libraries, and different moments within a single application its self. I feel almost like the 'current' allocator needs to be managed as some sort of state-machine. Passing them manually down the callstack is no good. And 'hard' binding objects to their allocators like C++ is no good either.
Sep 22 2013
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 9/22/13 9:03 PM, Manu wrote:
 On 23 September 2013 12:28, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org <mailto:SeeWebsiteForEmail erdani.org>>
 wrote:
     My design makes it very easy to experiment by allowing one to define
     complex allocators out of a few simple building blocks. It is not a
     general-purpose allocator, but it allows one to define any number of
     such.

 Oh okay, so this isn't really intended as a system then, so much a
 suggested API?
For some definition of "system" and "API", yes :o).
 That makes almost all my questions redundant. I'm interested in the
 system, not the API of a single allocator (although your API looks fine
 to me).
 I already have allocators I use in my own code. Naturally, they don't
 inter-operate with anything, and that's what I thought std.allocator was
 meant to address.
Great. Do you have a couple of nontrivial allocators (heap, buddy system etc) that could be adapted to the described API?
     The proposed design makes it easy to create allocator objects. How
     they are used and combined is left to the application.

 Is that the intended limit of std.allocator's responsibility, or will
 patterns come later?
Some higher level design will come later. I'm not sure whether or not you'll find it satisfying, for reasons I'll expand on below.
 Leaving the usage up to the application means we've gained nothing.
 I already have more than enough allocators which I use throughout my
 code. The problem is that they don't inter-operate, and certainly not
 with foreign code/libraries.
 This is what I hoped std.allocator would address.
Again, if you already have many allocators, please let me know if you can share some. std.allocator will prescribe a standard for defining allocators, with which the rest of std will work, same as std.range prescribes a standard for defining ranges, with which std.algorithm, std.format, and other modules work. Clearly one could come back with "but I already have my own ranges that use first/done/next instead of front/empty/popFront, so I'm not sure what we're gaining here".
     An allocator instance is a variable like any other. So you use the
     classic techniques (shared globals, thread-local globals, passing
     around as parameter) for using the same allocator object from
     multiple places.


 Okay, that's fine... but this sort of manual management implies that I'm
 using it explicitly. That's where it all falls down for me.
I think a disconnect here is that you think "it" where I think "them". It's natural for an application to use one allocator that's not provided by the standard library, and it's often the case that an application defines and uses _several_ allocators for different parts of it. Then the natural question arises, how to deal with these allocators, pass them around, etc. etc.
 Eg, I want to use a library, it's allocation patterns are incompatible
 with my application; I need to provide it with an allocator.
 What now? Is every library responsible for presenting the user with a
 mechanism for providing allocators? What if the author forgets? (a
 problem I've frequently had to chase up in the past when dealing with
 3rd party libraries)
If the author forgets and hardcodes a library to use malloc(), I have no way around that.
 Once a library is designed to expect a user to supply an allocator, what
 happens if the user doesn't? Fall-back logic/boilerplate exists in every
 library I guess...
The library wouldn't need to worry as there would be the notion of a default allocator (probably backed by the existing GC).
 And does that mean that applications+libraries are required to ALWAYS
 allocate through given allocator objects?
Yes, they should.
 That effectively makes the new keyword redundant.
new will still be used to tap into the global shared GC. std.allocator will provide other means of allocating memory.
 And what about the GC?
The current global GC is unaffected for the time being.
 I can't really consider std.allocator intil it presents some usage patterns.
Then you'd need to wait a little bit.
         It wasn't clear to me from your demonstration, but 'collect()'
         implies
         that GC becomes allocator-aware; how does that work?


     No, each allocator has its own means of dealing with memory. One
     could define a tracing allocator independent of the global GC.


 I'm not sure what this means. Other than I gather that the GC and
 allocators are fundamentally separate?
Yes, they'd be distinct. Imagine an allocator that requests 4 MB from the GC as NO_SCAN memory, and then does its own management inside that block. User-level code allocates and frees e.g. strings or whatever from that block, without the global GC intervening.
 Is it possible to create a tracing allocator without language support?
I think it is possible.
 Does the current language insert any runtime calls to support the GC?
Aside from operator new, I don't think so.
 I want a ref-counting GC for instance to replace the existing GC, but
 it's impossible to implement one of them nicely without support from the
 language, to insert implicit inc/dec ref calls all over the place, and
 to optimise away redundant inc/dec sequences.
Unfortunately that's a chymera I had to abandon, at least at this level. The problem is that installing an allocator does not get to define what a pointer is and what a reference is. These are notions hardwired into the language, so the notion of turning a switch and replacing the global GC with a reference counting scheme is impossible at the level of a library API. (As an aside, you still need tracing for collecting cycles in a transparent reference counting scheme, so it's not all roses.) What I do hope to get to is to have allocators define their own pointers and reference types. User code that uses those will be guaranteed certain allocation behaviors.
 I can easily define an allocator to use in my own code if it's entirely
 up to me how I use it, but that completely defeats the purpose of this
 exercise.
It doesn't. As long as the standard prescribes ONE specific API for defining untyped allocators, if you define your own to satisfy that API, then you'll be able to use your allocator with e.g. std.container, just the same as defining your own range as std.range requires allows you to tap into std.algorithm.
 Until there aren't standard usage patterns, practises, conventions that
 ALL code follows, then we have nothing. I was hoping to hear your
 thoughts about those details.
         It's quite an additional burden of resources and management to
         manage
         the individual allocations with a range allocator above what is
         supposed
         to be a performance critical allocator to begin with.


     I don't understand this.


 It's irrelevant here.
 But fwiw, in relation to the prior point about block-freeing a range
 allocation;
What is a "range allocation"?
 there will be many *typed* allocations within these ranges,
 but a typical range allocator doesn't keep track of the allocations within.
Do you mean s/range/region/?
 This seems like a common problem that may or may not want to be
 addressed in std.allocator.
 If the answer is simply "your range allocator should keep track of the
 offsets of allocations, and their types", then fine. But that seems like
 boilerplate that could be automated, or maybe there is a
 different/separate system for such tracking?
If you meant region, then yes that's boilerplate that hopefully will be reasonably automated by std.allocator. (What I discussed so far predates that stage of the design.)
         C++'s design seems reasonable in some ways, but history has
         demonstrated
         that it's a total failure, which is almost never actually used (I've
         certainly never seen anyone use it).


     Agreed. I've seen some uses of it that quite fall within the notion
     of the proverbial exception that prove the rule.


 I think the main fail of C++'s design is that it mangles the type.
 I don't think a type should be defined by the way it's memory is
 allocated, especially since that could change from application to
 application, or even call to call. For my money, that's the fundamental
 flaw in C++'s design.
This is not a flaw as much as an engineering choice with advantages and disadvantages on the relative merits of which reasonable people may disagree. There are two _fundamental_ flaws of the C++ allocator design, in the sense that they are very difficult to argue in favor of and relatively easy to argue against: 1. Allocators are parameterized by type; instead, individual allocations should be parameterized by type. 2. There is no appropriate handling for allocators with state. The proposed std.allocator design deals with (2) with care, and will deal with (1) when it gets to typed allocators.
 Well as an atom, as you say, it seems like a good first step.
 I can't see any obvious issues, although I don't think I quite
 understand the collect() function if it has no relation to the GC. What
 is it's purpose?
At this point collect() is only implemented by the global GC. It is possible I'll drop it from the final design. However, it's also possible that collect() will be properly defined as "collect all objects allocated within this particular allocator that are not referred from any objects also allocated within this allocator". I think that's a useful definition.
 If the idea is that you might implement some sort of tracking heap which
 is able to perform a collect, how is that actually practical without
 language support?
Language support would be needed for things like scanning the stack and the globals. But one can gainfully use a heap with semantics as described just above, which requires no language support.
 I had imagined going into this that, like the range interface which the
 _language_ understands and interacts with, the allocator interface would
 be the same, ie, the language would understand this API and integrate it
 with 'new', and the GC... somehow.
The D language has no idea what a range is. The notion is completely defined in std.range.
 If allocators are just an object like in C++ that people may or may not
 use, I don't think it'll succeed as a system. I reckon it needs deep
 language integration to be truly useful.
I guess that's to be seen.
 The key problem to solve is the friction between different libraries,
 and different moments within a single application its self.
 I feel almost like the 'current' allocator needs to be managed as some
 sort of state-machine. Passing them manually down the callstack is no
 good. And 'hard' binding objects to their allocators like C++ is no good
 either.
I think it's understood that if a library chooses its own ways to allocate memory, there's no way around that. The point of std.allocator is that it defines a common interface that user code can work with. Andrei
Sep 23 2013
next sibling parent reply "Simen Kjaeraas" <simen.kjaras gmail.com> writes:
On 2013-09-23, 15:58, Andrei Alexandrescu wrote:

 I had imagined going into this that, like the range interface which the
 _language_ understands and interacts with, the allocator interface would
 be the same, ie, the language would understand this API and integrate it
 with 'new', and the GC... somehow.
The D language has no idea what a range is. The notion is completely defined in std.range.
The language has some knowledge of ranges, or foreach (e; myRange) would not work. -- Simen
Sep 23 2013
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdan.org> writes:
"Simen Kjaeraas" <simen.kjaras gmail.com> wrote:
 On 2013-09-23, 15:58, Andrei Alexandrescu wrote:
 
 I had imagined going into this that, like the range interface which the
 _language_ understands and interacts with, the allocator interface would
 be the same, ie, the language would understand this API and integrate it
 with 'new', and the GC... somehow.
The D language has no idea what a range is. The notion is completely > defined in std.range.
The language has some knowledge of ranges, or foreach (e; myRange) would not work.
Great point. I amend my claim. Andrei
Sep 23 2013
parent Manu <turkeyman gmail.com> writes:
On 24 September 2013 00:49, Andrei Alexandrescu <
SeeWebsiteForEmail erdan.org> wrote:

 "Simen Kjaeraas" <simen.kjaras gmail.com> wrote:
 On 2013-09-23, 15:58, Andrei Alexandrescu wrote:

 I had imagined going into this that, like the range interface which the
 _language_ understands and interacts with, the allocator interface
would
 be the same, ie, the language would understand this API and integrate
it
 with 'new', and the GC... somehow.
The D language has no idea what a range is. The notion is completely > defined in std.range.
The language has some knowledge of ranges, or foreach (e; myRange) would not work.
Great point. I amend my claim.
Mmm, precisely what I was talking about.
Sep 23 2013
prev sibling next sibling parent reply Manu <turkeyman gmail.com> writes:
On 23 September 2013 23:58, Andrei Alexandrescu <
SeeWebsiteForEmail erdani.org> wrote:

 On 9/22/13 9:03 PM, Manu wrote:

 On 23 September 2013 12:28, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org
<mailto:SeeWebsiteForEmail **erdani.org<SeeWebsiteForEmail erdani.org>

wrote: My design makes it very easy to experiment by allowing one to define complex allocators out of a few simple building blocks. It is not a general-purpose allocator, but it allows one to define any number of such. Oh okay, so this isn't really intended as a system then, so much a suggested API?
For some definition of "system" and "API", yes :o). That makes almost all my questions redundant. I'm interested in the
 system, not the API of a single allocator (although your API looks fine
 to me).
 I already have allocators I use in my own code. Naturally, they don't
 inter-operate with anything, and that's what I thought std.allocator was
 meant to address.
Great. Do you have a couple of nontrivial allocators (heap, buddy system etc) that could be adapted to the described API?
Err, not really actually. When I use custom allocator's, it's for performance, which basically implies that it IS a trivial allocator :) The common ones I use are: stack-based mark&release, circular buffers, pools, pool groups (collection of different sized pools)... that might be it actually. Very simple tools for different purposes. The proposed design makes it easy to create allocator objects. How
     they are used and combined is left to the application.

 Is that the intended limit of std.allocator's responsibility, or will
 patterns come later?
Some higher level design will come later. I'm not sure whether or not you'll find it satisfying, for reasons I'll expand on below. Leaving the usage up to the application means we've gained nothing.
 I already have more than enough allocators which I use throughout my
 code. The problem is that they don't inter-operate, and certainly not
 with foreign code/libraries.
 This is what I hoped std.allocator would address.
Again, if you already have many allocators, please let me know if you can share some. std.allocator will prescribe a standard for defining allocators, with which the rest of std will work, same as std.range prescribes a standard for defining ranges, with which std.algorithm, std.format, and other modules work. Clearly one could come back with "but I already have my own ranges that use first/done/next instead of front/empty/popFront, so I'm not sure what we're gaining here".
No, it's just that I'm saying std.allocator needs to do a lot more than define a contract before I can start to consider if it solves my problems. This is a good first step though, I'm happy to discuss this, but I think discussion about the practical application may also reveal design details at this level. It's like you say, I can rename my allocator's methods to suit an agreed standard, that'll take me 2 minutes, but it's how the rest of the universe interacts with that API that matters, and if it effectively solves my problems. An allocator instance is a variable like any other. So you use the
     classic techniques (shared globals, thread-local globals, passing
     around as parameter) for using the same allocator object from
     multiple places.


 Okay, that's fine... but this sort of manual management implies that I'm
 using it explicitly. That's where it all falls down for me.
I think a disconnect here is that you think "it" where I think "them". It's natural for an application to use one allocator that's not provided by the standard library, and it's often the case that an application defines and uses _several_ allocators for different parts of it. Then the natural question arises, how to deal with these allocators, pass them around, etc. etc.
No, I certainly understand you mean 'them', but you lead to what I'm asking, how do these things get carried/passed around. Are they discreet, or will they invade argument lists everywhere? Are they free to flow in/out of libraries in a natural way? These patterns are what will define the system as I see it. Perhaps more importantly, where do these allocators get their memory themselves (if they're not a bottom-level allocator)? Global override perhaps, or should a memory source always be explicitly provided to a non-bottom-level allocator? Eg, I want to use a library, it's allocation patterns are incompatible
 with my application; I need to provide it with an allocator.
 What now? Is every library responsible for presenting the user with a
 mechanism for providing allocators? What if the author forgets? (a
 problem I've frequently had to chase up in the past when dealing with
 3rd party libraries)
If the author forgets and hardcodes a library to use malloc(), I have no way around that.
Sure, but the common case is that the author will almost certainly use keyword 'new'. How can I affect that as a 3rd party? This would require me overriding the global allocator somehow... which you touched on earlier. Once a library is designed to expect a user to supply an allocator, what
 happens if the user doesn't? Fall-back logic/boilerplate exists in every
 library I guess...
The library wouldn't need to worry as there would be the notion of a default allocator (probably backed by the existing GC).
Right. So it's looking like like the ability to override the global allocator is a critical requirement. And does that mean that applications+libraries are required to ALWAYS
 allocate through given allocator objects?
Yes, they should.
Then we make keyword 'new' redundant? That effectively makes the new keyword redundant.

 new will still be used to tap into the global shared GC. std.allocator
 will provide other means of allocating memory.
I think the system will fail here. People will use 'new', siomply because it's a keyword. Once that's boxed in a library, I will no longer be able to affect that inconsiderate behaviour from my application. Again, I think this signals that a global override is necessary. And what about the GC?

 The current global GC is unaffected for the time being.


  I can't really consider std.allocator intil it presents some usage
 patterns.
Then you'd need to wait a little bit.
Okay. It wasn't clear to me from your demonstration, but 'collect()'
         implies
         that GC becomes allocator-aware; how does that work?


     No, each allocator has its own means of dealing with memory. One
     could define a tracing allocator independent of the global GC.


 I'm not sure what this means. Other than I gather that the GC and
 allocators are fundamentally separate?
Yes, they'd be distinct. Imagine an allocator that requests 4 MB from the GC as NO_SCAN memory, and then does its own management inside that block. User-level code allocates and frees e.g. strings or whatever from that block, without the global GC intervening.
Yup, that's fine. But what if the GC isn't the bottom level? There's just another allocator underneath. What I'm saying is, the GC should *be* an allocator, not be a separate entity. I want to eliminate the GC from my application. Ideally, in the future, it can be replaced with an ARC, which I have become convinced is the right choice for my work. Is it possible to create a tracing allocator without language support?

 I think it is possible.


  Does the current language insert any runtime calls to support the GC?

 Aside from operator new, I don't think so.
Okay, so a flexible lowering of 'new' is all we need for now? It will certainly need substantially more language support for ARC. I want a ref-counting GC for instance to replace the existing GC, but
 it's impossible to implement one of them nicely without support from the
 language, to insert implicit inc/dec ref calls all over the place, and
 to optimise away redundant inc/dec sequences.
Unfortunately that's a chymera I had to abandon, at least at this level.
And there's the part you said I'm not going to like? ;) The problem is that installing an allocator does not get to define what a
 pointer is and what a reference is.
Why not? A pointer has a type, like anything else. An ARC pointer can theoretically have the compiler insert ARC magic. That does imply though that the allocator affects the type, which I don't like... I'll think on it. These are notions hardwired into the language, so the notion of turning a
 switch and replacing the global GC with a reference counting scheme is
 impossible at the level of a library API.
Indeed it is. So is this API being built upon an incomplete foundation? Is there something missing, and can it be added later, or will this design cement some details that might need changing in the future? (we all know potentially breaking changes like that will never actually happen) (As an aside, you still need tracing for collecting cycles in a transparent
 reference counting scheme, so it's not all roses.)
It's true, but it's possible to explicitly control all those factors. It remains deterministic. What I do hope to get to is to have allocators define their own pointers
 and reference types. User code that uses those will be guaranteed certain
 allocation behaviors.
Interesting, will this mangle the pointer type, or the object type being pointed to? The latter is obviously not desirable. Does the former actually work in theory? I can easily define an allocator to use in my own code if it's entirely
 up to me how I use it, but that completely defeats the purpose of this
 exercise.
It doesn't. As long as the standard prescribes ONE specific API for defining untyped allocators, if you define your own to satisfy that API, then you'll be able to use your allocator with e.g. std.container, just the same as defining your own range as std.range requires allows you to tap into std.algorithm.
I realise this. That's all fine. Until there aren't standard usage patterns, practises, conventions that
 ALL code follows, then we have nothing. I was hoping to hear your
 thoughts about those details.
It's quite an additional burden of resources and management to
         manage
         the individual allocations with a range allocator above what is
         supposed
         to be a performance critical allocator to begin with.


     I don't understand this.


 It's irrelevant here.
 But fwiw, in relation to the prior point about block-freeing a range
 allocation;
What is a "range allocation"? there will be many *typed* allocations within these ranges,
 but a typical range allocator doesn't keep track of the allocations
 within.
Do you mean s/range/region/?
Yes. This seems like a common problem that may or may not want to be
 addressed in std.allocator.
 If the answer is simply "your range allocator should keep track of the
 offsets of allocations, and their types", then fine. But that seems like
 boilerplate that could be automated, or maybe there is a
 different/separate system for such tracking?
If you meant region, then yes that's boilerplate that hopefully will be reasonably automated by std.allocator. (What I discussed so far predates that stage of the design.) C++'s design seems reasonable in some ways, but history has
         demonstrated
         that it's a total failure, which is almost never actually used
 (I've
         certainly never seen anyone use it).


     Agreed. I've seen some uses of it that quite fall within the notion
     of the proverbial exception that prove the rule.


 I think the main fail of C++'s design is that it mangles the type.
 I don't think a type should be defined by the way it's memory is
 allocated, especially since that could change from application to
 application, or even call to call. For my money, that's the fundamental
 flaw in C++'s design.
This is not a flaw as much as an engineering choice with advantages and disadvantages on the relative merits of which reasonable people may disagree. There are two _fundamental_ flaws of the C++ allocator design, in the sense that they are very difficult to argue in favor of and relatively easy to argue against: 1. Allocators are parameterized by type; instead, individual allocations should be parameterized by type. 2. There is no appropriate handling for allocators with state. The proposed std.allocator design deals with (2) with care, and will deal with (1) when it gets to typed allocators.
Fair enough. These are certainly more critical mistakes than the one I raised. I'm trying to remember the details of the practical failures I ran into trying to use C++ allocators years ago. Eventually, experience proved to us (myself and colleagues) that it wasn't worth the mess, and we simply pursued a more direct solution. I've heard similar stories from friends in other companies... I need to try and recall the specific scenarios though, they might be interesting :/ .. (going back the better part of a decade >_<) Well as an atom, as you say, it seems like a good first step.
 I can't see any obvious issues, although I don't think I quite
 understand the collect() function if it has no relation to the GC. What
 is it's purpose?
At this point collect() is only implemented by the global GC. It is possible I'll drop it from the final design. However, it's also possible that collect() will be properly defined as "collect all objects allocated within this particular allocator that are not referred from any objects also allocated within this allocator". I think that's a useful definition.
Perhaps. I'm not sure how this situation arises though. Unless you've managed to implement your own GC inside an allocator. If the idea is that you might implement some sort of tracking heap which
 is able to perform a collect, how is that actually practical without
 language support?
Language support would be needed for things like scanning the stack and the globals. But one can gainfully use a heap with semantics as described just above, which requires no language support. I had imagined going into this that, like the range interface which the
 _language_ understands and interacts with, the allocator interface would
 be the same, ie, the language would understand this API and integrate it
 with 'new', and the GC... somehow.
The D language has no idea what a range is. The notion is completely defined in std.range. If allocators are just an object like in C++ that people may or may not
 use, I don't think it'll succeed as a system. I reckon it needs deep
 language integration to be truly useful.
I guess that's to be seen.
I think a critical detail to keep in mind, is that (I suspect) people simply won't use it if it doesn't interface with keyword 'new'. It also complicates generic code, and makes it more difficult to retrofit an allocator where 'new' is already in use. The key problem to solve is the friction between different libraries,
 and different moments within a single application its self.
 I feel almost like the 'current' allocator needs to be managed as some
 sort of state-machine. Passing them manually down the callstack is no
 good. And 'hard' binding objects to their allocators like C++ is no good
 either.
I think it's understood that if a library chooses its own ways to allocate memory, there's no way around that.
Are we talking here about explicit choice for sourcing memory, or just that the library allocates through the default/GC? This is the case where I like to distinguish a bottom-level allocator from a high-level allocator. A library probably wants to use some patterns for allocation of it's object, these are high-level allocators, but where it sources it's memory from still needs to be overridable. It's extremely common that I want to enforce that a library exist entirely within a designated heap. It can't fall back to the global GC. I work on platforms where memory is not unified. Different resources need to go into different heaps. It has happened on numerous occasions that we have been denied a useful library simply because the author did not provide allocation hooks, and the author was not responsive to requests... leading to my favourite scenario of re-inventing yet another wheel (the story of my career). It shouldn't be the case that the author has to manually account for the possibility that someone might want to provide a heap for the libraries resources. This is equally true for the filesystem (a gripe I haven't really raised in D yet). The point of std.allocator is that it defines a common interface that user
 code can work with.


 Andrei
Sep 23 2013
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 9/23/13 8:32 AM, Manu wrote:
 On 23 September 2013 23:58, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org <mailto:SeeWebsiteForEmail erdani.org>>
 wrote:
 This is a good first step though, I'm happy to discuss this, but I think
 discussion about the practical application may also reveal design
 details at this level.
Absolutely. This is refreshing since I've gone through the cycle of "types must be integrated into allocators and with the global GC" ... -> ... "untyped allocators can actually be extricated" once. It is now great to revisit the assumptions again. One matter I'm rather surprised didn't come up (I assumed everyone would be quite curious about it) is that the allocator does not store the allocated size, or at least does not allow it to be discovered easily. This is a definitely appropriate topic for the design at hand. The malloc-style APIs go like: void* malloc(size_t size); void free(void*); Note that the user doesn't pass the size to free(), which means the allocator is burdened with inferring the size of the block from the pointer efficiently. Given that most allocators make crucial strategic choices depending on the size requested, this API shortcoming is a bane of allocator writers everywhere, and a source of inefficiency in virtually all contemporary malloc-style allocators. This is most ironic since there is next to nothing that user code can do with a pointer without knowing the extent of memory addressed by it. (A notable exception is zero-terminated strings, another questionable design decision.) It all harkens back to Walter's claim of "C's biggest mistake" (with which I agree) of not defining a type for a bounded memory region. Upon studying a few extant allocators and D's lore, I decided that in D things have evolved sufficiently to have the user pass the size upon deallocation as well: void[] allocate(size_t size); void deallocate(void[] buffer); This is because the size of D objects is naturally known: classes have it in the classinfo, slices store it, and the few cases of using bald pointers for allocation are irrelevant and unrecommended. This all makes memory allocation in D a problem fundamentally simpler than in C, which is quite an interesting turn :o).
 No, I certainly understand you mean 'them', but you lead to what I'm
 asking, how do these things get carried/passed around. Are they
 discreet, or will they invade argument lists everywhere?
Since there's a notion of "the default" allocator, there will be ways to push/pop user-defined allocators that temporarily (or permanently) replace the default allocator. This stage of the design is concerned with allowing users to define such user-defined allocators without having to implement them from scratch.
 Are they free
 to flow in/out of libraries in a natural way?
 These patterns are what will define the system as I see it.
 Perhaps more importantly, where do these allocators get their memory
 themselves (if they're not a bottom-level allocator)? Global override
 perhaps, or should a memory source always be explicitly provided to a
 non-bottom-level allocator?
There are a few obvious bottom allocators, such as malloc, new, sbrk, or Windows heap. All other allocators will compose by parameterizing on their parent allocator.
 Sure, but the common case is that the author will almost certainly use
 keyword 'new'. How can I affect that as a 3rd party?
 This would require me overriding the global allocator somehow... which
 you touched on earlier.
The way I'm thinking of this is to install/uninstall user-defined allocators that will satisfy calls to new. Since not all allocators support tracing, some would require the user to deallocate stuff manually.
     The library wouldn't need to worry as there would be the notion of a
     default allocator (probably backed by the existing GC).

 Right. So it's looking like like the ability to override the global
 allocator is a critical requirement.
Again, this is difficult to handle in the same conversation as "should we pass size upon deallocation". Yes, we need road laws, but it's hard to talk about those and engine lubrication in the same breath. For me, "critical" right now is to assess whether the untyped API misses an important category of allocators, what safety level it has (that's why "expand" is so different from "realloc"!) etc.
         And does that mean that applications+libraries are required to
         ALWAYS
         allocate through given allocator objects?


     Yes, they should.


 Then we make keyword 'new' redundant?
Probably not. Typed allocators will need to integrate with (and occasionally replace) the global shared GC.
     The problem is that installing an allocator does not get to define
     what a pointer is and what a reference is.

 Why not?
Because that requires a language change. I'm not sure you realize but you move the goalposts all the time. We were talking within the context of libraries and installing allocators dynamically and all of a sudden you get to change what the compiler thinks a pointer is.
 A pointer has a type, like anything else. An ARC pointer can
 theoretically have the compiler insert ARC magic.
 That does imply though that the allocator affects the type, which I
 don't like... I'll think on it.
I talked Walter's ear off a while ago at an ACCU conference about the notion that reference counting could be a switch passed to the compiler. Recently he's authored a DIP about the compiler inserting refcounting primitives in the generated code. Unfortunately I think the DIP is awfully incomplete and superficial (to the extent it would bring more damage to the language if any implementation effort would be invested in it at this point). Can it be done? Probably. But it would be an absolutely major effort.
     These are notions hardwired into the language, so the notion of
     turning a switch and replacing the global GC with a reference
     counting scheme is impossible at the level of a library API.


 Indeed it is. So is this API being built upon an incomplete foundation?
The API is quite good at what it puports to do.
     What I do hope to get to is to have allocators define their own
     pointers and reference types. User code that uses those will be
     guaranteed certain allocation behaviors.


 Interesting, will this mangle the pointer type, or the object type being
 pointed to? The latter is obviously not desirable. Does the former
 actually work in theory?
I don't think I understand what you mean. Honest, it seems to me you're confused about what you want, how it can be done, and what moving pieces are involved. One example is Rust, which defines several pointer types to implement its scoping and borrowing semantics. I think it's not easy to achieve its purported semantics with fewer types, and time will tell whether programmers will put up with the complexity for the sake of the corresponding benefits.
     At this point collect() is only implemented by the global GC. It is
     possible I'll drop it from the final design. However, it's also
     possible that collect() will be properly defined as "collect all
     objects allocated within this particular allocator that are not
     referred from any objects also allocated within this allocator". I
     think that's a useful definition.

 Perhaps. I'm not sure how this situation arises though. Unless you've
 managed to implement your own GC inside an allocator.
That should be doable within D today.
 I think a critical detail to keep in mind, is that (I suspect) people
 simply won't use it if it doesn't interface with keyword 'new'.
I think this is debatable. For one, languages such as Java and C++ still have built-in "new" but quite ubiquitously unrecommend their usage in user code. Far as I can tell that's been a successful campaign. For two, there are allocations that don't even entail calls to new, such as array concatenation.
 It's extremely common that I want to enforce that a library exist
 entirely within a designated heap. It can't fall back to the global GC.

 I work on platforms where memory is not unified. Different resources
 need to go into different heaps.
The proposed API would make that entirely possible. You seem to care only with the appendix "and mind you I only want to use new for all of those!" which is a bit out there. But nevertheless good feedback. Andrei
Sep 23 2013
next sibling parent Manu <turkeyman gmail.com> writes:
On 24 September 2013 03:53, Andrei Alexandrescu <
SeeWebsiteForEmail erdani.org> wrote:

 On 9/23/13 8:32 AM, Manu wrote:

 On 23 September 2013 23:58, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org
<mailto:SeeWebsiteForEmail **erdani.org<SeeWebsiteForEmail erdani.org>

wrote: This is a good first step though, I'm happy to discuss this, but I think discussion about the practical application may also reveal design details at this level.
Absolutely. This is refreshing since I've gone through the cycle of "types must be integrated into allocators and with the global GC" ... -> ... "untyped allocators can actually be extricated" once. It is now great to revisit the assumptions again. One matter I'm rather surprised didn't come up (I assumed everyone would be quite curious about it) is that the allocator does not store the allocated size, or at least does not allow it to be discovered easily. This is a definitely appropriate topic for the design at hand. The malloc-style APIs go like: void* malloc(size_t size); void free(void*); Note that the user doesn't pass the size to free(), which means the allocator is burdened with inferring the size of the block from the pointer efficiently. Given that most allocators make crucial strategic choices depending on the size requested, this API shortcoming is a bane of allocator writers everywhere, and a source of inefficiency in virtually all contemporary malloc-style allocators. This is most ironic since there is next to nothing that user code can do with a pointer without knowing the extent of memory addressed by it. (A notable exception is zero-terminated strings, another questionable design decision.) It all harkens back to Walter's claim of "C's biggest mistake" (with which I agree) of not defining a type for a bounded memory region. Upon studying a few extant allocators and D's lore, I decided that in D things have evolved sufficiently to have the user pass the size upon deallocation as well: void[] allocate(size_t size); void deallocate(void[] buffer); This is because the size of D objects is naturally known: classes have it in the classinfo, slices store it, and the few cases of using bald pointers for allocation are irrelevant and unrecommended. This all makes memory allocation in D a problem fundamentally simpler than in C, which is quite an interesting turn :o).
Yeah, that's definitely cool. I had previously just presumed the allocator would stash the size somewhere along with it's allocation record (as usual), but this does potentially simplify some allocators. No, I certainly understand you mean 'them', but you lead to what I'm
 asking, how do these things get carried/passed around. Are they
 discreet, or will they invade argument lists everywhere?
Since there's a notion of "the default" allocator, there will be ways to push/pop user-defined allocators that temporarily (or permanently) replace the default allocator. This stage of the design is concerned with allowing users to define such user-defined allocators without having to implement them from scratch.
I get that about this stage of design. Buy my question was about the next stage, that is, usage/management of many instances of different allocators, how they are carried around, and how they are supplied to things that want to use them. Prior comments lead me to suspect you were in favour of exclusive usage of this new API, and the 'old' new keyword would continue to exist to just allocate GC memory. At least that's the impression I got. This leads to a presumption that parameter lists will become clogged with allocators if they must be micro-managed, references to allocators will end up all over the place. I'm not into that. I want to know how it will look in practise. I appreciate you think that's off-topic, but I find it pretty relevant. Sure, but the common case is that the author will almost certainly use
 keyword 'new'. How can I affect that as a 3rd party?
 This would require me overriding the global allocator somehow... which
 you touched on earlier.
The way I'm thinking of this is to install/uninstall user-defined allocators that will satisfy calls to new. Since not all allocators support tracing, some would require the user to deallocate stuff manually.
So we need to keep delete then? I also think it may remain useful for allocators that don't clean up automatically. The library wouldn't need to worry as there would be the notion of a
     default allocator (probably backed by the existing GC).

 Right. So it's looking like like the ability to override the global
 allocator is a critical requirement.
Again, this is difficult to handle in the same conversation as "should we pass size upon deallocation". Yes, we need road laws, but it's hard to talk about those and engine lubrication in the same breath. For me, "critical" right now is to assess whether the untyped API misses an important category of allocators, what safety level it has (that's why "expand" is so different from "realloc"!) etc.
Then we should fork the thread? Or I'll just wait for the next one? I'm happy to do that, but as I said before, I suspect 'that' discussion will have impact on this one though, so however awkward, I think it's relevant. And does that mean that applications+libraries are required to
         ALWAYS
         allocate through given allocator objects?


     Yes, they should.


 Then we make keyword 'new' redundant?
Probably not. Typed allocators will need to integrate with (and occasionally replace) the global shared GC.
'typed allocators'? Are they somehow fundamentally separate from this discussion? They seem like just a layer above which construct/destruct/casts. Do you anticipate the typed allocation interface to be significantly more than just a light wrapper? The problem is that installing an allocator does not get to define
     what a pointer is and what a reference is.

 Why not?
Because that requires a language change. I'm not sure you realize but you move the goalposts all the time. We were talking within the context of libraries and installing allocators dynamically and all of a sudden you get to change what the compiler thinks a pointer is.
Deprecating new is definitely a language change. You said (down lower) "have allocators define their own pointers and reference types"... so _you_ said about changing what the compiler thinks a pointer is, that wasn't my idea, I am just trying to roll with that train of thought, and consider possibilities. I don't think it's fair to accuse me of moving goal posts when it's absolutely not clear where they are to begin with. I made the first reply in this thread, at which point there was no conversation to go from. I haven't declared any goal posts I'm aware of. I'm just humoring ideas, and trying to adapt my thinking to your responses, and consider how it affects my requirements. I'm also trying to read into vague comments about practical usage that I can't properly visualise without examples. This is a very important topic to me long-term. It is the source of innumerable headache and mess throughout my career. D needs to get this right. I'm equally confused by your changing position on whether new is important, should be deprecated, will support global 'new' overrides, assuming that delete doesn't exist (rules out non-collecting allocators in conjunction with the keyword allocators), allocation through an allocator object should be encouraged as standard in user code, typed allocators will need to integrate with (and occasionally replace) the global GC, or not. Your position hasn't been static, I'm just trying to work out what it is, and then consider if I'm happy with it. You're implying I have a fixed position, I don't (I do have a few implementation requirements I think are important though, however they end out being expressed), I had no idea at all what you had in mind at the start of the thread, I'm just trying to get clarification, and generally throwing ideas in the pot. What I do hope to get to is to have allocators define their own
     pointers and reference types. User code that uses those will be
     guaranteed certain allocation behaviors.


 Interesting, will this mangle the pointer type, or the object type being
 pointed to? The latter is obviously not desirable. Does the former
 actually work in theory?
I don't think I understand what you mean. Honest, it seems to me you're confused about what you want, how it can be done, and what moving pieces are involved.
Apparently I don't understand what you mean. What does "have allocators define their own pointers and reference types" mean then? I presumed you mean that an allocator would have some influence on the type of the pointers they allocate, then it can be known how to handle them throughout their life. One example is Rust, which defines several pointer types to implement its
 scoping and borrowing semantics. I think it's not easy to achieve its
 purported semantics with fewer types, and time will tell whether
 programmers will put up with the complexity for the sake of the
 corresponding benefits.
I don't think rust allows user-defined allocators to declare new pointer types, does it? I think a critical detail to keep in mind, is that (I suspect) people
 simply won't use it if it doesn't interface with keyword 'new'.
I think this is debatable. For one, languages such as Java and C++ still have built-in "new" but quite ubiquitously unrecommend their usage in user code. Far as I can tell that's been a successful campaign.
I'm not sure what you mean... if by success you mean people don't use new in C++, then yes, I have observed that pattern, and it's an absolute catastrophe in my experience. Every company/code base I've ever worked on has had some stupid new macro they roll themselves. And that always leads to problems when interacting with 3rd party libraries, and generic code in C++ can't work when there's lots of different ways to allocate memory. For two, there are allocations that don't even entail calls to new, such as
 array concatenation.
This is a very important area of discussion; how will user-defined allocators fit into implicit allocations? Shall we split off a few threads so they can be considered 'on topic'? There are a lot of things that need discussion. It's extremely common that I want to enforce that a library exist
 entirely within a designated heap. It can't fall back to the global GC.

 I work on platforms where memory is not unified. Different resources
 need to go into different heaps.
The proposed API would make that entirely possible. You seem to care only with the appendix "and mind you I only want to use new for all of those!" which is a bit out there. But nevertheless good feedback.
You often "quote" people, but totally paraphrase them such that it's nothing like what they said. I'm sure I've made the point a whole bunch of times times that I care only that "libraries don't hard-code an allocator which can not be overridden by the application that uses them". That may sound like "I want to use new for all those"; because libraries invariably use new, and I can't control that. The reason for that is very likely nothing more than 'because it's a keyword', it's also convenient, and perceived as the standard/proper way to allocate memory. I think that's water well under the bridge, probably in the ocean by now, new is out there, I don't think it can be revoked. So as far as I see it, unless you can present a practical solution to deprecate it, and a path to migrate existing 'new' calls, then new must be a part of the solution here; it must be overridable. That seems to be the direction this thread is going anyway, so I'll drop this thread now. But you can't accuse me of changing my story when I'm only trying to clarify your story in the first place, which also seems to have changed (in favour of keeping new judging by the direction of the thread, and raising DIP46).
Sep 23 2013
prev sibling next sibling parent reply Jacob Carlborg <doob me.com> writes:
On 2013-09-23 19:53, Andrei Alexandrescu wrote:

 I talked Walter's ear off a while ago at an ACCU conference about the
 notion that reference counting could be a switch passed to the compiler.
 Recently he's authored a DIP about the compiler inserting refcounting
 primitives in the generated code. Unfortunately I think the DIP is
 awfully incomplete and superficial (to the extent it would bring more
 damage to the language if any implementation effort would be invested in
 it at this point).

 Can it be done? Probably. But it would be an absolutely major effort.
I don't know if we're talking about the same things but a couple of us had a private email conversion about implementing ARC (Automatic Reference Counting) in the compiler as a result of my suggestion for adding support for Objective-C in D. A DIP was never created, at least I cannot find it at the wiki. If I recall correctly the conversation was never put in the public, which I think it should be.
 I think this is debatable. For one, languages such as Java and C++ still
 have built-in "new" but quite ubiquitously unrecommend their usage in
 user code. Far as I can tell that's been a successful campaign.
Since when is it _not_ recommended to use "new" in Java? -- /Jacob Carlborg
Sep 23 2013
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 9/23/13 11:32 PM, Jacob Carlborg wrote:
 On 2013-09-23 19:53, Andrei Alexandrescu wrote:

 I talked Walter's ear off a while ago at an ACCU conference about the
 notion that reference counting could be a switch passed to the compiler.
 Recently he's authored a DIP about the compiler inserting refcounting
 primitives in the generated code. Unfortunately I think the DIP is
 awfully incomplete and superficial (to the extent it would bring more
 damage to the language if any implementation effort would be invested in
 it at this point).

 Can it be done? Probably. But it would be an absolutely major effort.
I don't know if we're talking about the same things but a couple of us had a private email conversion about implementing ARC (Automatic Reference Counting) in the compiler as a result of my suggestion for adding support for Objective-C in D. A DIP was never created, at least I cannot find it at the wiki. If I recall correctly the conversation was never put in the public, which I think it should be.
I know of this idea and discussed it with Walter. Our provisional conclusion was that the matter is quite a bit more involved than we initially thought, and ensuring safety would be extremely difficult.
 I think this is debatable. For one, languages such as Java and C++ still
 have built-in "new" but quite ubiquitously unrecommend their usage in
 user code. Far as I can tell that's been a successful campaign.
Since when is it _not_ recommended to use "new" in Java?
http://goo.gl/KVmAom Andrei
Sep 24 2013
next sibling parent reply Paulo Pinto <pjmlp progtools.org> writes:
Am 24.09.2013 17:29, schrieb Andrei Alexandrescu:
 On 9/23/13 11:32 PM, Jacob Carlborg wrote:
 On 2013-09-23 19:53, Andrei Alexandrescu wrote:

 I think this is debatable. For one, languages such as Java and C++ still
 have built-in "new" but quite ubiquitously unrecommend their usage in
 user code. Far as I can tell that's been a successful campaign.
Since when is it _not_ recommended to use "new" in Java?
http://goo.gl/KVmAom Andrei
The top results are from around 2002, when most JVMs still had basic GC implementations and did not perform escape analysis. -- Paulo
Sep 24 2013
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 9/24/13 9:12 AM, Paulo Pinto wrote:
 Am 24.09.2013 17:29, schrieb Andrei Alexandrescu:
 On 9/23/13 11:32 PM, Jacob Carlborg wrote:
 On 2013-09-23 19:53, Andrei Alexandrescu wrote:

 I think this is debatable. For one, languages such as Java and C++
 still
 have built-in "new" but quite ubiquitously unrecommend their usage in
 user code. Far as I can tell that's been a successful campaign.
Since when is it _not_ recommended to use "new" in Java?
http://goo.gl/KVmAom Andrei
The top results are from around 2002, when most JVMs still had basic GC implementations and did not perform escape analysis.
I'm having in mind the polymorphism/flexibility argument. I'd forgotten about the efficiency argument. Andrei
Sep 24 2013
parent Paulo Pinto <pjmlp progtools.org> writes:
Am 24.09.2013 18:14, schrieb Andrei Alexandrescu:
 On 9/24/13 9:12 AM, Paulo Pinto wrote:
 Am 24.09.2013 17:29, schrieb Andrei Alexandrescu:
 On 9/23/13 11:32 PM, Jacob Carlborg wrote:
 On 2013-09-23 19:53, Andrei Alexandrescu wrote:

 I think this is debatable. For one, languages such as Java and C++
 still
 have built-in "new" but quite ubiquitously unrecommend their usage in
 user code. Far as I can tell that's been a successful campaign.
Since when is it _not_ recommended to use "new" in Java?
http://goo.gl/KVmAom Andrei
The top results are from around 2002, when most JVMs still had basic GC implementations and did not perform escape analysis.
I'm having in mind the polymorphism/flexibility argument. I'd forgotten about the efficiency argument. Andrei
Ah ok. There I agree with you. The common trend is to favour interfaces for behaviour and make use of dependency injection instead of new. -- Paulo
Sep 24 2013
prev sibling parent Nick Sabalausky <SeeWebsiteToContactMe semitwist.com> writes:
On Tue, 24 Sep 2013 08:29:39 -0700
Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:

 On 9/23/13 11:32 PM, Jacob Carlborg wrote:
 On 2013-09-23 19:53, Andrei Alexandrescu wrote:
 I think this is debatable. For one, languages such as Java and C++
 still have built-in "new" but quite ubiquitously unrecommend their
 usage in user code. Far as I can tell that's been a successful
 campaign.
Since when is it _not_ recommended to use "new" in Java?
http://goo.gl/KVmAom
Uhh, that's JavaScript, not Java. From what I can tell, the arguments seem to be very specific to JavaScript, ie, centering around prototype-inheritance and implicit declarations. Paulo's comment about preferring dependency injection makes sense (Although they'd still get new'ed anyway, just at some other point, right?). Were there other reasons?
Sep 24 2013
prev sibling parent reply Robert <jfanatiker gmx.at> writes:
On Mon, 2013-09-23 at 10:53 -0700, Andrei Alexandrescu wrote:
 void deallocate(void[] buffer);
 
 This is because the size of D objects is naturally known: classes
 have 
 it in the classinfo, slices store it, and the few cases of using bald 
 pointers for allocation are irrelevant and unrecommended.
One concern though: void[] is just a slice, who guarantees that the size matches the one that was allocated? Is the allocator assumed to handle this correctly: auto arr = allocate(100); auto arr2 = arr[50..100]; arr = arr[0..50]; deallocate(arr); deallocate(arr2); ? Or is the user simply obliged to use the originally returned range? Best regards, Robert
Sep 24 2013
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 9/24/13 2:49 AM, Robert wrote:
 On Mon, 2013-09-23 at 10:53 -0700, Andrei Alexandrescu wrote:
 void deallocate(void[] buffer);

 This is because the size of D objects is naturally known: classes
 have
 it in the classinfo, slices store it, and the few cases of using bald
 pointers for allocation are irrelevant and unrecommended.
One concern though: void[] is just a slice, who guarantees that the size matches the one that was allocated? Is the allocator assumed to handle this correctly: auto arr = allocate(100); auto arr2 = arr[50..100]; arr = arr[0..50]; deallocate(arr); deallocate(arr2); ? Or is the user simply obliged to use the originally returned range?
Initially I thought the user must pass the exact slice returned by allocate(). Then I thought we can relax the requirement to any length between the _requested_ size in allocate and the _received_ length of the slice. For example: auto arr = a.allocate(11); Most allocators will allocate more than 11 bytes. Say the returned arr has length 64. Then either a.deallocate(arr) or a.deallocate(arr[0 .. 11]) would work. In fact a.deallocate(arr[0 .. n]) would work for any n between 11 and 64 inclusive. This is because the allocator will apply whatever rounding mechanism it has applied to the size when allocating, to the deallocated size. If the size passed for deallocation is smaller than the size requested, most allocators won't handle that properly. Interior pointers are also not handled by most allocators. Andrei
Sep 24 2013
prev sibling parent "deadalnix" <deadalnix gmail.com> writes:
On Monday, 23 September 2013 at 13:58:42 UTC, Andrei Alexandrescu 
wrote:
 If the author forgets and hardcodes a library to use malloc(), 
 I have no way around that.
We should consider the lib writer use new and/or automatic allocation like for closures. In practice, this is what will happen.
 new will still be used to tap into the global shared GC. 
 std.allocator will provide other means of allocating memory.
That uterly suck and we can do better using type informations.
 At this point collect() is only implemented by the global GC. 
 It is possible I'll drop it from the final design. However, 
 it's also possible that collect() will be properly defined as 
 "collect all objects allocated within this particular allocator 
 that are not referred from any objects also allocated within 
 this allocator". I think that's a useful definition.
I don't think collect is usefull with no type information (not even if the data may contain pointers).
 The D language has no idea what a range is. The notion is 
 completely defined in std.range.
foreach has some idea of what a range is.
Sep 23 2013
prev sibling next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 9/22/2013 4:49 PM, Andrei Alexandrescu wrote:
 Destroy!
It should be compatible with: http://wiki.dlang.org/DIP46 or the DIP should be adaptable to work with std.allocator.
Sep 22 2013
parent "Chad Joan" <chadjoan gmail.com> writes:
On Monday, 23 September 2013 at 03:58:51 UTC, Walter Bright wrote:
 On 9/22/2013 4:49 PM, Andrei Alexandrescu wrote:
 Destroy!
It should be compatible with: http://wiki.dlang.org/DIP46 or the DIP should be adaptable to work with std.allocator.
It seems to me like DIP46 attempts to address Manu's concerns, and is (mostly) orthogonal to Andrei's designs. I would wish the current GC to be defined according to Andrei's new API. It should be the one that handles all allocations by default at program start. I have a desire for region based allocation, and in this context I want to be able to push the thread's default allocator onto a "thread allocator stack" and thus make all D allocations (new's, string concats, etc) use the region allocator. So I pretty much want DIP46, except I want "pushAllocator(...)" and "popAllocator()" instead of "gc_push()" and "gc_pop()". The region allocator would also be defined according to Andrei's API. By default, the region allocator should request giant chunks of memory from the GC (or whatever is past it in the thread's allocator stack). This should be manually adjustable so that I can point it at a malloc-forwarding allocator (Mallocator). The allocator stack at program start should probably look like this: [GCAllocator] <- current [Mallocator] <system> My server might look like this: // ---------------------- module main; import core.thread; import thread_main; void main(string[] args) { ... initialization/finalization (thanks scope(exit)!) ... while ( true ) { if ( !handshake(connection, ...) ) continue; auto worker = new Thread(&threadMain); worker.start(); } } // ---------------------- module thread_main; import std.allocator; void threadMain() { // Construct a region allocator that uses the nearest // Mallocator to allocate regions. auto regionAllocator = new RegionAllocator(getMallocator()); // Use it by default for all of this thread's allocations. pushAllocator( regionAllocator ); scope(exit) { // Free 'foo' and everything still allocated by // the call to doStuff(). regionAllocator.deallocateAll(); // Return control to the GC. popAllocator(); } auto foo = ubyte[40]; // Region allocated. doStuff(); } /// Gets the nearest Mallocator in the allocator stack, if it's /// available. Returns the allocator stack's front if there is /// none. Mallocator getMallocator() { auto allocatorStackCopy = allocatorStack.save; auto result = allocatorStackCopy.front; while ( !allocatorStackCopy.empty ) { // I just realized that allocators should be a class // type and not a struct type, otherwise this function // doesn't work! // <failure> // if (is(typeof(allocatorStackCopy.front) == Mallocator)) // </failure> auto mallocator = cast(Mallocator)allocatorStackCopy.front; if ( mallocator !is null ) result = mallocator; } return result; } // ---------------------- Now, once inside doStuff(), the allocator stack looks like so: [RegionaAllocator] -. <- Current [GCAllocator] | [Mallocator] <------' <system> // ---------------------- I think I might have managed to demonstrate that Manu's concerns (and DIP46) have possible implications for Andrei's design: allocators need dynamic dispatch to be reassignable at runtime, and classes with inheritance seem like a natural choice for this. I like my no-cost abstractions though, so I hate to admit the dynamic nature of this. Is there a better way? I definitely don't want 3rd parties to, at compile-time (link-time?), avoid my custom allocators by default. Small aside: I don't like that statelessOp has to be nothrow. Is there a reason that we can't pre-allocate a fixed amount of memory for exception objects, including at least one "can't allocate proper exception" fatal exception, and then throw the pre-allocated NoMemoryForException exception if we don't have enough memory to handle the throw's that may occur?
Sep 23 2013
prev sibling next sibling parent reply Benjamin Thaut <code benjamin-thaut.de> writes:
Am 23.09.2013 01:49, schrieb Andrei Alexandrescu:
 Hello,


 2. Untyped allocator - traffics exclusively in ubyte[].
Why "ubyte[]" and not "void[]"? -- Kind Regards Benjamin Thaut
Sep 22 2013
next sibling parent reply Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
On 9/23/13, Benjamin Thaut <code benjamin-thaut.de> wrote:
 Why "ubyte[]" and not "void[]"?
Probably to make the GC avoid scanning into the array.
Sep 23 2013
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 9/23/13 4:39 AM, Andrej Mitrovic wrote:
 On 9/23/13, Benjamin Thaut <code benjamin-thaut.de> wrote:
 Why "ubyte[]" and not "void[]"?
Probably to make the GC avoid scanning into the array.
No, that's determined at runtime by how the blocks are allocated. Andrei
Sep 23 2013
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 9/22/13 10:20 PM, Benjamin Thaut wrote:
 Am 23.09.2013 01:49, schrieb Andrei Alexandrescu:
 Hello,


 2. Untyped allocator - traffics exclusively in ubyte[].
Why "ubyte[]" and not "void[]"?
It's the logical choice at this level. ubyte[] == "these are octets" Andrei
Sep 23 2013
parent reply Manu <turkeyman gmail.com> writes:
On 24 September 2013 00:04, Andrei Alexandrescu <
SeeWebsiteForEmail erdani.org> wrote:

 On 9/22/13 10:20 PM, Benjamin Thaut wrote:

 Am 23.09.2013 01:49, schrieb Andrei Alexandrescu:

 Hello,


 2. Untyped allocator - traffics exclusively in ubyte[].
Why "ubyte[]" and not "void[]"?
It's the logical choice at this level. ubyte[] == "these are octets"
Isn't that what void[] also means? Except it says "these are un-typed octets, ie, not a sequence of typed integers in the range 0-255".
Sep 23 2013
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 9/23/13 7:07 AM, Manu wrote:
 On 24 September 2013 00:04, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org <mailto:SeeWebsiteForEmail erdani.org>>
 wrote:

     On 9/22/13 10:20 PM, Benjamin Thaut wrote:

         Am 23.09.2013 01:49, schrieb Andrei Alexandrescu:

             Hello,


             2. Untyped allocator - traffics exclusively in ubyte[].


         Why "ubyte[]" and not "void[]"?


     It's the logical choice at this level.

     ubyte[] == "these are octets"


 Isn't that what void[] also means?
 Except it says "these are un-typed octets, ie, not a sequence of typed
 integers in the range 0-255".
I think void[] means "objects of unknown type". Andrei
Sep 23 2013
parent reply Benjamin Thaut <code benjamin-thaut.de> writes:
Am 23.09.2013 16:16, schrieb Andrei Alexandrescu:
 On 9/23/13 7:07 AM, Manu wrote:
 On 24 September 2013 00:04, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org <mailto:SeeWebsiteForEmail erdani.org>>
 wrote:

     On 9/22/13 10:20 PM, Benjamin Thaut wrote:

         Am 23.09.2013 01:49, schrieb Andrei Alexandrescu:

             Hello,


             2. Untyped allocator - traffics exclusively in ubyte[].


         Why "ubyte[]" and not "void[]"?


     It's the logical choice at this level.

     ubyte[] == "these are octets"


 Isn't that what void[] also means?
 Except it says "these are un-typed octets, ie, not a sequence of typed
 integers in the range 0-255".
I think void[] means "objects of unknown type". Andrei
I always understood void[] as block of unkown data. Which a allocator should return in my opinion. Whats the point of "void" having a size in D if we still do it the C way? In my opinion ubyte[] is a array of values in the range of 0-255 like manu says. Also if you get a ubyte[] you might get the opinion that it is initialized to all zeros or something. Which might not be true for all allocators (performance ...) If you get a void[] you know, all bets are off, and you have to check if the allocator preinitialized it or not. Kind Regards Benjamin Thaut
Sep 23 2013
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 9/23/13 9:47 AM, Benjamin Thaut wrote:
 I always understood void[] as block of unkown data. Which a allocator
 should return in my opinion. Whats the point of "void" having a size in
 D if we still do it the C way? In my opinion ubyte[] is a array of
 values in the range of 0-255 like manu says. Also if you get a ubyte[]
 you might get the opinion that it is initialized to all zeros or
 something. Which might not be true for all allocators (performance ...)
 If you get a void[] you know, all bets are off, and you have to check if
 the allocator preinitialized it or not.
You might be right. For example, ubyte[] allows arithmetic on its elements, which is something one shouldn't ever care to do in an allocation library. I'm unclear on what void[] means starting from its semantics. That said, I replaced ubyte[] with void[] throughout my existing code and it works. Andrei
Sep 23 2013
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 9/23/2013 10:08 AM, Andrei Alexandrescu wrote:
 I'm unclear on what void[] means starting from its semantics.
I agree that it should be void[], as that represents (in D) a block of untyped data. void[] should be ideal for a primitive allocator.
Sep 23 2013
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 9/23/13 1:04 PM, Walter Bright wrote:
 On 9/23/2013 10:08 AM, Andrei Alexandrescu wrote:
 I'm unclear on what void[] means starting from its semantics.
I agree that it should be void[], as that represents (in D) a block of untyped data. void[] should be ideal for a primitive allocator.
Great, made that change, it all works. Does void[] make any promise about alignment? Andrei
Sep 23 2013
next sibling parent Walter Bright <newshound2 digitalmars.com> writes:
On 9/23/2013 1:18 PM, Andrei Alexandrescu wrote:
 Does void[] make any promise about alignment?
No.
Sep 23 2013
prev sibling parent "deadalnix" <deadalnix gmail.com> writes:
On Monday, 23 September 2013 at 20:18:56 UTC, Andrei Alexandrescu 
wrote:
 On 9/23/13 1:04 PM, Walter Bright wrote:
 On 9/23/2013 10:08 AM, Andrei Alexandrescu wrote:
 I'm unclear on what void[] means starting from its semantics.
I agree that it should be void[], as that represents (in D) a block of untyped data. void[] should be ideal for a primitive allocator.
Great, made that change, it all works. Does void[] make any promise about alignment? Andrei
void[] dont make any promise about anything, that is the whole point.
Sep 23 2013
prev sibling parent Walter Bright <newshound2 digitalmars.com> writes:
On 9/23/2013 9:47 AM, Benjamin Thaut wrote:
 I always understood void[] as block of unkown data.
Untyped data, not unknown data.
Sep 23 2013
prev sibling next sibling parent reply Jacob Carlborg <doob me.com> writes:
On 2013-09-23 01:49, Andrei Alexandrescu wrote:

 I am making good progress on the design of std.allocator, and I am
 optimistic about the way it turns out. D's introspection capabilities
 really shine through, and in places the design does feel really
 archetypal - e.g. "this is the essence of a freelist allocator". It's a
 very good feeling. The overall inspiration comes from Berger's
 HeapLayers, but D's introspection takes that pattern to a whole new level.
I agree with Manu here. I thought the whole point was to come up with a design and API how the allocators are supposed to be used. How they integrate with user code. Here's a quick idea: I can think of at least four different usage patterns how an allocator can be set. Hopefully this will be backwards compatible as well. 1. Globally - The same allocator is used in the whole application by all threads 2. Thread local - The allocator is set per thread. Different threads can used different allocators 3. Function local - This is the most intrusive. Sets the allocator for a given function call 4. Block local - The allocator is used during a given block This will require some interaction with the runtime and library and probably use OOP. We define two module variables in some module in druntime, say rt.allocator: module rt.allocator: shared Allocator globalAllocator; Allocator allocator; shared this () { globalAllocator = GCAllocator.allocator; } The global allocator is, by default, a GC allocator (the GC we're currently using). For each thread we set the thread local allocator to be the same as the global allocator: allocator = globalAllocator; By default all code will use "new" to allocate memory. druntime will have three functions which the "new" keyword is lowered to. They could look something like this: extern (C) ubyte[] _d_tl_new (size_t size) { return _d_new(size, rt.allocator.allocator); } extern (C) ubyte[] _d_global_new (size_t size) { return _d_new(size, rt.allocator.globalAllocator); } extern (C) ubyte[] _d_new (size_t size, Allocator allocator) { if (memory = allocator.allocate(size)) return memory; onOutOfMemoryError(); } By default "new" is lowered to a call to "_d_tl_new", which will allocate using the thread local allocator, which is by default the same as the global allocator, that is, the GC. In this way we maintain the current method of allocating memory. When using "new shared", it's lowered to a function call to "_d_global_new", which uses the global allocator. For block local allocator an ideal syntax would be: allocator (GCAllocator.allocator) { // explicitly use the GC allocator within this block. } If that's not possibly we can define a function look like this: useAllocator (alias allocator, alias block) () { auto currentAllocator = core.allocator.allocator; scope (exit) core.allocator.allocator = currentAllocator; block(); } Which is used, something like this: useAllocator!(GCAllocator.allocator, { // explicitly use the GC allocator within this block. }); Or alternately, using a struct: struct UseAllocator { private Allocator currentAlloctor; this (Allocator allocator) { currentAlloctor = core.allocator.allocator; } ~this () { rt.allocator.allocator = currentAlloctor; } } UseAllocator useAllocator (Allocator allocator) { return UseAllocator(allocator); } { auto _ = useAllocator(GCAllocator.allocator); // explicitly use the GC allocator within this block. } Of curse it's also possible to explicitly set the thread local or global allocator for a more fine grained control. The above functions are just for convenience and to make sure the allocator is restored. Function local allocator would just be a function taking an allocator as a parameter, example: void process (Allocator) (int a, Allocator allocator = core.allocator.allocator) { auto _ = useAllocator(allocator); // use the given allocator throughout the rest of this function scope } Some bikeshedding:
 struct NullAllocator
 {
      enum alignment = real.alignof;
      enum size_t available = 0;
      ubyte[] allocate(size_t s) shared { return null; }
      bool expand(ref ubyte[] b, size_t minDelta, size_t maxDelta) shared
      { assert(b is null); return false; }
      bool reallocate(ref ubyte[] b, size_t) shared
      { assert(b is null); return false; }
      void deallocate(ubyte[] b) shared { assert(b is null); }
      void collect() shared { }
      void deallocateAll() shared { }
      static shared NullAllocator it;
 }
I would put the asserts in an "in" block.
 * "it" is an interesting artifact. Allocators may or may not hold
 per-instance state. Those that don't are required to define a global
 shared or thread-local singleton called "it" that will be used for all
 calls related to allocation. Of course, it is preferred for maximum
 flexibility that "it" is shared so as to clarify that the allocator is
 safe to use among different threads. In the case of the NullAllocator,
 this is obviously true, but non-trivial examples are the malloc-based
 allocator and the global garbage collector, both of which implement
 thread-safe allocation (at great effort no less).
You have to come up with a better name than "it". If it's some kind of singleton instance then the standard name is usually "instance". Another suggested would be "allocator". -- /Jacob Carlborg
Sep 23 2013
next sibling parent reply "qznc" <qznc web.de> writes:
On Monday, 23 September 2013 at 07:31:46 UTC, Jacob Carlborg 
wrote:
 On 2013-09-23 01:49, Andrei Alexandrescu wrote:

 I am making good progress on the design of std.allocator, and 
 I am
 optimistic about the way it turns out. D's introspection 
 capabilities
 really shine through, and in places the design does feel really
 archetypal - e.g. "this is the essence of a freelist 
 allocator". It's a
 very good feeling. The overall inspiration comes from Berger's
 HeapLayers, but D's introspection takes that pattern to a 
 whole new level.
I agree with Manu here. I thought the whole point was to come up with a design and API how the allocators are supposed to be used. How they integrate with user code. Here's a quick idea: I can think of at least four different usage patterns how an allocator can be set. Hopefully this will be backwards compatible as well. 1. Globally - The same allocator is used in the whole application by all threads 2. Thread local - The allocator is set per thread. Different threads can used different allocators 3. Function local - This is the most intrusive. Sets the allocator for a given function call 4. Block local - The allocator is used during a given block
5. Class local - The allocator is used for specific types (e.g. ASTNode in a compiler) 6. Class-hierarchy - The allocator is used for a specific type hierarchy (e.g. ASTNode might have sub classes Statement,BinOp,etc) 7. Container local - The C++ way which binds allocation to a wrapping container 8. Task local - like Thread local but for std.parallelism.Task That's it for now. This is quite a long list, which means it is probably exhaustive. There should be a generic approach which encompasses at least those cases.
Sep 23 2013
parent reply Jacob Carlborg <doob me.com> writes:
On 2013-09-23 11:31, qznc wrote:

 5. Class local - The allocator is used for specific types (e.g. ASTNode
 in a compiler)

 6. Class-hierarchy - The allocator is used for a specific type hierarchy
 (e.g. ASTNode might have sub classes Statement,BinOp,etc)

 7. Container local - The C++ way which binds allocation to a wrapping
 container

 8. Task local - like Thread local but for std.parallelism.Task

 That's it for now.

 This is quite a long list, which means it is probably exhaustive. There
 should be a generic approach which encompasses at least those cases.
That's a good addition to the list. -- /Jacob Carlborg
Sep 23 2013
parent reply Manu <turkeyman gmail.com> writes:
On 23 September 2013 23:38, Jacob Carlborg <doob me.com> wrote:

 On 2013-09-23 11:31, qznc wrote:

  5. Class local - The allocator is used for specific types (e.g. ASTNode
 in a compiler)

 6. Class-hierarchy - The allocator is used for a specific type hierarchy
 (e.g. ASTNode might have sub classes Statement,BinOp,etc)

 7. Container local - The C++ way which binds allocation to a wrapping
 container

 8. Task local - like Thread local but for std.parallelism.Task

 That's it for now.

 This is quite a long list, which means it is probably exhaustive. There
 should be a generic approach which encompasses at least those cases.
That's a good addition to the list.
Another situation that I've encountered on a few occasions... Imagine I declare a particular allocator for my application, when dealing with 3rd party libs, I expect it to allocate within my application's heap. Seems like a situation where I might set a global allocator... Mr 3rd party library also has its own allocators though, things like pools or groups which it uses explicitly within it's code. I don't actually want to override these allocators, since they're effectively a higher-level construct, but I *do* want to override these allocator's memory source. Ie, they should allocate their pools from my application's heap, rather than the default heap. So in this way, it's important to be able to set overrides at multiple levels.
Sep 23 2013
parent Jacob Carlborg <doob me.com> writes:
On 2013-09-23 16:05, Manu wrote:

 Another situation that I've encountered on a few occasions...
 Imagine I declare a particular allocator for my application, when
 dealing with 3rd party libs, I expect it to allocate within my
 application's heap. Seems like a situation where I might set a global
 allocator...
 Mr 3rd party library also has its own allocators though, things like
 pools or groups which it uses explicitly within it's code.
 I don't actually want to override these allocators, since they're
 effectively a higher-level construct, but I *do* want to override these
 allocator's memory source. Ie, they should allocate their pools from my
 application's heap, rather than the default heap.
 So in this way, it's important to be able to set overrides at multiple
 levels.
You might be able to combine two different allocators. Basically create a new allocator that somehow extends the third party allocator but allocates in your own heap instead. -- /Jacob Carlborg
Sep 23 2013
prev sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 9/23/13 12:31 AM, Jacob Carlborg wrote:
 On 2013-09-23 01:49, Andrei Alexandrescu wrote:

 I am making good progress on the design of std.allocator, and I am
 optimistic about the way it turns out. D's introspection capabilities
 really shine through, and in places the design does feel really
 archetypal - e.g. "this is the essence of a freelist allocator". It's a
 very good feeling. The overall inspiration comes from Berger's
 HeapLayers, but D's introspection takes that pattern to a whole new
 level.
I agree with Manu here. I thought the whole point was to come up with a design and API how the allocators are supposed to be used. How they integrate with user code.
This is a disconnect. I say "Well, I'm exploring car engines and this Otto cycle looks promising, here's how a transmission would work, and the brake system would be hydraulic..." and you say "but I want to know how to drive a car, and road regulations, and laws!" Both are important, but they are very difficult to handle simultaneously in the same conversation. Andrei
Sep 23 2013
prev sibling next sibling parent reply "Dicebot" <public dicebot.lv> writes:
On Sunday, 22 September 2013 at 23:49:56 UTC, Andrei Alexandrescu 
wrote:
 ...
In general I like it though I do agree with concerns mentioned that without speculation about high-level API and usage examples it is rather hard to evaluate its practical applicability. For example, you have mentioned somewhere in comments that it is planned to embed allocator as part of template site like in C++ and it does not really sound a good idea.
Sep 23 2013
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 9/23/13 1:03 AM, Dicebot wrote:
 On Sunday, 22 September 2013 at 23:49:56 UTC, Andrei Alexandrescu wrote:
 ...
In general I like it though I do agree with concerns mentioned that without speculation about high-level API and usage examples it is rather hard to evaluate its practical applicability. For example, you have mentioned somewhere in comments that it is planned to embed allocator as part of template site like in C++ and it does not really sound a good idea.
That is undecided at this point. It's possible that we collectively get convinced the efficiency impact of indirect calls is not too high, in which case we can have containers using a dynamic allocator interface. Andrei
Sep 23 2013
prev sibling next sibling parent reply "ponce" <contact gm3sfrommrs.fr> writes:
Great news! It looks like a big improvement on akward C++ 
allocators.

(For what it's worth I have a working implementation of aligned 
malloc/free/realloc here 
https://github.com/p0nce/gfm/blob/master/gfm/core/memory.d, which 
can be the basis for an allocator layered upon another)
Sep 23 2013
next sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 9/23/13 7:22 AM, ponce wrote:
 Great news! It looks like a big improvement on akward C++ allocators.

 (For what it's worth I have a working implementation of aligned
 malloc/free/realloc here
 https://github.com/p0nce/gfm/blob/master/gfm/core/memory.d, which can be
 the basis for an allocator layered upon another)
Awesome, thanks. Will look at it. Andrei
Sep 23 2013
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 9/23/13 7:22 AM, ponce wrote:
 Great news! It looks like a big improvement on akward C++ allocators.

 (For what it's worth I have a working implementation of aligned
 malloc/free/realloc here
 https://github.com/p0nce/gfm/blob/master/gfm/core/memory.d, which can be
 the basis for an allocator layered upon another)
I gave this a read, nice work. One question: what circumstances require run-time alignment values, and what values would those be? I'm currently under the assumption that alignments are known during compilation. Thanks, Andrei
Sep 23 2013
parent reply "ponce" <contact gm3sfrommrs.fr> writes:
On Monday, 23 September 2013 at 15:45:25 UTC, Andrei Alexandrescu 
wrote:
 On 9/23/13 7:22 AM, ponce wrote:
 Great news! It looks like a big improvement on akward C++ 
 allocators.

 (For what it's worth I have a working implementation of aligned
 malloc/free/realloc here
 https://github.com/p0nce/gfm/blob/master/gfm/core/memory.d, 
 which can be
 the basis for an allocator layered upon another)
I gave this a read, nice work. One question: what circumstances require run-time alignment values, and what values would those be? I'm currently under the assumption that alignments are known during compilation. Thanks, Andrei
I don't know of a use case for run-time alignment values.
Sep 23 2013
parent "ponce" <contact gam3sfrommars.fr> writes:
On Monday, 23 September 2013 at 15:56:11 UTC, ponce wrote:
 On Monday, 23 September 2013 at 15:45:25 UTC, Andrei 
 Alexandrescu wrote:
 On 9/23/13 7:22 AM, ponce wrote:

 One question: what circumstances require run-time alignment 
 values, and what values would those be? I'm currently under 
 the assumption that alignments are known during compilation.


 Thanks,

 Andrei
I don't know of a use case for run-time alignment values.
Actually there is at least one scenario in which run-time alignment values is required. Using "true sharing" for OpenCL memory objects requires to first get the necessary alignement through an API call then passing accordingly aligned buffers. It can requires alignement as high as 4096 bytes. Same business with OpenGL. Just stumbled upon this problem in C++ which prevent to use a custom C++ STL allocator.
Oct 24 2013
prev sibling next sibling parent reply "Adam D. Ruppe" <destructionator gmail.com> writes:
We should really deprecate the new keyword. It'd break like all 
code ever, but with D's templates, there's no need for it, and 
when it is there, it is going to spark problems about replacing 
global allocators or the library allocators being second class 
citizens.

Maybe we could minimize the breakage by just rewriting the 
keyword into a library function call, which could then be 
overridden with local variables or imports  via normal scoping.
Sep 23 2013
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 9/23/13 8:02 AM, Adam D. Ruppe wrote:
 We should really deprecate the new keyword. It'd break like all code
 ever, but with D's templates, there's no need for it, and when it is
 there, it is going to spark problems about replacing global allocators
 or the library allocators being second class citizens.

 Maybe we could minimize the breakage by just rewriting the keyword into
 a library function call, which could then be overridden with local
 variables or imports  via normal scoping.
I'd think new already is translated into a library call. Walter? Andrei
Sep 23 2013
parent reply Jacob Carlborg <doob me.com> writes:
On 2013-09-23 17:03, Andrei Alexandrescu wrote:

 I'd think new already is translated into a library call. Walter?
Yes, "new" is lowered to a couple different runtime functions. Here's a list: http://wiki.dlang.org/Runtime_Hooks -- /Jacob Carlborg
Sep 23 2013
next sibling parent Jacob Carlborg <doob me.com> writes:
On 2013-09-23 20:59, Jacob Carlborg wrote:

 http://wiki.dlang.org/Runtime_Hooks
Search for "new". -- /Jacob Carlborg
Sep 23 2013
prev sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 9/23/13 11:59 AM, Jacob Carlborg wrote:
 On 2013-09-23 17:03, Andrei Alexandrescu wrote:

 I'd think new already is translated into a library call. Walter?
Yes, "new" is lowered to a couple different runtime functions. Here's a list: http://wiki.dlang.org/Runtime_Hooks
Thanks, this is great. Currently "new" is fail because it calls functions passing TypeInfo objects, instead of calling type-parameterized functions. We must change that to have "new T(args)" lower into ".opNew!T(args)". Then object.d defines that function. Andrei
Sep 23 2013
prev sibling next sibling parent reply Benjamin Thaut <code benjamin-thaut.de> writes:
Am 23.09.2013 17:02, schrieb Adam D. Ruppe:
 We should really deprecate the new keyword. It'd break like all code
 ever, but with D's templates, there's no need for it, and when it is
 there, it is going to spark problems about replacing global allocators
 or the library allocators being second class citizens.
Its not possible to replace new with a library function in all cases. There are many examples where it breaks (I tried it believe me). Just let me give a few: class A { class B { } B m_b; this() { // uses the sourrounding context // not possible with library function m_b = new B(); } } Also there is the problem that D does not have perfect forwarding. That means if "new" is a library function it will cause additional copies / moves of structs which are not the case with the buildin new. Next there are implict conversions of literals. Those work just fine with the buildin new, but will break with a templated library new. E.g. class C { this(ubyte a, ubyte b) { } } new C(5,3); // fine New!C(5,3); // Error: can not implicitly convert int to ubyte Unless of course you want to write a template metaprogram that does those implict conversions. Although the problem would remain that you can't know if the value you get comes from a literal or from a function call, variable, etc. The next thing are arrays. While you get the nice array syntax with the buildin new, a library new just looks ugly. new int[5]; vs. NewArray!int(5); // Can't use New! here because it would conflict with the New! for creating objects / structs I'm using library implemented new / delete since over a year and its annoying and makes the language look ugly and feel strange to use. See my allocator and New / Delete implementation here: https://github.com/Ingrater/druntime/blob/master/src/core/allocator.d I would rather want new to be overloadable and have 2 sets of parameters new (allocator)(arg1, arg2) Where "allocator" would go to the overloaded version of new and "arg1" and "arg2" will be passed to the constructor. I think its a really bad idea to deprecate new. Replacing Delete works just fine with a library function though. Kind Regards Benjamin Thaut
Sep 23 2013
parent reply "Namespace" <rswhite4 googlemail.com> writes:
 I would rather want new to be overloadable and have 2 sets of 
 parameters

 new (allocator)(arg1, arg2)

 Where "allocator" would go to the overloaded version of new and 
 "arg1" and "arg2" will be passed to the constructor.
+1
Sep 24 2013
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 9/24/13 3:11 PM, Namespace wrote:
 I would rather want new to be overloadable and have 2 sets of parameters

 new (allocator)(arg1, arg2)

 Where "allocator" would go to the overloaded version of new and "arg1"
 and "arg2" will be passed to the constructor.
+1
We're banning that syntax of new. Andrei
Sep 24 2013
next sibling parent reply "deadalnix" <deadalnix gmail.com> writes:
On Tuesday, 24 September 2013 at 22:39:27 UTC, Andrei 
Alexandrescu wrote:
 On 9/24/13 3:11 PM, Namespace wrote:
 I would rather want new to be overloadable and have 2 sets of 
 parameters

 new (allocator)(arg1, arg2)

 Where "allocator" would go to the overloaded version of new 
 and "arg1"
 and "arg2" will be passed to the constructor.
+1
We're banning that syntax of new.
It seem to me like typed allocator should try to fit in rather than redefine everything.
Sep 24 2013
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 9/24/13 6:42 PM, deadalnix wrote:
 On Tuesday, 24 September 2013 at 22:39:27 UTC, Andrei Alexandrescu wrote:
 On 9/24/13 3:11 PM, Namespace wrote:
 I would rather want new to be overloadable and have 2 sets of
 parameters

 new (allocator)(arg1, arg2)

 Where "allocator" would go to the overloaded version of new and "arg1"
 and "arg2" will be passed to the constructor.
+1
We're banning that syntax of new.
It seem to me like typed allocator should try to fit in rather than redefine everything.
The banning is orthogonal to allocators. Andrei
Sep 24 2013
prev sibling next sibling parent "Namespace" <rswhite4 googlemail.com> writes:
On Tuesday, 24 September 2013 at 22:39:27 UTC, Andrei 
Alexandrescu wrote:
 On 9/24/13 3:11 PM, Namespace wrote:
 I would rather want new to be overloadable and have 2 sets of 
 parameters

 new (allocator)(arg1, arg2)

 Where "allocator" would go to the overloaded version of new 
 and "arg1"
 and "arg2" will be passed to the constructor.
+1
We're banning that syntax of new. Andrei
What's wrong with this syntax? And: *That* syntax or complete 'new' as built-in feature?
Sep 25 2013
prev sibling parent reply Benjamin Thaut <code benjamin-thaut.de> writes:
Am 25.09.2013 00:39, schrieb Andrei Alexandrescu:
 On 9/24/13 3:11 PM, Namespace wrote:
 I would rather want new to be overloadable and have 2 sets of parameters

 new (allocator)(arg1, arg2)

 Where "allocator" would go to the overloaded version of new and "arg1"
 and "arg2" will be passed to the constructor.
+1
We're banning that syntax of new. Andrei
I always love it when people just plain ignore all the arguments I made ... -- Kind Regards Benjamin Thaut
Sep 25 2013
parent reply Johannes Pfau <nospam example.com> writes:
Am Wed, 25 Sep 2013 17:22:28 +0200
schrieb Benjamin Thaut <code benjamin-thaut.de>:

 Am 25.09.2013 00:39, schrieb Andrei Alexandrescu:
 We're banning that syntax of new.

 Andrei
I always love it when people just plain ignore all the arguments I made ...
I think there's a misunderstanding here. new(...) already was a valid syntax in D1, I think Andrei is referring to that: http://www.digitalmars.com/d/1.0/class.html#allocators And we really should get rid of that. The nested class/context pointer argument is a pretty strong one. Of course we could introduce builtins to deal with that but there's not much benefit to that. Having the compiler rewrite "new(MallocAllocator) MyClass(a,b,c)" to "void[] __buf = MallocAllocator.it.allocate(MyClass.instancesize);" "__construct!MyClass(__buf, a, b, c);" //Has to handle context ptr seems like a good solution. It probably has to support both allocator types(for allocators without state) and allocator instances.
Sep 25 2013
next sibling parent Benjamin Thaut <code benjamin-thaut.de> writes:
Am 25.09.2013 20:27, schrieb Johannes Pfau:
 Am Wed, 25 Sep 2013 17:22:28 +0200
 schrieb Benjamin Thaut <code benjamin-thaut.de>:

 Am 25.09.2013 00:39, schrieb Andrei Alexandrescu:
 We're banning that syntax of new.

 Andrei
I always love it when people just plain ignore all the arguments I made ...
I think there's a misunderstanding here. new(...) already was a valid syntax in D1, I think Andrei is referring to that: http://www.digitalmars.com/d/1.0/class.html#allocators And we really should get rid of that. The nested class/context pointer argument is a pretty strong one. Of course we could introduce builtins to deal with that but there's not much benefit to that. Having the compiler rewrite "new(MallocAllocator) MyClass(a,b,c)" to "void[] __buf = MallocAllocator.it.allocate(MyClass.instancesize);" "__construct!MyClass(__buf, a, b, c);" //Has to handle context ptr seems like a good solution. It probably has to support both allocator types(for allocators without state) and allocator instances.
Well I didn't actually know that a similar syntax existed in D1. I would be fine with a rewrite rule too. Anything that hides the syntactic unglyness of using a templated library method would be fine. We just have to make sure that the rewrite rule also correctly handles the implict conversion of literals. We might also want to allow: MallocAllocator.new MyClass(a,b,c) Kind Regards Benjamin Thaut
Sep 25 2013
prev sibling parent Jacob Carlborg <doob me.com> writes:
On 2013-09-25 20:27, Johannes Pfau wrote:

 Having the compiler rewrite

 "new(MallocAllocator) MyClass(a,b,c)"
 to
 "void[] __buf = MallocAllocator.it.allocate(MyClass.instancesize);"
 "__construct!MyClass(__buf, a, b, c);" //Has to handle context ptr

 seems like a good solution. It probably has to support both allocator
 types(for allocators without state) and allocator instances.
Here's an alternative. The compiler will lower: new MyClass(a, b, c); To: auto __buf = _d_tl_new(MyClass.instancesize); And: new shared MyClass(a, b, c); auto __buf = _d_global_new(MyClass.instancesize); Then construct the class as above. To change the alloctor one just sets the rt.allocator.allocator variable, most likely via some function. extern (C) void[] _d_tl_new (size_t size) { return _d_new(size, rt.allocator.allocator); } extern (C) void[] _d_global_new (size_t size) { return _d_new(size, rt.allocator.globalAllocator); } extern (C) void[] _d_new (size_t size, Allocator allocator) { if (memory = allocator.allocate(size)) return memory; onOutOfMemoryError(); } -- /Jacob Carlborg
Sep 26 2013
prev sibling parent reply Nick Sabalausky <SeeWebsiteToContactMe semitwist.com> writes:
On Mon, 23 Sep 2013 17:02:09 +0200
"Adam D. Ruppe" <destructionator gmail.com> wrote:

 We should really deprecate the new keyword. It'd break like all 
 code ever, but with D's templates, there's no need for it, and 
 when it is there, it is going to spark problems about replacing 
 global allocators or the library allocators being second class 
 citizens.
 
I think that's addressing the wrong problem, it wouldn't solve much, if anything. We'd just end up with a lot of lib writers (and others) hardcoding in usage of the default allocator. So it'd be exactly the same as now, just with uglier syntax and a ton of breakage. What's ultimately needed is a way to change the default allocator for not only "new" but also array concatenation, closures, and any other part of the language that might allocate. That way, we actually *do* get the benefits we're looking for, plus we keep the nicer current syntax and no breakage. tl;dr: "new" is a red herring. The real issue is being able to easily replace the default allocator.
Sep 23 2013
parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Mon, Sep 23, 2013 at 07:29:39PM -0400, Nick Sabalausky wrote:
 On Mon, 23 Sep 2013 17:02:09 +0200
 "Adam D. Ruppe" <destructionator gmail.com> wrote:
 
 We should really deprecate the new keyword. It'd break like all 
 code ever, but with D's templates, there's no need for it, and 
 when it is there, it is going to spark problems about replacing 
 global allocators or the library allocators being second class 
 citizens.
 
I think that's addressing the wrong problem, it wouldn't solve much, if anything. We'd just end up with a lot of lib writers (and others) hardcoding in usage of the default allocator. So it'd be exactly the same as now, just with uglier syntax and a ton of breakage. What's ultimately needed is a way to change the default allocator for not only "new" but also array concatenation, closures, and any other part of the language that might allocate. That way, we actually *do* get the benefits we're looking for, plus we keep the nicer current syntax and no breakage. tl;dr: "new" is a red herring. The real issue is being able to easily replace the default allocator.
I thought Walter's DIP already addresses the issue of replacing the default allocator? http://wiki.dlang.org/DIP46 I get the feeling that we don't have a good handle on the fundamental issues, though. Having a stack for managing the default allocator works for the simpler cases, but it's unclear how things would interact once you throw in other requirements, like class/struct-scoped allocators, block scoped allocators inside closures, user-specified allocators that must interact with the foregoing, etc.. For example, it's relatively easy to understand this: void doWork() { gc_push(MyAlloc); scope(exit) gc_pop(); ... // do work } void doMoreWork() { gc_push(AnotherAlloc); scope(exit) gc_pop(); ... // do work } void main() { doWork(); doMoreWork(); } But once you throw in closures and delegates, things become rather murky: auto getCallback() { gc_push(MyAlloc); scope(exit) gc_pop(); int closed_var; return { // What does this do? closed_var++; string x = "abc"; x ~= "def"; // which allocator does this use? }; } void main() { auto dg = getCallback(); gc_push(OtherAlloc); scope(exit) gc_pop(); dg(); // Hmm... } T -- May you live all the days of your life. -- Jonathan Swift
Sep 23 2013
parent Jacob Carlborg <doob me.com> writes:
On 2013-09-24 02:03, H. S. Teoh wrote:

 I thought Walter's DIP already addresses the issue of replacing the
 default allocator?

 	http://wiki.dlang.org/DIP46

 I get the feeling that we don't have a good handle on the fundamental
 issues, though. Having a stack for managing the default allocator works
 for the simpler cases, but it's unclear how things would interact once
 you throw in other requirements, like class/struct-scoped allocators,
 block scoped allocators inside closures, user-specified allocators that
 must interact with the foregoing, etc..
Why not just let the user set the allocator explicitly, see: http://forum.dlang.org/thread/l1nvn4$ca3$1 digitalmars.com?page=2#post-l1oqp2:2429fg:241:40digitalmars.com -- /Jacob Carlborg
Sep 23 2013
prev sibling next sibling parent reply "deadalnix" <deadalnix gmail.com> writes:
On Sunday, 22 September 2013 at 23:49:56 UTC, Andrei Alexandrescu 
wrote:
 Hello,
First of all, awesome ! Now the meeeeh part. I really think the it thing is not good. I don't think it is desirable or necessary. We should get rid of it. You can't deal with ubyte[] like that, that is incorrect in regard to - unimplemented - aliasing rules. Allocator should deal with void[] . What you call safe really isn't. Allocate something on the GC, store a pointer on a custom allocated location, collect, enjoy the memory corruption. All operation are safe according to your proposal. Allocation can only be safe if the GRAND MASTER GC is aware of it. You proposal allocate shared memory. This is common in C/C++ world as memory is shared by default, but shouldn't be in D. It is highly desirable to allocate with different methods for different type qualifier. How does your design adapt to that ? Finally, we got to decide how these basics block are used to form typed allocators, and interact with language constructs. Sorry if this has been mentioned before, it is really ate here and I can't read the whole thread, especially since Manu is on steroids :D
Sep 23 2013
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 9/23/13 10:01 AM, deadalnix wrote:
 On Sunday, 22 September 2013 at 23:49:56 UTC, Andrei Alexandrescu wrote:
 Hello,
First of all, awesome ! Now the meeeeh part. I really think the it thing is not good. I don't think it is desirable or necessary. We should get rid of it.
The singleton allocator "it" is instrumental for supporting stateful and stateless allocators with ease. It took me a few iterations to get to that, and I'm very pleased with the results. I think it would be difficult to improve on it without making other parts more difficult.
 You can't deal with ubyte[] like that, that is incorrect in regard to -
 unimplemented - aliasing rules. Allocator should deal with void[] .
I think I'll do that.
 What you call safe really isn't. Allocate something on the GC, store a
 pointer on a custom allocated location, collect, enjoy the memory
 corruption.
I don't understand this. There are no pointers at this level, only untyped memory. The main chance of corruption is to access something after it's been freed.
 All operation are safe according to your proposal.
No, for most allocators freeing and reallocating are unsafe.
 Allocation can only be safe if the GRAND MASTER GC is aware of it.
I don't think so.
 You proposal allocate shared memory.
No. It allocates unshared memory off of a shared or unshared allocator. The memory just allocated is as of yet unshared for the simple reason that only one thread has as of yet access to it.
 This is common in C/C++ world as
 memory is shared by default, but shouldn't be in D. It is highly
 desirable to allocate with different methods for different type
 qualifier. How does your design adapt to that ?
The typed allocators will have distinct methods for shared and unshared allocation. Even at this level it's possible to design an allocator that allocates in different ways with a shared vs. an unshared allocator object (overload on shared). So far I've only designed non-shared allocators, or wrap those that are already shared (malloc and new).
 Finally, we got to decide how these basics block are used to form typed
 allocators, and interact with language constructs.
Sure. Again: I describe the Otto cycle and you discuss how to paint the road. Andrei
Sep 23 2013
next sibling parent reply Robert Schadek <realburner gmx.de> writes:
On 09/23/2013 07:16 PM, Andrei Alexandrescu wrote:
 On 9/23/13 10:01 AM, deadalnix wrote:

 Finally, we got to decide how these basics block are used to form typed
 allocators, and interact with language constructs.
Sure. Again: I describe the Otto cycle and you discuss how to paint the road.
IMO there is a problem with this metaphor. If the road is painted metallic the Otto motor (cycle) needs cooling, otherwise ... boom. So I think the overall picture has to be kept in sight.
Sep 23 2013
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 9/23/13 11:55 AM, Robert Schadek wrote:
 On 09/23/2013 07:16 PM, Andrei Alexandrescu wrote:
 On 9/23/13 10:01 AM, deadalnix wrote:

 Finally, we got to decide how these basics block are used to form typed
 allocators, and interact with language constructs.
Sure. Again: I describe the Otto cycle and you discuss how to paint the road.
IMO there is a problem with this metaphor. If the road is painted metallic the Otto motor (cycle) needs cooling, otherwise ... boom. So I think the overall picture has to be kept in sight.
Which makes the metaphor unintentionally better. Yes, we do need to worry about higher concerns because that's how layerd abstractions work. Andrei
Sep 23 2013
parent Robert Schadek <realburner gmx.de> writes:
On 09/23/2013 09:11 PM, Andrei Alexandrescu wrote:
 On 9/23/13 11:55 AM, Robert Schadek wrote:
 On 09/23/2013 07:16 PM, Andrei Alexandrescu wrote:
 On 9/23/13 10:01 AM, deadalnix wrote:

 Finally, we got to decide how these basics block are used to form
 typed
 allocators, and interact with language constructs.
Sure. Again: I describe the Otto cycle and you discuss how to paint the road.
IMO there is a problem with this metaphor. If the road is painted metallic the Otto motor (cycle) needs cooling, otherwise ... boom. So I think the overall picture has to be kept in sight.
Which makes the metaphor unintentionally better. Yes, we do need to worry about higher concerns because that's how layerd abstractions work. Andrei
You're right! And I'm to tired to read properly.
Sep 23 2013
prev sibling parent reply "deadalnix" <deadalnix gmail.com> writes:
On Monday, 23 September 2013 at 17:16:40 UTC, Andrei Alexandrescu 
wrote:
 The singleton allocator "it" is instrumental for supporting 
 stateful and stateless allocators with ease. It took me a few 
 iterations to get to that, and I'm very pleased with the 
 results. I think it would be difficult to improve on it without 
 making other parts more difficult.
The allocator can use a singleton internally without using "it". I'm unclear what is the problem solved.
 What you call safe really isn't. Allocate something on the GC, 
 store a
 pointer on a custom allocated location, collect, enjoy the 
 memory
 corruption.
I don't understand this. There are no pointers at this level, only untyped memory. The main chance of corruption is to access something after it's been freed.
If there are no pointers at this level, then the collect method do not make any sense.
 Allocation can only be safe if the GRAND MASTER GC is aware of 
 it.
I don't think so.
When GRAND MASTER GC unhappy, he free live objects.
Sep 23 2013
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 9/23/13 5:12 PM, deadalnix wrote:
 On Monday, 23 September 2013 at 17:16:40 UTC, Andrei Alexandrescu wrote:
 The singleton allocator "it" is instrumental for supporting stateful
 and stateless allocators with ease. It took me a few iterations to get
 to that, and I'm very pleased with the results. I think it would be
 difficult to improve on it without making other parts more difficult.
The allocator can use a singleton internally without using "it". I'm unclear what is the problem solved.
Composability. My first design had stateful allocators define regular methods and stateless (better said global) allocators define static methods. That seemed to work well but then allocators composed with them also had to define static or nonstatic methods depending on whether the parent allocators had state. Using a singleton instance for stateless allocators has simplified things somewhat - everybody defines regular methods, and if they are stateless they define the singleton instance. Andrei
Sep 23 2013
prev sibling parent "BLM768" <blm768 gmail.com> writes:
On Monday, 23 September 2013 at 17:01:47 UTC, deadalnix wrote:
 What you call safe really isn't. Allocate something on the GC, 
 store a pointer on a custom allocated location, collect, enjoy 
 the memory corruption. All operation are safe according to your 
 proposal. Allocation can only be safe if the GRAND MASTER GC is 
 aware of it.
I don't see why the GC can't be allocator-agnostic. In principle, it should be possible to register arbitrary allocators with the GC (possibly by storing a pointer to the allocator in each GC-managed block of memory). The GC would have a default allocator which would be used with the "new" operator, and something like the aforementioned create!() syntax could be used for custom allocators: SomeClass c1 = new SomeClass("arg"); //Managed by the GC SomeClass c2 = gcCreate!SomeClass(someAllocator, "arg"); //Another possible syntax (much prettier!): SomeClass c3 = someAllocator.new SomeClass("arg"); This could also integrate with DIP46; instead of pushing and popping a GC object, one would push and pop the GC's default allocator (for the current thread, of course). There would be a bit of overhead, but that seems to be unavoidable in a sufficiently generic design, and I doubt that it would cut performance to an unacceptable level, especially because really performance-critical code would avoid the GC anyway.
 You proposal allocate shared memory. This is common in C/C++ 
 world as memory is shared by default, but shouldn't be in D. It 
 is highly desirable to allocate with different methods for 
 different type qualifier. How does your design adapt to that ?
If the aforementioned thread-local GC reference is implemented, it should be possible to provide decent thread-local allocation. I'm not sure exactly how to design that, but it seems like thread-local GC allocation would require some work at the GC level; a global GC won't necessarily play nicely with thread-local allocators. For non-GC thread-local allocations, I'd imagine that you could just create a thread-local allocator.
Sep 23 2013
prev sibling next sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 09/23/2013 01:49 AM, Andrei Alexandrescu wrote:
 I am making good progress on the design of std.allocator, and I am
 optimistic about the way it turns out. D's introspection capabilities
 really shine through, and in places the design does feel really
 archetypal - e.g. "this is the essence of a freelist allocator". It's a
 very good feeling. The overall inspiration comes from Berger's
 HeapLayers, but D's introspection takes that pattern to a whole new level.
 ...
Seems neat.
 Several details are still in flux, but at the top level it seems most
 natural to divide things in two tiers:

 1. Typed allocator, i.e. every request for allocation comes with the
 exact type requested;

 2. Untyped allocator - traffics exclusively in ubyte[].
 ...
Some general remarks: One issue you will probably run into and maybe want to fix in some way during the typed allocator design is that private constructors cannot be called from templates parameterized on types. E.g.: module a; auto New(T,Args...)(Args args){ return new T(args); } // --- module b; class A{ private this(){} } void main(){ auto a = New!A(); // error } Is there an intention to also support replacements of instance new allocations, i.e. object.new subobject(args)?
Sep 23 2013
next sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 9/23/13 1:32 PM, Timon Gehr wrote:
 One issue you will probably run into and maybe want to fix in some way
 during the typed allocator design is that private constructors cannot be
 called from templates parameterized on types.
Hrm, that's a real problem. We've ran into it at work when we were trying to replace a macro with a template. The macro is automatically friend with everyone! :o)
 Is there an intention to also support replacements of instance new
 allocations, i.e. object.new subobject(args)?
It's a good point to keep in mind as we move further with this. Andrei
Sep 23 2013
prev sibling parent Jacob Carlborg <doob me.com> writes:
On 2013-09-23 22:32, Timon Gehr wrote:

 Some general remarks:

 One issue you will probably run into and maybe want to fix in some way
 during the typed allocator design is that private constructors cannot be
 called from templates parameterized on types.

 E.g.:

 module a;

 auto New(T,Args...)(Args args){
      return new T(args);
 }

 // ---

 module b;

 class A{
      private this(){}
 }

 void main(){
      auto a = New!A(); // error
 }
Allocate the memory needed for T without using "new". Then a pointer can be used to bypass the protection of the constructor, something like: extern (C) Object _d_newclass(in ClassInfo); T New (T, Args ...) (Args args) { auto object = cast(T) _d_newclass(T.classinfo); // use explicit type to handle overloads. T delegate (Args) dg = &object.__ctor; return dg(args); } -- /Jacob Carlborg
Sep 23 2013
prev sibling next sibling parent reply "Peter Alexander" <peter.alexander.au gmail.com> writes:
On Sunday, 22 September 2013 at 23:49:56 UTC, Andrei Alexandrescu 
wrote:
 1. Typed allocator, i.e. every request for allocation comes 
 with the exact type requested;

 2. Untyped allocator - traffics exclusively in ubyte[].
This is a good design. Perhaps this is a later concern, but we should make sure that node-based containers (e.g. linked list, trees) have a way to access the allocation size needed for the node. STL did not do this.
 * alignment is a compile-time constant yielding the alignment 
 of allocated data. Many allocators offer the maximum alignment 
 of the platform (for which I used real.alignof above). This 
 constant is required for all allocators.
What if, for example, I wanted to allocate a 4096 byte aligned block from the GC allocator? Do I have to create a new allocator backed by the GC allocator? What if the alignment is not known at compile time (e.g. hard disk page size or CPU cache line size)? Might be better to pass the desired alignment in the allocate method.
 * available is a property that returns how many total (not 
 necessarily contiguous) bytes are available for allocation. The 
 NullAllocator knows statically it has 0 bytes available so it 
 implements it as an enum. Generally allocators will implement 
 it as a  property. This property is optional.
It would be useful to know the maximum available contiguous block size too, so that you can find out if an allocation will succeed without calling allocate. It's also useful for diagnosing fragmentation issues e.g. "allocation failed, free memory = X, max contiguous = Y". If X is high and Y is low then you are highly fragmented. Of course, this should be optional.
 * allocate(s) returns a ubyte[] with length at least s, or 
 null. (It does not throw on out-of-memory because that would 
 hurt composability; it turns out many elemental allocators do 
 naturally run out of memory.) This method is required for all 
 allocators. In most allocators this method should be  safe.
What are the semantics of allocate(0)? malloc(0) is implementation defined.
Sep 23 2013
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 9/23/13 1:37 PM, Peter Alexander wrote:
 What if, for example, I wanted to allocate a 4096 byte aligned block
 from the GC allocator? Do I have to create a new allocator backed by the
 GC allocator?
I've been thinking a good library component would be AlignmentAllocator that would work like this: struct AlignmentAllocator(Allocator, size_t desiredAlignment) if (Allocator.alignment < desiredAlignment) { enum alignment = desiredAlignment; ... } That allocator would allocate more memory (I suspect there's a gcd calculation somewhere :o)) and then adjust the starting pointer of the allocated block to reach the requested alignment. I'm just off the phone with Walter discussing this and related issue. It's an interesting problem space.
 What if the alignment is not known at compile time (e.g. hard disk page
 size or CPU cache line size)?
The proposed design assumes compile-time allocation sizes. A trick similar to the one above (overallocate and adjust pointer) should work. I'd need a handful of good examples where the alignment must be known at runtime. Can you link to some?
 Might be better to pass the desired alignment in the allocate method.
The thing is in 99%+ of the cases you don't need it. Then perhaps in 99%+ of the remaining cases the alignment is known during compilation. Nevertheless it's a change worth investigating.
 * available is a property that returns how many total (not necessarily
 contiguous) bytes are available for allocation. The NullAllocator
 knows statically it has 0 bytes available so it implements it as an
 enum. Generally allocators will implement it as a  property. This
 property is optional.
It would be useful to know the maximum available contiguous block size too, so that you can find out if an allocation will succeed without calling allocate.
I think the best way is to just go ahead and attempt an allocation, especially if the allocator is shared (races!). (There's some flexibility with expand() there, which has a minimum size and a desired size.) But clearly the information of the largest contiguous block is currently missing, and is reasonable to be considered for addition.
 It's also useful for diagnosing fragmentation issues
 e.g. "allocation failed, free memory = X, max contiguous = Y". If X is
 high and Y is low then you are highly fragmented.

 Of course, this should be optional.
Makes sense.
 * allocate(s) returns a ubyte[] with length at least s, or null. (It
 does not throw on out-of-memory because that would hurt composability;
 it turns out many elemental allocators do naturally run out of
 memory.) This method is required for all allocators. In most
 allocators this method should be  safe.
What are the semantics of allocate(0)? malloc(0) is implementation defined.
I defined Mallocator to return whatever malloc returns on allocate(0), but decided realloc() is too messed up and special-cased reallocate(0) to free memory and place null in the buffer. Probably I'll work the same logic in Mallocator.allocate(0) to eliminate platform dependencies. Andrei
Sep 23 2013
parent reply "Peter Alexander" <peter.alexander.au gmail.com> writes:
On Monday, 23 September 2013 at 21:27:36 UTC, Andrei Alexandrescu 
wrote:
 That allocator would allocate more memory (I suspect there's a 
 gcd calculation somewhere :o)) and then adjust the starting 
 pointer of the allocated block to reach the requested alignment.
That's quite an inefficient use of memory for large alignment sizes. Also, how does it work with your deallocate interface? Suppose I request an 0x100 aligned block of 0x100 bytes. Your alignment allocator requests 0x200 from the GC, which maybe returns [0xffff0040-0xffff0240] and then returns an aligned buffer from that [0xffff0100-0xffff0200]. Later, I try to deallocate that buffer, which means your alignment allocator has to deallocate the original buffer. Where does the alignment allocator store the original GC-provided buffer ptr + size? It cannot be determined from the buffer I gave it.
 I'd need a handful of good examples where the alignment must be 
 known at runtime. Can you link to some?
mprotect requires a pointer that is a multiple of the system page size, which isn't constant. http://linux.die.net/man/2/mprotect Reading a file without buffering on Windows requires that you align the destination buffer on volume sector size boundaries, which is not known at compile time (although many just assume 4096). http://msdn.microsoft.com/en-us/library/windows/desktop/cc644950(v=vs.85).aspx As I mentioned previously, you may want to also align to the cache line size (for performance). This is not known at compile time (although again, is often assumed in practice).
Sep 23 2013
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 9/23/13 3:12 PM, Peter Alexander wrote:
 On Monday, 23 September 2013 at 21:27:36 UTC, Andrei Alexandrescu wrote:
 That allocator would allocate more memory (I suspect there's a gcd
 calculation somewhere :o)) and then adjust the starting pointer of the
 allocated block to reach the requested alignment.
That's quite an inefficient use of memory for large alignment sizes.
I don't see a way around it unless the allocator natively provides a large alignment size, which it would then advertise in its "alignment" constant. The problem is independent of this particular design.
 Also, how does it work with your deallocate interface? Suppose I request
 an 0x100 aligned block of 0x100 bytes. Your alignment allocator requests
 0x200 from the GC, which maybe returns [0xffff0040-0xffff0240] and then
 returns an aligned buffer from that [0xffff0100-0xffff0200]. Later, I
 try to deallocate that buffer, which means your alignment allocator has
 to deallocate the original buffer. Where does the alignment allocator
 store the original GC-provided buffer ptr + size? It cannot be
 determined from the buffer I gave it.
Walter noted the same issue. I'd need to think about it. A simple solution is to simply store the size just before the pointer returned, in a dope block manner.
 I'd need a handful of good examples where the alignment must be known
 at runtime. Can you link to some?
mprotect requires a pointer that is a multiple of the system page size, which isn't constant. http://linux.die.net/man/2/mprotect
Thanks! I see posix_memalign is a corresponding call that returns a page-aligned chunk of memory. I assume calls like mmap also return page-aligned chunks. Indeed it is a problem for the current design that the alignment must be known during compilation.
 Reading a file without buffering on Windows requires that you align the
 destination buffer on volume sector size boundaries, which is not known
 at compile time (although many just assume 4096).
>
 http://msdn.microsoft.com/en-us/library/windows/desktop/cc644950(v=vs.85).aspx
I see there's a difference between x86 and Itanium in page size... I'm trying to lean toward handling these as rare/exotic cases without outright eliminating them, while providing simple and efficient handling of alignment for the frequent cases (i.e. allocate ints at addresses multiple of 4 etc).
 As I mentioned previously, you may want to also align to the cache line
 size (for performance). This is not known at compile time (although
 again, is often assumed in practice).
Indeed. Andrei
Sep 23 2013
parent reply Manu <turkeyman gmail.com> writes:
On 24 September 2013 09:21, Andrei Alexandrescu <
SeeWebsiteForEmail erdani.org> wrote:

 On 9/23/13 3:12 PM, Peter Alexander wrote:

 On Monday, 23 September 2013 at 21:27:36 UTC, Andrei Alexandrescu wrote:

 That allocator would allocate more memory (I suspect there's a gcd
 calculation somewhere :o)) and then adjust the starting pointer of the
 allocated block to reach the requested alignment.
That's quite an inefficient use of memory for large alignment sizes.
I don't see a way around it unless the allocator natively provides a large alignment size, which it would then advertise in its "alignment" constant. The problem is independent of this particular design.
There's an important distinction between an allocator advertising a large allocation size (which it always uses), and an allocator being capable of allocating a larger aligned block if requested. Surely the advertised alignment is the alignment you will get in all cases, even if you request a small alignment. You should advertise a minimum and maximum alignment if that is your solution. Also, how does it work with your deallocate interface? Suppose I request
 an 0x100 aligned block of 0x100 bytes. Your alignment allocator requests
 0x200 from the GC, which maybe returns [0xffff0040-0xffff0240] and then
 returns an aligned buffer from that [0xffff0100-0xffff0200]. Later, I
 try to deallocate that buffer, which means your alignment allocator has
 to deallocate the original buffer. Where does the alignment allocator
 store the original GC-provided buffer ptr + size? It cannot be
 determined from the buffer I gave it.
Walter noted the same issue. I'd need to think about it. A simple solution is to simply store the size just before the pointer returned, in a dope block manner.
This means if the allocator returns you memory that is already of the desired alignment, you need to waste an entire aligned block just to store the offset, which would have been 0. I solve this problem by storing my allocation table in a separate hash table, and entries in that table associate the size and also the alignment offset. But it's not a good solution, it's still wasteful. It's just the best option you have if you don't have any control over the system allocator. I also use large alignments which I've detailed in prior discussions. Common ones are cache line (~64-256 bytes), and gpu memory page (~4k). You can't go wasting GPU memory by overallocating every block. It's definitely important that allocator's are able to receive an alignment request, and give them the opportunity to fulfill a dynamic alignment request without always resorting to an over-allocation strategy. I'd need a handful of good examples where the alignment must be known
 at runtime. Can you link to some?
mprotect requires a pointer that is a multiple of the system page size, which isn't constant. http://linux.die.net/man/2/**mprotect<http://linux.die.net/man/2/mprotect>
Thanks! I see posix_memalign is a corresponding call that returns a page-aligned chunk of memory. I assume calls like mmap also return page-aligned chunks. Indeed it is a problem for the current design that the alignment must be known during compilation. Reading a file without buffering on Windows requires that you align the
 destination buffer on volume sector size boundaries, which is not known
 at compile time (although many just assume 4096).
>
 http://msdn.microsoft.com/en-**us/library/windows/desktop/**
 cc644950(v=vs.85).aspx<http://msdn.microsoft.com/en-us/library/windows/desktop/cc644950(v=vs.85).aspx>
I see there's a difference between x86 and Itanium in page size... I'm trying to lean toward handling these as rare/exotic cases without outright eliminating them, while providing simple and efficient handling of alignment for the frequent cases (i.e. allocate ints at addresses multiple of 4 etc). As I mentioned previously, you may want to also align to the cache line
 size (for performance). This is not known at compile time (although
 again, is often assumed in practice).
Indeed. Andrei
Sep 23 2013
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 9/23/13 9:56 PM, Manu wrote:
 You can't go wasting GPU memory by overallocating every block.
Only the larger chunk may need to be overallocated if all allocations are then rounded up.
 It's definitely important that allocator's are able to receive an
 alignment request, and give them the opportunity to fulfill a dynamic
 alignment request without always resorting to an over-allocation strategy.
I'd need a bit of convincing. I'm not sure everybody needs to pay for a few, and it is quite possible that malloc_align suffers from the same fragmentation issues as the next guy. Also, there's always the possibility of leaving some bits to lower-level functions. Andrei
Sep 23 2013
parent reply Manu <turkeyman gmail.com> writes:
On 24 September 2013 15:31, Andrei Alexandrescu <
SeeWebsiteForEmail erdani.org> wrote:

 On 9/23/13 9:56 PM, Manu wrote:

 You can't go wasting GPU memory by overallocating every block.
Only the larger chunk may need to be overallocated if all allocations are then rounded up.
I don't follow. If I want to allocate 4k aligned, then 8k will be allocated (because it wants to store an offset). Any smaller allocation let's say, 16 bytes, will round up to 4k. You can't waste precious gpu ram like that. A minimum and a maximum (guaranteed without over-allocating) alignment may be useful. But I think allocators need to be given the opportunity to do the best it can. It's definitely important that allocator's are able to receive an
 alignment request, and give them the opportunity to fulfill a dynamic
 alignment request without always resorting to an over-allocation strategy.
I'd need a bit of convincing. I'm not sure everybody needs to pay for a few, and it is quite possible that malloc_align suffers from the same fragmentation issues as the next guy. Also, there's always the possibility of leaving some bits to lower-level functions.
What are they paying exactly? An extra arg to allocate that can probably be defaulted? void[] allocate(size_t bytes, size_t align = this.alignment) shared; Or is it the burden of adding the overallocation boilerplate logic to each allocator for simple allocators that don't want to deal with alignment in a conservative way? I imagine that could possibly be automated, the boilerplate could be given as a library. void[] allocate(size_t size, size_t align) { size_t allocSize = std.allocator.getSizeCompensatingForAlignment(size, align); void[] mem = ...; // allocation logic using allocSize return std.allocator.alignAllocation(mem, align); // adjusts the range, and maybe write the offset to the prior bytes }
Sep 23 2013
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 9/23/13 11:06 PM, Manu wrote:
 On 24 September 2013 15:31, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org <mailto:SeeWebsiteForEmail erdani.org>>
 wrote:

     On 9/23/13 9:56 PM, Manu wrote:

         You can't go wasting GPU memory by overallocating every block.


     Only the larger chunk may need to be overallocated if all
     allocations are then rounded up.


 I don't follow.
 If I want to allocate 4k aligned, then 8k will be allocated (because it
 wants to store an offset).
What do extant GPU allocators do here?
 Any smaller allocation let's say, 16 bytes, will round up to 4k. You
 can't waste precious gpu ram like that.
That's easy, you just segregate allocations by size.
 A minimum and a maximum (guaranteed without over-allocating) alignment
 may be useful.
What's the semantics of the minimum?
 But I think allocators need to be given the opportunity to do the best
 it can.

         It's definitely important that allocator's are able to receive an
         alignment request, and give them the opportunity to fulfill a
         dynamic
         alignment request without always resorting to an over-allocation
         strategy.


     I'd need a bit of convincing. I'm not sure everybody needs to pay
     for a few, and it is quite possible that malloc_align suffers from
     the same fragmentation issues as the next guy. Also, there's always
     the possibility of leaving some bits to lower-level functions.


 What are they paying exactly? An extra arg to allocate that can probably
 be defaulted?
    void[] allocate(size_t bytes, size_t align = this.alignment) shared;
For allocating relatively small objects (say up to 32K), we're looking at tens of cycles, no more. An extra argument needs to be passed around and more importantly looked at and acted upon. At this level it's a serious dent in the time budget. Part of the matter is that small objects must in a way the fastest to allocate. For larger objects, it is true to some extent that the caller will do some work with the obtained memory, which offsets the relative cost of allocation. (That being said, Jason Evans told me you can't always assume the caller will do at least "a memset amount of work".) Anyhow it stands to reason that you don't want to pay for matters related to alignment without even looking.
 Or is it the burden of adding the overallocation boilerplate logic to
 each allocator for simple allocators that don't want to deal with
 alignment in a conservative way?
 I imagine that could possibly be automated, the boilerplate could be
 given as a library.

 void[] allocate(size_t size, size_t align)
 {
    size_t allocSize =
 std.allocator.getSizeCompensatingForAlignment(size, align);

    void[] mem = ...; // allocation logic using allocSize

    return std.allocator.alignAllocation(mem, align); // adjusts the
 range, and maybe write the offset to the prior bytes
 }
One possibility I'm thinking of is to make maximum alignment a static property of the allocator. It may be set during runtime, but a given allocator object has one define maximum allocation. Would that be satisfactory to all? Andrei
Sep 24 2013
parent reply "Peter Alexander" <peter.alexander.au gmail.com> writes:
On Tuesday, 24 September 2013 at 15:25:11 UTC, Andrei 
Alexandrescu wrote:
 What are they paying exactly? An extra arg to allocate that 
 can probably
 be defaulted?
   void[] allocate(size_t bytes, size_t align = this.alignment) 
 shared;
For allocating relatively small objects (say up to 32K), we're looking at tens of cycles, no more. An extra argument needs to be passed around and more importantly looked at and acted upon. At this level it's a serious dent in the time budget.
The cost of a few cycles really doesn't matter for memory allocation... If you are really allocating memory so frequently that those few extra cycles matter then you are probably going to be memory bound anyway. I think this is a situation where you need to justify yourself with something concrete. Can you provide an example of some code whose performance is significantly impacted by the addition of an alignment parameter? It has to be "real code" that does something useful, not just a loop the continually calls allocate.
Sep 24 2013
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 9/24/13 9:58 AM, Peter Alexander wrote:
 On Tuesday, 24 September 2013 at 15:25:11 UTC, Andrei Alexandrescu wrote:
 What are they paying exactly? An extra arg to allocate that can probably
 be defaulted?
   void[] allocate(size_t bytes, size_t align = this.alignment) shared;
For allocating relatively small objects (say up to 32K), we're looking at tens of cycles, no more. An extra argument needs to be passed around and more importantly looked at and acted upon. At this level it's a serious dent in the time budget.
The cost of a few cycles really doesn't matter for memory allocation... If you are really allocating memory so frequently that those few extra cycles matter then you are probably going to be memory bound anyway.
It does. I'm not even going to argue this.
 I think this is a situation where you need to justify yourself with
 something concrete. Can you provide an example of some code whose
 performance is significantly impacted by the addition of an alignment
 parameter? It has to be "real code" that does something useful, not just
 a loop the continually calls allocate.
Strings. Andrei
Sep 24 2013
parent reply "Peter Alexander" <peter.alexander.au gmail.com> writes:
On Tuesday, 24 September 2013 at 17:02:18 UTC, Andrei 
Alexandrescu wrote:
 On 9/24/13 9:58 AM, Peter Alexander wrote:
 On Tuesday, 24 September 2013 at 15:25:11 UTC, Andrei 
 Alexandrescu wrote:
 What are they paying exactly? An extra arg to allocate that 
 can probably
 be defaulted?
  void[] allocate(size_t bytes, size_t align = 
 this.alignment) shared;
For allocating relatively small objects (say up to 32K), we're looking at tens of cycles, no more. An extra argument needs to be passed around and more importantly looked at and acted upon. At this level it's a serious dent in the time budget.
The cost of a few cycles really doesn't matter for memory allocation... If you are really allocating memory so frequently that those few extra cycles matter then you are probably going to be memory bound anyway.
It does. I'm not even going to argue this.
Sorry but I find this insulting. Myself and Manu, both professional and senior game developers with a lot of experience in performance are both arguing against you. I'm not saying this makes us automatically right, but I think it's rude to dismiss our concerns as not even worthy of discussion.
 I think this is a situation where you need to justify yourself 
 with
 something concrete. Can you provide an example of some code 
 whose
 performance is significantly impacted by the addition of an 
 alignment
 parameter? It has to be "real code" that does something 
 useful, not just
 a loop the continually calls allocate.
Strings.
Strings what? Just allocating lots of small strings? Ok, I've put together a benchmark of the simplest allocator I can think of (pointer bump) doing *nothing* but allocating 12 bytes at a time and copying a pre-defined string into the allocated memory: http://dpaste.dzfl.pl/59636d82 On my machine, the difference between the version with alignment and the version without 1%. I tried changing the allocator to a class so that the allocation was virtual and not inlined, and the difference was still only ~2% (Yes, I verified in the generated code that nothing was being omitted). In a real scenario, much more will be going on outside the allocator, making the overhead much less than 1%. Please let me know if you take issue with the benchmark. I wrote this quickly so hopefully I have not made any mistakes.
Sep 24 2013
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 9/24/13 11:02 AM, Peter Alexander wrote:
 On Tuesday, 24 September 2013 at 17:02:18 UTC, Andrei Alexandrescu wrote:
 On 9/24/13 9:58 AM, Peter Alexander wrote:
 On Tuesday, 24 September 2013 at 15:25:11 UTC, Andrei Alexandrescu
 wrote:
 What are they paying exactly? An extra arg to allocate that can
 probably
 be defaulted?
  void[] allocate(size_t bytes, size_t align = this.alignment) shared;
For allocating relatively small objects (say up to 32K), we're looking at tens of cycles, no more. An extra argument needs to be passed around and more importantly looked at and acted upon. At this level it's a serious dent in the time budget.
The cost of a few cycles really doesn't matter for memory allocation... If you are really allocating memory so frequently that those few extra cycles matter then you are probably going to be memory bound anyway.
It does. I'm not even going to argue this.
Sorry but I find this insulting.
Apologies, you're right. I was bummed straight after having sent that all-too-glib message.
 Myself and Manu, both professional and
 senior game developers with a lot of experience in performance are both
 arguing against you. I'm not saying this makes us automatically right,
 but I think it's rude to dismiss our concerns as not even worthy of
 discussion.
This is not an argument "against me" - I'm looking at ways to address alignment concerns. There's a larger issue at work: certain special allocator APIs are sensible but are unneeded for composition. The focus here is to provide enough primitives that allow composing larger allocators out of smaller components. A top-level specialized allocator may implement some or all of the discussed API, plus a bunch of other specialized functions.
 I think this is a situation where you need to justify yourself with
 something concrete. Can you provide an example of some code whose
 performance is significantly impacted by the addition of an alignment
 parameter? It has to be "real code" that does something useful, not just
 a loop the continually calls allocate.
Strings.
Strings what? Just allocating lots of small strings? Ok, I've put together a benchmark of the simplest allocator I can think of (pointer bump) doing *nothing* but allocating 12 bytes at a time and copying a pre-defined string into the allocated memory: http://dpaste.dzfl.pl/59636d82 On my machine, the difference between the version with alignment and the version without 1%. I tried changing the allocator to a class so that the allocation was virtual and not inlined, and the difference was still only ~2% (Yes, I verified in the generated code that nothing was being omitted). In a real scenario, much more will be going on outside the allocator, making the overhead much less than 1%. Please let me know if you take issue with the benchmark. I wrote this quickly so hopefully I have not made any mistakes.
There is really no need for a benchmark, for at least two reasons. First, people _will_ do cycle counting, which _will_ influence the bottom line. I work with a dozen of such. I understand it doesn't make a difference for you, Manu, a lot of game developers, and a bunch of others, but I know for a fact it _does_ make a difference for an important category of program(mer)?s. Second, there's no need for a defaulted argument; the aligned allocation can be an optional overload of the one-argument function. I'm looking into ways to compose with that overload. Andrei
Sep 24 2013
parent "deadalnix" <deadalnix gmail.com> writes:
On Tuesday, 24 September 2013 at 19:27:13 UTC, Andrei 
Alexandrescu wrote:
 Second, there's no need for a defaulted argument; the aligned 
 allocation can be an optional overload of the one-argument 
 function. I'm looking into ways to compose with that overload.
This or something like allocator.setAlignement(...); allocator.allocate(...);
Sep 24 2013
prev sibling parent reply "Dicebot" <public dicebot.lv> writes:
On Tuesday, 24 September 2013 at 16:58:27 UTC, Peter Alexander 
wrote:
 The cost of a few cycles really doesn't matter for memory 
 allocation... If you are really allocating memory so frequently 
 that those few extra cycles matter then you are probably going 
 to be memory bound anyway.
It is true if you are using malloc. Not for pool allocator or scratchpad allocator or similar performance-enhancing allocator.
Sep 24 2013
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 9/24/13 10:05 AM, Dicebot wrote:
 On Tuesday, 24 September 2013 at 16:58:27 UTC, Peter Alexander wrote:
 The cost of a few cycles really doesn't matter for memory
 allocation... If you are really allocating memory so frequently that
 those few extra cycles matter then you are probably going to be memory
 bound anyway.
It is true if you are using malloc. Not for pool allocator or scratchpad allocator or similar performance-enhancing allocator.
Even for malloc it's a concern. For small allocations, jemalloc takes only about one hundred cycles to take you from where you are to where you are plus a little memory. Andrei
Sep 24 2013
prev sibling parent "jerro" <a a.com> writes:
 Also, how does it work with your deallocate interface? Suppose 
 I request an 0x100 aligned block of 0x100 bytes. Your alignment 
 allocator requests 0x200 from the GC, which maybe returns 
 [0xffff0040-0xffff0240] and then returns an aligned buffer from 
 that [0xffff0100-0xffff0200]. Later, I try to deallocate that 
 buffer, which means your alignment allocator has to deallocate 
 the original buffer. Where does the alignment allocator store 
 the original GC-provided buffer ptr + size? It cannot be 
 determined from the buffer I gave it.
It can store it just before the address it returns. I have an example of this at https://github.com/jerro/pfft/blob/multidim/pfft/clib.d#L46 . If the deallocate method takes the size of the returned block as a parameter in addition to the pointer, the original pointer could also be stored right after the end of the returned block.
Sep 24 2013
prev sibling next sibling parent Nick Sabalausky <SeeWebsiteToContactMe semitwist.com> writes:
On Sun, 22 Sep 2013 16:49:57 -0700
Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:
 
 * expand(b, minDelta, maxDelta) grows b's length by at least minDelta 
 (and on a best-effort basis by at least maxDelta)
Minor bikeshedding issue, but 'maxDelta' isn't a maximum at all by your description, rather a suggested-but-optional minimum. So it shouldn't be called 'maxDelta' - very misleading. Should be something like 'preferredDelta' or 'idealMinDelta', etc.
Sep 23 2013
prev sibling next sibling parent reply Dmitry Olshansky <dmitry.olsh gmail.com> writes:
23-Sep-2013 03:49, Andrei Alexandrescu пишет:
 Hello,


 I am making good progress on the design of std.allocator, and I am
 optimistic about the way it turns out. D's introspection capabilities
 really shine through, and in places the design does feel really
 archetypal - e.g. "this is the essence of a freelist allocator". It's a
 very good feeling. The overall inspiration comes from Berger's
 HeapLayers, but D's introspection takes that pattern to a whole new level.

 Several details are still in flux, but at the top level it seems most
 natural to divide things in two tiers:

 1. Typed allocator, i.e. every request for allocation comes with the
 exact type requested;

 2. Untyped allocator - traffics exclusively in ubyte[].
Looks good (s/ubyte[]/void[] per current discussion). Do you imagine Typed allocators as something more then adapters that simplify a common pattern of allocate + emplace / destroy + deallocate? (create!T/destroy!T)
 struct NullAllocator
 {
      enum alignment = real.alignof;
      enum size_t available = 0;
      ubyte[] allocate(size_t s) shared { return null; }
      bool expand(ref ubyte[] b, size_t minDelta, size_t maxDelta) shared
      { assert(b is null); return false; }
      bool reallocate(ref ubyte[] b, size_t) shared
      { assert(b is null); return false; }
      void deallocate(ubyte[] b) shared { assert(b is null); }
      void collect() shared { }
      void deallocateAll() shared { }
      static shared NullAllocator it;
 }

 Primitives:
First things first - can't allocator return alignment as run-time value - a property (just like 'available' does)? The implementation contract is that it must be O(1) vanila syscall-free function. (Read this as consult system info exactly once, at construction). Thinking more of it - it also may be immutable size_t? Then it gets proper value at construction and then is never changed.
 * expand(b, minDelta, maxDelta) grows b's length by at least minDelta
 (and on a best-effort basis by at least maxDelta) and returns true, or
 does nothing and returns false. In most allocators this should be  safe.
 (One key insight is that expand() can be made  safe whereas shrink() or
 realloc() are most often not; such mini-epiphanies are very exciting
 because they put the design on a beam guide with few principles and many
 consequences.) The precondition is that b is null or has been previously
 returned by allocate(). This method is optional.
Great to see this primitive. Classic malloc-ators are so lame... (+ WinAPI Heaps fits here)
 * deallocate(b) deallocates the memory allocated for b. b must have been
 previously allocated by the same allocator. This method is usually
 unsafe (but there are notable implementations that may offer safety,
 such as unbounded freelists.) This method is optional.
Does the following implication hold "have a deallocate" --> must be manually managed? Otherwise how would one reliably work with it and not leak? This brings us to traits that allocators may (should) have a-la automatic? free-all on termination? Zeros on allocate (more common then one would think)? etc.
 * deallocateAll() deallocates in one shot all memory previously
 allocated by this allocator. This method is optional, and when present
 is almost always unsafe (I see no meaningful  safe implementation.)
 Region allocators are notable examples of allocators that define this
 method.
Okay.. I presume region "mark" works by spiting off a subinstance of allocator and "release" by deallocateAll().
 * collect() frees unused memory that had been allocated with this
 allocator. This optional method is implemented by tracing collectors and
 is usually  safe.
This seems hard and/or suboptimal w/o typeinfo --> typed allocators? I guess they could be more then a simple helper.
 There are quite a few more things to define more precisely, but this
 part of the design has become quite stable. To validate the design, I've
 defined some simple allocators (Mallocator, GCAllocator, Freelist,
 StackRegion, Region etc) but not the more involved ones, such as
 coalescing heaps, buddy system etc.
The problem I see is that allocators are rooted in poor foundation - malloc is so out of fashion (its interface is simply too thin on guarantees), sbrk is frankly a stone-age syscall. I would love to make a root allocator one on top of mmap/VirtualAlloc. This leads me to my biggest problem with classical memory management - ~20 years of PCs having virtual memory support directly in the CPU, and yet hardly anything outside of OS takes advantage of it. They (allocators) seem simply not aware of it - none of interface implies anything that would help user take advantage of it. I'm talking of (potentially) large allocations that may be often resized. (large allocations actually almost always a blind spot/fallback in allocators) To the point - we have address space and optionally memory committed in there, leading us to a 3-state of an _address range_: available, reserved, committed. The ranges of 2nd state are dirt cheap (abundant on 64bit) and could be easily flipped to committed and back. So I have a pattern I want to try to fit in your design. At present it is a datastructure + allocation logic. What I would want is clean and nice composition for ultimate reuse. An example is a heavy-duty array (could be used for potentially large stack). The requirements are: a) Never reallocate on resize - in certain cases it may even be too large to reallocate in RAM (!) b) Never waste RAM beyond some limit (a few pages or a tiny faction of size is fine) c) O(1) amortized appending as in plain array and other properties of an array -> it's a dynamic array after all Currently I use mmap/madvise and related entities directly. It goes as follows. Allocate a sufficiently large - as large as "it" may get (1Mb, 1G, 10G you name it) _address range_. Call this size a 'capacity'. Commit some memory (optionally on first resize) - call this a 'committed'. Appending goes using up committed space as typical array would do. Whenever it needs more it then that applies the same extending algorithm as usual but with (committed, capacity) pair. Ditto on resize back - (with some amortization) it asks OS to decommit pages decreasing the commited amount. In short - a c++ "vector" that has 2 capacities (current and absolute maximum) with a plus that it never reallocates. What to do if/when it hits the upper bound - is up to specific application (terminate, fallback to something else). It has the danger of sucking up address space on 32bits though.. but who cares :) Now how to fit this monster into the current API... Plainly I need a way to tell hey Mr. Allocator RESERVE me a 1Gb of RAM but don't chew on, it please. In other words - allocate needs something like a "guaranteed capacity of the block". By default it could be == size. struct Allocator { ... //allocate size bytes with a block of with no less then capacity bytes. void[] allocate(size_t size, size_t capacity=size); ... } Or maybe somehow add a reserve method explicitly... Then extend then will easily do the second part of the job - in-place resize (commit/decommit as required) and both hints make a lot of sense. The pattern ideally would lower to: take a generic array, plug-in a MemMap allocator, reserve top amount of memory at start ... profit! -- Dmitry Olshansky
Sep 24 2013
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 9/24/13 1:46 AM, Dmitry Olshansky wrote:
 23-Sep-2013 03:49, Andrei Alexandrescu пишет:
 Looks good (s/ubyte[]/void[] per current discussion).
Changed.
 Do you imagine Typed allocators as something more then adapters that
 simplify a common pattern of allocate + emplace / destroy + deallocate?
 (create!T/destroy!T)
Well as a simple matter tracing allocators will have to store dynamic type information (just like the current GC does). Also, I hope we'll be able to allow allocators to define Pointer(T), Ref(T) etc. that supplant replacements for the built-in notions of pointer, reference etc. Then, user code that uses these notions instead of the built-in ones will be guaranteed some nice properties (e.g. automatic reference counting). Unfortunately I don't see a way for an allocator to enforce that its users don't do illegal things such as escaping addresses etc. So it would be a discipline-backed scheme. Notable beneficiaries will be containers.
 First things first - can't allocator return alignment as run-time value
 - a property (just like 'available' does)? The implementation contract
 is that it must be O(1) vanila syscall-free function. (Read this as
 consult system info exactly once, at construction).
It could, but as I mentioned to Manu - at this level any cost is significant. Even changing from a compile-time constant to a global static dynamically-initialized constant has a cost. Making alignment an instance variable turns stateless allocators into stateful ones.
 Thinking more of it - it also may be immutable size_t? Then it gets
 proper value at construction and then is never changed.
I'm hoping to get away with a static function "static uint alignment()" that (a) is CTFEable for most allocators/platforms, (b) uses a dynamically-initialized static constant for a few. Then derived allocators will inherit CTFEability. Would a per-allocator-type alignment suffice?
 * deallocate(b) deallocates the memory allocated for b. b must have been
 previously allocated by the same allocator. This method is usually
 unsafe (but there are notable implementations that may offer safety,
 such as unbounded freelists.) This method is optional.
Does the following implication hold "have a deallocate" --> must be manually managed? Otherwise how would one reliably work with it and not leak? This brings us to traits that allocators may (should) have a-la automatic? free-all on termination? Zeros on allocate (more common then one would think)? etc.
The global GC does offer manual deallocation. It's the presence of collect() that indicates tracing abilities. If deallocateAll() is present, user code must assume it will be called during destruction. That said, I've also been thinking of adding a variety of Boolean telling properties. "Zeros on allocate" has an important relationship with safety of higher-level objects - zeros offer "not really initialized but without forged pointers so memory-safe" objects.
 * deallocateAll() deallocates in one shot all memory previously
 allocated by this allocator. This method is optional, and when present
 is almost always unsafe (I see no meaningful  safe implementation.)
 Region allocators are notable examples of allocators that define this
 method.
Okay.. I presume region "mark" works by spiting off a subinstance of allocator and "release" by deallocateAll().
Nice, didn't think of this.
 * collect() frees unused memory that had been allocated with this
 allocator. This optional method is implemented by tracing collectors and
 is usually  safe.
This seems hard and/or suboptimal w/o typeinfo --> typed allocators? I guess they could be more then a simple helper.
Yah, collect() is not really appropriate at this level...
 There are quite a few more things to define more precisely, but this
 part of the design has become quite stable. To validate the design, I've
 defined some simple allocators (Mallocator, GCAllocator, Freelist,
 StackRegion, Region etc) but not the more involved ones, such as
 coalescing heaps, buddy system etc.
The problem I see is that allocators are rooted in poor foundation - malloc is so out of fashion (its interface is simply too thin on guarantees), sbrk is frankly a stone-age syscall. I would love to make a root allocator one on top of mmap/VirtualAlloc. This leads me to my biggest problem with classical memory management - ~20 years of PCs having virtual memory support directly in the CPU, and yet hardly anything outside of OS takes advantage of it. They (allocators) seem simply not aware of it - none of interface implies anything that would help user take advantage of it. I'm talking of (potentially) large allocations that may be often resized. (large allocations actually almost always a blind spot/fallback in allocators) To the point - we have address space and optionally memory committed in there, leading us to a 3-state of an _address range_: available, reserved, committed. The ranges of 2nd state are dirt cheap (abundant on 64bit) and could be easily flipped to committed and back. So I have a pattern I want to try to fit in your design. At present it is a datastructure + allocation logic. What I would want is clean and nice composition for ultimate reuse.
[snip] Interesting. I'll be looking forward to what you come up with. Yes, for large, coarse-granularity allocations, tapping into OS' paging capabilities is the best way to go.
 Now how to fit this monster into the current API...
 Plainly I need a way to tell hey Mr. Allocator RESERVE me a 1Gb of RAM
 but don't chew on, it please. In other words - allocate needs something
 like a "guaranteed capacity of the block". By default it could be == size.

 struct Allocator
 {
 ...
 //allocate size bytes with a block of with no less then capacity bytes.
 void[] allocate(size_t size, size_t capacity=size);
 ...
 }

 Or maybe somehow add a reserve method explicitly...
Assuming you have an optional two-argument allocate() primitive, once you get the initial allocation, how do you commit it? By calling expand() I assume? Would that be enough? In the reserve()-based API - how would that work? Does reserve returns a void[] with size zero and then you expand() it? Andrei
Sep 24 2013
next sibling parent reply Dmitry Olshansky <dmitry.olsh gmail.com> writes:
24-Sep-2013 19:56, Andrei Alexandrescu пишет:
 On 9/24/13 1:46 AM, Dmitry Olshansky wrote:
 23-Sep-2013 03:49, Andrei Alexandrescu пишет:
 Do you imagine Typed allocators as something more then adapters that
 simplify a common pattern of allocate + emplace / destroy + deallocate?
 (create!T/destroy!T)
Well as a simple matter tracing allocators will have to store dynamic type information (just like the current GC does). Also, I hope we'll be able to allow allocators to define Pointer(T), Ref(T) etc. that supplant replacements for the built-in notions of pointer, reference etc. Then, user code that uses these notions instead of the built-in ones will be guaranteed some nice properties (e.g. automatic reference counting). Unfortunately I don't see a way for an allocator to enforce that its users don't do illegal things such as escaping addresses etc. So it would be a discipline-backed scheme. Notable beneficiaries will be containers.
Indeed I thought of Ref!T and Pointer!T for ARC. And just like you said I do not see a way of avoiding any potential abuse. But it is some power to desire.
 First things first - can't allocator return alignment as run-time value
 - a property (just like 'available' does)? The implementation contract
 is that it must be O(1) vanila syscall-free function. (Read this as
 consult system info exactly once, at construction).
It could, but as I mentioned to Manu - at this level any cost is significant. Even changing from a compile-time constant to a global static dynamically-initialized constant has a cost. Making alignment an instance variable turns stateless allocators into stateful ones.
Hm most could just get away with an enum. Am I missing something?
 Thinking more of it - it also may be immutable size_t? Then it gets
 proper value at construction and then is never changed.
I'm hoping to get away with a static function "static uint alignment()" that (a) is CTFEable for most allocators/platforms, (b) uses a dynamically-initialized static constant for a few. Then derived allocators will inherit CTFEability.
A check for CTFE-ablity might do the trick. The run-time dependent ones usually would immediately hit a wall that is a sys-call or WinAPI.
 Would a per-allocator-type alignment suffice?
I see alignment as an inherent property of the way an allocator acquires memory. Sad but true one can't determine the actual alignment of some important ones at CTFE. Even worse adapters would in turn depend on these system-specific run-time values such as half a page, 2 cache lines etc. auto my2PageAligner = AligningAllocator(page_size*2, rootAllocator) where page_size is calculated elsewhere.
 * deallocate(b) deallocates the memory allocated for b. b must have been
 previously allocated by the same allocator. This method is usually
 unsafe (but there are notable implementations that may offer safety,
 such as unbounded freelists.) This method is optional.
Does the following implication hold "have a deallocate" --> must be manually managed? Otherwise how would one reliably work with it and not leak? This brings us to traits that allocators may (should) have a-la automatic? free-all on termination? Zeros on allocate (more common then one would think)? etc.
The global GC does offer manual deallocation. It's the presence of collect() that indicates tracing abilities. If deallocateAll() is present, user code must assume it will be called during destruction.
I was looking into how some library would have to tweak its code based on presence/absence of some methods. For instance if there is collect, it doesn't have to call deallocate (even if present). If not then it has to call deallocate (again if present). Ditto with deallocateAll with or without explicit deallocate.
 That said, I've also been thinking of adding a variety of Boolean
 telling properties. "Zeros on allocate" has an important relationship
 with safety of higher-level objects - zeros offer "not really
 initialized but without forged pointers so memory-safe" objects.
Great.
 * deallocateAll() deallocates in one shot all memory previously
 allocated by this allocator. This method is optional, and when present
 is almost always unsafe (I see no meaningful  safe implementation.)
 Region allocators are notable examples of allocators that define this
 method.
Okay.. I presume region "mark" works by spiting off a subinstance of allocator and "release" by deallocateAll().
Nice, didn't think of this.
I made it up 'cause I thought you did thought of this ;) [snip]
 To the point - we have address space and optionally memory committed in
 there, leading us to a 3-state of an  _address range_:
 available, reserved, committed. The ranges of 2nd state are dirt cheap
 (abundant on 64bit) and could be easily flipped to committed and back.

 So I have a pattern I want to try to fit in your design. At present it
 is a datastructure + allocation logic. What I would want is clean and
 nice composition for ultimate reuse.
[snip] Interesting. I'll be looking forward to what you come up with. Yes, for large, coarse-granularity allocations, tapping into OS' paging capabilities is the best way to go.
Would love to do just that.. once we figure out some minor details.
 Now how to fit this monster into the current API...
 Plainly I need a way to tell hey Mr. Allocator RESERVE me a 1Gb of RAM
 but don't chew on, it please. In other words - allocate needs something
 like a "guaranteed capacity of the block". By default it could be ==
 size.

 struct Allocator
 {
 ...
 //allocate size bytes with a block of with no less then capacity bytes.
 void[] allocate(size_t size, size_t capacity=size);
 ...
 }

 Or maybe somehow add a reserve method explicitly...
Assuming you have an optional two-argument allocate() primitive, once you get the initial allocation, how do you commit it? By calling expand() I assume? Would that be enough?
expand should do the trick. I think that nicely saves on primitives count. But we may have to add a 'shrink' if you are so bent on never decreasing size in expand :) And ... reallocate doesn't cut it as long as there is no big red tape on it that says - if decreasing the size reallocation it is ALWAYS in-place (it might not be true for some allocators - e.g. realloc doesn't guarantee even this IIRC).
 In the reserve()-based API - how would that work? Does reserve returns a
 void[] with size zero and then you expand() it?
In the reserve curiously one has to return a pointer to where the block is reserver _and_ a size of reserved range. Yes, one can't touch said memory range yet. So said 0 actual size is assumed implicitly. void[] reserve(size_t maxSize); Usage is like this: auto range = reserve(1024*1024*1024); auto memory = range[0..0]; auto capacity = range.length; expand(memory, someSize, someSize); ... Other option is to be "honest" and return a tuple of (pointer,capacity) and not a slice. It's still a bit hairy as it may as well be. -- Dmitry Olshansky
Sep 24 2013
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 9/24/13 1:02 PM, Dmitry Olshansky wrote:
 24-Sep-2013 19:56, Andrei Alexandrescu пишет:
 It could, but as I mentioned to Manu - at this level any cost is
 significant. Even changing from a compile-time constant to a global
 static dynamically-initialized constant has a cost. Making alignment an
 instance variable turns stateless allocators into stateful ones.
Hm most could just get away with an enum. Am I missing something?
Right now I do things like: struct DopeAllocator(Allocator, Dope) { static assert(Allocator.alignment >= Dope.alignof, "DopeAllocator does not work with allocators offering a smaller" " alignment than the dope alignment."); static assert(!hasElaborateDestructor!Dope, "DopeAllocator: ellaborate destructors for Dope not implemented."); enum alignment = Dope.alignof; ... } Such code would be significantly more involved if Allocator had the freedom to define alignment as either an enum, a static method, or a member function. (Background: DopeAllocator!(A, D) allocates an object of type D just before every allocation and offers user code access to it. Allocations are made using A.)
 I'm hoping to get away with a static function "static uint alignment()"
 that (a) is CTFEable for most allocators/platforms, (b) uses a
 dynamically-initialized static constant for a few. Then derived
 allocators will inherit CTFEability.
A check for CTFE-ablity might do the trick. The run-time dependent ones usually would immediately hit a wall that is a sys-call or WinAPI.
It does. It's just some more work. Guess I'd need to put it in.
 Would a per-allocator-type alignment suffice?
I see alignment as an inherent property of the way an allocator acquires memory.
The question remains, is it a property of the allocator _object_ or a property of the allocator _type_? I.e. do we need to support allocators that store alignment as a nonstatic member variable?
 Sad but true one can't determine the actual alignment of some
 important ones at CTFE. Even worse adapters would in turn depend on
 these system-specific run-time values such as half a page, 2 cache lines
 etc.

 auto my2PageAligner = AligningAllocator(page_size*2, rootAllocator)

 where page_size is calculated elsewhere.
That can be made to work, it's just that page_size would be a static variable that the user must set (not a template argument).
 The global GC does offer manual deallocation. It's the presence of
 collect() that indicates tracing abilities. If deallocateAll() is
 present, user code must assume it will be called during destruction.
I was looking into how some library would have to tweak its code based on presence/absence of some methods. For instance if there is collect, it doesn't have to call deallocate (even if present). If not then it has to call deallocate (again if present). Ditto with deallocateAll with or without explicit deallocate.
Yah, my std.allocator prototype has a bunch of such logic.
 expand should do the trick. I think that nicely saves on primitives
 count. But we may have to add a 'shrink' if you are so bent on never
 decreasing size in expand :)
 And ... reallocate doesn't cut it as long as there is no big red tape on
 it that says - if decreasing the size reallocation it is ALWAYS in-place
 (it might not be true for some allocators - e.g. realloc doesn't
 guarantee even this IIRC).
OK, I'm willing to add the optional method shrink(void[], size_t newSize) if expand() and realloc() are insufficient. Indeed realloc() does not guarantee immovability, and we also shouldn't because Mallocator must be compatible the C API. Andrei
Sep 24 2013
parent Dmitry Olshansky <dmitry.olsh gmail.com> writes:
25-Sep-2013 00:39, Andrei Alexandrescu пишет:
 On 9/24/13 1:02 PM, Dmitry Olshansky wrote:
 24-Sep-2013 19:56, Andrei Alexandrescu пишет:
 It could, but as I mentioned to Manu - at this level any cost is
 significant. Even changing from a compile-time constant to a global
 static dynamically-initialized constant has a cost. Making alignment an
 instance variable turns stateless allocators into stateful ones.
Hm most could just get away with an enum. Am I missing something?
Right now I do things like: struct DopeAllocator(Allocator, Dope) { static assert(Allocator.alignment >= Dope.alignof, "DopeAllocator does not work with allocators offering a smaller" " alignment than the dope alignment."); static assert(!hasElaborateDestructor!Dope, "DopeAllocator: ellaborate destructors for Dope not implemented."); enum alignment = Dope.alignof; ... }
But it's a boiler plate regardless. What if we are to provide the tools to manipulate said alignment composition (some imaginary templates in action): alias alignment = requireAlignment!(Allocator, Dope.alignof).overrideAlignment!(Dope.alignof); Which in this case would unfold to a proper primitive that tests alignment of Allocator at RT or CT, and then return fixed value as new alignment.
 Such code would be significantly more involved if Allocator had the
 freedom to define alignment as either an enum, a static method, or a
 member function.
Well you solved static vs member function issue already (via singleton).
 I'm hoping to get away with a static function "static uint alignment()"
 that (a) is CTFEable for most allocators/platforms, (b) uses a
 dynamically-initialized static constant for a few. Then derived
 allocators will inherit CTFEability.
A check for CTFE-ablity might do the trick. The run-time dependent ones usually would immediately hit a wall that is a sys-call or WinAPI.
It does. It's just some more work. Guess I'd need to put it in.
 Would a per-allocator-type alignment suffice?
I see alignment as an inherent property of the way an allocator acquires memory.
The question remains, is it a property of the allocator _object_ or a property of the allocator _type_? I.e. do we need to support allocators that store alignment as a nonstatic member variable?
 Sad but true one can't determine the actual alignment of some
 important ones at CTFE. Even worse adapters would in turn depend on
 these system-specific run-time values such as half a page, 2 cache lines
 etc.

 auto my2PageAligner = AligningAllocator(page_size*2, rootAllocator)

 where page_size is calculated elsewhere.
That can be made to work, it's just that page_size would be a static variable that the user must set (not a template argument).
Right. So another angle to the problem is that maybe we could just provide a set of run-time "base" values for alignment. Then one can use say x-page alignment defining x at compile-time. The only problem I foresee with this is that: std.allocator is ill suited to determine all base alignments it may need. Say a certain kind of disk sector size (some 3rd party code), or DMA block size (or what it's called) on some system. Maybe it's that 1% that is not worth it but I dunno.
 expand should do the trick. I think that nicely saves on primitives
 count. But we may have to add a 'shrink' if you are so bent on never
 decreasing size in expand :)
 And ... reallocate doesn't cut it as long as there is no big red tape on
 it that says - if decreasing the size reallocation it is ALWAYS in-place
 (it might not be true for some allocators - e.g. realloc doesn't
 guarantee even this IIRC).
OK, I'm willing to add the optional method shrink(void[], size_t newSize) if expand() and realloc() are insufficient.
Okay let it be shrink. If the safety of expand is that important and the fact it may make no sense to shrink on a call to expand. Honestly I could see a lot of code doing things like if(newSize < oldSize) shrink(...) else if(newSize > oldSize) expand(...) Maybe it's fine.
 Indeed realloc()
 does not guarantee immovability, and we also shouldn't because
 Mallocator must be compatible the C API.
Spawn of Satan :) -- Dmitry Olshansky
Sep 24 2013
prev sibling next sibling parent reply "deadalnix" <deadalnix gmail.com> writes:
On Tuesday, 24 September 2013 at 15:57:00 UTC, Andrei 
Alexandrescu wrote:
 Also, I hope we'll be able to allow allocators to define 
 Pointer(T), Ref(T) etc. that supplant replacements for the 
 built-in notions of pointer, reference etc. Then, user code 
 that uses these notions instead of the built-in ones will be 
 guaranteed some nice properties (e.g. automatic reference 
 counting). Unfortunately I don't see a way for an allocator to 
 enforce that its users don't do illegal things such as escaping 
 addresses etc. So it would be a discipline-backed scheme. 
 Notable beneficiaries will be containers.
It will be fun without tail const.
 The global GC does offer manual deallocation. It's the presence 
 of collect() that indicates tracing abilities. If 
 deallocateAll() is present, user code must assume it will be 
 called during destruction.
It doesn't make any sens at this level. These allocator do not know what a pointer is. And can't be safe if they do know.
Sep 24 2013
parent "sclytrack" <sclytrack layout.com> writes:
On Wednesday, 25 September 2013 at 01:22:46 UTC, deadalnix wrote:
 On Tuesday, 24 September 2013 at 15:57:00 UTC, Andrei 
 Alexandrescu wrote:
 Also, I hope we'll be able to allow allocators to define 
 Pointer(T), Ref(T) etc. that supplant replacements for the 
 built-in notions of pointer, reference etc. Then, user code 
 that uses these notions instead of the built-in ones will be 
 guaranteed some nice properties (e.g. automatic reference 
 counting). Unfortunately I don't see a way for an allocator to 
 enforce that its users don't do illegal things such as 
 escaping addresses etc. So it would be a discipline-backed 
 scheme. Notable beneficiaries will be containers.
It will be fun without tail const.
Do you mean something like this? struct StructName1 { int a; } struct StructName2 { const int a; } StructName1 a; StructName2 b; b = a; //does not work Struct sorta like typedef on the name. Nothing to make sure that it is the same field layout, regardless of const. struct LayoutName1, StructName1 { int a; } struct LayoutName1, StructName2 { const int a; }
Sep 25 2013
prev sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
Andrei Alexandrescu:

 Also, I hope we'll be able to allow allocators to define 
 Pointer(T), Ref(T) etc. that supplant replacements for the 
 built-in notions of pointer, reference etc. Then, user code 
 that uses these notions instead of the built-in ones will be 
 guaranteed some nice properties (e.g. automatic reference 
 counting). Unfortunately I don't see a way for an allocator to 
 enforce that its users don't do illegal things such as escaping 
 addresses etc. So it would be a discipline-backed scheme. 
 Notable beneficiaries will be containers.
Adding some compiler help is not forbidden. (Also for reference counting). Bye, bearophile
Sep 24 2013
prev sibling next sibling parent reply "Brad Anderson" <eco gnuk.net> writes:
On Tuesday, 24 September 2013 at 08:46:36 UTC, Dmitry Olshansky 
wrote:
 23-Sep-2013 03:49, Andrei Alexandrescu пишет:
 Hello,


 I am making good progress on the design of std.allocator, and 
 I am
 optimistic about the way it turns out. D's introspection 
 capabilities
 really shine through, and in places the design does feel really
 archetypal - e.g. "this is the essence of a freelist 
 allocator". It's a
 very good feeling. The overall inspiration comes from Berger's
 HeapLayers, but D's introspection takes that pattern to a 
 whole new level.

 Several details are still in flux, but at the top level it 
 seems most
 natural to divide things in two tiers:

 1. Typed allocator, i.e. every request for allocation comes 
 with the
 exact type requested;

 2. Untyped allocator - traffics exclusively in ubyte[].
Looks good (s/ubyte[]/void[] per current discussion). Do you imagine Typed allocators as something more then adapters that simplify a common pattern of allocate + emplace / destroy + deallocate? (create!T/destroy!T)
 struct NullAllocator
 {
     enum alignment = real.alignof;
     enum size_t available = 0;
     ubyte[] allocate(size_t s) shared { return null; }
     bool expand(ref ubyte[] b, size_t minDelta, size_t 
 maxDelta) shared
     { assert(b is null); return false; }
     bool reallocate(ref ubyte[] b, size_t) shared
     { assert(b is null); return false; }
     void deallocate(ubyte[] b) shared { assert(b is null); }
     void collect() shared { }
     void deallocateAll() shared { }
     static shared NullAllocator it;
 }

 Primitives:
First things first - can't allocator return alignment as run-time value - a property (just like 'available' does)? The implementation contract is that it must be O(1) vanila syscall-free function. (Read this as consult system info exactly once, at construction). Thinking more of it - it also may be immutable size_t? Then it gets proper value at construction and then is never changed.
 * expand(b, minDelta, maxDelta) grows b's length by at least 
 minDelta
 (and on a best-effort basis by at least maxDelta) and returns 
 true, or
 does nothing and returns false. In most allocators this should 
 be  safe.
 (One key insight is that expand() can be made  safe whereas 
 shrink() or
 realloc() are most often not; such mini-epiphanies are very 
 exciting
 because they put the design on a beam guide with few 
 principles and many
 consequences.) The precondition is that b is null or has been 
 previously
 returned by allocate(). This method is optional.
Great to see this primitive. Classic malloc-ators are so lame... (+ WinAPI Heaps fits here)
 * deallocate(b) deallocates the memory allocated for b. b must 
 have been
 previously allocated by the same allocator. This method is 
 usually
 unsafe (but there are notable implementations that may offer 
 safety,
 such as unbounded freelists.) This method is optional.
Does the following implication hold "have a deallocate" --> must be manually managed? Otherwise how would one reliably work with it and not leak? This brings us to traits that allocators may (should) have a-la automatic? free-all on termination? Zeros on allocate (more common then one would think)? etc.
 * deallocateAll() deallocates in one shot all memory previously
 allocated by this allocator. This method is optional, and when 
 present
 is almost always unsafe (I see no meaningful  safe 
 implementation.)
 Region allocators are notable examples of allocators that 
 define this
 method.
Okay.. I presume region "mark" works by spiting off a subinstance of allocator and "release" by deallocateAll().
 * collect() frees unused memory that had been allocated with 
 this
 allocator. This optional method is implemented by tracing 
 collectors and
 is usually  safe.
This seems hard and/or suboptimal w/o typeinfo --> typed allocators? I guess they could be more then a simple helper.
 There are quite a few more things to define more precisely, 
 but this
 part of the design has become quite stable. To validate the 
 design, I've
 defined some simple allocators (Mallocator, GCAllocator, 
 Freelist,
 StackRegion, Region etc) but not the more involved ones, such 
 as
 coalescing heaps, buddy system etc.
The problem I see is that allocators are rooted in poor foundation - malloc is so out of fashion (its interface is simply too thin on guarantees), sbrk is frankly a stone-age syscall. I would love to make a root allocator one on top of mmap/VirtualAlloc. This leads me to my biggest problem with classical memory management - ~20 years of PCs having virtual memory support directly in the CPU, and yet hardly anything outside of OS takes advantage of it. They (allocators) seem simply not aware of it - none of interface implies anything that would help user take advantage of it. I'm talking of (potentially) large allocations that may be often resized. (large allocations actually almost always a blind spot/fallback in allocators) To the point - we have address space and optionally memory committed in there, leading us to a 3-state of an _address range_: available, reserved, committed. The ranges of 2nd state are dirt cheap (abundant on 64bit) and could be easily flipped to committed and back. So I have a pattern I want to try to fit in your design. At present it is a datastructure + allocation logic. What I would want is clean and nice composition for ultimate reuse. An example is a heavy-duty array (could be used for potentially large stack). The requirements are: a) Never reallocate on resize - in certain cases it may even be too large to reallocate in RAM (!) b) Never waste RAM beyond some limit (a few pages or a tiny faction of size is fine) c) O(1) amortized appending as in plain array and other properties of an array -> it's a dynamic array after all Currently I use mmap/madvise and related entities directly. It goes as follows. Allocate a sufficiently large - as large as "it" may get (1Mb, 1G, 10G you name it) _address range_. Call this size a 'capacity'. Commit some memory (optionally on first resize) - call this a 'committed'. Appending goes using up committed space as typical array would do. Whenever it needs more it then that applies the same extending algorithm as usual but with (committed, capacity) pair. Ditto on resize back - (with some amortization) it asks OS to decommit pages decreasing the commited amount. In short - a c++ "vector" that has 2 capacities (current and absolute maximum) with a plus that it never reallocates. What to do if/when it hits the upper bound - is up to specific application (terminate, fallback to something else). It has the danger of sucking up address space on 32bits though.. but who cares :)
Somewhat related: http://probablydance.com/2013/05/13/4gb-per-vector/
Sep 24 2013
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 9/24/13 9:39 AM, Brad Anderson wrote:
 Somewhat related: http://probablydance.com/2013/05/13/4gb-per-vector/
Great insight, long scroll. Please trim quoting appropriately. Andrei
Sep 24 2013
parent Dmitry Olshansky <dmitry.olsh gmail.com> writes:
24-Sep-2013 20:48, Andrei Alexandrescu пишет:
 On 9/24/13 9:39 AM, Brad Anderson wrote:
 Somewhat related: http://probablydance.com/2013/05/13/4gb-per-vector/
Great insight, long scroll. Please trim quoting appropriately. Andrei
In fact one may have both deque (just start in the middle) and array that never needs to reallocate as long as paying the cost of reserving a large address space chunk is an option. More then that the resulting deque has insanely fast index - it's exactly the same as in array (it is an array with plenty of space to grow). -- Dmitry Olshansky
Sep 24 2013
prev sibling parent reply Rainer Schuetze <r.sagitario gmx.de> writes:
On 24.09.2013 10:46, Dmitry Olshansky wrote:
 * expand(b, minDelta, maxDelta) grows b's length by at least minDelta
 (and on a best-effort basis by at least maxDelta) and returns true, or
 does nothing and returns false. In most allocators this should be  safe.
 (One key insight is that expand() can be made  safe whereas shrink() or
 realloc() are most often not; such mini-epiphanies are very exciting
 because they put the design on a beam guide with few principles and many
 consequences.) The precondition is that b is null or has been previously
 returned by allocate(). This method is optional.
Great to see this primitive. Classic malloc-ators are so lame... (+ WinAPI Heaps fits here)
expand is nice, but I don't like minDelta and maxDelta as arguments. On a shared allocator this can lead to undesired side-effects, i.e. when this function is called concurrently on the same block by two threads, the net result will be the sum of the deltas of the two threads, while later synchronization on the memory block might only allow one thread to actually use the extended memory area. This can happen with gc_extend in the current GC when appending to a shared array. I'd prefer the function to actually pass the desired sizes: void[] expand(void[] b, size_t minSize, size_t maxSize);
Sep 24 2013
next sibling parent Dmitry Olshansky <dmitry.olsh gmail.com> writes:
24-Sep-2013 22:28, Rainer Schuetze пишет:

 On 24.09.2013 10:46, Dmitry Olshansky wrote:
 * expand(b, minDelta, maxDelta) grows b's length by at least minDelta
 (and on a best-effort basis by at least maxDelta) and returns true, or
 does nothing and returns false. In most allocators this should be  safe.
 (One key insight is that expand() can be made  safe whereas shrink() or
 realloc() are most often not; such mini-epiphanies are very exciting
 because they put the design on a beam guide with few principles and many
 consequences.) The precondition is that b is null or has been previously
 returned by allocate(). This method is optional.
Great to see this primitive. Classic malloc-ators are so lame... (+ WinAPI Heaps fits here)
expand is nice, but I don't like minDelta and maxDelta as arguments. On a shared allocator this can lead to undesired side-effects, i.e. when this function is called concurrently on the same block by two threads, the net result will be the sum of the deltas of the two threads, while later synchronization on the memory block might only allow one thread to actually use the extended memory area. This can happen with gc_extend in the current GC when appending to a shared array. I'd prefer the function to actually pass the desired sizes: void[] expand(void[] b, size_t minSize, size_t maxSize);
+1 -- Dmitry Olshansky
Sep 24 2013
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 9/24/13 11:28 AM, Rainer Schuetze wrote:
 On 24.09.2013 10:46, Dmitry Olshansky wrote:
 * expand(b, minDelta, maxDelta) grows b's length by at least minDelta
 (and on a best-effort basis by at least maxDelta) and returns true, or
 does nothing and returns false. In most allocators this should be  safe.
 (One key insight is that expand() can be made  safe whereas shrink() or
 realloc() are most often not; such mini-epiphanies are very exciting
 because they put the design on a beam guide with few principles and many
 consequences.) The precondition is that b is null or has been previously
 returned by allocate(). This method is optional.
Great to see this primitive. Classic malloc-ators are so lame... (+ WinAPI Heaps fits here)
expand is nice, but I don't like minDelta and maxDelta as arguments. On a shared allocator this can lead to undesired side-effects, i.e. when this function is called concurrently on the same block by two threads, the net result will be the sum of the deltas of the two threads, while later synchronization on the memory block might only allow one thread to actually use the extended memory area. This can happen with gc_extend in the current GC when appending to a shared array. I'd prefer the function to actually pass the desired sizes: void[] expand(void[] b, size_t minSize, size_t maxSize);
I don't understand this argument. These parameters can be trivially computed from the others, after which locking etc. proceeds just the same. FWIW I chose the "delta"-based signature because with it the precondition is: assert(minDelta <= maxDelta); In fact behavior can be easily defined at no cost even if this precondition is not satisfied. With the "absolute"-based signature the precondition is: assert(minSize <= maxSize && b.length <= minSize); which causes confusion - people will pass small sizes to expand() expecting it to contract, something that expand() can't support as a matter of principle (safety!!!). Andrei
Sep 24 2013
next sibling parent reply Dmitry Olshansky <dmitry.olsh gmail.com> writes:
24-Sep-2013 23:36, Andrei Alexandrescu пишет:
 On 9/24/13 11:28 AM, Rainer Schuetze wrote:

[snip]
 which causes confusion - people will pass small sizes to expand()
 expecting it to contract, something that expand() can't support as a
 matter of principle (safety!!!).
Showstopper. It has to support contraction.
 Andrei
-- Dmitry Olshansky
Sep 24 2013
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 9/24/13 12:47 PM, Dmitry Olshansky wrote:
 24-Sep-2013 23:36, Andrei Alexandrescu пишет:
 On 9/24/13 11:28 AM, Rainer Schuetze wrote:

[snip]
 which causes confusion - people will pass small sizes to expand()
 expecting it to contract, something that expand() can't support as a
 matter of principle (safety!!!).
Showstopper. It has to support contraction.
reallocate() is allowed contract. Show restarted :o). Andrei
Sep 24 2013
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 9/24/13 12:49 PM, Andrei Alexandrescu wrote:
 On 9/24/13 12:47 PM, Dmitry Olshansky wrote:
 24-Sep-2013 23:36, Andrei Alexandrescu пишет:
 On 9/24/13 11:28 AM, Rainer Schuetze wrote:

[snip]
 which causes confusion - people will pass small sizes to expand()
 expecting it to contract, something that expand() can't support as a
 matter of principle (safety!!!).
Showstopper. It has to support contraction.
reallocate() is allowed contract. Show restarted :o).
s/contract/to contract/ Andrei
Sep 24 2013
prev sibling parent reply Rainer Schuetze <r.sagitario gmx.de> writes:
On 24.09.2013 21:36, Andrei Alexandrescu wrote:
 On 9/24/13 11:28 AM, Rainer Schuetze wrote:
 expand is nice, but I don't like minDelta and maxDelta as arguments. On
 a shared allocator this can lead to undesired side-effects, i.e. when
 this function is called concurrently on the same block by two threads,
 the net result will be the sum of the deltas of the two threads, while
 later synchronization on the memory block might only allow one thread to
 actually use the extended memory area. This can happen with gc_extend in
 the current GC when appending to a shared array.

 I'd prefer the function to actually pass the desired sizes:

 void[] expand(void[] b, size_t minSize, size_t maxSize);
I don't understand this argument. These parameters can be trivially computed from the others, after which locking etc. proceeds just the same.
Taking the current array implementation as an example, the deltas are computed before the actual GC lock happens inside gc_extend which means that the second of two concurrent requests leads to overallocation.
 FWIW I chose the "delta"-based signature because with it the
 precondition is:

 assert(minDelta <= maxDelta);

 In fact behavior can be easily defined at no cost even if this
 precondition is not satisfied. With the "absolute"-based signature the
 precondition is:

 assert(minSize <= maxSize && b.length <= minSize);

 which causes confusion - people will pass small sizes to expand()
 expecting it to contract, something that expand() can't support as a
 matter of principle (safety!!!).
Why would you expect a function "expand" to reduce the memory size?
Sep 24 2013
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 9/24/13 2:02 PM, Rainer Schuetze wrote:
 On 24.09.2013 21:36, Andrei Alexandrescu wrote:
 On 9/24/13 11:28 AM, Rainer Schuetze wrote:
 expand is nice, but I don't like minDelta and maxDelta as arguments. On
 a shared allocator this can lead to undesired side-effects, i.e. when
 this function is called concurrently on the same block by two threads,
 the net result will be the sum of the deltas of the two threads, while
 later synchronization on the memory block might only allow one thread to
 actually use the extended memory area. This can happen with gc_extend in
 the current GC when appending to a shared array.

 I'd prefer the function to actually pass the desired sizes:

 void[] expand(void[] b, size_t minSize, size_t maxSize);
I don't understand this argument. These parameters can be trivially computed from the others, after which locking etc. proceeds just the same.
Taking the current array implementation as an example, the deltas are computed before the actual GC lock happens inside gc_extend which means that the second of two concurrent requests leads to overallocation.
(I'm confused - which array implementation?) Does this qualify as an implementation bug, or are you referring to the case where the void[] being reallocated is being shared?
 FWIW I chose the "delta"-based signature because with it the
 precondition is:

 assert(minDelta <= maxDelta);

 In fact behavior can be easily defined at no cost even if this
 precondition is not satisfied. With the "absolute"-based signature the
 precondition is:

 assert(minSize <= maxSize && b.length <= minSize);

 which causes confusion - people will pass small sizes to expand()
 expecting it to contract, something that expand() can't support as a
 matter of principle (safety!!!).
Why would you expect a function "expand" to reduce the memory size?
Ask Dmitry :o). Far as I can tell he assumed so. Andrei
Sep 24 2013
next sibling parent Rainer Schuetze <r.sagitario gmx.de> writes:
On 24.09.2013 23:05, Andrei Alexandrescu wrote:
 Taking the current array implementation as an example, the deltas are
 computed before the actual GC lock happens inside gc_extend which means
 that the second of two concurrent requests leads to overallocation.
(I'm confused - which array implementation?)
I mean the dynamic array implementation in rt.lifetime.
 Does this qualify as an
 implementation bug, or are you referring to the case where the void[]
 being reallocated is being shared?
Yes, I'm referring to a shared void[], e.g. strings which use immutable shared buffers that you can still append to if you are not stomping other data. Appending to it does not need a lock for the whole operation, but only an atomic operation when adding to the stored size (though the implementation uses a synchronized block to change the size).
Sep 24 2013
prev sibling parent Dmitry Olshansky <dmitry.olsh gmail.com> writes:
25-Sep-2013 01:05, Andrei Alexandrescu пишет:
 On 9/24/13 2:02 PM, Rainer Schuetze wrote:
[snip]
 Why would you expect a function "expand" to reduce the memory size?
Ask Dmitry :o). Far as I can tell he assumed so.
I had read it as reallocate in place. Yeah, my "semantic reading" must suck though it usually speeds things up :o)
 Andrei
-- Dmitry Olshansky
Sep 24 2013
prev sibling next sibling parent reply "Dan Schatzberg" <dschatz hidden.edu> writes:
One thing I'm not sure is addressed by this design is memory 
locality. I know of libnuma http://linux.die.net/man/3/numa which 
allows me to express what NUMA domain my memory should be 
allocated from at run-time for each allocation.

In the case that I want to allocate memory in a specific NUMA 
domain (not just local vs non-local), I believe this design is 
insufficient because the number of domains are only known at 
run-time.

Also, as far as alignment is concerned I will throw in that x86 
is relatively unique in having a statically known cache-line 
size. Both ARM and PowerPC cores can differ in their cache-line 
sizes. I feel this is a significant argument for the ability to 
dynamically express alignment.
Sep 24 2013
next sibling parent reply "deadalnix" <deadalnix gmail.com> writes:
On Tuesday, 24 September 2013 at 11:38:29 UTC, Dan Schatzberg 
wrote:
 One thing I'm not sure is addressed by this design is memory 
 locality. I know of libnuma http://linux.die.net/man/3/numa 
 which allows me to express what NUMA domain my memory should be 
 allocated from at run-time for each allocation.
Not being part of the interface do not mean that the allocator cannot accept parameters. Via state for instance.
Sep 24 2013
parent "Dan Schatzberg" <dschatz hidden.edu> writes:
On Tuesday, 24 September 2013 at 13:21:48 UTC, deadalnix wrote:
 On Tuesday, 24 September 2013 at 11:38:29 UTC, Dan Schatzberg 
 wrote:
 One thing I'm not sure is addressed by this design is memory 
 locality. I know of libnuma http://linux.die.net/man/3/numa 
 which allows me to express what NUMA domain my memory should 
 be allocated from at run-time for each allocation.
Not being part of the interface do not mean that the allocator cannot accept parameters. Via state for instance.
It is likely that I have a misunderstanding but as I understand the proposal, I could not use the same allocator to allocate memory with different locality domains. This seems like something that would be desirable for the base allocator of a hierarchy.
Sep 24 2013
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 9/24/13 4:38 AM, Dan Schatzberg wrote:
 One thing I'm not sure is addressed by this design is memory locality. I
 know of libnuma http://linux.die.net/man/3/numa which allows me to
 express what NUMA domain my memory should be allocated from at run-time
 for each allocation.

 In the case that I want to allocate memory in a specific NUMA domain
 (not just local vs non-local), I believe this design is insufficient
 because the number of domains are only known at run-time.

 Also, as far as alignment is concerned I will throw in that x86 is
 relatively unique in having a statically known cache-line size. Both ARM
 and PowerPC cores can differ in their cache-line sizes. I feel this is a
 significant argument for the ability to dynamically express alignment.
Could you send a few links so I can take a look? My knee-jerk reaction to this is that NUMA allocators would provide their own additional primitives and not participate naively in compositions with other allocators. Andrei
Sep 24 2013
parent reply "Dan Schatzberg" <dschatz hidden.edu> writes:
On Tuesday, 24 September 2013 at 16:06:39 UTC, Andrei 
Alexandrescu wrote:
 On 9/24/13 4:38 AM, Dan Schatzberg wrote:
 One thing I'm not sure is addressed by this design is memory 
 locality. I
 know of libnuma http://linux.die.net/man/3/numa which allows 
 me to
 express what NUMA domain my memory should be allocated from at 
 run-time
 for each allocation.

 In the case that I want to allocate memory in a specific NUMA 
 domain
 (not just local vs non-local), I believe this design is 
 insufficient
 because the number of domains are only known at run-time.

 Also, as far as alignment is concerned I will throw in that 
 x86 is
 relatively unique in having a statically known cache-line 
 size. Both ARM
 and PowerPC cores can differ in their cache-line sizes. I feel 
 this is a
 significant argument for the ability to dynamically express 
 alignment.
Could you send a few links so I can take a look? My knee-jerk reaction to this is that NUMA allocators would provide their own additional primitives and not participate naively in compositions with other allocators. Andrei
Not sure what kind of links you're looking for The following link is a good discussion of the issue and the current solutions http://queue.acm.org/detail.cfm?id=2513149 In particular: "The application may want fine-grained control of how the operating system handles allocation for each of its memory segments. For that purpose, system calls exist that allow the application to specify which memory region should use which policies for memory allocations. The main performance issues typically involve large structures that are accessed frequently by the threads of the application from all memory nodes and that often contain information that needs to be shared among all threads. These are best placed using interleaving so that the objects are distributed over all available nodes." The Linux/libc interfaces are linked in my first comment. Specifically with the mbind() call one can specify the policy for allocations from a virtual address range (which NUMA node to allocate the backing physical page from). More generally you could imagine specifying this per allocation. What is your objective though? Aren't you trying to define a hierarchy of allocators where more specific allocators can be composed from general ones? In which case what is the concern with including locality at the base level? It seems to be one characteristic of memory that programmers might be concerned with and rather trivially you can compose a non-locality aware allocator from a locality aware allocator.
Sep 24 2013
parent reply "Dan Schatzberg" <dschatz hidden.edu> writes:
On Tuesday, 24 September 2013 at 17:38:34 UTC, Dan Schatzberg 
wrote:
 What is your objective though? Aren't you trying to define a 
 hierarchy of allocators where more specific allocators can be 
 composed from general ones? In which case what is the concern 
 with including locality at the base level? It seems to be one 
 characteristic of memory that programmers might be concerned 
 with and rather trivially you can compose a non-locality aware 
 allocator from a locality aware allocator.
I'll elaborate on this point a bit just to make sure I'm clear. I understand your initial post here to be asking if the allocator interface you designed is comprehensive enough that all other allocators (within reason) could be constructed from it. Whether or not it is inefficient to pass alignment, or locality domain arguments to an allocate call seems orthogonal to whether or not one can construct an allocator which does not take such arguments from one that does. As I'm sure your original design is intended for, one can imagine allocators which have a fixed size (constructed from the original allocator you proposed). The allocate path can then very efficiently hand off memory from a free-list or some other efficient structure. One can make analogous allocators that do not allow for dynamic alignment or NUMA locality. The only reasonable argument against seems to be whether or not the flexibility is worth the semantic overhead of a larger hierarchy which is a completely reasonable and subjective argument to be had.
Sep 24 2013
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 9/24/13 11:20 AM, Dan Schatzberg wrote:
 On Tuesday, 24 September 2013 at 17:38:34 UTC, Dan Schatzberg wrote:
 What is your objective though? Aren't you trying to define a hierarchy
 of allocators where more specific allocators can be composed from
 general ones? In which case what is the concern with including
 locality at the base level? It seems to be one characteristic of
 memory that programmers might be concerned with and rather trivially
 you can compose a non-locality aware allocator from a locality aware
 allocator.
I'll elaborate on this point a bit just to make sure I'm clear. I understand your initial post here to be asking if the allocator interface you designed is comprehensive enough that all other allocators (within reason) could be constructed from it.
That is correct.
 Whether or not it is inefficient to pass alignment, or locality domain
 arguments to an allocate call seems orthogonal to whether or not one can
 construct an allocator which does not take such arguments from one that
 does.
That is correct too. A fat API for a core allocator does make everybody's life harder because they need to support it (e.g. by passing calls transparently or with tweaks). I think this will become a bit clearer after I publish the code.
 As I'm sure your original design is intended for, one can imagine
 allocators which have a fixed size (constructed from the original
 allocator you proposed). The allocate path can then very efficiently
 hand off memory from a free-list or some other efficient structure. One
 can make analogous allocators that do not allow for dynamic alignment or
 NUMA locality.

 The only reasonable argument against seems to be whether or not the
 flexibility is worth the semantic overhead of a larger hierarchy which
 is a completely reasonable and subjective argument to be had.
Yah, that sounds abour right. Thanks, Andrei
Sep 24 2013
prev sibling next sibling parent reply "Bigsandwich" <bigsandwich gmail.com> writes:
Hi,

I mostly just lurk around here, but occasionally I just can't 
resist putting in my two cents.  I really want to see D replace 
C++ for AAA games (my industry) and allocators are really 
critical to that.  I think there's an elephant here that most of 
the posts have been dancing around.

In C++ when you roll your own allocators (whether STL allocators 
or some other interface) there is basically, only one set of 
*semantics* to worry about - the classic C++/C pattern of 
new/delete or malloc/free.  This is actually much more 
complicated in D where you have at least two different semantics:

1) C++/C style
2) Tracing/Garbage Collection
3) Region allocators and other oddballs

Consequences:
1) These semantics aren't really interchangeable.  If, for 
instance, a library is implemented with one of them in mind, it 
can only be replaced by an allocator with the same semantics.

2) Tracing/Garbage Collection are required for whatever the 
default allocator is and replacement allocator must implement 
those semantics.

3) Its only possible to change (2) by hacking the runtime so that 
any language features that rely on the GC cause errors.

The design Andrei has presents seems to *allow* the creation of 
allocators with all of these semantics, but it doesn't seem to 
answer the following questions:

1) Can semantics be enforced?
2) Can allocators with different semantics be composed?
3) Can different semantics for the default allocator be 
implemented?
4) If so, how can I reconcile that with libraries that expect 
different default semantics?  A use case that I foresee for 
something like this would be banning the use of GC in low level 
engine code, but allowing it in higher level gameplay code.

5) Are any of these questions even relevant for this part of the 
design or will we have to wait for the rest of it to know the 
answers?

Thanks.
Sep 24 2013
parent "deadalnix" <deadalnix gmail.com> writes:
On Tuesday, 24 September 2013 at 18:53:56 UTC, Bigsandwich wrote:
 Hi,

 I mostly just lurk around here, but occasionally I just can't 
 resist putting in my two cents.  I really want to see D replace 
 C++ for AAA games (my industry) and allocators are really 
 critical to that.  I think there's an elephant here that most 
 of the posts have been dancing around.

 In C++ when you roll your own allocators (whether STL 
 allocators or some other interface) there is basically, only 
 one set of *semantics* to worry about - the classic C++/C 
 pattern of new/delete or malloc/free.  This is actually much 
 more complicated in D where you have at least two different 
 semantics:

 1) C++/C style
 2) Tracing/Garbage Collection
 3) Region allocators and other oddballs

 Consequences:
 1) These semantics aren't really interchangeable.  If, for 
 instance, a library is implemented with one of them in mind, it 
 can only be replaced by an allocator with the same semantics.

 2) Tracing/Garbage Collection are required for whatever the 
 default allocator is and replacement allocator must implement 
 those semantics.

 3) Its only possible to change (2) by hacking the runtime so 
 that any language features that rely on the GC cause errors.

 The design Andrei has presents seems to *allow* the creation of 
 allocators with all of these semantics, but it doesn't seem to 
 answer the following questions:

 1) Can semantics be enforced?
 2) Can allocators with different semantics be composed?
 3) Can different semantics for the default allocator be 
 implemented?
 4) If so, how can I reconcile that with libraries that expect 
 different default semantics?  A use case that I foresee for 
 something like this would be banning the use of GC in low level 
 engine code, but allowing it in higher level gameplay code.

 5) Are any of these questions even relevant for this part of 
 the design or will we have to wait for the rest of it to know 
 the answers?

 Thanks.
I'd like to see these question answered. This is one of the most insightful posts of the discussion.
Sep 24 2013
prev sibling parent reply "Peter Alexander" <peter.alexander.au gmail.com> writes:
On Sunday, 22 September 2013 at 23:49:56 UTC, Andrei Alexandrescu 
wrote:
 struct NullAllocator
 {
     enum alignment = real.alignof;
     enum size_t available = 0;
     ubyte[] allocate(size_t s) shared { return null; }
     bool expand(ref ubyte[] b, size_t minDelta, size_t 
 maxDelta) shared
     { assert(b is null); return false; }
     bool reallocate(ref ubyte[] b, size_t) shared
     { assert(b is null); return false; }
     void deallocate(ubyte[] b) shared { assert(b is null); }
     void collect() shared { }
     void deallocateAll() shared { }
     static shared NullAllocator it;
 }
Does the presence of "shared" indicate that all allocators must be thread-safe? Will we be able to define thread-unsafe allocators, designed for use on one thread for performance reasons?
Sep 25 2013
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 9/25/13 12:03 PM, Peter Alexander wrote:
 On Sunday, 22 September 2013 at 23:49:56 UTC, Andrei Alexandrescu wrote:
 struct NullAllocator
 {
     enum alignment = real.alignof;
     enum size_t available = 0;
     ubyte[] allocate(size_t s) shared { return null; }
     bool expand(ref ubyte[] b, size_t minDelta, size_t maxDelta) shared
     { assert(b is null); return false; }
     bool reallocate(ref ubyte[] b, size_t) shared
     { assert(b is null); return false; }
     void deallocate(ubyte[] b) shared { assert(b is null); }
     void collect() shared { }
     void deallocateAll() shared { }
     static shared NullAllocator it;
 }
Does the presence of "shared" indicate that all allocators must be thread-safe? Will we be able to define thread-unsafe allocators, designed for use on one thread for performance reasons?
Shared is not required. Most allocators I wrote are thread-local. I've only put "shared" there for NullAllocator, Mallocator, GCAllocator to represent in the type system a simple reality - these allocators offer a singleton that multiple threads may use simultaneously. Andrei
Sep 25 2013
parent reply "Yota" <yotaxp thatGoogleMailThing.com> writes:
If we're sharing potential allocators, I just partially 
implemented a design inspired by a rather 'Bright' person. 
(Please shoot me for that.)

http://pastebin.com/14jeVY4n

Based on: 
http://www.drdobbs.com/cpp/increasing-compiler-speed-by-over-75/240158941
Sep 27 2013
parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Fri, Sep 27, 2013 at 07:10:18PM +0200, Yota wrote:
 If we're sharing potential allocators, I just partially implemented
 a design inspired by a rather 'Bright' person. (Please shoot me for
 that.)
*BANG* T -- They say that "guns don't kill people, people kill people." Well I think the gun helps. If you just stood there and yelled BANG, I don't think you'd kill too many people. -- Eddie Izzard, Dressed to Kill
Sep 27 2013
next sibling parent "Brian Schott" <briancschott gmail.com> writes:
On Friday, 27 September 2013 at 17:29:06 UTC, H. S. Teoh wrote:
 *BANG*
 They say that "guns don't kill people, people kill people." 
 Well I think
 the gun helps. If you just stood there and yelled BANG, I don't 
 think
 you'd kill too many people. -- Eddie Izzard, Dressed to Kil
Was that an amazing coincidence, or did you have your quote generator cheat there?
Sep 27 2013
prev sibling parent reply "Craig Dillabaugh" <craig.dillabaugh gmail.com> writes:
On Friday, 27 September 2013 at 17:29:06 UTC, H. S. Teoh wrote:
 On Fri, Sep 27, 2013 at 07:10:18PM +0200, Yota wrote:
 If we're sharing potential allocators, I just partially 
 implemented
 a design inspired by a rather 'Bright' person. (Please shoot 
 me for
 that.)
*BANG* T -- They say that "guns don't kill people, people kill people." Well I think the gun helps. If you just stood there and yelled BANG, I don't think you'd kill too many people. -- Eddie Izzard, Dressed to Kill
Did you specify that quote, or was it randomly generated?
Sep 27 2013
parent reply "Dicebot" <public dicebot.lv> writes:
On Friday, 27 September 2013 at 17:32:28 UTC, Craig Dillabaugh 
wrote:
 On Friday, 27 September 2013 at 17:29:06 UTC, H. S. Teoh wrote:
 On Fri, Sep 27, 2013 at 07:10:18PM +0200, Yota wrote:
 If we're sharing potential allocators, I just partially 
 implemented
 a design inspired by a rather 'Bright' person. (Please shoot 
 me for
 that.)
*BANG* T -- They say that "guns don't kill people, people kill people." Well I think the gun helps. If you just stood there and yelled BANG, I don't think you'd kill too many people. -- Eddie Izzard, Dressed to Kill
Did you specify that quote, or was it randomly generated?
Given the fact it is not the first time he has done it, I'd incline to call the former.
Sep 27 2013
parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Fri, Sep 27, 2013 at 07:33:40PM +0200, Dicebot wrote:
 On Friday, 27 September 2013 at 17:32:28 UTC, Craig Dillabaugh
 wrote:
On Friday, 27 September 2013 at 17:29:06 UTC, H. S. Teoh wrote:
On Fri, Sep 27, 2013 at 07:10:18PM +0200, Yota wrote:
If we're sharing potential allocators, I just partially implemented
a design inspired by a rather 'Bright' person. (Please shoot me for
that.)
*BANG* T -- They say that "guns don't kill people, people kill people." Well I think the gun helps. If you just stood there and yelled BANG, I don't think you'd kill too many people. -- Eddie Izzard, Dressed to Kill
Did you specify that quote, or was it randomly generated?
Given the fact it is not the first time he has done it, I'd incline to call the former.
I cheated. :-P Now punish me... oh. (OK, I didn't cheat on this one, but it was a close enough coincidence for me to take advantage of it. :-P) T -- I'm still trying to find a pun for "punishment"...
Sep 27 2013