digitalmars.D - why allocators are not discussed here

cybervadim (34/34) Jun 25 2013 I know Andrey mentioned he was going to work on Allocators a year

Adam D. Ruppe (8/9) Jun 25 2013 It would be easier to just pass an allocator object that provides

H. S. Teoh (38/48) Jun 25 2013 It's not too late to introduce a default allocator object that maps to
Robert Schadek (16/21) Jun 26 2013 I did think about this as well, but than I came up with something that

Dmitry Olshansky (4/28) Jun 26 2013 --

Robert Schadek (7/18) Jun 26 2013 IMHO, not really, as the place you get the memory from is not managed by

Marco Leise (5/17) Jun 26 2013 Does it mean 16 extra bytes for every allocation ?

Robert Schadek (3/4) Jun 26 2013 yes, or wrap it, and you have 4 or 8 bytes, but yes you would to have

H. S. Teoh (50/89) Jun 25 2013 That would be nice to get things going. :)

cybervadim (23/101) Jun 25 2013 From my experience all objects may be divided into 2 categories

bearophile (6/18) Jun 25 2013 Many garbage collectors use the same idea (and manage it

cybervadim (4/9) Jun 25 2013 The problem with GC is that it doesn't know which is temporary

Adam D. Ruppe (79/85) Jun 25 2013 Yea, I was thinking this might be a way to go. You'd have a

Adam D. Ruppe (28/28) Jun 25 2013 I was just quickly skimming some criticism of C++ allocators,

Dmitry Olshansky (16/22) Jun 26 2013 Criticisms are:

Jacob Carlborg (5/16) Jun 26 2013 It won't be inlined. You would need to make it a template parameter to
Dmitry Olshansky (37/61) Jun 26 2013 Both suffer from

H. S. Teoh (47/109) Jun 26 2013 How is this different from using malloc() and free() manually? You have

Brian Rogoff (10/16) Jun 26 2013 Maybe he was talking about a freelist allocator over a reap, as
Adam D. Ruppe (11/15) Jun 26 2013 Blargh, I forgot about operator ~ on built ins. For custom types
Dmitry Olshansky (57/137) Jun 26 2013 Why the heck you people think I purpose to use malloc directly as

Adam D. Ruppe (27/27) Jun 26 2013 So to try some ideas, I started implementing a simple container

H. S. Teoh (17/31) Jun 26 2013 Maybe a type distinction akin to C++'s auto_ptr might help? Say we

Adam D. Ruppe (60/63) Jun 26 2013 Yeah, that's what I'm thinking, but I don't really like it.

BLM768 (25/35) Jun 27 2013 Something along those lines would probably be a good solution.
Marco Leise (5/8) Jun 28 2013 Isn't that what scope is for?

Dicebot (4/11) Jun 28 2013 Reading dlang.org makes you guess so but official position is
Adam D. Ruppe (6/7) Jun 28 2013 I don't really know. In practice, it does something else (usually

Dicebot (6/14) Jun 28 2013 It is no-op keyword in current implementation for everything but
Jonathan M Davis (26/34) Jun 28 2013 Per the spec, all scope is supposed to do is prevent references in a par...

Adam D. Ruppe (42/45) Jun 28 2013 That's fairly close to what I'd want. But there's two cases I'm

Jonathan M Davis (7/10) Jun 28 2013 Per the current spec, this would not be a valid use of scope, as scope i...

Jason House (8/43) Jun 26 2013 Bloomberg released an STL alternative called BSL which contains

cybervadim (15/22) Jun 26 2013 I think the problem with such approach is that you have to

H. S. Teoh (20/44) Jun 26 2013 Yeah, that's a major inconvenience with the C++ allocator model. There's

cybervadim (5/23) Jun 26 2013 Yes, being able to change gc_alloc, gc_free would do the work. If

H. S. Teoh (46/66) Jun 26 2013 Adam's idea does this: tie each replacement of gc_alloc/gc_free to some

Dicebot (2/2) Jun 26 2013 Some type system help is required to guarantee that references to

Brad Anderson (4/11) Jun 28 2013 There is also EASTL's (Electronic Arts version of STL for

Dmitry Olshansky (9/32) Jun 26 2013 Awful. What that extra syntax had brought you? Except that now new is

cybervadim (30/37) Jun 26 2013 Extra syntax allows me not touching the existing code.

Dmitry Olshansky (25/60) Jun 26 2013 Here is a chief problem - the assumption that is required to make it

cybervadim (11/32) Jun 26 2013 Not magic, but the tool which is quite powerful and thus it may

Dmitry Olshansky (16/36) Jun 26 2013 I know what kind of thing you are talking about. It's ain't powerful

H. S. Teoh (58/186) Jun 26 2013 Actually, D's frontend leaves something to be desired when it comes to

Adam D. Ruppe (80/94) Jun 26 2013 Let's consider what kinds of allocations we have. We can break

Dicebot (13/13) Jun 26 2013 By the way, while this topic gets some attention, I want to make

Dmitry Olshansky (6/10) Jun 26 2013 Sadly I believe that global allocators would still have to be compatible...

Dicebot (6/10) Jun 26 2013 Nice way to say "we don't really need that embedded, kernel and

Dmitry Olshansky (12/20) Jun 26 2013 Just don't use certain built-ins. Stub them out in run-time if you like....

Dicebot (7/11) Jun 26 2013 It is an API issue. Hacking druntime is, unfortunately,
Adam D. Ruppe (46/49) Jun 26 2013 Actually, I was kinda sorta able to solve this in my minimal d.

Dmitry Olshansky (11/24) Jun 26 2013 [snip a cool hack]

John Colvin (4/39) Jun 27 2013 Old but perhaps relevant?

deadalnix (3/6) Jun 28 2013 Interesting paper. Still concurrency isn't really addressed,

Brian Rogoff (2/10) Jun 28 2013 http://en.wikipedia.org/wiki/Hoard_memory_allocator

"cybervadim" <vadim.goryunov gmail.com> writes:

I know Andrey mentioned he was going to work on Allocators a year 
ago. In DConf 2013 he described the problems he needs to solve 
with Allocators. But I wonder if I am missing the discussion 
around that - I tried searching this forum, found a few threads 
that was not actually a brain storm for Allocators design.

Please point me in the right direction
or
is there a reason it is not discussed
or
should we open the discussion?


The easiest approach for Allocators design I can imagine would be 
to let user specify which Allocator operator new should get the 
memory from (introducing a new keyword allocator). This gives a 
total control, but assumes user knows what he is doing.

Example:

CustomAllocator ca;
allocator(ca) {
   auto a = new A; // operator new will use 
ScopeAllocator::malloc()
   auto b = new B;

   free(a); // that should call ScopeAllocator::free()
   // if free() is missing for allocated area, it is a user 
responsibility to make sure custom Allocator can handle that
}

By default allocator is the druntime using GC, free(a) does 
nothing for it.


if some library defines its allocator (e.g. specialized 
container), there should be ability to:
1. override allocator
2. get access to the allocator used

I understand that I spent 5 mins thinking about the way 
Allocators may look.
My point is - if somebody is working on it, can you please share 
your ideas?

Jun 25 2013

"Adam D. Ruppe" <destructionator gmail.com> writes:

On Tuesday, 25 June 2013 at 22:22:09 UTC, cybervadim wrote:
 (introducing a new keyword allocator)

It would be easier to just pass an allocator object that provides 
the necessary methods and don't use new at all. (I kinda wish new 
wasn't in the language. It'd make this a little more consistent.)

The allocator's create function could also return wrapped types, 
like RefCounted!T or NotNull!T depending on what it does.

Though the devil is in the details here and I don't think I can 
say more without trying to actually do it.

Jun 25 2013

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Wed, Jun 26, 2013 at 12:50:36AM +0200, Adam D. Ruppe wrote:
 On Tuesday, 25 June 2013 at 22:22:09 UTC, cybervadim wrote:
(introducing a new keyword allocator)

 
 It would be easier to just pass an allocator object that provides
 the necessary methods and don't use new at all. (I kinda wish new
 wasn't in the language. It'd make this a little more consistent.)

It's not too late to introduce a default allocator object that maps to
built-in GC primitives. Maybe something like:

	struct DefaultAllocator
	{
		T* alloc(T, A...)(A args) {
			return new T(args);
		}
		void free(T)(T* ref) {
			// no-op
		}
	}

We can then change Phobos to always use allocator.alloc and
allocator.free, which it gets from user code somehow, and in the default
case it would do the Right Thing.


 The allocator's create function could also return wrapped types,
 like RefCounted!T or NotNull!T depending on what it does.

So maybe something like:

	struct RefCountedAllocator
	{
		RefCounted!T alloc(T, A...)(A args) {
			return allocRefCounted(args);
		}
		void free(T)(RefCounted!T ref) {
			dotDotDotMagic(ref);
		}
	}

etc..


 Though the devil is in the details here and I don't think I can say
 more without trying to actually do it.

The main issue I see is how *not* to get stuck in C++'s situation where
you have to specify allocator objects everywhere, which is highly
inconvenient and liable for people to avoid using, which defeats the
purpose of having allocators. It would be nice, IMO, if we can somehow
let the user specify a custom allocator for, say, the whole of Phobos,
so that people who care about this sorta thing can just replace the GC
wholesale and then use Phobos to their hearts' content without having to
manually specify allocator objects everywhere and risk forgetting a
single case that eventually leads to memory leakage.


T

-- 
Computers shouldn't beep through the keyhole.

Jun 25 2013

Robert Schadek <realburner gmx.de> writes:

On 06/26/2013 12:50 AM, Adam D. Ruppe wrote:
 On Tuesday, 25 June 2013 at 22:22:09 UTC, cybervadim wrote:
 (introducing a new keyword allocator)

 It would be easier to just pass an allocator object that provides the
 necessary methods and don't use new at all. (I kinda wish new wasn't
 in the language. It'd make this a little more consistent.)

I did think about this as well, but than I came up with something that
IMHO is even simpler.

Imagine we have two delegates:

void* delegate(size_t);  // this one allocs
void delegate(void*);    // this one frees

you pass both to a function that constructs you object. The first is
used for allocation the
memory, the second gets attached to the TypeInfo and is used by the gc
to free
the object. This would be completely transparent to the user.

The use in a container is similar. Just use the alloc delegate to
construct the objects and
attach the free delegate to the typeinfo. You could even mix allocator
strategies in the middle
of the lifetime of the container.

Jun 26 2013

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

26-Jun-2013 14:03, Robert Schadek пишет:
 On 06/26/2013 12:50 AM, Adam D. Ruppe wrote:
 On Tuesday, 25 June 2013 at 22:22:09 UTC, cybervadim wrote:
 (introducing a new keyword allocator)

 It would be easier to just pass an allocator object that provides the
 necessary methods and don't use new at all. (I kinda wish new wasn't
 in the language. It'd make this a little more consistent.)

 I did think about this as well, but than I came up with something that
 IMHO is even simpler.

 Imagine we have two delegates:

 void* delegate(size_t);  // this one allocs
 void delegate(void*);    // this one frees

 you pass both to a function that constructs you object. The first is
 used for allocation the
 memory, the second gets attached to the TypeInfo and is used by the gc
 to free
 the object.

Then it's just GC but with an extra complication.

 This would be completely transparent to the user.

 The use in a container is similar. Just use the alloc delegate to
 construct the objects and
 attach the free delegate to the typeinfo. You could even mix allocator
 strategies in the middle
 of the lifetime of the container.


-- 
Dmitry Olshansky

Jun 26 2013

Robert Schadek <realburner gmx.de> writes:

 Imagine we have two delegates:

 void* delegate(size_t);  // this one allocs
 void delegate(void*);    // this one frees

 you pass both to a function that constructs you object. The first is
 used for allocation the
 memory, the second gets attached to the TypeInfo and is used by the gc
 to free
 the object.

 Then it's just GC but with an extra complication.

IMHO, not really, as the place you get the memory from is not managed by
the GC, or at least not
directly. The GC algorithm would see that there is a "free delegate"
attached to the object and would
use this to free the memory.

The same should hold true for calling GC.free.

Or are you talking about ref counting and such?

Jun 26 2013

Marco Leise <Marco.Leise gmx.de> writes:

Am Wed, 26 Jun 2013 16:30:50 +0200
schrieb Robert Schadek <realburner gmx.de>:

 
 Imagine we have two delegates:

 void* delegate(size_t);  // this one allocs
 void delegate(void*);    // this one frees

 you pass both to a function that constructs you object. The first is
 used for allocation the
 memory, the second gets attached to the TypeInfo and is used by the gc
 to free
 the object.



Does it mean 16 extra bytes for every allocation ?

-- 
Marco

Jun 26 2013

Robert Schadek <realburner gmx.de> writes:

On 06/26/2013 10:06 PM, Marco Leise wrote:
 Does it mean 16 extra bytes for every allocation ?

yes, or wrap it, and you have 4 or 8 bytes, but yes you would to have
save it somewhere

Jun 26 2013

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Wed, Jun 26, 2013 at 12:22:04AM +0200, cybervadim wrote:
 I know Andrey mentioned he was going to work on Allocators a year
 ago. In DConf 2013 he described the problems he needs to solve with
 Allocators. But I wonder if I am missing the discussion around that
 - I tried searching this forum, found a few threads that was not
 actually a brain storm for Allocators design.
 
 Please point me in the right direction
 or
 is there a reason it is not discussed
 or
 should we open the discussion?

That would be nice to get things going. :)

Ever since I found D and subscribed to this mailing list, I've been
hearing rumors of allocators, but they seem to be rather lacking in the
department of concrete evidence. They're like the Big Foot or Swamp Ape
of D. Maybe it's time we got out into the field and produced some real
evidence of these mythical beasts. :-P


 The easiest approach for Allocators design I can imagine would be to
 let user specify which Allocator operator new should get the memory
 from (introducing a new keyword allocator). This gives a total
 control, but assumes user knows what he is doing.
 
 Example:
 
 CustomAllocator ca;
 allocator(ca) {
   auto a = new A; // operator new will use ScopeAllocator::malloc()
   auto b = new B;
 
   free(a); // that should call ScopeAllocator::free()
   // if free() is missing for allocated area, it is a user
 responsibility to make sure custom Allocator can handle that
 }
 
 By default allocator is the druntime using GC, free(a) does nothing
 for it.

I believe the current direction is to avoid needing new language
features / syntax. So the above probably won't happen.


 if some library defines its allocator (e.g. specialized container),
 there should be ability to:
 1. override allocator
 2. get access to the allocator used
 
 I understand that I spent 5 mins thinking about the way Allocators
 may look.
 My point is - if somebody is working on it, can you please share
 your ideas?

Well, thanks for getting the ball rolling. Maybe Andrei can pipe up
about any experimental designs he's currently considering.

But barring that, I'm thinking about how allocators would be used in
user code. I think it's pretty much a given that the C++ way of sticking
it to the end of template arguments doesn't really fly: it's just too
much of a hassle to keep having to worry about passing allocators around
template arguments, that people just don't bother. So coming back to
square one, how would allocators be used?

1) Usually, the user would just be content with the GC, and not ever
have to worry about allocators. So this means that whatever allocator
design we adopt, it should be practically invisible to ordinary users
unless they're specifically looking to change how memory is allocated.

2) Furthermore, it's unlikely that in the same piece of code, you'd want
to use 3 or 4 different allocators for different objects; while such
cases may exist, it seems to me to be more likely that you want either
(a) a very specific object (say a class instance or container) to use a
particular allocator, or (b) you want to transitively block off an
entire section of code (which may be the entire program in some cases)
to use a particular allocator.

As a first stab at it, I'd say (a) can be implemented by a static class
member reference to an allocator, that can be set from user code.

And maybe (b) can be implemented by making gc_alloc / gc_free
overridable function pointers? Then we can override their values and use
scope guards to revert them back to the values they were before. This
allows us to use the runtime stack to manage which allocator is
currently active. This lets *all* memory allocations be rerouted through
the custom allocator without needing to hand-edit every call to new down
the call graph.

This is just a very crude first stab at the problem, though. In
particular, (a) isn't very satisfactory. And also the interaction of
allocated objects with the call stack: if any custom-allocated objects
in (b) survive past the containing function which sets/resets the
function pointers, there could be problems: if a member function of such
an object needs to allocate memory, it will pick up the ambient
allocator instead of the custom allocator in effect when the object was
first created. Also, we may have the problem of the wrong allocator
being used to free the object.

Anyone has better ideas?


T

-- 
All problems are easy in retrospect.

Jun 25 2013

"cybervadim" <vadim.goryunov gmail.com> writes:

On Tuesday, 25 June 2013 at 22:50:55 UTC, H. S. Teoh wrote:
 On Wed, Jun 26, 2013 at 12:22:04AM +0200, cybervadim wrote:

 That would be nice to get things going. :)

 Ever since I found D and subscribed to this mailing list, I've 
 been
 hearing rumors of allocators, but they seem to be rather 
 lacking in the
 department of concrete evidence. They're like the Big Foot or 
 Swamp Ape
 of D. Maybe it's time we got out into the field and produced 
 some real
 evidence of these mythical beasts. :-P

 Well, thanks for getting the ball rolling. Maybe Andrei can 
 pipe up
 about any experimental designs he's currently considering.

 But barring that, I'm thinking about how allocators would be 
 used in
 user code. I think it's pretty much a given that the C++ way of 
 sticking
 it to the end of template arguments doesn't really fly: it's 
 just too
 much of a hassle to keep having to worry about passing 
 allocators around
 template arguments, that people just don't bother. So coming 
 back to
 square one, how would allocators be used?

 1) Usually, the user would just be content with the GC, and not 
 ever
 have to worry about allocators. So this means that whatever 
 allocator
 design we adopt, it should be practically invisible to ordinary 
 users
 unless they're specifically looking to change how memory is 
 allocated.

 2) Furthermore, it's unlikely that in the same piece of code, 
 you'd want
 to use 3 or 4 different allocators for different objects; while 
 such
 cases may exist, it seems to me to be more likely that you want 
 either
 (a) a very specific object (say a class instance or container) 
 to use a
 particular allocator, or (b) you want to transitively block off 
 an
 entire section of code (which may be the entire program in some 
 cases)
 to use a particular allocator.

 As a first stab at it, I'd say (a) can be implemented by a 
 static class
 member reference to an allocator, that can be set from user 
 code.

 And maybe (b) can be implemented by making gc_alloc / gc_free
 overridable function pointers? Then we can override their 
 values and use
 scope guards to revert them back to the values they were 
 before. This
 allows us to use the runtime stack to manage which allocator is
 currently active. This lets *all* memory allocations be 
 rerouted through
 the custom allocator without needing to hand-edit every call to 
 new down
 the call graph.

 This is just a very crude first stab at the problem, though. In
 particular, (a) isn't very satisfactory. And also the 
 interaction of
 allocated objects with the call stack: if any custom-allocated 
 objects
 in (b) survive past the containing function which sets/resets 
 the
 function pointers, there could be problems: if a member 
 function of such
 an object needs to allocate memory, it will pick up the ambient
 allocator instead of the custom allocator in effect when the 
 object was
 first created. Also, we may have the problem of the wrong 
 allocator
 being used to free the object.

 Anyone has better ideas?


 T

 From my experience all objects may be divided into 2 categories
1. temporaries. Program usually have some kind of event loop. 
During one iteration of this loop some temporary objects are 
created and then discarded. The ideal case for stack (or ranged 
or area) allocator, where you define allocator at the beginning 
of the loop cycle, use it for all temporaries, then free all the 
memory in one go at the end of iteration.
2. containers. Program receives an event from the outside and 
puts some data into container OR update the data if the record 
already exists.
The important thing here is - when updating the data in 
container, you may want to resize the existing area.

If you are working with temporary which should be placed into 
container, a copy can be made (with corresponding memory 
allocation from container allocator).

Not sure if there is anything better than stack/area allocator 
for the first class. For the second class user should be able to 
choose default GC or more precise memory handling (e.g. explicit 
malloc/free for resizing).

Anything I am missing in this categorization?

So even if we get allocators that lets us deal with temporaries, 
that will be a huge benefit.

Jun 25 2013

"bearophile" <bearophileHUGS lycos.com> writes:

cybervadim:

 From my experience all objects may be divided into 2 categories
 1. temporaries. Program usually have some kind of event loop. 
 During one iteration of this loop some temporary objects are 
 created and then discarded. The ideal case for stack (or ranged 
 or area) allocator, where you define allocator at the beginning 
 of the loop cycle, use it for all temporaries, then free all 
 the memory in one go at the end of iteration.
 2. containers. Program receives an event from the outside and 
 puts some data into container OR update the data if the record 
 already exists.
 The important thing here is - when updating the data in 
 container, you may want to resize the existing area.

Many garbage collectors use the same idea (and manage it 
automatically), with two or three different generations:

http://en.wikipedia.org/wiki/Garbage_collection_%28computer_science%29#Generational_GC_.28ephemeral_GC.29

Bye,
bearophile

Jun 25 2013

"cybervadim" <vadim.goryunov gmail.com> writes:

 Many garbage collectors use the same idea (and manage it 
 automatically), with two or three different generations:

 http://en.wikipedia.org/wiki/Garbage_collection_%28computer_science%29#Generational_GC_.28ephemeral_GC.29

 Bye,
 bearophile

The problem with GC is that it doesn't know which is temporary 
and which is not, so it has to traverse tree to determine that. 
Allocators in my opinion should let user specify explicitly the 
temporaries.

Jun 25 2013

"Adam D. Ruppe" <destructionator gmail.com> writes:

On Tuesday, 25 June 2013 at 22:50:55 UTC, H. S. Teoh wrote:
 And maybe (b) can be implemented by making gc_alloc / gc_free
 overridable function pointers? Then we can override their 
 values and use scope guards to revert them back to the values 
 they were before.

Yea, I was thinking this might be a way to go. You'd have a 
global (well, thread-local) allocator instance that can be set 
and reset through stack calls.

You'd want it to be RAII or delegate based, so the scope is clear.

with_allocator(my_alloc, {
      do whatever here
});


or

{
    ChangeAllocator!my_alloc dummy;

    do whatever here
} // dummy's destructor ends the allocator scope


I think the former is a bit nicer, since the dummy variable is a 
bit silly. We'd hope that delegate can be inlined.



But, the template still has a big advantage: you can change the 
type. And I think that is potentially enormously useful.



Another question is how to tie into output ranges. Take 
std.conv.to.

auto s = to!string(10); // currently, this hits the gc

What if I want it to go on a stack buffer? One option would be to 
rewrite it to use an output range, and then call it like:

char[20] buffer;
auto s = to!string(10, buffer); // it returns the slice of the 
buffer it actually used

(and we can do overloads so to!string(10, radix) still works, as 
well as to!string(10, radix, buffer). Hassle, I know...)

Naturally, the default argument is to use the 'global' allocator, 
whatever that is, which does nothing special.



The fun part is the output range works for that, and could also 
work for something like this:

struct malloced_string {
     char* ptr;
     size_t length;
     size_t capacity;
     void put(char c) {
         if(length >= capacity)
            ptr = realloc(ptr, capacity*2);
         ptr[length++] = c;
     }

     char[] slice() { return ptr[0 .. length]; }
     alias slice this;
     mixin RefCounted!this; // pretend this works
}


{
    malloced_string str;
    auto got = to!string(10, str);
} // str is out of scope, so it gets free()'d. unsafe though: if 
you stored a copy of got somewhere, it is now a pointer to freed 
memory. I'd kinda like language support of some sort to help 
mitigate that though, like being a borrowed pointer that isn't 
allowed to be stored, but that's another discussion.


And that should work. So then what we might do is provide these 
little output range wrappers for various allocators, and use them 
on many functions.

So we'd write:

import std.allocators;
import std.range;

// mallocator is provided in std.allocators and offers the goods
OutputRange!(char, mallocator) str;

auto got = to!string(10, str);



What's nice here is the output range is useful for more than just 
allocators. You could also to!string(10, my_file) or a delegate, 
blah blah blah. So it isn't too much of a burden, it is something 
you might naturally use anyway.

 Also, we may have the problem of the wrong allocator
 being used to free the object.

Another reason why encoding the allocator into the type is so 
nice. For the minimal D I've been playing with, the idea I'm 
running with is all allocated memory has some kind of special 
type, and then naked pointers are always assumed to be borrowed, 
so you should never store or free them.

auto foo = HeapArray!char(capacity);

void bar(char[] lol){}

bar(foo); // allowed, foo has an alias this on slice

// but....

struct A {
    char[] lol; // not allowed, because you don't know when lol is 
going to be freed
}


foo frees itself with refcounting.

Jun 25 2013

"Adam D. Ruppe" <destructionator gmail.com> writes:

I was just quickly skimming some criticism of C++ allocators, 
since my thought here is similar to what they do. On one hand, 
maybe D can do it right by tweaking C++'s design rather than 
discarding it.

On the other hand, with all the C++ I've done, I have never 
actually used STL allocators, which could say something about me 
or could say something about them.


One thing I saw said making the differently allocated object a 
different type sucks. ...but must it? The complaint there was "so 
much for just doing a function that takes a std::string". But, 
the way I'd want to do it in D is the function would take a 
char[] instead, and our special allocated type provides that via 
opSlice and/or alias this.

So you'd only have to worry about the different type if you 
intend to take ownership of the container yourself. Which we 
already kinda think about in D: if you store a char[], someone 
else could overwrite it, so we prefer to store an 
immutable(char)[] aka string. If you're given a char[] and want 
to store it, you might idup. So I don't think doing a private 
copy with some other allocation scheme is any more of a hassle.

(BTW immutable objects IMO should *always* be garbage collected, 
because part of immutability is infinite lifetime. So we might 
want to be careful with implicit conversions to immutable based 
on allocation method, which I believe we can protect through 
member functions.)


Anyway, bottom line is I don't think that criticism necessarily 
applies to D. But there's surely many others and I'm more or less 
a n00b re c++'s allocators so idk yet.

Jun 25 2013

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

26-Jun-2013 05:24, Adam D. Ruppe пишет:
 I was just quickly skimming some criticism of C++ allocators, since my
 thought here is similar to what they do. On one hand, maybe D can do it
 right by tweaking C++'s design rather than discarding it.

Criticisms are:

A) Was defined to not have any state (as noted in the standard)
B) Parametrized on type (T) yet a container that is parametrized on it 
may need to allocate something else completely (a node with T).
C) Containers are parametrized on allocators so say 2 lists with 
different allocators are incompatible in a sense that e.g. you can't 
splice pieces of  them together.

Of the above IMHO we can deduce that
a) Should support stateful allocators but we have to make sure we don't 
pay storage space for state-less ones (global ones e.g. mallocator).
b) Should preferably be typeless and let container define what they allocate
c) Hardly solvable unless we require a way to reassign objects between 
allocators (at least of similar kinds)

 Anyway, bottom line is I don't think that criticism necessarily applies
 to D. But there's surely many others and I'm more or less a n00b re
 c++'s allocators so idk yet.


-- 
Dmitry Olshansky

Jun 26 2013

Jacob Carlborg <doob me.com> writes:

On 2013-06-26 01:16, Adam D. Ruppe wrote:

 You'd want it to be RAII or delegate based, so the scope is clear.

 with_allocator(my_alloc, {
       do whatever here
 });


 or

 {
     ChangeAllocator!my_alloc dummy;

     do whatever here
 } // dummy's destructor ends the allocator scope


 I think the former is a bit nicer, since the dummy variable is a bit
 silly. We'd hope that delegate can be inlined.

It won't be inlined. You would need to make it a template parameter to 
have it inlined.

-- 
/Jacob Carlborg

Jun 26 2013

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

26-Jun-2013 03:16, Adam D. Ruppe пишет:
 On Tuesday, 25 June 2013 at 22:50:55 UTC, H. S. Teoh wrote:
 And maybe (b) can be implemented by making gc_alloc / gc_free
 overridable function pointers? Then we can override their values and
 use scope guards to revert them back to the values they were before.

 Yea, I was thinking this might be a way to go. You'd have a global
 (well, thread-local) allocator instance that can be set and reset
 through stack calls.

 You'd want it to be RAII or delegate based, so the scope is clear.

 with_allocator(my_alloc, {
       do whatever here
 });


 or

 {
     ChangeAllocator!my_alloc dummy;

     do whatever here
 } // dummy's destructor ends the allocator scope

Both suffer from
a) being totally unsafe and in fact bug prone since all references 
obtained in there are now dangling (and there is no indication where 
they came from)
b) imagine you need to use an allocator for a stateful object. Say 
forward range of some other ranges (e.g. std.regex) both scoped/stacked 
to allocate its internal stuff. 2nd one may handle it but not the 1st one.
c) transfer of objects allocated differently up the call graph (scope 
graph?), is pretty much neglected I see.

I kind of wondering how our knowledgeable community has come to this.
(must have been starving w/o allocators way too long)

 {
     malloced_string str;
     auto got = to!string(10, str);
 } // str is out of scope, so it gets free()'d. unsafe though: if you
 stored a copy of got somewhere, it is now a pointer to freed memory. I'd
 kinda like language support of some sort to help mitigate that though,
 like being a borrowed pointer that isn't allowed to be stored, but
 that's another discussion.

In contrast 'container as an output range' works both safely and would 
be still customizable.

IMHO the only place for allocators is in containers other kinds of code 
may just ignore allocators completely.

std.algorithm and friends should imho be customized on 2 things only:

a) containers to use (instead of array)
b) optionally a memory source (or allocator) f container is 
temporary(scoped) to tie its life-time to smth.

Want temporary stuff? Use temporary arrays, hashmaps and whatnot i.e. 
types tailored for a particular use case (e.g. with a temporary/scoped 
allocator in mind).
These would all be unsafe though. Alternative is ref-counting pointers 
to an allocator. With word on street about ARC it could be nice 
direction to pursue.

Allocators (as Andrei points out in his video) have many kinds:
a) persistence: infinite, manual, scoped
b) size: unlimited vs fixed
c) block-size: any, fixed, or *any* up to some maximum size

Most of these ARE NOT interchangeable!
Yet some are composable however I'd argue that allocators are not 
composable but have some reusable parts that in turn are composable.

Code would have to cutter for specific flavors of allocators still so 
we'd better reduce this problem to the selection of containers.

-- 
Dmitry Olshansky

Jun 26 2013

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Wed, Jun 26, 2013 at 06:51:54PM +0400, Dmitry Olshansky wrote:
 26-Jun-2013 03:16, Adam D. Ruppe пишет:
On Tuesday, 25 June 2013 at 22:50:55 UTC, H. S. Teoh wrote:
And maybe (b) can be implemented by making gc_alloc / gc_free
overridable function pointers? Then we can override their values and
use scope guards to revert them back to the values they were before.

Yea, I was thinking this might be a way to go. You'd have a global
(well, thread-local) allocator instance that can be set and reset
through stack calls.

You'd want it to be RAII or delegate based, so the scope is clear.

with_allocator(my_alloc, {
      do whatever here
});


or

{
    ChangeAllocator!my_alloc dummy;

    do whatever here
} // dummy's destructor ends the allocator scope

 
 Both suffer from
 a) being totally unsafe and in fact bug prone since all references
 obtained in there are now dangling (and there is no indication where
 they came from)

How is this different from using malloc() and free() manually? You have
no indication of where a void* came from either, and the danger of
dangling references is very real, as any C/C++ coder knows. And I assume
that *some* people will want to be defining custom allocators that wrap
around malloc/free (e.g. the game engine guys who want total control).


 b) imagine you need to use an allocator for a stateful object. Say
 forward range of some other ranges (e.g. std.regex) both
 scoped/stacked to allocate its internal stuff. 2nd one may handle it
 but not the 1st one.

Yeah this is a complicated area. A container basically needs to know how
to allocate its elements. So somehow that information has to be
somewhere.


 c) transfer of objects allocated differently up the call graph
 (scope graph?), is pretty much neglected I see.

They're incompatible. You can't safely make a linked list that contains
both GC-allocated nodes and malloc() nodes. That's just a bomb waiting
to explode in your face. So in that sense, Adam's idea of using a
different type for differently-allocated objects makes sense. A
container has to declare what kind of allocation its members are using;
any other way is asking for trouble.


 I kind of wondering how our knowledgeable community has come to this.
 (must have been starving w/o allocators way too long)

We're just trying to provoke Andrei into responding. ;-)


[...]
 IMHO the only place for allocators is in containers other kinds of
 code may just ignore allocators completely.

But some people clamoring for allocators are doing so because they're
bothered by Phobos using ~ for string concatenation, which implicitly
uses the GC. I don't think we can just ignore that.


 std.algorithm and friends should imho be customized on 2 things only:
 
 a) containers to use (instead of array)
 b) optionally a memory source (or allocator) f container is
 temporary(scoped) to tie its life-time to smth.
 
 Want temporary stuff? Use temporary arrays, hashmaps and whatnot
 i.e. types tailored for a particular use case (e.g. with a
 temporary/scoped allocator in mind).
 These would all be unsafe though. Alternative is ref-counting
 pointers to an allocator. With word on street about ARC it could be
 nice direction to pursue.

Ref-counting is not fool-proof, though. There's always cycles to mess
things up.


 Allocators (as Andrei points out in his video) have many kinds:
 a) persistence: infinite, manual, scoped
 b) size: unlimited vs fixed
 c) block-size: any, fixed, or *any* up to some maximum size
 
 Most of these ARE NOT interchangeable!
 Yet some are composable however I'd argue that allocators are not
 composable but have some reusable parts that in turn are composable.

I was listening to Andrei's talk this morning, but I didn't quite
understand what he means by composable allocators. Is he talking about
nesting, say, a GC inside a region allocated by a region allocator?


 Code would have to cutter for specific flavors of allocators still
 so we'd better reduce this problem to the selection of containers.

[...]

Hmm. Sounds like we have two conflicting things going on here:

1) En massé replacement of gc_alloc/gc_free in a certain block of code
(which may be the entire program), e.g., for the avoidance of GC in game
engines, etc.. Basically, the code is allocator-agnostic, but at some
higher level we want to control which allocator is being used.

2) Specific customization of containers, etc., as to which allocator(s)
should be used, with (hopefully) some kind of support from the type
system to prevent mistakes like dangling pointers, escaping references,
etc.. Here, the code is NOT allocator-agnostic; it has to be written
with the specific allocation model in mind. You can't just replace the
allocator with another one without introducing bugs or problems.

These two may interact in complex ways... e.g., you might want to use
malloc to allocate a pool, then use a custom gc_alloc/gc_free to
allocate from this pool in order to support language built-ins like ~
and ~= without needing to rewrite every function that uses strings.

Maybe we should stop conflating these two things so that we stop
confusing ourselves, and hopefully it will be easier to analyse
afterwards.


T

-- 
You have to expect the unexpected. -- RL

Jun 26 2013

"Brian Rogoff" <brogoff gmail.com> writes:

On Wednesday, 26 June 2013 at 17:25:24 UTC, H. S. Teoh wrote:
 I was listening to Andrei's talk this morning, but I didn't 
 quite
 understand what he means by composable allocators. Is he 
 talking about
 nesting, say, a GC inside a region allocated by a region 
 allocator?

Maybe he was talking about a freelist allocator over a reap, as
described by the HeapLayers project http://heaplayers.org/ in the
paper from 2001 titled 'Composing High-Performance Memory
Allocators'. I'm pretty sure that web site was referenced in the
talk. A few publications there are from Andrei.

I agree that D should support programming without a GC, with
different GCs than the default one, and custom allocators, and
that features which demand a GC will be troublesome.

-- Brian

Jun 26 2013

"Adam D. Ruppe" <destructionator gmail.com> writes:

On Wednesday, 26 June 2013 at 17:25:24 UTC, H. S. Teoh wrote:
 malloc to allocate a pool, then use a custom gc_alloc/gc_free to
 allocate from this pool in order to support language built-ins 
 like ~ and ~= without needing to rewrite every function that 
 uses strings.

Blargh, I forgot about operator ~ on built ins. For custom types 
it is easy enough to manage, just overload it. You can even do ~= 
on types that aren't allowed to allocate, if they have a certain 
capacity set up ahead of time (like a stack buffer)

But for built ins, blargh, I don't even think we can hint on them 
to the gc. Maybe we should just go ahead and make the gc 
generational. (If you aren't using gc, I say leave binary ~ 
unimplemented in all cases. Use ~= on a temporary instead 
whenever you would do that. It is easier to follow the lifetime 
if you explicitly declare your temporary.)

Jun 26 2013

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

26-Jun-2013 21:23, H. S. Teoh пишет:

 Both suffer from
 a) being totally unsafe and in fact bug prone since all references
 obtained in there are now dangling (and there is no indication where
 they came from)

 How is this different from using malloc() and free() manually? You have
 no indication of where a void* came from either, and the danger of
 dangling references is very real, as any C/C++ coder knows. And I assume
 that *some* people will want to be defining custom allocators that wrap
 around malloc/free (e.g. the game engine guys who want total control).

Why the heck you people think I purpose to use malloc directly as 
alternative to whatever hackish allocator stack proposed?

Use the darn container. For starters I'd make allocation strategy a 
parameter of each containers. At least they do OWN memory.

Then refactor out common pieces into a framework of allocation helpers. 
I'd personally in the end would separate concerns into 3 entities:

1. Memory area objects - think as allocators but without the circuitry 
to do the allocation, e.g. a chunk of memory returned by malloc/alloca 
can be wrapped into a memory area object.

2. Allocators (Policies) - a potentially nested combination of such 
"circuitry" that makes use of memory areas. Free-lists, pools, stacks 
etc. Safe ones have ref-counting on memory areas, unsafe once don't. 
(Though safety largely depends on the way you got that chunk of memory)

3. Containers/Warppers as above objects that handle life-cycle of 
objects and make use of allocators. In fact allocators are part of
type but not memory area objects.


 b) imagine you need to use an allocator for a stateful object. Say
 forward range of some other ranges (e.g. std.regex) both
 scoped/stacked to allocate its internal stuff. 2nd one may handle it
 but not the 1st one.

 Yeah this is a complicated area. A container basically needs to know how
 to allocate its elements. So somehow that information has to be
 somewhere.


 c) transfer of objects allocated differently up the call graph
 (scope graph?), is pretty much neglected I see.

 They're incompatible. You can't safely make a linked list that contains
 both GC-allocated nodes and malloc() nodes.

What I mean is that if types are the same as built-ins it would be a 
horrible mistake. If not then we are talking about containers anyway.
And if these have a ref-counted pointer to their allocator then the 
whole thing is safe albeit at the cost of performance.

Sadly alias this to some built-in (=e.g. slice) allows squirreling away 
underlying reference too easily.

As such I don't believe in any of the 2 *lies*:
a) built-ins can be refurbished to use custom allocators
b) we can add opSlice/alias this or whatever to our custom type to get 
access to the underlying built-ins safely and transparently

Both are just nuclear bombs waiting a good time to explode.

That's just a bomb waiting
 to explode in your face. So in that sense, Adam's idea of using a
 different type for differently-allocated objects makes sense.

Yes, but one should be careful here as not to have exponential explosion 
in the code size. So some allocators have to be compatible and if there 
is a way to transfer ownership it'd be bonus points (and a large pot of 
these mind you).

 A
 container has to declare what kind of allocation its members are using;
 any other way is asking for trouble.

Hence my thoughts to move this piece of "circuitry" to containers 
proper. The whole idea that by swapping malloc with myMalloc you can 
translate to a wildly different allocation scheme doesn't quite hold.

I think it may be interesting to try and put a "wall" in different place 
namely in between allocation strategy and memory areas it works on.


 I kind of wondering how our knowledgeable community has come to this.
 (must have been starving w/o allocators way too long)

 We're just trying to provoke Andrei into responding. ;-)

Cool, then keep it coming but ... safety and other holes has to be taken 
care of.

 [...]
 IMHO the only place for allocators is in containers other kinds of
 code may just ignore allocators completely.

 But some people clamoring for allocators are doing so because they're
 bothered by Phobos using ~ for string concatenation, which implicitly
 uses the GC. I don't think we can just ignore that.

~= would work with any sensible array-like contianer.
~ is sadly only a convenience for scripts and/or non-performance 
(determinism) critical apps unfortunately.
 std.algorithm and friends should imho be customized on 2 things only:

 a) containers to use (instead of array)
 b) optionally a memory source (or allocator) f container is
 temporary(scoped) to tie its life-time to smth.

 Want temporary stuff? Use temporary arrays, hashmaps and whatnot
 i.e. types tailored for a particular use case (e.g. with a
 temporary/scoped allocator in mind).
 These would all be unsafe though. Alternative is ref-counting
 pointers to an allocator. With word on street about ARC it could be
 nice direction to pursue.

 Ref-counting is not fool-proof, though. There's always cycles to mess
 things up.

You surely shouldn't have allocators reference each other cyclically? 
Then I see this as a DAG with allocator at the bottom and objects 
referencing it.

 Allocators (as Andrei points out in his video) have many kinds:
 a) persistence: infinite, manual, scoped
 b) size: unlimited vs fixed
 c) block-size: any, fixed, or *any* up to some maximum size

 Most of these ARE NOT interchangeable!
 Yet some are composable however I'd argue that allocators are not
 composable but have some reusable parts that in turn are composable.

 I was listening to Andrei's talk this morning, but I didn't quite
 understand what he means by composable allocators. Is he talking about
 nesting, say, a GC inside a region allocated by a region allocator?

I'd say something like: fixed size region allocator  with GC as fallback.
Or pool for small allocations + malloc/free with a free-list for bigger 
allocations etc. And the stuff should be as easily composable as I just 
listed.

 Code would have to cutter for specific flavors of allocators still
 so we'd better reduce this problem to the selection of containers.

 [...]

 Hmm. Sounds like we have two conflicting things going on here:

 1) En massé replacement of gc_alloc/gc_free in a certain block of code
 (which may be the entire program), e.g., for the avoidance of GC in game
 engines, etc.. Basically, the code is allocator-agnostic, but at some
 higher level we want to control which allocator is being used.

There is no allocator agnostic code that allocates. It either happens to 
call free/dispose/destroy manually (implicitly with ref-counts) or it 
does not. It either escapes references to who knows where or doesn't.

 2) Specific customization of containers, etc., as to which allocator(s)
 should be used, with (hopefully) some kind of support from the type
 system to prevent mistakes like dangling pointers, escaping references,
 etc.. Here, the code is NOT allocator-agnostic; it has to be written
 with the specific allocation model in mind. You can't just replace the
 allocator with another one without introducing bugs or problems.

With another one of the same _kind_ I'd say.

 These two may interact in complex ways... e.g., you might want to use
 malloc to allocate a pool, then use a custom gc_alloc/gc_free to
 allocate from this pool in order to support language built-ins like ~
 and ~= without needing to rewrite every function that uses strings.

I guess we have to re-write them. Or don't allocate in string functions.


 Maybe we should stop conflating these two things so that we stop
 confusing ourselves, and hopefully it will be easier to analyse
 afterwards.


-- 
Dmitry Olshansky

Jun 26 2013

"Adam D. Ruppe" <destructionator gmail.com> writes:

So to try some ideas, I started implementing a simple container 
with replaceable allocators: a singly linked list.

All was going kinda well until I realized the forward range it 
offers to iterate its contents makes it possible to escape a 
reference to a freed node.

auto range = list.range;
auto range2 = range;
range.removeFront();

range2 now refers to a freed node. Maybe the nodes could be 
refcounted, though a downside there is even the range won't be 
sharable, it would be a different type based on allocation 
method. (I was hoping to make the range be a sharable component, 
even as the list itself changed type with allocators.)

I guess we could  disable copy construction, and make it a 
forward range instead of an input one, but that takes some of the 
legitimate usefulness away.

Interestingly though, opApply would be ok here, since all it 
would expose is the payload.

(though if the payload is a reference type, does the container 
take ownership of it? How do we indicate that? Perhaps more 
interestingly, how do we indicate the /lack/ of ownership at the 
transfer point?)



This is all fairly easy if we just decide "we're going to do this 
with GC" or "we're going to do this C style" and do the whole 
program like that, libraries and all. But trying to mix and match 
just gets more complicated the more I think about it :( It makes 
the question of "allocators" look trivial.

Jun 26 2013

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Thu, Jun 27, 2013 at 12:43:54AM +0200, Adam D. Ruppe wrote:
 So to try some ideas, I started implementing a simple container with
 replaceable allocators: a singly linked list.
 
 All was going kinda well until I realized the forward range it
 offers to iterate its contents makes it possible to escape a
 reference to a freed node.

[...]
 (though if the payload is a reference type, does the container take
 ownership of it? How do we indicate that? Perhaps more interestingly,
 how do we indicate the /lack/ of ownership at the transfer point?)

Maybe a type distinction akin to C++'s auto_ptr might help? Say we
introduce OwnedRef!T vs. plain old T*. So something returning OwnedRef!T
will need to assume ownership of the object, whereas something returning
T* would just be returning a reference, but the container continues to
hold ownership over the object.


 This is all fairly easy if we just decide "we're going to do this
 with GC" or "we're going to do this C style" and do the whole
 program like that, libraries and all. But trying to mix and match
 just gets more complicated the more I think about it :( It makes the
 question of "allocators" look trivial.

Heh. Yeah, I'm started to wonder if it even makes sense to try to
mix-n-match GC-based and non-GC-based allocators. It seems that maybe we
just have to settle for the fact of life that a GC-based object is
fundamentally incompatible with a pool-allocated object, and both are
also fundamentally incompatible with malloc-allocated objects, 'cos you
need the code to be aware in each instance of what needs to be done to
cleanup, etc..


T

-- 
GEEK = Gatherer of Extremely Enlightening Knowledge

Jun 26 2013

"Adam D. Ruppe" <destructionator gmail.com> writes:

On Wednesday, 26 June 2013 at 23:02:47 UTC, H. S. Teoh wrote:
 Maybe a type distinction akin to C++'s auto_ptr might help?

Yeah, that's what I'm thinking, but I don't really like it. 
Perhaps I'm trying too hard to cover everything, and should be 
happier with just doing what C++ does. Full memory safety is 
prolly out the window anyway.

In std.typecons, there's a Unique!T, but it doesn't look 
complete. A lot of the code is commented out, maybe it was 
started back in the days of bug city.

 Yeah, I'm started to wonder if it even makes sense to try to 
 mix-n-match GC-based and non-GC-based allocators.

It might not be so bad if we modified D to add a lent storage 
class, or something, similar to some discussions about scope in 
the past.

These would be values you may work with, but never keep; 
assigning them to anything is not allowed and you may only pass 
them to a function or return them from a function if that is also 
marked lent. Any regular reference would be implicitly usable as 
lent.

int* ptr;

void bar(int* a) {
   foo(a); // ok
}

int* foo(lent int* a) {
    bar(a); // error, cannot call bar with lent pointer
    ptr = a; // error, cannot assign lent value to non-lent field
    foo2(a); // ok
    foo(foo2(a)); // ok
    return a; // error, cannot return a lent value
}

lent int* foo2(lent int* a) {
    return a; // ok
}

foo(ptr); // ok (if foo actually compiled)

And finally, if you take the address of a lent reference, that 
itself is lent; &(lent int*) == lent int**.


Then, if possible, it would be cool if:

lent int* a;
{
   int* b;
   a = b;
}


That was an error, because a outlived b. But since you can't 
store a anywhere, the only time this would happen would be 
something like here. And hell maybe we could hammer around that 
by making lent variables head const and say they must be 
initialized at declaration, so "lent int* a;" is illegal as well 
as "a = b;". But we wouldn't want it transitively const, because 
then:

void fillBuffer(lent char[] buffer) {}

would be disallowed and that is something I would definitely want.



Part of me thinks pure might help with this too.... but eh maybe 
not because even a pure function could in theory escape a 
reference via its other parameters.



But with this kind of thing, we could do a nicer pointer type 
that does:

lent T getThis() { return _this; }
alias getThis this;

and thus implicitly convert our inner pointer to something we can 
use on the outside world with some confidence that they won't 
sneak away any references to it. If combined with  disabling the 
address of operator on the container itself, we could really lock 
down ownership.

Jun 26 2013

"BLM768" <blm768 gmail.com> writes:

On Wednesday, 26 June 2013 at 23:59:01 UTC, Adam D. Ruppe wrote:
 On Wednesday, 26 June 2013 at 23:02:47 UTC, H. S. Teoh wrote:
 Maybe a type distinction akin to C++'s auto_ptr might help?

 It might not be so bad if we modified D to add a lent storage 
 class, or something, similar to some discussions about scope in 
 the past.

 These would be values you may work with, but never keep; 
 assigning them to anything is not allowed and you may only pass 
 them to a function or return them from a function if that is 
 also marked lent. Any regular reference would be implicitly 
 usable as lent.

Something along those lines would probably be a good solution.

It seems that we're working with three types of objects:

1. Objects that are "owned" by a scope (can be stack-allocated)
2. Objects that are "owned" by a another object (C/C++-like 
memory management)
3. Objects that have no single "owner" (GC memory management)

The first two would probably operate under semantics like "lent" 
or "scope", although I'd like to propose an extension to the 
rules: it should be possible to store a weak reference to these 


The third type seems to be pretty much solved, seeing as we have 
a (mostly) working GC.

Something like this might be a nice way to implement it:

class Thing {}


reference


void main() {
   scope Thing t1; //stack-allocated
   doSomething(t1);
   owned Thing t2 = new Thing; //heap-allocated but freed at end 
of scope
   doSomething(t2);
}

Jun 27 2013

Marco Leise <Marco.Leise gmx.de> writes:

Am Thu, 27 Jun 2013 01:59:00 +0200
schrieb "Adam D. Ruppe" <destructionator gmail.com>:

 void fillBuffer(lent char[] buffer) {}
 
 would be disallowed and that is something I would definitely want.

Isn't that what scope is for?

-- 
Marco

Jun 28 2013

"Dicebot" <public dicebot.lv> writes:

On Friday, 28 June 2013 at 07:07:39 UTC, Marco Leise wrote:
 Am Thu, 27 Jun 2013 01:59:00 +0200
 schrieb "Adam D. Ruppe" <destructionator gmail.com>:

 void fillBuffer(lent char[] buffer) {}
 
 would be disallowed and that is something I would definitely 
 want.

 Isn't that what scope is for?

Reading dlang.org makes you guess so but official position is 
that 'scope' does not exist, so it is hard to say what it is 
really for.

Jun 28 2013

"Adam D. Ruppe" <destructionator gmail.com> writes:

On Friday, 28 June 2013 at 07:07:39 UTC, Marco Leise wrote:
 Isn't that what scope is for?

I don't really know. In practice, it does something else (usually 
nothing, but suppresses heap closure allocation on delegates). 
The DIPs relating to it all talk about returning refs from 
functions and I'm not sure if they relate to the built ins or 
not- I don't think it would quite work for what I have in mind.

Jun 28 2013

"Dicebot" <public dicebot.lv> writes:

On Friday, 28 June 2013 at 11:55:46 UTC, Adam D. Ruppe wrote:
 On Friday, 28 June 2013 at 07:07:39 UTC, Marco Leise wrote:
 Isn't that what scope is for?

 I don't really know. In practice, it does something else 
 (usually nothing, but suppresses heap closure allocation on 
 delegates). The DIPs relating to it all talk about returning 
 refs from functions and I'm not sure if they relate to the 
 built ins or not- I don't think it would quite work for what I 
 have in mind.

It is no-op keyword in current implementation for everything but 
delegates. DIP speculation was based on 
http://dlang.org/attribute.html#scope and "Parameter Storage 
Classes" in http://dlang.org/function.html but that info is 
obviously outdated.

Jun 28 2013

"Jonathan M Davis" <jmdavisProg gmx.com> writes:

On Friday, June 28, 2013 13:55:45 Adam D. Ruppe wrote:
 On Friday, 28 June 2013 at 07:07:39 UTC, Marco Leise wrote:
 Isn't that what scope is for?

 
 I don't really know. In practice, it does something else (usually
 nothing, but suppresses heap closure allocation on delegates).
 The DIPs relating to it all talk about returning refs from
 functions and I'm not sure if they relate to the built ins or
 not- I don't think it would quite work for what I have in mind.

Per the spec, all scope is supposed to do is prevent references in a parameter 
to be escaped. To be specific, it says

-------
references in the parameter cannot be escaped (e.g. assigned to
a 
global variable)
-------

So, in theory, if you had something like

auto foo(scope int[] i) {...}

it would prevent i or anything refering to it from being returned or assigned 
to any variable which will outlive the function call. However, scope currently 
does _nothing_ for anything other than delegates - which is why I think that 
using the in attribute is such an incredibly bad idea. Using either in or 
scope on anything other than delegates could result in all kinds of code 
breakage if/when scope is ever implemented for types other than delegates.

For delegates, it has the advantage of telling the compiler that it doesn't 
need to allocate a closure (since the delegate won't be used passed the point 
when it's calling scope will exist as could occur if the delegate escaped the 
function it was passed to), but I'm not sure that even that works 100% 
correctly right now.

We really should sort out exactly what we're going to do with scope one of 
these days soon.

But the stuff that some of the DIPS do with scope (e.g. returning with scope - 
which is completely against the spec at this point) are suggestions and not at 
all how it currently works.

- Jonathan M Davis

Jun 28 2013

"Adam D. Ruppe" <destructionator gmail.com> writes:

On Friday, 28 June 2013 at 17:43:21 UTC, Jonathan M Davis wrote:
 it would prevent i or anything refering to it from being 
 returned or assigned to any variable which will outlive the
 function call. However,

That's fairly close to what I'd want. But there's two cases I'm 
not sure it would cover:

1:

struct Unique(T) {
    scope T borrow();
}

If the unique pointer decides to let its reference slip, it 
wouldn't want it going somewhere else and escaping, since that 
breaks the unique need.

This is important for a few cases. Here's one:

   int* foo;
   {
      Unique!(int*) bar;
      foo = bar.borrow;

      int* ok = bar.borrow; // this should be ok, because this 
never exists outside the same scope as the Unique
   }

   // foo now talks to a freed *bar, so that shouldn't be allowed

Similarly, if bar were reassigned, this could cause trouble, but 
what we might do is just disallow such reassignments, but maybe 
it could work if it always goes down in scope. I'd have to think 
about that.


(I'm thinking my borrowed thing might have to be a type 
constructor rather than a storage class. Otherwise, you could get 
around it by:

int* bar(scope int* foo) {
   int* b = foo;
   return b;
}

Unless the compiler is very smart about following where it goes.)


But if scope works on the return value too, it might be ok.


maybe 2:

void bar(scope int* foo, int** bar) {
     *bar = foo;
}


Actually, I'm reasonably clear the spec's scope words would work 
for this one. But we'd need to be sure - this is one case where 
pure wouldn't help (pure generally would help, since it disallows 
assignments to the outside world, but there's enough holes that 
you could leak a reference).


To be memory safe, these would all have to be guaranteed.

Jun 28 2013

"Jonathan M Davis" <jmdavisProg gmx.com> writes:

On Friday, June 28, 2013 19:56:44 Adam D. Ruppe wrote:
 struct Unique(T) {
 scope T borrow();
 }

Per the current spec, this would not be a valid use of scope, as scope is 
specifically a parameter storage class and can only be used on function 
parameters (just like in, out, ref, and lazy). scope seems to be specifically 
intended for guaranteeing that an argument passed to a function does not 
escape that function.

- Jonathan M Davis

Jun 28 2013

"Jason House" <jason.james.house gmail.com> writes:

Bloomberg released an STL alternative called BSL which contains 
an alternate allocator model. In a nutshell object supporting 
custom allocators can optionally take an allocator pointer as an 
argument. Containers will save the pointer and use it for all 
their allocations. It seems simple enough and does not embed the 
allocator in the type.

https://github.com/bloomberg/bsl/wiki/BDE-Allocator-model

On Tuesday, 25 June 2013 at 22:22:09 UTC, cybervadim wrote:
 I know Andrey mentioned he was going to work on Allocators a 
 year ago. In DConf 2013 he described the problems he needs to 
 solve with Allocators. But I wonder if I am missing the 
 discussion around that - I tried searching this forum, found a 
 few threads that was not actually a brain storm for Allocators 
 design.

 Please point me in the right direction
 or
 is there a reason it is not discussed
 or
 should we open the discussion?


 The easiest approach for Allocators design I can imagine would 
 be to let user specify which Allocator operator new should get 
 the memory from (introducing a new keyword allocator). This 
 gives a total control, but assumes user knows what he is doing.

 Example:

 CustomAllocator ca;
 allocator(ca) {
   auto a = new A; // operator new will use 
 ScopeAllocator::malloc()
   auto b = new B;

   free(a); // that should call ScopeAllocator::free()
   // if free() is missing for allocated area, it is a user 
 responsibility to make sure custom Allocator can handle that
 }

 By default allocator is the druntime using GC, free(a) does 
 nothing for it.


 if some library defines its allocator (e.g. specialized 
 container), there should be ability to:
 1. override allocator
 2. get access to the allocator used

 I understand that I spent 5 mins thinking about the way 
 Allocators may look.
 My point is - if somebody is working on it, can you please 
 share your ideas?

Jun 26 2013

"cybervadim" <vadim.goryunov gmail.com> writes:

On Wednesday, 26 June 2013 at 13:16:25 UTC, Jason House wrote:
 Bloomberg released an STL alternative called BSL which contains 
 an alternate allocator model. In a nutshell object supporting 
 custom allocators can optionally take an allocator pointer as 
 an argument. Containers will save the pointer and use it for 
 all their allocations. It seems simple enough and does not 
 embed the allocator in the type.

 https://github.com/bloomberg/bsl/wiki/BDE-Allocator-model

I think the problem with such approach is that you have to 
maniacally add support for custom allocator to every class if you 
want them to be on a custom allocator.
If we simply able to say - all memory allocated in this area {} 
should use my custom allocator, that would simplify the code and 
no need to change std lib.
The next step is to notify allocator when the memory should be 
released. But for the stack based allocator that is not required.
More over, if we introduce access to different GCs (e.g. 
mark-n-sweep, semi-copy, ref counted), we should be able to say 
this {} piece of code is my temporary, so use semi-copy GC, the 
other code is long lived and not much objects created, so use ref 
counted. That is, it is all runtime support and no need changing 
the library code.

Jun 26 2013

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Wed, Jun 26, 2013 at 04:10:49PM +0200, cybervadim wrote:
 On Wednesday, 26 June 2013 at 13:16:25 UTC, Jason House wrote:
Bloomberg released an STL alternative called BSL which contains an
alternate allocator model. In a nutshell object supporting custom
allocators can optionally take an allocator pointer as an
argument. Containers will save the pointer and use it for all
their allocations. It seems simple enough and does not embed the
allocator in the type.

https://github.com/bloomberg/bsl/wiki/BDE-Allocator-model

 
 I think the problem with such approach is that you have to
 maniacally add support for custom allocator to every class if you
 want them to be on a custom allocator.

Yeah, that's a major inconvenience with the C++ allocator model. There's
no way to say "switch to allocator A within this block of code"; if
you're given a binary-only library that doesn't support allocators,
you're out of luck. And even if you have the source code, you have to
manually modify every single line of code that performs allocation to
take an additional parameter -- not a very feasible approach.


 If we simply able to say - all memory allocated in this area {}
 should use my custom allocator, that would simplify the code and no
 need to change std lib.
 The next step is to notify allocator when the memory should be
 released. But for the stack based allocator that is not required.
 More over, if we introduce access to different GCs (e.g.
 mark-n-sweep, semi-copy, ref counted), we should be able to say this
 {} piece of code is my temporary, so use semi-copy GC, the other
 code is long lived and not much objects created, so use ref counted.
 That is, it is all runtime support and no need changing the library
 code.

Yeah, I think the best approach would be one that doesn't require
changing a whole mass of code to support. Also, one that doesn't require
language changes would be far more likely to be accepted, as the core D
devs are leery of adding yet more complications to the language.

That's why I proposed that gc_alloc and gc_free be made into
thread-global function pointers, that can be swapped with a custom
allocator's version. This doesn't have to be visible to user code; it
can just be an implementation detail in std.allocator, for example. It
allows us to implement custom allocators across a block of code that
doesn't know (and doesn't need to know) what allocator will be used.


T

-- 
Fact is stranger than fiction.

Jun 26 2013

"cybervadim" <vadim.goryunov gmail.com> writes:

On Wednesday, 26 June 2013 at 14:26:03 UTC, H. S. Teoh wrote:
 Yeah, I think the best approach would be one that doesn't 
 require
 changing a whole mass of code to support. Also, one that 
 doesn't require
 language changes would be far more likely to be accepted, as 
 the core D
 devs are leery of adding yet more complications to the language.

 That's why I proposed that gc_alloc and gc_free be made into
 thread-global function pointers, that can be swapped with a 
 custom
 allocator's version. This doesn't have to be visible to user 
 code; it
 can just be an implementation detail in std.allocator, for 
 example. It
 allows us to implement custom allocators across a block of code 
 that
 doesn't know (and doesn't need to know) what allocator will be 
 used.

Yes, being able to change gc_alloc, gc_free would do the work. If 
runtime  remembers the stack of gc_alloc/gc_free functions like 
pushd, popd, that would simplify its usage.
I think this is a very nice and simple solution to the problem.

Jun 26 2013

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Wed, Jun 26, 2013 at 04:31:40PM +0200, cybervadim wrote:
 On Wednesday, 26 June 2013 at 14:26:03 UTC, H. S. Teoh wrote:
Yeah, I think the best approach would be one that doesn't require
changing a whole mass of code to support. Also, one that doesn't
require language changes would be far more likely to be accepted, as
the core D devs are leery of adding yet more complications to the
language.

That's why I proposed that gc_alloc and gc_free be made into
thread-global function pointers, that can be swapped with a custom
allocator's version. This doesn't have to be visible to user code; it
can just be an implementation detail in std.allocator, for example.
It allows us to implement custom allocators across a block of code
that doesn't know (and doesn't need to know) what allocator will be
used.

 
 Yes, being able to change gc_alloc, gc_free would do the work. If
 runtime  remembers the stack of gc_alloc/gc_free functions like pushd,
 popd, that would simplify its usage.  I think this is a very nice and
 simple solution to the problem.

Adam's idea does this: tie each replacement of gc_alloc/gc_free to some
stack-based object, that automatically cleans up in the dtor. So
something along these lines:

	struct CustomAlloc(A) {
		void* function(size_t size) old_alloc;
		void  function(void* ptr)   old_free;

		this(A alloc) {
			old_alloc = gc_alloc;
			old_free  = gc_free;

			gc_alloc = &A.alloc;
			gc_free  = &A.free;
		}

		~this() {
			gc_alloc = old_alloc;
			gc_free  = old_free;

			// Cleans up, e.g., region allocator deletes the
			// region
			A.cleanup();
		}
	}

	class C {}

	void main() {
		auto c = new C();	// allocates using default allocator (GC)
		{
			CustomAlloc!MyAllocator _;

			// Everything from here on until end of block
			// uses MyAllocator

			auto d = new C();	// allocates using MyAllocator

			{
				CustomAlloc!AnotherAllocator _;
				auto e = new C(); // allocates using AnotherAllocator

				// End of scope: auto cleanup, gc_alloc and
				// gc_free reverts back to MyAllocator
			}

			auto f = new C();	// allocates using MyAllocator

			// End of scope: auto cleanup, gc_alloc and
			// gc_free reverts back to default values
		}
		auto g = new C();	// allocates using default allocator
	}


So you effectively have an allocator stack, and user code never has to
directly manipulate gc_alloc/gc_free (which would be dangerous).


T

-- 
Almost all proofs have bugs, but almost all theorems are true. -- Paul Pedersen

Jun 26 2013

"Dicebot" <public dicebot.lv> writes:

Some type system help is required to guarantee that references to 
such scope-allocated data won't escape.

Jun 26 2013

"Brad Anderson" <eco gnuk.net> writes:

On Wednesday, 26 June 2013 at 13:16:25 UTC, Jason House wrote:
 Bloomberg released an STL alternative called BSL which contains 
 an alternate allocator model. In a nutshell object supporting 
 custom allocators can optionally take an allocator pointer as 
 an argument. Containers will save the pointer and use it for 
 all their allocations. It seems simple enough and does not 
 embed the allocator in the type.

 https://github.com/bloomberg/bsl/wiki/BDE-Allocator-model

There is also EASTL's (Electronic Arts version of STL for 
gamedev) take on allocators.

http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2271.html#eastl_allocator

Jun 28 2013

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

26-Jun-2013 02:22, cybervadim пишет:
 I know Andrey mentioned he was going to work on Allocators a year ago.
 In DConf 2013 he described the problems he needs to solve with
 Allocators. But I wonder if I am missing the discussion around that - I
 tried searching this forum, found a few threads that was not actually a
 brain storm for Allocators design.

 Please point me in the right direction
 or
 is there a reason it is not discussed
 or
 should we open the discussion?


 The easiest approach for Allocators design I can imagine would be to let
 user specify which Allocator operator new should get the memory from
 (introducing a new keyword allocator). This gives a total control, but
 assumes user knows what he is doing.

 Example:

 CustomAllocator ca;
 allocator(ca) {
    auto a = new A; // operator new will use ScopeAllocator::malloc()
    auto b = new B;

    free(a); // that should call ScopeAllocator::free()
    // if free() is missing for allocated area, it is a user
 responsibility to make sure custom Allocator can handle that
 }

Awful. What that extra syntax had brought you? Except that now new is 
unsafe by design?
Other questions involve how does this allocation scope goes inside of 
functions, what is the mechanism of passing it up and down of call-stack.

Last but not least I fail to see how scoped allocators alone (as 
presented) solve even half of the problem.

-- 
Dmitry Olshansky

Jun 26 2013

"cybervadim" <vadim.goryunov gmail.com> writes:

On Wednesday, 26 June 2013 at 14:17:03 UTC, Dmitry Olshansky 
wrote:
 Awful. What that extra syntax had brought you? Except that now 
 new is unsafe by design?
 Other questions involve how does this allocation scope goes 
 inside of functions, what is the mechanism of passing it up and 
 down of call-stack.

 Last but not least I fail to see how scoped allocators alone 
 (as presented) solve even half of the problem.

Extra syntax allows me not touching the existing code.
Imagine you have a stateless event processing. That is event 
comes, you do some calculation, prepare the answer and send it 
back. It will look like:

void onEvent(Event event)
{
    process();
}

Because it is stateless, you know all the memory allocated during 
processing will not be required afterwards. So the syntax I 
suggested requires a very little change in code. process() may be 
implemented using std lib, doing several news and resizing.

With new syntax:


void onEvent(Event event)
{
    ScopedAllocator alloc;
    allocator(alloc) {
      process();
    }
}

So now you do not use GC for all that is created inside the 
process().
ScopedAllocator is a simple stack that will free all memory in 
one go.

It is up to the runtime implementation to make sure all memory 
that is allocated inside allocator{} scope is actually allocated 
using ScopedAllocator and not GC.

Does it make sense?

Jun 26 2013

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

26-Jun-2013 18:27, cybervadim пишет:
 On Wednesday, 26 June 2013 at 14:17:03 UTC, Dmitry Olshansky wrote:
 Awful. What that extra syntax had brought you? Except that now new is
 unsafe by design?
 Other questions involve how does this allocation scope goes inside of
 functions, what is the mechanism of passing it up and down of call-stack.

 Last but not least I fail to see how scoped allocators alone (as
 presented) solve even half of the problem.

 Extra syntax allows me not touching the existing code.
 Imagine you have a stateless event processing. That is event comes, you
 do some calculation, prepare the answer and send it back. It will look
 like:

 void onEvent(Event event)
 {
     process();
 }

 Because it is stateless, you know all the memory allocated during
 processing will not be required afterwards.

Here is a chief problem - the assumption that is required to make it 
magically work.

Now what I see is:

T arr[];//TLS

//somewhere down the line
arr = ... ;
else{
...
alloctor(myAlloc){
	arr = array(filter!....);
}
...
}
return arr;

Having an unsafe magic wand that may transmogrify some code to switch 
allocation strategy I consider naive and dangerous.

Who ever told you process does return before allocating a few Gigs of 
RAM (and hoping on GC collection)? Right, nobody. Maybe it's an event 
loop that may run forever.

What is missing is that code up to date assumes new == GC and works 
_like that_.

 So the syntax I suggested
 requires a very little change in code. process() may be implemented
 using std lib, doing several news and resizing.

 With new syntax:


 void onEvent(Event event)
 {
     ScopedAllocator alloc;
     allocator(alloc) {
       process();
     }
 }

 So now you do not use GC for all that is created inside the process().
 ScopedAllocator is a simple stack that will free all memory in one go.

 It is up to the runtime implementation to make sure all memory that is
 allocated inside allocator{} scope is actually allocated using
 ScopedAllocator and not GC.

 Does it make sense?

Yes, but it's horribly broken.

-- 
Dmitry Olshansky

Jun 26 2013

"cybervadim" <vadim.goryunov gmail.com> writes:

On Wednesday, 26 June 2013 at 14:59:41 UTC, Dmitry Olshansky 
wrote:
 Here is a chief problem - the assumption that is required to 
 make it magically work.

 Now what I see is:

 T arr[];//TLS

 //somewhere down the line
 arr = ... ;
 else{
 ...
 alloctor(myAlloc){
 	arr = array(filter!....);
 }
 ...
 }
 return arr;

 Having an unsafe magic wand that may transmogrify some code to 
 switch allocation strategy I consider naive and dangerous.

 Who ever told you process does return before allocating a few 
 Gigs of RAM (and hoping on GC collection)? Right, nobody. Maybe 
 it's an event loop that may run forever.

 What is missing is that code up to date assumes new == GC and 
 works _like that_.

Not magic, but the tool which is quite powerful and thus it may 
shoot your leg.
This is unsafe, but if you want it safe, don't use allocators, 
stay with GC.
In the example above, you get first arr freed by GC, second arr 
may point to nothing if myAlloc was implemented to free it 
before. Or you may get a proper arr reference if myAlloc used 
malloc and didn't free it. The fact that you may write bad code 
does not make the language (or concept) bad.

Jun 26 2013

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

26-Jun-2013 23:04, cybervadim пишет:
 On Wednesday, 26 June 2013 at 14:59:41 UTC, Dmitry Olshansky wrote:

 Having an unsafe magic wand that may transmogrify some code to switch
 allocation strategy I consider naive and dangerous.

 Who ever told you process does return before allocating a few Gigs of
 RAM (and hoping on GC collection)? Right, nobody. Maybe it's an event
 loop that may run forever.

 What is missing is that code up to date assumes new == GC and works
 _like that_.

 Not magic, but the tool which is quite powerful and thus it may shoot
 your leg.

I know what kind of thing you are talking about. It's ain't powerful 
it's just a hack that doesn't quite do what advertised.

 This is unsafe, but if you want it safe, don't use allocators, stay with
 GC.

BTW you were talking changing allocation of the code you didn't write.
There is not even single fact that makes the thing safe. It's all 
working by chance or because the thing was designed to work with scoped 
allocator to begin with.

I believe the 2nd case (design to use scoped allocation) is
a) The behavior is guaranteed (determinism vs GC etc)
b) Safety is assured be the designer not pure luck (and reasonable 
assumption that may not hold)

 In the example above, you get first arr freed by GC, second arr may
 point to nothing if myAlloc was implemented to free it before. Or you
 may get a proper arr reference if myAlloc used malloc and didn't free
 it.

Yeah I know, hence I showed it. BTW forget about malloc I'm not talking 
about explicit malloc being an alternative to you scheme.

 The fact that you may write bad code does not make the language (or
 concept) bad.

It does. Because it introduces easy unreliable and bug prone usage.

-- 
Dmitry Olshansky

Jun 26 2013

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Wed, Jun 26, 2013 at 01:16:31AM +0200, Adam D. Ruppe wrote:
 On Tuesday, 25 June 2013 at 22:50:55 UTC, H. S. Teoh wrote:
And maybe (b) can be implemented by making gc_alloc / gc_free
overridable function pointers? Then we can override their values
and use scope guards to revert them back to the values they were
before.

 
 Yea, I was thinking this might be a way to go. You'd have a global
 (well, thread-local) allocator instance that can be set and reset
 through stack calls.
 
 You'd want it to be RAII or delegate based, so the scope is clear.
 
 with_allocator(my_alloc, {
      do whatever here
 });
 
 
 or
 
 {
    ChangeAllocator!my_alloc dummy;
 
    do whatever here
 } // dummy's destructor ends the allocator scope
 
 
 I think the former is a bit nicer, since the dummy variable is a bit
 silly. We'd hope that delegate can be inlined.

Actually, D's frontend leaves something to be desired when it comes to
inlining delegates. It *is* done sometimes, but not as often as one may
like. For example, opApply generally doesn't inline its delegate, even
when it's just a thin wrapper around a foreach loop.

But yeah, I think the former has nicer syntax. Maybe we can help the
compiler with inlining by making the delegate a compile-time parameter?
But it forces a switch of parameter order, which is Not Nice (hurts
readability 'cos the allocator argument comes after the block instead of
before).


 But, the template still has a big advantage: you can change the
 type. And I think that is potentially enormously useful.

True. It can use different types for different allocators that does (or
doesn't) do cleanups at the end of the scope, depending on what the
allocator needs to do.


 Another question is how to tie into output ranges. Take std.conv.to.
 
 auto s = to!string(10); // currently, this hits the gc
 
 What if I want it to go on a stack buffer? One option would be to
 rewrite it to use an output range, and then call it like:
 
 char[20] buffer;
 auto s = to!string(10, buffer); // it returns the slice of the
 buffer it actually used
 
 (and we can do overloads so to!string(10, radix) still works, as
 well as to!string(10, radix, buffer). Hassle, I know...)

I think supporting the multi-argument version of to!string() is a good
thing, but what to do with library code that calls to!string()? It'd be
nice if we could somehow redirect those GC calls without having to comb
through the entire Phobos codebase for stray calls to to!string().


[...]
 The fun part is the output range works for that, and could also work
 for something like this:
 
 struct malloced_string {
     char* ptr;
     size_t length;
     size_t capacity;
     void put(char c) {
         if(length >= capacity)
            ptr = realloc(ptr, capacity*2);
         ptr[length++] = c;
     }
 
     char[] slice() { return ptr[0 .. length]; }
     alias slice this;
     mixin RefCounted!this; // pretend this works
 }
 
 
 {
    malloced_string str;
    auto got = to!string(10, str);
 } // str is out of scope, so it gets free()'d. unsafe though: if you
 stored a copy of got somewhere, it is now a pointer to freed memory.
 I'd kinda like language support of some sort to help mitigate that
 though, like being a borrowed pointer that isn't allowed to be
 stored, but that's another discussion.

Nice!


 And that should work. So then what we might do is provide these
 little output range wrappers for various allocators, and use them on
 many functions.
 
 So we'd write:
 
 import std.allocators;
 import std.range;
 
 // mallocator is provided in std.allocators and offers the goods
 OutputRange!(char, mallocator) str;
 
 auto got = to!string(10, str);

I like this. However, it still doesn't address how to override the
default allocator in, say, Phobos functions.


 What's nice here is the output range is useful for more than just
 allocators. You could also to!string(10, my_file) or a delegate,
 blah blah blah. So it isn't too much of a burden, it is something
 you might naturally use anyway.

Now *that* is a very nice idea. I like having a way of bypassing using a
string buffer, and just writing the output directly to where it's
intended to go. I think to() with an output range parameter definitely
should be implemented. It doesn't address all of the issues, but it's a
very big first step IMO.


Also, we may have the problem of the wrong allocator
being used to free the object.

 
 Another reason why encoding the allocator into the type is so nice.
 For the minimal D I've been playing with, the idea I'm running with
 is all allocated memory has some kind of special type, and then
 naked pointers are always assumed to be borrowed, so you should
 never store or free them.

Interesting idea. So basically you can tell which allocator was used to
allocate an object just by looking at its type? That's not a bad idea,
actually.


 auto foo = HeapArray!char(capacity);
 
 void bar(char[] lol){}
 
 bar(foo); // allowed, foo has an alias this on slice

This is nice. Hooray for alias this. :)


 // but....
 
 struct A {
    char[] lol; // not allowed, because you don't know when lol is
 going to be freed
 }
 
 
 foo frees itself with refcounting.

This is a bit inconvenient. So your member variables will have to know
what allocation type is being used. Not the end of the world, of course,
but not as pretty as one would like.


On Wed, Jun 26, 2013 at 03:24:57AM +0200, Adam D. Ruppe wrote:
 I was just quickly skimming some criticism of C++ allocators, since
 my thought here is similar to what they do. On one hand, maybe D can
 do it right by tweaking C++'s design rather than discarding it.
 
 On the other hand, with all the C++ I've done, I have never actually
 used STL allocators, which could say something about me or could say
 something about them.
 
 
 One thing I saw said making the differently allocated object a
 different type sucks. ...but must it? The complaint there was "so
 much for just doing a function that takes a std::string". But, the
 way I'd want to do it in D is the function would take a char[]
 instead, and our special allocated type provides that via opSlice
 and/or alias this.

Yeah I think alias this adds a whole new factor into the equation. The
advantage of having a distinct type makes it much easier to implement,
and allows you to mix differently-allocated objects without having to
worry about things like calling the right version of gc_free to cleanup
properly. You can even have the same underlying data type be allocated
in two different ways, and the cleanup will happen correctly.

Basically, when you allocate some object O of class C using allocator A,
then it follows that no matter what you do with the gc_alloc/gc_free
function pointers afterwards, O must be freed using A.free. So in a
sense, O needs to carry around a function pointer to A.free in its dtor
(or whoever frees it). So this actually argues for having a distinct
type for an instance of C allocated using A, vs. an instance of C
allocated using a different allocator B. You need to store that function
pointer to A.free and B.free *somewhere*, otherwise things won't work
properly.


[...]
 Anyway, bottom line is I don't think that criticism necessarily
 applies to D.

Agreed, in D, distinct types per allocator is, at the very least, not as
bad as it is in C++.


 But there's surely many others and I'm more or less a
 n00b re c++'s allocators so idk yet.

Who *isn't* a n00b wrt to C++'s allocators, since so few people actually
use it? :-P


T

-- 
He who sacrifices functionality for ease of use, loses both and deserves
neither. -- Slashdotter

Jun 26 2013

"Adam D. Ruppe" <destructionator gmail.com> writes:

On Wednesday, 26 June 2013 at 16:40:20 UTC, H. S. Teoh wrote:
 I think supporting the multi-argument version of to!string() is 
 a good thing, but what to do with library code that calls 
 to!string()? It'd be nice if we could somehow redirect those GC 
 calls without having to comb through the entire Phobos codebase 
 for stray calls to to!string().


Let's consider what kinds of allocations we have. We can break 
them up into two broad groups: internal and visible.

Internal allocations, in theory, don't matter. These can be on 
the stack, the gc heap, malloc/free, whatever. The function 
itself is responsible for their entire lifetime.

Changing these either optimize, in the case of reusing a region, 
or leak if you switch it to manual and the function doesn't know 
it.

Visible allocations are important because the caller is 
responsible for freeing them. Here, I really think we want the 
type system's help: either it should return something that we 
know we're responsible for, or take a buffer/output range from us 
to receive the data in the first place.

Either way, the function signature should reflect what's going on 
with visible allocations. It'd possibly return a wrapped type and 
it'd take an output range/buffer/allocator.



With internals though, the only reason I can see why you'd want 
to change them outside the function is to give them a region of 
some sort to work with, especially since you don't know for sure 
what it is doing - these are all local variables to the 
function/call stack. And here, I don't think we want to change 
the allocator wholesale.

At most, we'd want to give it hints that what we're doing are 
short lived. (Or, better yet, have it figure this out on its own, 
like a generational gc.)



So I think this is more about tweaking the gc than replacing it, 
at most adding a couple new functions to it:

GC.hint_short_lived // returns a helper struct with a static 
refcount:

TempGcAllocator {
      static int tempCount = 0;
      static void* localRegion;
      this() { tempCount++; } // pretend this works
      ~this() { tempCount--; if(tempCount == 0) 
gc.tryToCollect(localRegion); }

      T create(T, Args...)(Args args) { return GC.new_short_lived 
T(args); }
}


and gc.tryToCollect() does a quick scan for anything into the 
local region. If there's nothing in there, it frees the whole 
thing. If there is, in the name of memory safety, it just 
reintegrates that local region into the regular memory and gc's 
its components normally.



The reason the count is static is that you don't have to pass 
this thing down the call stack. Any function that wants to adapt 
to this generational hint system just calls hint_short_lived. If 
you're a leaf function, that's ok, the static count means you'll 
inherit the region from the function above you.

You would NOT use this in main(), as that defeats the purpose.


 I think to() with an output range parameter definitely
 should be implemented.

No doubt about it, we should aim for most phobos functions not to 
allocate at all, if given an output range they can use.


 Interesting idea. So basically you can tell which allocator was 
 used to allocate an object just by looking at its type?

Right, then you'll know if you have to free() it. (Or it can free 
itself with its destructor.)


 This is a bit inconvenient. So your member variables will have 
 to know what allocation type is being used. Not the end of the
 world, of course, but not as pretty as one would like.

Yeah, you'd need to know if you own them or not too (are you 
responsible for freeing that string you just got passed? If no, 
are you sure it won't be freed while you're still using it?), but 
I just think that's a part of memory management you can't 
sidestep.

There's two easy answers: 1) always make a private copy of 
anything you store (and perhaps write to) or 2) use a gc and 
trust it to always be the owner.

In any other case, I think you *have* to think about it, and the 
type telling you can help you make that decision.


 and allows you to mix differently-allocated objects without 
 having to

Important to remember though that you are borrowing these 
references, not taking ownership.

I think the rule of all pointers/slices are borrowed is fairly 
workable though. With the gc, that's ok, you don't own anything. 
The garbage collector is responsible for it all, so store away. 
(Though if it is mutable, you might want to idup it so you don't 
get overwritten by someone else. But that's a separate question 
from allocation method.... and already encoded in D's type 
system).

So never free() a naked pointer, unless you know what you're 
doing like interfacing with a C library, prefer to only free a 
ManuallyAllocated!(pointer).

hell a C library binding could change the type too, it'd still be 
binary compatible. RefCounted!T wouldn't be, but 
ManuallyAllocated!T would just be a wrapper around T*.

I think I'm starting to ramble!

Jun 26 2013

"Dicebot" <public dicebot.lv> writes:

By the way, while this topic gets some attention, I want to make 
a notice that there are actually two orthogonal entities that 
arise when speaking about configurable allocation - allocators 
itself and global allocation policies. I think good design should 
address both of those.

For example, changing global allocator for custom one has limited 
usability - you are anyway limited by the language design that 
makes only GC or ref-counting viable general options. However, 
some way to prohibit automatic allocations at runtime while still 
allowing manual ones may be useful - and it does not matter what 
allocator is actually used to get that memory. Once such API is 
designed, tighter classification and control may be added with 
time.

Jun 26 2013

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

26-Jun-2013 21:35, Dicebot пишет:
 By the way, while this topic gets some attention, I want to make a
 notice that there are actually two orthogonal entities that arise when
 speaking about configurable allocation - allocators itself and global
 allocation policies. I think good design should address both of those.

Sadly I believe that global allocators would still have to be compatible 
with GC (to not break code in hard to track ways) thus basically being a 
GC. Hence we can easily stop talking about them ;)



-- 
Dmitry Olshansky

Jun 26 2013

"Dicebot" <public dicebot.lv> writes:

On Wednesday, 26 June 2013 at 19:40:54 UTC, Dmitry Olshansky 
wrote:
 Sadly I believe that global allocators would still have to be 
 compatible with GC (to not break code in hard to track ways) 
 thus basically being a GC. Hence we can easily stop talking 
 about them ;)

Nice way to say "we don't really need that embedded, kernel and 
gamedev guys". GC as a safe an obvious approach should be the 
default but druntime needs to provide means for tight and 
dangerous control upon explicit request.

Jun 26 2013

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

27-Jun-2013 00:53, Dicebot пишет:
 On Wednesday, 26 June 2013 at 19:40:54 UTC, Dmitry Olshansky wrote:
 Sadly I believe that global allocators would still have to be
 compatible with GC (to not break code in hard to track ways) thus
 basically being a GC. Hence we can easily stop talking about them ;)

 Nice way to say "we don't really need that embedded, kernel and gamedev
 guys". GC as a safe an obvious approach should be the default but
 druntime needs to provide means for tight and dangerous control upon
 explicit request.

Just don't use certain built-ins. Stub them out in run-time if you like. 
The only problematic point I see is closures allocated on heap.

Frankly I see embedded, kernel and gamedev guys using ref-counting and 
custom data structures all the time. They all want that level of control 
and determinism anyway or are so resource constrained that GC is too 
much code space or run-time overhead anyway.

Needless to say that custom run-time for the first 2 categories is 
required anyway so just hack the druntime. It would be nice to have 
hooks readily available (and documented?) to do so but hardly beyond that.

-- 
Dmitry Olshansky

Jun 26 2013

"Dicebot" <public dicebot.lv> writes:

On Wednesday, 26 June 2013 at 21:00:54 UTC, Dmitry Olshansky 
wrote:
 Needless to say that custom run-time for the first 2 categories 
 is required anyway so just hack the druntime. It would be nice 
 to have hooks readily available (and documented?) to do so but 
 hardly beyond that.

It is an API issue. Hacking druntime is, unfortunately, 
inevitable but keeping ability to swap those two with no code 
changes simplifies development process and makes less tempting 
too forget about this use case when doing std lib / runtime stuff 
- it has been a second-class citizen for rather long time.

Jun 26 2013

"Adam D. Ruppe" <destructionator gmail.com> writes:

On Wednesday, 26 June 2013 at 21:00:54 UTC, Dmitry Olshansky 
wrote:
 Just don't use certain built-ins. Stub them out in run-time if 
 you like. The only problematic point I see is closures 
 allocated on heap.

Actually, I was kinda sorta able to solve this in my minimal d.

// this would be used for automatic heap closures, but there's no 
way to free it...
///*
extern(C)
void* _d_allocmemory(size_t bytes) {
         auto ptr = manual_malloc(bytes);
         debug(allocations) {
                 char[16] buffer;
                 write("warning: automatic memory allocation ", 
intToString(cast(size_t) ptr, buffer));
         }
         return ptr;
}


struct HeapClosure(T) if(is(T == delegate)) {
         mixin SimpleRefCounting!(T, q{
                 char[16] buffer;
                 write("\nfreeing closure ", 
intToString(cast(size_t) payload.ptr, buffer),"\n");
                 manual_free(payload.ptr);
         });
}

HeapClosure!T makeHeapClosure(T)(T t) { // if(__traits(isNested, 
T)) {
         return HeapClosure!T(t);
}



void closureTest2(HeapClosure!(void delegate()) test) {
         write("\nptr is ", cast(size_t) test.ptr, "\n");
         test();

         auto b = test;
}

void closureTest() {
         string a = "whoa";
         scope(exit) write("\n\nexit\n\n");
         //throw new Exception("test");
         closureTest2( makeHeapClosure({ write(a); }) );
}




It worked in my toy tests. The trick would be though to never 
store or use a non-scope builtin delegate. Using RTInfo, I 
believe I can statically verify you don't do this in the whole 
program,  but haven't actually tried yet.


I also left built in append unimplemented, but did custom types 
with ~= that are pretty convenient. Binary ~ is a loss though, 
too easy to lose pointers with that.

Jun 26 2013

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

27-Jun-2013 01:05, Adam D. Ruppe пишет:
 On Wednesday, 26 June 2013 at 21:00:54 UTC, Dmitry Olshansky wrote:
 Just don't use certain built-ins. Stub them out in run-time if you
 like. The only problematic point I see is closures allocated on heap.

 Actually, I was kinda sorta able to solve this in my minimal d.

 // this would be used for automatic heap closures, but there's no way to
 free it...

[snip a cool hack]

Yeah, I suspected something like this might work. Basically defining 
your own ref-count closure type and forging delegate keyword in your 
codebase (except in the file that defines heap closure). That still 
leaves chasing code like auto dg = (...){ ... } though.

Maybe having it as a template Closure!(ret-type, arg types...)
and instantiator function called simply closure could be more
ecstatically pleasing (this is IMHO).

 It worked in my toy tests. The trick would be though to never store or
 use a non-scope builtin delegate. Using RTInfo, I believe I can
 statically verify you don't do this in the whole program,  but haven't
 actually tried yet.


 I also left built in append unimplemented, but did custom types with ~=
 that are pretty convenient. Binary ~ is a loss though, too easy to lose
 pointers with that.


-- 
Dmitry Olshansky

Jun 26 2013

"John Colvin" <john.loughran.colvin gmail.com> writes:

On Tuesday, 25 June 2013 at 22:22:09 UTC, cybervadim wrote:
 I know Andrey mentioned he was going to work on Allocators a 
 year ago. In DConf 2013 he described the problems he needs to 
 solve with Allocators. But I wonder if I am missing the 
 discussion around that - I tried searching this forum, found a 
 few threads that was not actually a brain storm for Allocators 
 design.

 Please point me in the right direction
 or
 is there a reason it is not discussed
 or
 should we open the discussion?


 The easiest approach for Allocators design I can imagine would 
 be to let user specify which Allocator operator new should get 
 the memory from (introducing a new keyword allocator). This 
 gives a total control, but assumes user knows what he is doing.

 Example:

 CustomAllocator ca;
 allocator(ca) {
   auto a = new A; // operator new will use 
 ScopeAllocator::malloc()
   auto b = new B;

   free(a); // that should call ScopeAllocator::free()
   // if free() is missing for allocated area, it is a user 
 responsibility to make sure custom Allocator can handle that
 }

 By default allocator is the druntime using GC, free(a) does 
 nothing for it.


 if some library defines its allocator (e.g. specialized 
 container), there should be ability to:
 1. override allocator
 2. get access to the allocator used

 I understand that I spent 5 mins thinking about the way 
 Allocators may look.
 My point is - if somebody is working on it, can you please 
 share your ideas?

Old but perhaps relevant?

http://www.linkedin.com/news?viewArticle=&articleID=-1&gid=86782&type=member&item=253295471&articleURL=http%3A%2F%2Fwww%2Eallendowney%2Ecom%2Fss08%2Fhandouts%2Fberger02reconsidering%2Epdf&urlhash=96TJ&goback=%2Egmr_86782%2Egde_86782_member_253295471

(It's an academic article about memory allocation from 2002)

Jun 27 2013

"deadalnix" <deadalnix gmail.com> writes:

On Thursday, 27 June 2013 at 22:50:47 UTC, John Colvin wrote:
 Old but perhaps relevant?

 http://www.linkedin.com/news?viewArticle=&articleID=-1&gid=86782&type=member&item=253295471&articleURL=http%3A%2F%2Fwww%2Eallendowney%2Ecom%2Fss08%2Fhandouts%2Fberger02reconsidering%2Epdf&urlhash=96TJ&goback=%2Egmr_86782%2Egde_86782_member_253295471

 (It's an academic article about memory allocation from 2002)

Interesting paper. Still concurrency isn't really addressed, 
which is a problem to be future proof.

Jun 28 2013

"Brian Rogoff" <brogoff gmail.com> writes:

On Friday, 28 June 2013 at 10:57:45 UTC, deadalnix wrote:
 On Thursday, 27 June 2013 at 22:50:47 UTC, John Colvin wrote:
 Old but perhaps relevant?

 http://www.linkedin.com/news?viewArticle=&articleID=-1&gid=86782&type=member&item=253295471&articleURL=http%3A%2F%2Fwww%2Eallendowney%2Ecom%2Fss08%2Fhandouts%2Fberger02reconsidering%2Epdf&urlhash=96TJ&goback=%2Egmr_86782%2Egde_86782_member_253295471

 (It's an academic article about memory allocation from 2002)

 Interesting paper. Still concurrency isn't really addressed, 
 which is a problem to be future proof.

http://en.wikipedia.org/wiki/Hoard_memory_allocator

Jun 28 2013

D Programming

C/C++ Programming

Other

digitalmars.D - why allocators are not discussed here