www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - A few thoughts on std.allocator

reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
In C, C++, and D people have allocated memory for a long time in the 
following manner:

1. Allocate as many bytes as needed (e.g. by using malloc);
2. Mess with the memory allocated.

(C++ took this one step further by defining class-specific allocators, 
feature that D copied. That turned out to be little else than a lateral 
step. C++ also offers distinct allocation for arrays vs. anything else, 
but because it lacks typed reallocation primitives, it effectively 
hamstrung that feature's capabilities. Also the STL introduced 
allocators, which turned out to not do much of anything useful except 
being an immense time sink for everyone involved with them in any 
capacity. I call these cumulatively The Big Distraction.)

I've just had a mini-epiphany that's been buzzing around my head for a 
while. Finally it got to a form expressible in words:

====
Type should inform allocation.
====

I.e. you shouldn't first allocate and then brand the memory with a type. 
Instead, you should make the type known _during_ the allocation process.

This is not new and on the face of it not even all that noteworthy: (a) 
some high-level languages did this for years, (b) it's unclear which 
exact properties of a type should be relevant to the allocation.

Within the context of D, there are some interesting aspects of a type 
that a complex modular allocator could use to great effect. Consider:

1. Individual objects (i.e. not arrays)

When allocating an individual object, there is virtually zero chance 
that object will ever need to be resized. (There are exceptions such as 
"the struct hack" but those would be handled together with arrays.) 
There are two important consequences of this:

(a) Individual object sizes are already quantized, i.e. there is a 
relatively small set of distinct sizes compared to the number of objects 
being allocated.

(b) Individual objects can be placed in an allocator that's arbitrarily 
adverse to reallocation (i.e. has no in-place expansion ability and no 
clever reallocation primitives). It seems like FreeTree 
(file:///Users/aalexandre/code/d/dlang.org/web/phobos-prerelease/std_experimental_alloc
tor_free_tree.html) 
would be an excellent fit for allocating individual objects. HeapBlock 
also comes to mind.

2. Arrays

The range of sizes allocated for arrays is quite a bit larger and more 
arbitrary than for individual objects. Arrays still grow in quanta (e.g. 
T.sizeof hops) but the set of distinct sizes is much larger and much 
less predictable.

One other interesting thing is that arrays offer concatenation and 
resizing as primitive operations, so it follows that their underlying 
allocator should be friendly to such. This leads to an allocator design 
for arrays that's radically different from the allocator design for 
individual objects.

Long story short, arrays should sit on a different heap than objects.

3. Thread-local vs. shared objects

Currently in D it's legal to allocate memory in one thread and 
deallocate it in another. (One simple way to look at it is casting to 
shared.) This has a large performance cost that only benefits very few 
actual cases.

It follows that we need to change the notion that you first allocate 
memory and then brand it as shared. The "will be shared" knowledge must 
be present during allocation, and use different methods of allocation 
for the two cases.

4. "Has pointers" vs. "has no pointers"

When collecting a specific object, it's important to know whether its 
type embeds pointers. If it does, more scanning is necessary. Otherwise, 
the object is a "leaf" leading to no others.

It seems to me we stand to gain by physically and conceptually 
separating memory that stores leaf objects from that dedicated to 
non-leaf objects. Making the scan-heavy memory more compact leads to 
better cache friendliness.

Currently D's allocator allows choosing at runtime for each allocated 
block whether is should be scanned. The new design would require making 
the choice upfront. There would be, of course, the "anything goes" heap 
that still allows that level of dynamism.

5. "Plan to deallocate" vs. "plan to let the GC mind this"

Objects that will be deallocated manually deserve different placement 
compared to those that will only be deallocated during a GC cycle.

6. Immutable vs. mutable

I'm not sure how this can be practically exploited. There must be 
something in the notion that, save for a brief write period after 
allocation, the memory will stay unmodified. Still thinking.

7. Strings

On the face of it strings are immutable arrays, so they've been already 
discussed. However, they have some interesting characteristics:

(a) small alignment, e.g. 1 byte

(b) often extremely short, e.g. 1-8 bytes

(c) concatenation intensive (including append)

(d) perhaps other unique properties and patterns that elude me

So... types should inform allocation.


Please chime in!

Andrei
May 10 2015
next sibling parent reply "anonymous" <anonymous example.com> writes:
On Sunday, 10 May 2015 at 09:50:00 UTC, Andrei Alexandrescu wrote:
 (file:///Users/aalexandre/code/d/dlang.org/web/phobos-prerelease/std_experimental_allocator_free_tree.html)
bad link
May 10 2015
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 5/10/15 3:15 AM, anonymous wrote:
 On Sunday, 10 May 2015 at 09:50:00 UTC, Andrei Alexandrescu wrote:
 (file:///Users/aalexandre/code/d/dlang.org/web/phobos-prerelease/std_experimental_allocator_free_tree.html)
bad link
Pardon: erdani.com/d/phobos-prerelease/std_experimental_allocator_free_tree.html -- Andrei
May 10 2015
prev sibling next sibling parent reply Michel Fortin <michel.fortin michelf.ca> writes:
On 2015-05-10 09:50:00 +0000, Andrei Alexandrescu 
<SeeWebsiteForEmail erdani.org> said:

 3. Thread-local vs. shared objects
 
 Currently in D it's legal to allocate memory in one thread and 
 deallocate it in another. (One simple way to look at it is casting to 
 shared.) This has a large performance cost that only benefits very few 
 actual cases.
 
 It follows that we need to change the notion that you first allocate 
 memory and then brand it as shared. The "will be shared" knowledge must 
 be present during allocation, and use different methods of allocation 
 for the two cases.
Shared is implicit in the case of immutable. Think carefully: if you implement this and it has any efficiency benefit for non-shared allocations, const-allocated objects and arrays will become more performant than immutable-allocated ones. People will thus have an incentive to stay away from immutable. -- Michel Fortin michel.fortin michelf.ca http://michelf.ca
May 10 2015
next sibling parent "Panke" <tobias pankrath.net> writes:
On Sunday, 10 May 2015 at 10:51:54 UTC, Michel Fortin wrote:
 On 2015-05-10 09:50:00 +0000, Andrei Alexandrescu 
 <SeeWebsiteForEmail erdani.org> said:

 3. Thread-local vs. shared objects
 
 Currently in D it's legal to allocate memory in one thread and 
 deallocate it in another. (One simple way to look at it is 
 casting to shared.) This has a large performance cost that 
 only benefits very few actual cases.
 
 It follows that we need to change the notion that you first 
 allocate memory and then brand it as shared. The "will be 
 shared" knowledge must be present during allocation, and use 
 different methods of allocation for the two cases.
Shared is implicit in the case of immutable. Think carefully: if you implement this and it has any efficiency benefit for non-shared allocations, const-allocated objects and arrays will become more performant than immutable-allocated ones. People will thus have an incentive to stay away from immutable.
If immutable does not pull its weight in other ways, it's just not worth it.
May 10 2015
prev sibling next sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 5/10/15 3:51 AM, Michel Fortin wrote:
 Shared is implicit in the case of immutable. Think carefully: if you
 implement this and it has any efficiency benefit for non-shared
 allocations, const-allocated objects and arrays will become more
 performant than immutable-allocated ones. People will thus have an
 incentive to stay away from immutable.
Good point. Well I'd say if things turn out that way, we shouldn't pessimize one case just to keep it aligned with another. There might be optimization opportunities particular to immutable data. -- Andrei
May 10 2015
prev sibling parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
On 5/10/15 6:51 AM, Michel Fortin wrote:
 On 2015-05-10 09:50:00 +0000, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> said:

 3. Thread-local vs. shared objects

 Currently in D it's legal to allocate memory in one thread and
 deallocate it in another. (One simple way to look at it is casting to
 shared.) This has a large performance cost that only benefits very few
 actual cases.

 It follows that we need to change the notion that you first allocate
 memory and then brand it as shared. The "will be shared" knowledge
 must be present during allocation, and use different methods of
 allocation for the two cases.
Shared is implicit in the case of immutable. Think carefully: if you implement this and it has any efficiency benefit for non-shared allocations, const-allocated objects and arrays will become more performant than immutable-allocated ones. People will thus have an incentive to stay away from immutable.
The whole concept of immutable being implicitly shareable is kind of broken. There are many reasons to have immutable unshared data, and it poisons const to the point where you really should consider any const variable to be also shared. -Steve
May 12 2015
parent reply "Brad Anderson" <eco gnuk.net> writes:
On Tuesday, 12 May 2015 at 15:51:38 UTC, Steven Schveighoffer 
wrote:
 The whole concept of immutable being implicitly shareable is 
 kind of broken. There are many reasons to have immutable 
 unshared data, and it poisons const to the point where you 
 really should consider any const variable to be also shared.

 -Steve
What are the extra implications of having to think of all const variables as shared? I guess it would mean the currently unimplemented memory barriers should happen for const variables as well. That does seem like a problem.
May 12 2015
parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
On 5/12/15 12:53 PM, Brad Anderson wrote:
 On Tuesday, 12 May 2015 at 15:51:38 UTC, Steven Schveighoffer wrote:
 The whole concept of immutable being implicitly shareable is kind of
 broken. There are many reasons to have immutable unshared data, and it
 poisons const to the point where you really should consider any const
 variable to be also shared.
What are the extra implications of having to think of all const variables as shared? I guess it would mean the currently unimplemented memory barriers should happen for const variables as well. That does seem like a problem.
The one that always comes to my mind is array appending: immutable int[] x = new int[5]; const int[] y = x; x ~= 1; // should this lock; y ~= 1; // should this lock? y = new int[5]; y ~= 1; // should this too? If so, isn't it a waste of cycles? Of course, array appending is an odd duck here, as generally you are not generally able to add data to an immutable piece of data. But there are other cases. Consider a struct like this: struct S { int a; immutable int b; } I can create an S on the heap (or whatever allocator), and s.b could be shared, but s.a could not be. How does that treat the block the entire S is allocated in? As much as it sucks, I'd prefer to see immutable not be implicitly shared, and require a cast. We could specialize the cast so it's not a true "ignore all the rules" cast. Sharing data is too fraught with danger. A concept of uniqueness would help here too. I think shared is broken in general, the only thing that's great about it is *not* shared, which is defined by the absence of shared :) That is something that's easy to wrap your head around. -Steve
May 12 2015
next sibling parent reply "Dicebot" <public dicebot.lv> writes:
On Tuesday, 12 May 2015 at 17:21:04 UTC, Steven Schveighoffer 
wrote:
 The one that always comes to my mind is array appending:

 immutable int[] x = new int[5];

 const int[] y = x;

 x ~= 1; // should this lock;

 y ~= 1; // should this lock?
As per my udnerstanding `shared` should _never_ result in automatic locking or barriers or whatever. It is simply a tag qualfier, with no extra magical semantics.
May 12 2015
next sibling parent reply "Brad Anderson" <eco gnuk.net> writes:
On Wednesday, 13 May 2015 at 05:35:18 UTC, Dicebot wrote:
 On Tuesday, 12 May 2015 at 17:21:04 UTC, Steven Schveighoffer 
 wrote:
 The one that always comes to my mind is array appending:

 immutable int[] x = new int[5];

 const int[] y = x;

 x ~= 1; // should this lock;

 y ~= 1; // should this lock?
As per my udnerstanding `shared` should _never_ result in automatic locking or barriers or whatever. It is simply a tag qualfier, with no extra magical semantics.
http://dlang.org/faq.html#shared_memory_barriers Perhaps this is out of date though.
May 12 2015
parent "Dicebot" <public dicebot.lv> writes:
On Wednesday, 13 May 2015 at 06:23:40 UTC, Brad Anderson wrote:
 On Wednesday, 13 May 2015 at 05:35:18 UTC, Dicebot wrote:
 On Tuesday, 12 May 2015 at 17:21:04 UTC, Steven Schveighoffer 
 wrote:
 The one that always comes to my mind is array appending:

 immutable int[] x = new int[5];

 const int[] y = x;

 x ~= 1; // should this lock;

 y ~= 1; // should this lock?
As per my udnerstanding `shared` should _never_ result in automatic locking or barriers or whatever. It is simply a tag qualfier, with no extra magical semantics.
http://dlang.org/faq.html#shared_memory_barriers Perhaps this is out of date though.
I have a feeling that no one know what shared truly means anymore. I remember talking about it with Andrei during last DConf and his explanation was that it is all about creating user types that encapsulate concurrency internally (+ atomics). I may remember wrong of course but was no mention of compiler actually doing anything special for it.
May 13 2015
prev sibling parent Steven Schveighoffer <schveiguy yahoo.com> writes:
On 5/13/15 1:35 AM, Dicebot wrote:
 On Tuesday, 12 May 2015 at 17:21:04 UTC, Steven Schveighoffer wrote:
 The one that always comes to my mind is array appending:

 immutable int[] x = new int[5];

 const int[] y = x;

 x ~= 1; // should this lock;

 y ~= 1; // should this lock?
As per my udnerstanding `shared` should _never_ result in automatic locking or barriers or whatever. It is simply a tag qualfier, with no extra magical semantics.
It's not automatic, the runtime does it. It uses the type information to determine whether it needs to lock or not. It also avoids the thread-local Most Recently Used cache to look up block info, since that may not be accurate. Essentially, the *lack* of shared can be used to optimize in the runtime. Right now, it assumes that const does not mean shared, but that can cause issues if you had a shared immutable array that you were appending to via a const reference. The logic for not locking there is that the vast VAST majority of cases do not have this problem. -Steve
May 13 2015
prev sibling next sibling parent "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Tuesday, 12 May 2015 at 17:21:04 UTC, Steven Schveighoffer 
wrote:
 I think shared is broken in general, the only thing that's 
 great about it is *not* shared, which is defined by the absence 
 of shared :) That is something that's easy to wrap your head 
 around.
Yes, «shared» is either broken or lacks definition. It should be deprecated in favour of «local». What the optimizer needs to know is: 1. Can the object be removed from the set of variables affected by a full memory barrier? 2. Is there no aliasing to the object outside of the context: E.g.: x++; y.f(); x--; Is it safe to optimize this to: y.f(); ?
May 12 2015
prev sibling next sibling parent reply "Jakob Ovrum" <jakobovrum gmail.com> writes:
On Tuesday, 12 May 2015 at 17:21:04 UTC, Steven Schveighoffer 
wrote:
 The one that always comes to my mind is array appending:

 immutable int[] x = new int[5];

 const int[] y = x;
Do you mean: immutable(int)[] x = new int[5]; const(int)[] y = x; ? Because you can't append to or reassign an immutable or const slice.
 x ~= 1; // should this lock;

 y ~= 1; // should this lock?
Assuming x and y are head-mutable; Locking is needed because even though the elements themselves do not need locking for inspection[1], the metadata (capacity and whatever else is in there) is always mutable. In other words, immutable(T)[] is implicitly convertible to immutable(T[]) which means it can be freely shared between threads, but when the slice refers to a druntime dynamic array, it *does* actually have mutable indirection. Blergh, yet another reason why conflating slices with dynamic arrays in the type system was probably a bad idea. Or rather, a good reason why T[new] was a good idea for D2. Then again, the cost is only paid if the dynamic array features are actually used :) [1] Immutable guarantees that the elements cannot be mutated from *any* thread, while const guarantees that the data is either immutable or thread-local, unless combined with shared. Remember that there is such a thing as shared(const(T)), but no such thing as shared(immutable(T)).
 y = new int[5];

 y ~= 1; // should this too? If so, isn't it a waste of cycles?
Yeah, it needs to lock and it is a waste... it could avoid it with some kind of virtual dispatch mechanism.
 Of course, array appending is an odd duck here, as generally 
 you are not generally able to add data to an immutable piece of 
 data.
Well, when the case is immutable(T)[], it's really not an odd case: in general you *can* grow a head-mutable container. In the case of an in-place append the existing elements are not touched, and in the case of reallocation the old elements are simply moved, which doesn't violate immutability either. The issue is the invisible, mutable metadata that is inherent in D's arrays, not an issue for head-mutable containers of immutable/const elements in general.
 But there are other cases. Consider a struct like this:

 struct S
 {
    int a;
    immutable int b;
 }

 I can create an S on the heap (or whatever allocator), and s.b 
 could be shared, but s.a could not be. How does that treat the 
 block the entire S is allocated in?
As they are in the same structure, `a` and `b` will always have the same sharedness. In the case of immutable(S), both `a` and `b` are immutable and can be shared, while in the case of mutable S they are both unshared. I don't see it as an issue as long as S vs immutable(S) is provided at time of allocation.
May 13 2015
parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
On 5/13/15 3:02 AM, Jakob Ovrum wrote:
 On Tuesday, 12 May 2015 at 17:21:04 UTC, Steven Schveighoffer wrote:
 The one that always comes to my mind is array appending:

 immutable int[] x = new int[5];

 const int[] y = x;
Do you mean: immutable(int)[] x = new int[5]; const(int)[] y = x; ? Because you can't append to or reassign an immutable or const slice.
Yes, that is what I meant. Sorry.
 x ~= 1; // should this lock;

 y ~= 1; // should this lock?
Assuming x and y are head-mutable; Locking is needed because even though the elements themselves do not need locking for inspection[1], the metadata (capacity and whatever else is in there) is always mutable.
OK, then consider that this: void main() { string x; x ~= "hello"; x ~= " world"; } would require locking. That's unacceptable. Nobody would append with strings if this all required locking for no reason. The runtime currently does NOT lock for this case, it considers immutable and const to be thread-local.
 In other words, immutable(T)[] is
 implicitly convertible to immutable(T[]) which means it can be freely
 shared between threads, but when the slice refers to a druntime dynamic
 array, it *does* actually have mutable indirection.
Yeah, that's why this case is an odd duck. It's not generally allowed to access mutable metadata via an immutable pointer.
 Blergh, yet another
 reason why conflating slices with dynamic arrays in the type system was
 probably a bad idea. Or rather, a good reason why T[new] was a good idea
 for D2. Then again, the cost is only paid if the dynamic array features
 are actually used :)
You will never convince me of this :) T[new] was a terrible idea.
 y = new int[5];

 y ~= 1; // should this too? If so, isn't it a waste of cycles?
Yeah, it needs to lock and it is a waste... it could avoid it with some kind of virtual dispatch mechanism.
No, I think the answer is simpler. Introduce shared(immutable), and then we can distinguish between immutable data that is shared and data that is not shared. It also makes implementing local heaps easier. Shared really is orthogonal to mutability.
 Well, when the case is immutable(T)[], it's really not an odd case: in
 general you *can* grow a head-mutable container. In the case of an
 in-place append the existing elements are not touched, and in the case
 of reallocation the old elements are simply moved, which doesn't violate
 immutability either.
Right, I'm not saying it's violating immutability -- the data AFTER the array that's in the block is not referenced anywhere, so it's technically unique. Here's something even weirder: If you append to y (const(int)[]) as a reference to x (immutable(int)[]), then part of the array is immutable and shareable, and the part that was appended is const and not shareable.
 But there are other cases. Consider a struct like this:

 struct S
 {
    int a;
    immutable int b;
 }

 I can create an S on the heap (or whatever allocator), and s.b could
 be shared, but s.a could not be. How does that treat the block the
 entire S is allocated in?
As they are in the same structure, `a` and `b` will always have the same sharedness. In the case of immutable(S), both `a` and `b` are immutable and can be shared, while in the case of mutable S they are both unshared. I don't see it as an issue as long as S vs immutable(S) is provided at time of allocation.
This should work: void main() { auto s = new S; passToOtherThread(&s.b); } Which makes s.a unshared, and s.b shared. But the memory block needs to be shareable. -Steve
May 13 2015
parent reply "Jakob Ovrum" <jakobovrum gmail.com> writes:
On Wednesday, 13 May 2015 at 17:49:38 UTC, Steven Schveighoffer 
wrote:
 OK, then consider that this:

 void main()
 {
    string x;
    x ~= "hello";
    x ~= " world";
 }

 would require locking. That's unacceptable. Nobody would append 
 with strings if this all required locking for no reason. The 
 runtime currently does NOT lock for this case, it considers 
 immutable and const to be thread-local.
Well, it's necessary because the design of druntime arrays is incompatible with D2's type system. Without locking, multi-threaded applications that use dynamic array operations could easily contain some particularly hard to track concurrency bugs. Simply not doing the locking and hoping that everything is fine doesn't sound like a good plan (which I think we agree on, since shared(immutable(T)) would solve it).
 No, I think the answer is simpler. Introduce shared(immutable), 
 and then we can distinguish between immutable data that is 
 shared and data that is not shared. It also makes implementing 
 local heaps easier. Shared really is orthogonal to mutability.
Basically, shared(immutable(T)) would only be useful to allocators, including arrays because they may need to allocate when growing. I don't think it would be useful beyond that; the sharedness of immutable data is probably not interesting to any other kind of code. It would make immutable considerably harder to use than it is today. shared(immutable(T)) would be implicitly convertible to shared(const(T)), but not const(T), which precludes the vast majority of mutation-agnostic D code out there today (I have never seen shared(const(T)) used in the wild). We would no longer be able to do even the simplest things, like passing a path string to another thread and use std.file.read on it.
 Here's something even weirder: If you append to y 
 (const(int)[]) as a reference to x (immutable(int)[]), then 
 part of the array is immutable and shareable, and the part that 
 was appended is const and not shareable.
This should be the case for user-defined containers as well as long as the element type doesn't have mutable indirection. Appending on slices seems to handle this correctly as well, rejecting attempts to append a const(T)[] to an immutable(T)[] when T has mutable indirection.
 This should work:

 void main()
 {
    auto s = new S;
    passToOtherThread(&s.b);
 }

 Which makes s.a unshared, and s.b shared. But the memory block 
 needs to be shareable.
That is an interesting example. Plain shared has the same problem: struct S { int a; shared int b; } void main() { import std.concurrency; auto tid = spawn(() {}); auto s = new S; tid.send(&s.b); } Types that contain shared anywhere within the type would have to be allocated from a global allocator, and with the current state of immutable, the same would have to be done for types that contain immutable anywhere within it. Note that it's only the case for head-shared/head-immutable; types like immutable(T)[] or shared(T)* don't count.
May 13 2015
parent Steven Schveighoffer <schveiguy yahoo.com> writes:
On 5/13/15 8:00 PM, Jakob Ovrum wrote:
 On Wednesday, 13 May 2015 at 17:49:38 UTC, Steven Schveighoffer wrote:
 OK, then consider that this:

 void main()
 {
    string x;
    x ~= "hello";
    x ~= " world";
 }

 would require locking. That's unacceptable. Nobody would append with
 strings if this all required locking for no reason. The runtime
 currently does NOT lock for this case, it considers immutable and
 const to be thread-local.
Well, it's necessary because the design of druntime arrays is incompatible with D2's type system. Without locking, multi-threaded applications that use dynamic array operations could easily contain some particularly hard to track concurrency bugs.
I don't think we should go there. I would say it's very unlikely to have this bug occur, but you are right it could occur. I just don't think destroying append performance for all const and immutable array types is worth it to fix the bug opportunity.
 No, I think the answer is simpler. Introduce shared(immutable), and
 then we can distinguish between immutable data that is shared and data
 that is not shared. It also makes implementing local heaps easier.
 Shared really is orthogonal to mutability.
Basically, shared(immutable(T)) would only be useful to allocators, including arrays because they may need to allocate when growing. I don't think it would be useful beyond that; the sharedness of immutable data is probably not interesting to any other kind of code.
Yes, it really only matters for cases where the immutable pointer can be used to obtain mutable data. But these are important cases. The fact that most code doesn't share *at all*, makes it less of an issue. Perhaps we need some way to indicate that when an array is shared, it can no longer be appended in-place. It would be as simple as flipping the APPENDABLE bit. The problem is, you're not always passing the immutable array directly.
 It would make immutable considerably harder to use than it is today.
 shared(immutable(T)) would be implicitly convertible to
 shared(const(T)), but not const(T), which precludes the vast majority of
 mutation-agnostic D code out there today (I have never seen
 shared(const(T)) used in the wild). We would no longer be able to do
 even the simplest things, like passing a path string to another thread
 and use std.file.read on it.
Yeah, there is no attribute that takes shared/unshared. I don't know a good solution to that, and I understand that shared(immutable) screws up most usages for immutable. It's not a good answer. The problem is that when you share an immutable pointer, it changes the rules for that pointer, but no type information has been altered. It's impossible to track. I think the idea of making immutable not shared by default is probably not a good answer, but I don't know of a good one. I have a feeling whatever answer we choose is going to be really painful :( -Steve
May 14 2015
prev sibling parent Michel Fortin <michel.fortin michelf.ca> writes:
On 2015-05-12 17:21:03 +0000, Steven Schveighoffer <schveiguy yahoo.com> said:

 Of course, array appending is an odd duck here, as generally you are 
 not generally able to add data to an immutable piece of data.
A similar odd duck would be reference counting (again, mutable metadata attached to immutable data). -- Michel Fortin michel.fortin michelf.ca http://michelf.ca
May 13 2015
prev sibling next sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 05/10/2015 11:50 AM, Andrei Alexandrescu wrote:
 In C, C++, and D people have allocated memory for a long time in the
 following manner:
 ...
 Long story short, arrays should sit on a different heap than objects.
 ...
Unless this has been fixed in the interim, I believe DMD lowers new S(args) to [S(args)].ptr for struct S. It should be ensured that the global allocator will not be misinformed about the kind of allocation that takes place, by removing this lowering.
 3. Thread-local vs. shared objects

 Currently in D it's legal to allocate memory in one thread and
 deallocate it in another. (One simple way to look at it is casting to
 shared.) This has a large performance cost that only benefits very few
 actual cases.

 It follows that we need to change the notion that you first allocate
 memory and then brand it as shared. The "will be shared" knowledge must
 be present during allocation, and use different methods of allocation
 for the two cases.

 ...
 6. Immutable vs. mutable

 I'm not sure how this can be practically exploited. There must be
 something in the notion that, save for a brief write period after
 allocation, the memory will stay unmodified. Still thinking.
 ...
Keep in mind that currently, entire regions of memory can change from mutable to immutable implicitly when returned from pure functions. Furthermore, as Michel points out, the ways 'immutable' can be leveraged is constrained by the fact that it implies 'shared'.
May 10 2015
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 5/10/15 5:58 AM, Timon Gehr wrote:
 Keep in mind that currently, entire regions of memory can change from
 mutable to immutable implicitly when returned from pure functions.
 Furthermore, as Michel points out, the ways 'immutable' can be leveraged
 is constrained by the fact that it implies 'shared'.
After sleeping on this for a bit, it seems to me pure function need to identify their allocations in some way to the caller. The simplest way is to have them conservatively use the most conservative heap. We get to refine these later. For now, here's a snapshot of flags that the allocation primitives should know about: enum AllocOptions { /// Allocate an array, not an individual object array, /// Allocate a string of characters string, /// Plan to let the GC take care of this object noFree, /// This object will be shared between threads forSharing, /// This object will be moved between threads forThreadTransfer, /// This object will be mutable after initialization mutableTarget, /// The caller is a pure function, so result may be immutable fromPureFunction, /// Object allocated has pointers hasPointers, /// Typical (default) options typical = array | noFree | forSharing | mutableTarget | hasPointers } Anything to add to this? Andrei
May 11 2015
next sibling parent reply "Dicebot" <public dicebot.lv> writes:
On Monday, 11 May 2015 at 15:45:38 UTC, Andrei Alexandrescu wrote:
   /// The caller is a pure function, so result may be immutable
   fromPureFunction,
It should simply say that allocation pointer is unique, the fact that it comes from pure functions is not important on its own. It should be legal to allocate such chunks from other sources too.
May 11 2015
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 5/11/15 9:14 AM, Dicebot wrote:
 On Monday, 11 May 2015 at 15:45:38 UTC, Andrei Alexandrescu wrote:
   /// The caller is a pure function, so result may be immutable
   fromPureFunction,
It should simply say that allocation pointer is unique, the fact that it comes from pure functions is not important on its own. It should be legal to allocate such chunks from other sources too.
Much clearer, thanks. -- Andrei
May 11 2015
prev sibling next sibling parent Jacob Carlborg <doob me.com> writes:
On 2015-05-11 17:45, Andrei Alexandrescu wrote:

 For now, here's a snapshot of flags that the allocation primitives
 should know about:

 enum AllocOptions
 {
    /// Allocate an array, not an individual object
    array,
    /// Allocate a string of characters
    string,
    /// Plan to let the GC take care of this object
    noFree,
I would recommend using a positive name, in this case, "garbageCollected".
    /// This object will be shared between threads
    forSharing,
    /// This object will be moved between threads
    forThreadTransfer,
    /// This object will be mutable after initialization
    mutableTarget,
    /// The caller is a pure function, so result may be immutable
    fromPureFunction,
    /// Object allocated has pointers
    hasPointers,
    /// Typical (default) options
    typical = array | noFree | forSharing | mutableTarget | hasPointers
Shouldn't this be the first enum member, then this will be the default value.
 }
-- /Jacob Carlborg
May 11 2015
prev sibling next sibling parent "Morbid.Obesity" <Morbid.Obesity mail.com> writes:
On Monday, 11 May 2015 at 15:45:38 UTC, Andrei Alexandrescu wrote:
 On 5/10/15 5:58 AM, Timon Gehr wrote:
 Keep in mind that currently, entire regions of memory can 
 change from
 mutable to immutable implicitly when returned from pure 
 functions.
 Furthermore, as Michel points out, the ways 'immutable' can be 
 leveraged
 is constrained by the fact that it implies 'shared'.
After sleeping on this for a bit, it seems to me pure function need to identify their allocations in some way to the caller. The simplest way is to have them conservatively use the most conservative heap. We get to refine these later. For now, here's a snapshot of flags that the allocation primitives should know about: enum AllocOptions { /// Allocate an array, not an individual object array, /// Allocate a string of characters string, /// Plan to let the GC take care of this object noFree, /// This object will be shared between threads forSharing, /// This object will be moved between threads forThreadTransfer, /// This object will be mutable after initialization mutableTarget, /// The caller is a pure function, so result may be immutable fromPureFunction, /// Object allocated has pointers hasPointers, /// Typical (default) options typical = array | noFree | forSharing | mutableTarget | hasPointers } Anything to add to this?
fixed or pinned? This will allocate the memory ala malloc and ignore it from hence forward. Similar to noFree but more expansive. It seems to me that you really need to break up the allocation options into both allocation and deallocation along with the AUTO option. The auto option allows the compiler to attempt to determine the best option(scan for ptr's, etc). It would be the default option. Allocation options are basically simpler and could include allocation specific things such as: Fast, Fixed, GC, Manual, Stack, Heap, etc.... The deallocation options would essentially be your options above(since they really are talking about what to do when an object is deallocated). Similarly I believe the GC and other memory managers should be parametrized with some protocol between different managers for cross-management. E.g., Multiple heaps for different object memory orientation. GC uses one heap, ARC uses another. If an object cross the boundary(ptr to an object in another heap) then the memory manager trying to free the object has to inform the other memory manager. This partitioning of memory into more specialized zones will provide more optimized options for the programmer. At some point
May 11 2015
prev sibling parent reply "Peter Alexander" <peter.alexander.au gmail.com> writes:
On Monday, 11 May 2015 at 15:45:38 UTC, Andrei Alexandrescu wrote:
 On 5/10/15 5:58 AM, Timon Gehr wrote:
 Keep in mind that currently, entire regions of memory can 
 change from
 mutable to immutable implicitly when returned from pure 
 functions.
 Furthermore, as Michel points out, the ways 'immutable' can be 
 leveraged
 is constrained by the fact that it implies 'shared'.
After sleeping on this for a bit, it seems to me pure function need to identify their allocations in some way to the caller. The simplest way is to have them conservatively use the most conservative heap. We get to refine these later. For now, here's a snapshot of flags that the allocation primitives should know about: enum AllocOptions { /// Allocate an array, not an individual object array, /// Allocate a string of characters string, /// Plan to let the GC take care of this object noFree, /// This object will be shared between threads forSharing, /// This object will be moved between threads forThreadTransfer, /// This object will be mutable after initialization mutableTarget, /// The caller is a pure function, so result may be immutable fromPureFunction, /// Object allocated has pointers hasPointers, /// Typical (default) options typical = array | noFree | forSharing | mutableTarget | hasPointers } Anything to add to this?
Would it be better to name these after their interpretation, rather than expected use cases? For example, the allocator doesn't care if something is an array, it cares about resizing, and large block allocations. Perhaps s/array/expectRealloc/ (as an example). Similar for other ones. The benefit here is that if, for example, I have some non-array use case for frequently reallocing then I can express that directly rather than having to pretend it's like an array.
May 12 2015
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 5/12/15 6:12 AM, Peter Alexander wrote:
 Would it be better to name these after their interpretation, rather than
 expected use cases? For example, the allocator doesn't care if something
 is an array, it cares about resizing, and large block allocations.
 Perhaps s/array/expectRealloc/ (as an example). Similar for other ones.
Yah, "resizable" is nice and self-explanatory. Thanks! -- Andrei
May 12 2015
prev sibling parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
On 5/10/15 8:58 AM, Timon Gehr wrote:
 On 05/10/2015 11:50 AM, Andrei Alexandrescu wrote:
 In C, C++, and D people have allocated memory for a long time in the
 following manner:
 ...
 Long story short, arrays should sit on a different heap than objects.
 ...
Unless this has been fixed in the interim, I believe DMD lowers new S(args) to [S(args)].ptr for struct S. It should be ensured that the global allocator will not be misinformed about the kind of allocation that takes place, by removing this lowering.
That has been fixed. The compiler now calls a different runtime hook. -Steve
May 12 2015
parent Timon Gehr <timon.gehr gmx.ch> writes:
On 05/12/2015 05:52 PM, Steven Schveighoffer wrote:
 On 5/10/15 8:58 AM, Timon Gehr wrote:
 On 05/10/2015 11:50 AM, Andrei Alexandrescu wrote:
 In C, C++, and D people have allocated memory for a long time in the
 following manner:
 ...
 Long story short, arrays should sit on a different heap than objects.
 ...
Unless this has been fixed in the interim, I believe DMD lowers new S(args) to [S(args)].ptr for struct S. It should be ensured that the global allocator will not be misinformed about the kind of allocation that takes place, by removing this lowering.
That has been fixed. The compiler now calls a different runtime hook. -Steve
Great.
May 12 2015
prev sibling next sibling parent reply "Marc =?UTF-8?B?U2Now7x0eiI=?= <schuetzm gmx.net> writes:
On Sunday, 10 May 2015 at 09:50:00 UTC, Andrei Alexandrescu wrote:
 3. Thread-local vs. shared objects

 Currently in D it's legal to allocate memory in one thread and 
 deallocate it in another. (One simple way to look at it is 
 casting to shared.) This has a large performance cost that only 
 benefits very few actual cases.

 It follows that we need to change the notion that you first 
 allocate memory and then brand it as shared. The "will be 
 shared" knowledge must be present during allocation, and use 
 different methods of allocation for the two cases.
In addition to an immutable/shared heap, it may also be necessary to transfer data from one thread to another. The shared heap is for data with shared ownership, while other data with (unique) ownership can be sent to other threads.
May 11 2015
parent "Dicebot" <public dicebot.lv> writes:
On Monday, 11 May 2015 at 09:55:36 UTC, Marc Schütz wrote:
 In addition to an immutable/shared heap, it may also be 
 necessary to transfer data from one thread to another. The 
 shared heap is for data with shared ownership, while other data 
 with (unique) ownership can be sent to other threads.
This is starting to sound close to deadalnix proposal about different heaps for "cooked" and raw (unique) data.
May 11 2015
prev sibling parent reply "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Sunday, 10 May 2015 at 09:50:00 UTC, Andrei Alexandrescu wrote:
 3. Thread-local vs. shared objects

 Currently in D it's legal to allocate memory in one thread and 
 deallocate it in another. (One simple way to look at it is 
 casting to shared.) This has a large performance cost that only 
 benefits very few actual cases.

 It follows that we need to change the notion that you first 
 allocate memory and then brand it as shared. The "will be 
 shared" knowledge must be present during allocation, and use 
 different methods of allocation for the two cases.
As long as casting to or from shared really doesn't do anything other than change how the type system treats that type, we can't really assume much of anything with regards to where shared _or_ thread-local variables came from. Maybe it's still of benefit for the allocator to know that it's going to be dealing with shared as opposed to thread-local variables when doing the allocation, but I don't see how we can get rid of the possibility of an object being allocated on one thread and deallocated on another without making major changes to shared - and std.concurrency isn't going to work very well if we can't cast to and from shared or immutable to pass objects across threads - not unless we add some way to indicate in the type system that a variable is being passed across threads. It may very well be that we _need_ to change how shared is treated by the language and runtime and how its plumbing works, but as it stands, I don't see how we could possibly assume that whether an object is created as shared or thread-local - or even what thread it's created on - has much of anything to do with how it's _actually_ going to be used. In _most_ cases, an object is going to be used on the same thread that it's created on, and if it were possible to create an object as shared directly (which it usually isn't), then in most cases an object that was created as thread-local would be destroyed on the thread that it's created on, and an object created as shared would be used as shared and destroy on who-knows-which thread, but because of both the possibility and need to cast to and from shared to do anything useful with shared as well as passing objects across threads, we simply cannot assume anything about how an object will be used based on whether it's constructed as shared or not or which thread it's constructed on. For any of that to change, shared needs to change. - Jonathan M Davis
May 11 2015
parent "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Monday, 11 May 2015 at 10:54:10 UTC, Jonathan M Davis wrote:
 doing the allocation, but I don't see how we can get rid of the 
 possibility of an object being allocated on one thread and 
 deallocated on another without making major changes to shared - 
 and std.concurrency isn't going to work very well if we can't 
 cast to and from shared or immutable to pass objects across 
 threads - not unless we add some way to indicate in the type 
 system that a variable is being passed across threads.
Behavioural typing is an active research field so if you are to statically prove that a specific program configuration at runtime causes a type to be unshared/immutable you are outside the language design comfort zone. Which I am very much in favour of ;-). More on behavioural typing: http://forum.dlang.org/thread/ovoarcbexpvrrceysnrs forum.dlang.org Of course, the sensible thing to do is to have "behavioural typing light". E.g. to allow passage through gateways (with rather limiting constraints which allows static verification).
May 11 2015