digitalmars.D - A few thoughts on std.allocator

Andrei Alexandrescu (87/87) May 10 2015 In C, C++, and D people have allocated memory for a long time in the

anonymous (2/3) May 10 2015 bad link

Andrei Alexandrescu (4/8) May 10 2015 Pardon:

Michel Fortin (11/22) May 10 2015 Shared is implicit in the case of immutable. Think carefully: if you

Panke (3/21) May 10 2015 If immutable does not pull its weight in other ways, it's just
Andrei Alexandrescu (4/9) May 10 2015 Good point. Well I'd say if things turn out that way, we shouldn't
Steven Schveighoffer (6/24) May 12 2015 The whole concept of immutable being implicitly shareable is kind of

Brad Anderson (6/11) May 12 2015 What are the extra implications of having to think of all const

Steven Schveighoffer (27/37) May 12 2015 The one that always comes to my mind is array appending:

Dicebot (5/10) May 12 2015 As per my udnerstanding `shared` should _never_ result in

Brad Anderson (3/17) May 12 2015 http://dlang.org/faq.html#shared_memory_barriers

Dicebot (7/25) May 13 2015 I have a feeling that no one know what shared truly means

Steven Schveighoffer (11/24) May 13 2015 It's not automatic, the runtime does it.

"Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= (15/19) May 12 2015 Yes, «shared» is either broken or lacks definition. It should be
Jakob Ovrum (40/59) May 13 2015 Do you mean:

Steven Schveighoffer (35/89) May 13 2015 OK, then consider that this:

Jakob Ovrum (44/71) May 13 2015 Well, it's necessary because the design of druntime arrays is

Steven Schveighoffer (22/55) May 14 2015 I don't think we should go there. I would say it's very unlikely to have...

Michel Fortin (7/9) May 13 2015 A similar odd duck would be reference counting (again, mutable metadata

Timon Gehr (9/29) May 10 2015 Unless this has been fixed in the interim, I believe DMD lowers new

Andrei Alexandrescu (30/34) May 11 2015 After sleeping on this for a bit, it seems to me pure function need to

Dicebot (4/6) May 11 2015 It should simply say that allocation pointer is unique, the fact

Andrei Alexandrescu (2/8) May 11 2015 Much clearer, thanks. -- Andrei

Jacob Carlborg (6/29) May 11 2015 Shouldn't this be the first enum member, then this will be the default
Morbid.Obesity (25/62) May 11 2015 fixed or pinned?
Peter Alexander (9/46) May 12 2015 Would it be better to name these after their interpretation,

Andrei Alexandrescu (2/6) May 12 2015 Yah, "resizable" is nice and self-explanatory. Thanks! -- Andrei

Steven Schveighoffer (3/13) May 12 2015 That has been fixed. The compiler now calls a different runtime hook.

Timon Gehr (2/16) May 12 2015 Great.

"Marc =?UTF-8?B?U2Now7x0eiI=?= <schuetzm gmx.net> (5/14) May 11 2015 In addition to an immutable/shared heap, it may also be necessary

Dicebot (3/7) May 11 2015 This is starting to sound close to deadalnix proposal about

Jonathan M Davis (33/42) May 11 2015 As long as casting to or from shared really doesn't do anything

"Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= (11/18) May 11 2015 Behavioural typing is an active research field so if you are to

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

In C, C++, and D people have allocated memory for a long time in the 
following manner:

1. Allocate as many bytes as needed (e.g. by using malloc);
2. Mess with the memory allocated.

(C++ took this one step further by defining class-specific allocators, 
feature that D copied. That turned out to be little else than a lateral 
step. C++ also offers distinct allocation for arrays vs. anything else, 
but because it lacks typed reallocation primitives, it effectively 
hamstrung that feature's capabilities. Also the STL introduced 
allocators, which turned out to not do much of anything useful except 
being an immense time sink for everyone involved with them in any 
capacity. I call these cumulatively The Big Distraction.)

I've just had a mini-epiphany that's been buzzing around my head for a 
while. Finally it got to a form expressible in words:

====
Type should inform allocation.
====

I.e. you shouldn't first allocate and then brand the memory with a type. 
Instead, you should make the type known _during_ the allocation process.

This is not new and on the face of it not even all that noteworthy: (a) 
some high-level languages did this for years, (b) it's unclear which 
exact properties of a type should be relevant to the allocation.

Within the context of D, there are some interesting aspects of a type 
that a complex modular allocator could use to great effect. Consider:

1. Individual objects (i.e. not arrays)

When allocating an individual object, there is virtually zero chance 
that object will ever need to be resized. (There are exceptions such as 
"the struct hack" but those would be handled together with arrays.) 
There are two important consequences of this:

(a) Individual object sizes are already quantized, i.e. there is a 
relatively small set of distinct sizes compared to the number of objects 
being allocated.

(b) Individual objects can be placed in an allocator that's arbitrarily 
adverse to reallocation (i.e. has no in-place expansion ability and no 
clever reallocation primitives). It seems like FreeTree 
(file:///Users/aalexandre/code/d/dlang.org/web/phobos-prerelease/std_experimental_alloc
tor_free_tree.html) 
would be an excellent fit for allocating individual objects. HeapBlock 
also comes to mind.

2. Arrays

The range of sizes allocated for arrays is quite a bit larger and more 
arbitrary than for individual objects. Arrays still grow in quanta (e.g. 
T.sizeof hops) but the set of distinct sizes is much larger and much 
less predictable.

One other interesting thing is that arrays offer concatenation and 
resizing as primitive operations, so it follows that their underlying 
allocator should be friendly to such. This leads to an allocator design 
for arrays that's radically different from the allocator design for 
individual objects.

Long story short, arrays should sit on a different heap than objects.

3. Thread-local vs. shared objects

Currently in D it's legal to allocate memory in one thread and 
deallocate it in another. (One simple way to look at it is casting to 
shared.) This has a large performance cost that only benefits very few 
actual cases.

It follows that we need to change the notion that you first allocate 
memory and then brand it as shared. The "will be shared" knowledge must 
be present during allocation, and use different methods of allocation 
for the two cases.

4. "Has pointers" vs. "has no pointers"

When collecting a specific object, it's important to know whether its 
type embeds pointers. If it does, more scanning is necessary. Otherwise, 
the object is a "leaf" leading to no others.

It seems to me we stand to gain by physically and conceptually 
separating memory that stores leaf objects from that dedicated to 
non-leaf objects. Making the scan-heavy memory more compact leads to 
better cache friendliness.

Currently D's allocator allows choosing at runtime for each allocated 
block whether is should be scanned. The new design would require making 
the choice upfront. There would be, of course, the "anything goes" heap 
that still allows that level of dynamism.

5. "Plan to deallocate" vs. "plan to let the GC mind this"

Objects that will be deallocated manually deserve different placement 
compared to those that will only be deallocated during a GC cycle.

6. Immutable vs. mutable

I'm not sure how this can be practically exploited. There must be 
something in the notion that, save for a brief write period after 
allocation, the memory will stay unmodified. Still thinking.

7. Strings

On the face of it strings are immutable arrays, so they've been already 
discussed. However, they have some interesting characteristics:

(a) small alignment, e.g. 1 byte

(b) often extremely short, e.g. 1-8 bytes

(c) concatenation intensive (including append)

(d) perhaps other unique properties and patterns that elude me

So... types should inform allocation.


Please chime in!

Andrei

May 10 2015

"anonymous" <anonymous example.com> writes:

On Sunday, 10 May 2015 at 09:50:00 UTC, Andrei Alexandrescu wrote:
 (file:///Users/aalexandre/code/d/dlang.org/web/phobos-prerelease/std_experimental_allocator_free_tree.html)

bad link

May 10 2015

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 5/10/15 3:15 AM, anonymous wrote:
 On Sunday, 10 May 2015 at 09:50:00 UTC, Andrei Alexandrescu wrote:
 (file:///Users/aalexandre/code/d/dlang.org/web/phobos-prerelease/std_experimental_allocator_free_tree.html)

 bad link

Pardon: 
erdani.com/d/phobos-prerelease/std_experimental_allocator_free_tree.html 
-- Andrei

May 10 2015

Michel Fortin <michel.fortin michelf.ca> writes:

On 2015-05-10 09:50:00 +0000, Andrei Alexandrescu 
<SeeWebsiteForEmail erdani.org> said:

 3. Thread-local vs. shared objects
 
 Currently in D it's legal to allocate memory in one thread and 
 deallocate it in another. (One simple way to look at it is casting to 
 shared.) This has a large performance cost that only benefits very few 
 actual cases.
 
 It follows that we need to change the notion that you first allocate 
 memory and then brand it as shared. The "will be shared" knowledge must 
 be present during allocation, and use different methods of allocation 
 for the two cases.

Shared is implicit in the case of immutable. Think carefully: if you 
implement this and it has any efficiency benefit for non-shared 
allocations, const-allocated objects and arrays will become more 
performant than immutable-allocated ones. People will thus have an 
incentive to stay away from immutable.

-- 
Michel Fortin
michel.fortin michelf.ca
http://michelf.ca

May 10 2015

"Panke" <tobias pankrath.net> writes:

On Sunday, 10 May 2015 at 10:51:54 UTC, Michel Fortin wrote:
 On 2015-05-10 09:50:00 +0000, Andrei Alexandrescu 
 <SeeWebsiteForEmail erdani.org> said:

 3. Thread-local vs. shared objects
 
 Currently in D it's legal to allocate memory in one thread and 
 deallocate it in another. (One simple way to look at it is 
 casting to shared.) This has a large performance cost that 
 only benefits very few actual cases.
 
 It follows that we need to change the notion that you first 
 allocate memory and then brand it as shared. The "will be 
 shared" knowledge must be present during allocation, and use 
 different methods of allocation for the two cases.

 Shared is implicit in the case of immutable. Think carefully: 
 if you implement this and it has any efficiency benefit for 
 non-shared allocations, const-allocated objects and arrays will 
 become more performant than immutable-allocated ones. People 
 will thus have an incentive to stay away from immutable.

If immutable does not pull its weight in other ways, it's just 
not worth it.

May 10 2015

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 5/10/15 3:51 AM, Michel Fortin wrote:
 Shared is implicit in the case of immutable. Think carefully: if you
 implement this and it has any efficiency benefit for non-shared
 allocations, const-allocated objects and arrays will become more
 performant than immutable-allocated ones. People will thus have an
 incentive to stay away from immutable.

Good point. Well I'd say if things turn out that way, we shouldn't 
pessimize one case just to keep it aligned with another. There might be 
optimization opportunities particular to immutable data. -- Andrei

May 10 2015

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 5/10/15 6:51 AM, Michel Fortin wrote:
 On 2015-05-10 09:50:00 +0000, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> said:

 3. Thread-local vs. shared objects

 Currently in D it's legal to allocate memory in one thread and
 deallocate it in another. (One simple way to look at it is casting to
 shared.) This has a large performance cost that only benefits very few
 actual cases.

 It follows that we need to change the notion that you first allocate
 memory and then brand it as shared. The "will be shared" knowledge
 must be present during allocation, and use different methods of
 allocation for the two cases.

 Shared is implicit in the case of immutable. Think carefully: if you
 implement this and it has any efficiency benefit for non-shared
 allocations, const-allocated objects and arrays will become more
 performant than immutable-allocated ones. People will thus have an
 incentive to stay away from immutable.

The whole concept of immutable being implicitly shareable is kind of 
broken. There are many reasons to have immutable unshared data, and it 
poisons const to the point where you really should consider any const 
variable to be also shared.

-Steve

May 12 2015

"Brad Anderson" <eco gnuk.net> writes:

On Tuesday, 12 May 2015 at 15:51:38 UTC, Steven Schveighoffer 
wrote:
 The whole concept of immutable being implicitly shareable is 
 kind of broken. There are many reasons to have immutable 
 unshared data, and it poisons const to the point where you 
 really should consider any const variable to be also shared.

 -Steve

What are the extra implications of having to think of all const 
variables as shared? I guess it would mean the currently 
unimplemented memory barriers should happen for const variables 
as well. That does seem like a problem.

May 12 2015

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 5/12/15 12:53 PM, Brad Anderson wrote:
 On Tuesday, 12 May 2015 at 15:51:38 UTC, Steven Schveighoffer wrote:
 The whole concept of immutable being implicitly shareable is kind of
 broken. There are many reasons to have immutable unshared data, and it
 poisons const to the point where you really should consider any const
 variable to be also shared.

 What are the extra implications of having to think of all const
 variables as shared? I guess it would mean the currently unimplemented
 memory barriers should happen for const variables as well. That does
 seem like a problem.

The one that always comes to my mind is array appending:

immutable int[] x = new int[5];

const int[] y = x;

x ~= 1; // should this lock;

y ~= 1; // should this lock?

y = new int[5];

y ~= 1; // should this too? If so, isn't it a waste of cycles?

Of course, array appending is an odd duck here, as generally you are not 
generally able to add data to an immutable piece of data.

But there are other cases. Consider a struct like this:

struct S
{
    int a;
    immutable int b;
}

I can create an S on the heap (or whatever allocator), and s.b could be 
shared, but s.a could not be. How does that treat the block the entire S 
is allocated in?

As much as it sucks, I'd prefer to see immutable not be implicitly 
shared, and require a cast. We could specialize the cast so it's not a 
true "ignore all the rules" cast. Sharing data is too fraught with 
danger. A concept of uniqueness would help here too.

I think shared is broken in general, the only thing that's great about 
it is *not* shared, which is defined by the absence of shared :) That is 
something that's easy to wrap your head around.

-Steve

May 12 2015

"Dicebot" <public dicebot.lv> writes:

On Tuesday, 12 May 2015 at 17:21:04 UTC, Steven Schveighoffer 
wrote:
 The one that always comes to my mind is array appending:

 immutable int[] x = new int[5];

 const int[] y = x;

 x ~= 1; // should this lock;

 y ~= 1; // should this lock?

As per my udnerstanding `shared` should _never_ result in 
automatic locking or barriers or whatever. It is simply a tag 
qualfier, with no extra magical semantics.

May 12 2015

"Brad Anderson" <eco gnuk.net> writes:

On Wednesday, 13 May 2015 at 05:35:18 UTC, Dicebot wrote:
 On Tuesday, 12 May 2015 at 17:21:04 UTC, Steven Schveighoffer 
 wrote:
 The one that always comes to my mind is array appending:

 immutable int[] x = new int[5];

 const int[] y = x;

 x ~= 1; // should this lock;

 y ~= 1; // should this lock?

 As per my udnerstanding `shared` should _never_ result in 
 automatic locking or barriers or whatever. It is simply a tag 
 qualfier, with no extra magical semantics.

http://dlang.org/faq.html#shared_memory_barriers

Perhaps this is out of date though.

May 12 2015

"Dicebot" <public dicebot.lv> writes:

On Wednesday, 13 May 2015 at 06:23:40 UTC, Brad Anderson wrote:
 On Wednesday, 13 May 2015 at 05:35:18 UTC, Dicebot wrote:
 On Tuesday, 12 May 2015 at 17:21:04 UTC, Steven Schveighoffer 
 wrote:
 The one that always comes to my mind is array appending:

 immutable int[] x = new int[5];

 const int[] y = x;

 x ~= 1; // should this lock;

 y ~= 1; // should this lock?

 As per my udnerstanding `shared` should _never_ result in 
 automatic locking or barriers or whatever. It is simply a tag 
 qualfier, with no extra magical semantics.

 http://dlang.org/faq.html#shared_memory_barriers

 Perhaps this is out of date though.

I have a feeling that no one know what shared truly means 
anymore. I remember talking about it with Andrei during last 
DConf and his explanation was that it is all about creating user 
types that encapsulate concurrency internally (+ atomics). I may 
remember wrong of course but was no mention of compiler actually 
doing anything special for it.

May 13 2015

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 5/13/15 1:35 AM, Dicebot wrote:
 On Tuesday, 12 May 2015 at 17:21:04 UTC, Steven Schveighoffer wrote:
 The one that always comes to my mind is array appending:

 immutable int[] x = new int[5];

 const int[] y = x;

 x ~= 1; // should this lock;

 y ~= 1; // should this lock?

 As per my udnerstanding `shared` should _never_ result in automatic
 locking or barriers or whatever. It is simply a tag qualfier, with no
 extra magical semantics.

It's not automatic, the runtime does it.

It uses the type information to determine whether it needs to lock or 
not. It also avoids the thread-local Most Recently Used cache to look up 
block info, since that may not be accurate. Essentially, the *lack* of 
shared can be used to optimize in the runtime.

Right now, it assumes that const does not mean shared, but that can 
cause issues if you had a shared immutable array that you were appending 
to via a const reference. The logic for not locking there is that the 
vast VAST majority of cases do not have this problem.

-Steve

May 13 2015

"Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:

On Tuesday, 12 May 2015 at 17:21:04 UTC, Steven Schveighoffer 
wrote:
 I think shared is broken in general, the only thing that's 
 great about it is *not* shared, which is defined by the absence 
 of shared :) That is something that's easy to wrap your head 
 around.

Yes, «shared» is either broken or lacks definition. It should be 
deprecated in favour of «local». What the optimizer needs to know 
is:

1. Can the object be removed from the set of variables affected 
by a full memory barrier?

2. Is there no aliasing to the object outside of the context:

E.g.:

x++;
y.f();
x--;

Is it safe to optimize this to:

y.f();

?

May 12 2015

"Jakob Ovrum" <jakobovrum gmail.com> writes:

On Tuesday, 12 May 2015 at 17:21:04 UTC, Steven Schveighoffer 
wrote:
 The one that always comes to my mind is array appending:

 immutable int[] x = new int[5];

 const int[] y = x;

Do you mean:

immutable(int)[] x = new int[5];

const(int)[] y = x;

?

Because you can't append to or reassign an immutable or const 
slice.

 x ~= 1; // should this lock;

 y ~= 1; // should this lock?

Assuming x and y are head-mutable;

Locking is needed because even though the elements themselves do 
not need locking for inspection[1], the metadata (capacity and 
whatever else is in there) is always mutable. In other words, 
immutable(T)[] is implicitly convertible to immutable(T[]) which 
means it can be freely shared between threads, but when the slice 
refers to a druntime dynamic array, it *does* actually have 
mutable indirection. Blergh, yet another reason why conflating 
slices with dynamic arrays in the type system was probably a bad 
idea. Or rather, a good reason why T[new] was a good idea for D2. 
Then again, the cost is only paid if the dynamic array features 
are actually used :)

[1] Immutable guarantees that the elements cannot be mutated from 
*any* thread, while const guarantees that the data is either 
immutable or thread-local, unless combined with shared. Remember 
that there is such a thing as shared(const(T)), but no such thing 
as shared(immutable(T)).

 y = new int[5];

 y ~= 1; // should this too? If so, isn't it a waste of cycles?

Yeah, it needs to lock and it is a waste... it could avoid it 
with some kind of virtual dispatch mechanism.

 Of course, array appending is an odd duck here, as generally 
 you are not generally able to add data to an immutable piece of 
 data.

Well, when the case is immutable(T)[], it's really not an odd 
case: in general you *can* grow a head-mutable container. In the 
case of an in-place append the existing elements are not touched, 
and in the case of reallocation the old elements are simply 
moved, which doesn't violate immutability either.

The issue is the invisible, mutable metadata that is inherent in 
D's arrays, not an issue for head-mutable containers of 
immutable/const elements in general.

 But there are other cases. Consider a struct like this:

 struct S
 {
    int a;
    immutable int b;
 }

 I can create an S on the heap (or whatever allocator), and s.b 
 could be shared, but s.a could not be. How does that treat the 
 block the entire S is allocated in?

As they are in the same structure, `a` and `b` will always have 
the same sharedness. In the case of immutable(S), both `a` and 
`b` are immutable and can be shared, while in the case of mutable 
S they are both unshared. I don't see it as an issue as long as S 
vs immutable(S) is provided at time of allocation.

May 13 2015

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 5/13/15 3:02 AM, Jakob Ovrum wrote:
 On Tuesday, 12 May 2015 at 17:21:04 UTC, Steven Schveighoffer wrote:
 The one that always comes to my mind is array appending:

 immutable int[] x = new int[5];

 const int[] y = x;

 Do you mean:

 immutable(int)[] x = new int[5];

 const(int)[] y = x;

 ?

 Because you can't append to or reassign an immutable or const slice.

Yes, that is what I meant. Sorry.

 x ~= 1; // should this lock;

 y ~= 1; // should this lock?

 Assuming x and y are head-mutable;

 Locking is needed because even though the elements themselves do not
 need locking for inspection[1], the metadata (capacity and whatever else
 is in there) is always mutable.

OK, then consider that this:

void main()
{
    string x;
    x ~= "hello";
    x ~= " world";
}

would require locking. That's unacceptable. Nobody would append with 
strings if this all required locking for no reason. The runtime 
currently does NOT lock for this case, it considers immutable and const 
to be thread-local.

 In other words, immutable(T)[] is
 implicitly convertible to immutable(T[]) which means it can be freely
 shared between threads, but when the slice refers to a druntime dynamic
 array, it *does* actually have mutable indirection.

Yeah, that's why this case is an odd duck. It's not generally allowed to 
access mutable metadata via an immutable pointer.

 Blergh, yet another
 reason why conflating slices with dynamic arrays in the type system was
 probably a bad idea. Or rather, a good reason why T[new] was a good idea
 for D2. Then again, the cost is only paid if the dynamic array features
 are actually used :)

You will never convince me of this :) T[new] was a terrible idea.

 y = new int[5];

 y ~= 1; // should this too? If so, isn't it a waste of cycles?

 Yeah, it needs to lock and it is a waste... it could avoid it with some
 kind of virtual dispatch mechanism.

No, I think the answer is simpler. Introduce shared(immutable), and then 
we can distinguish between immutable data that is shared and data that 
is not shared. It also makes implementing local heaps easier. Shared 
really is orthogonal to mutability.

 Well, when the case is immutable(T)[], it's really not an odd case: in
 general you *can* grow a head-mutable container. In the case of an
 in-place append the existing elements are not touched, and in the case
 of reallocation the old elements are simply moved, which doesn't violate
 immutability either.

Right, I'm not saying it's violating immutability -- the data AFTER the 
array that's in the block is not referenced anywhere, so it's 
technically unique.

Here's something even weirder: If you append to y (const(int)[]) as a 
reference to x (immutable(int)[]), then part of the array is immutable 
and shareable, and the part that was appended is const and not shareable.

 But there are other cases. Consider a struct like this:

 struct S
 {
    int a;
    immutable int b;
 }

 I can create an S on the heap (or whatever allocator), and s.b could
 be shared, but s.a could not be. How does that treat the block the
 entire S is allocated in?

 As they are in the same structure, `a` and `b` will always have the same
 sharedness. In the case of immutable(S), both `a` and `b` are immutable
 and can be shared, while in the case of mutable S they are both
 unshared. I don't see it as an issue as long as S vs immutable(S) is
 provided at time of allocation.

This should work:

void main()
{
    auto s = new S;
    passToOtherThread(&s.b);
}

Which makes s.a unshared, and s.b shared. But the memory block needs to 
be shareable.

-Steve

May 13 2015

"Jakob Ovrum" <jakobovrum gmail.com> writes:

On Wednesday, 13 May 2015 at 17:49:38 UTC, Steven Schveighoffer 
wrote:
 OK, then consider that this:

 void main()
 {
    string x;
    x ~= "hello";
    x ~= " world";
 }

 would require locking. That's unacceptable. Nobody would append 
 with strings if this all required locking for no reason. The 
 runtime currently does NOT lock for this case, it considers 
 immutable and const to be thread-local.

Well, it's necessary because the design of druntime arrays is 
incompatible with D2's type system. Without locking, 
multi-threaded applications that use dynamic array operations 
could easily contain some particularly hard to track concurrency 
bugs.

Simply not doing the locking and hoping that everything is fine 
doesn't sound like a good plan (which I think we agree on, since 
shared(immutable(T)) would solve it).

 No, I think the answer is simpler. Introduce shared(immutable), 
 and then we can distinguish between immutable data that is 
 shared and data that is not shared. It also makes implementing 
 local heaps easier. Shared really is orthogonal to mutability.

Basically, shared(immutable(T)) would only be useful to 
allocators, including arrays because they may need to allocate 
when growing. I don't think it would be useful beyond that; the 
sharedness of immutable data is probably not interesting to any 
other kind of code.

It would make immutable considerably harder to use than it is 
today. shared(immutable(T)) would be implicitly convertible to 
shared(const(T)), but not const(T), which precludes the vast 
majority of mutation-agnostic D code out there today (I have 
never seen shared(const(T)) used in the wild). We would no longer 
be able to do even the simplest things, like passing a path 
string to another thread and use std.file.read on it.

 Here's something even weirder: If you append to y 
 (const(int)[]) as a reference to x (immutable(int)[]), then 
 part of the array is immutable and shareable, and the part that 
 was appended is const and not shareable.

This should be the case for user-defined containers as well as 
long as the element type doesn't have mutable indirection. 
Appending on slices seems to handle this correctly as well, 
rejecting attempts to append a const(T)[] to an immutable(T)[] 
when T has mutable indirection.

 This should work:

 void main()
 {
    auto s = new S;
    passToOtherThread(&s.b);
 }

 Which makes s.a unshared, and s.b shared. But the memory block 
 needs to be shareable.

That is an interesting example. Plain shared has the same problem:

struct S {
     int a;
     shared int b;
}

void main() {
     import std.concurrency;

     auto tid = spawn(() {});
     auto s = new S;
     tid.send(&s.b);
}

Types that contain shared anywhere within the type would have to 
be allocated from a global allocator, and with the current state 
of immutable, the same would have to be done for types that 
contain immutable anywhere within it. Note that it's only the 
case for head-shared/head-immutable; types like immutable(T)[] or 
shared(T)* don't count.

May 13 2015

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 5/13/15 8:00 PM, Jakob Ovrum wrote:
 On Wednesday, 13 May 2015 at 17:49:38 UTC, Steven Schveighoffer wrote:
 OK, then consider that this:

 void main()
 {
    string x;
    x ~= "hello";
    x ~= " world";
 }

 would require locking. That's unacceptable. Nobody would append with
 strings if this all required locking for no reason. The runtime
 currently does NOT lock for this case, it considers immutable and
 const to be thread-local.

 Well, it's necessary because the design of druntime arrays is
 incompatible with D2's type system. Without locking, multi-threaded
 applications that use dynamic array operations could easily contain some
 particularly hard to track concurrency bugs.

I don't think we should go there. I would say it's very unlikely to have 
this bug occur, but you are right it could occur. I just don't think 
destroying append performance for all const and immutable array types is 
worth it to fix the bug opportunity.

 No, I think the answer is simpler. Introduce shared(immutable), and
 then we can distinguish between immutable data that is shared and data
 that is not shared. It also makes implementing local heaps easier.
 Shared really is orthogonal to mutability.

 Basically, shared(immutable(T)) would only be useful to allocators,
 including arrays because they may need to allocate when growing. I don't
 think it would be useful beyond that; the sharedness of immutable data
 is probably not interesting to any other kind of code.

Yes, it really only matters for cases where the immutable pointer can be 
used to obtain mutable data. But these are important cases.

The fact that most code doesn't share *at all*, makes it less of an 
issue. Perhaps we need some way to indicate that when an array is 
shared, it can no longer be appended in-place. It would be as simple as 
flipping the APPENDABLE bit. The problem is, you're not always passing 
the immutable array directly.

 It would make immutable considerably harder to use than it is today.
 shared(immutable(T)) would be implicitly convertible to
 shared(const(T)), but not const(T), which precludes the vast majority of
 mutation-agnostic D code out there today (I have never seen
 shared(const(T)) used in the wild). We would no longer be able to do
 even the simplest things, like passing a path string to another thread
 and use std.file.read on it.

Yeah, there is no attribute that takes shared/unshared. I don't know a 
good solution to that, and I understand that shared(immutable) screws up 
most usages for immutable. It's not a good answer.

The problem is that when you share an immutable pointer, it changes the 
rules for that pointer, but no type information has been altered. It's 
impossible to track.

I think the idea of making immutable not shared by default is probably 
not a good answer, but I don't know of a good one. I have a feeling 
whatever answer we choose is going to be really painful :(

-Steve

May 14 2015

Michel Fortin <michel.fortin michelf.ca> writes:

On 2015-05-12 17:21:03 +0000, Steven Schveighoffer <schveiguy yahoo.com> said:

 Of course, array appending is an odd duck here, as generally you are 
 not generally able to add data to an immutable piece of data.

A similar odd duck would be reference counting (again, mutable metadata 
attached to immutable data).

-- 
Michel Fortin
michel.fortin michelf.ca
http://michelf.ca

May 13 2015

Timon Gehr <timon.gehr gmx.ch> writes:

On 05/10/2015 11:50 AM, Andrei Alexandrescu wrote:
 In C, C++, and D people have allocated memory for a long time in the
 following manner:
 ...
 Long story short, arrays should sit on a different heap than objects.
 ...

Unless this has been fixed in the interim, I believe DMD lowers new 
S(args) to [S(args)].ptr for struct S. It should be ensured that the 
global allocator will not be misinformed about the kind of allocation 
that takes place, by removing this lowering.


 3. Thread-local vs. shared objects

 Currently in D it's legal to allocate memory in one thread and
 deallocate it in another. (One simple way to look at it is casting to
 shared.) This has a large performance cost that only benefits very few
 actual cases.

 It follows that we need to change the notion that you first allocate
 memory and then brand it as shared. The "will be shared" knowledge must
 be present during allocation, and use different methods of allocation
 for the two cases.

 ...
 6. Immutable vs. mutable

 I'm not sure how this can be practically exploited. There must be
 something in the notion that, save for a brief write period after
 allocation, the memory will stay unmodified. Still thinking.
 ...

Keep in mind that currently, entire regions of memory can change from 
mutable to immutable implicitly when returned from pure functions. 
Furthermore, as Michel points out, the ways 'immutable' can be leveraged 
is constrained by the fact that it implies 'shared'.

May 10 2015

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 5/10/15 5:58 AM, Timon Gehr wrote:
 Keep in mind that currently, entire regions of memory can change from
 mutable to immutable implicitly when returned from pure functions.
 Furthermore, as Michel points out, the ways 'immutable' can be leveraged
 is constrained by the fact that it implies 'shared'.

After sleeping on this for a bit, it seems to me pure function need to 
identify their allocations in some way to the caller. The simplest way 
is to have them conservatively use the most conservative heap. We get to 
refine these later.

For now, here's a snapshot of flags that the allocation primitives 
should know about:

enum AllocOptions
{
   /// Allocate an array, not an individual object
   array,
   /// Allocate a string of characters
   string,
   /// Plan to let the GC take care of this object
   noFree,
   /// This object will be shared between threads
   forSharing,
   /// This object will be moved between threads
   forThreadTransfer,
   /// This object will be mutable after initialization
   mutableTarget,
   /// The caller is a pure function, so result may be immutable
   fromPureFunction,
   /// Object allocated has pointers
   hasPointers,
   /// Typical (default) options
   typical = array | noFree | forSharing | mutableTarget | hasPointers
}

Anything to add to this?


Andrei

May 11 2015

"Dicebot" <public dicebot.lv> writes:

On Monday, 11 May 2015 at 15:45:38 UTC, Andrei Alexandrescu wrote:
   /// The caller is a pure function, so result may be immutable
   fromPureFunction,

It should simply say that allocation pointer is unique, the fact 
that it comes from pure functions is not important on its own. It 
should be legal to allocate such chunks from other sources too.

May 11 2015

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 5/11/15 9:14 AM, Dicebot wrote:
 On Monday, 11 May 2015 at 15:45:38 UTC, Andrei Alexandrescu wrote:
   /// The caller is a pure function, so result may be immutable
   fromPureFunction,

 It should simply say that allocation pointer is unique, the fact that it
 comes from pure functions is not important on its own. It should be
 legal to allocate such chunks from other sources too.

Much clearer, thanks. -- Andrei

May 11 2015

Jacob Carlborg <doob me.com> writes:

On 2015-05-11 17:45, Andrei Alexandrescu wrote:

 For now, here's a snapshot of flags that the allocation primitives
 should know about:

 enum AllocOptions
 {
    /// Allocate an array, not an individual object
    array,
    /// Allocate a string of characters
    string,
    /// Plan to let the GC take care of this object
    noFree,

I would recommend using a positive name, in this case, "garbageCollected".

    /// This object will be shared between threads
    forSharing,
    /// This object will be moved between threads
    forThreadTransfer,
    /// This object will be mutable after initialization
    mutableTarget,
    /// The caller is a pure function, so result may be immutable
    fromPureFunction,
    /// Object allocated has pointers
    hasPointers,
    /// Typical (default) options
    typical = array | noFree | forSharing | mutableTarget | hasPointers

Shouldn't this be the first enum member, then this will be the default 
value.

 }



-- 
/Jacob Carlborg

May 11 2015

"Morbid.Obesity" <Morbid.Obesity mail.com> writes:

On Monday, 11 May 2015 at 15:45:38 UTC, Andrei Alexandrescu wrote:
 On 5/10/15 5:58 AM, Timon Gehr wrote:
 Keep in mind that currently, entire regions of memory can 
 change from
 mutable to immutable implicitly when returned from pure 
 functions.
 Furthermore, as Michel points out, the ways 'immutable' can be 
 leveraged
 is constrained by the fact that it implies 'shared'.

 After sleeping on this for a bit, it seems to me pure function 
 need to identify their allocations in some way to the caller. 
 The simplest way is to have them conservatively use the most 
 conservative heap. We get to refine these later.

 For now, here's a snapshot of flags that the allocation 
 primitives should know about:

 enum AllocOptions
 {
   /// Allocate an array, not an individual object
   array,
   /// Allocate a string of characters
   string,
   /// Plan to let the GC take care of this object
   noFree,
   /// This object will be shared between threads
   forSharing,
   /// This object will be moved between threads
   forThreadTransfer,
   /// This object will be mutable after initialization
   mutableTarget,
   /// The caller is a pure function, so result may be immutable
   fromPureFunction,
   /// Object allocated has pointers
   hasPointers,
   /// Typical (default) options
   typical = array | noFree | forSharing | mutableTarget | 
 hasPointers
 }

 Anything to add to this?

fixed or pinned?

This will allocate the memory ala malloc and ignore it from hence 
forward. Similar to noFree but more expansive.

It seems to me that you really need to break up the allocation 
options into both allocation and deallocation along with the AUTO 
option. The auto option allows the compiler to attempt to 
determine the best option(scan for ptr's, etc). It would be the 
default option.

Allocation options are basically simpler and could include 
allocation specific things such as:

Fast, Fixed, GC, Manual, Stack, Heap, etc....

The deallocation options would essentially be your options 
above(since they really are talking about what to do when an 
object is deallocated).

Similarly I believe the GC and other memory managers should be 
parametrized with some protocol between different managers for 
cross-management.

E.g., Multiple heaps for different object memory orientation. GC 
uses one heap, ARC uses another. If an object cross the 
boundary(ptr to an object in another heap) then the memory 
manager trying to free the object has to inform the other memory 
manager.

This partitioning of memory into more specialized zones will 
provide more optimized options for the programmer. At some point

May 11 2015

"Peter Alexander" <peter.alexander.au gmail.com> writes:

On Monday, 11 May 2015 at 15:45:38 UTC, Andrei Alexandrescu wrote:
 On 5/10/15 5:58 AM, Timon Gehr wrote:
 Keep in mind that currently, entire regions of memory can 
 change from
 mutable to immutable implicitly when returned from pure 
 functions.
 Furthermore, as Michel points out, the ways 'immutable' can be 
 leveraged
 is constrained by the fact that it implies 'shared'.

 After sleeping on this for a bit, it seems to me pure function 
 need to identify their allocations in some way to the caller. 
 The simplest way is to have them conservatively use the most 
 conservative heap. We get to refine these later.

 For now, here's a snapshot of flags that the allocation 
 primitives should know about:

 enum AllocOptions
 {
   /// Allocate an array, not an individual object
   array,
   /// Allocate a string of characters
   string,
   /// Plan to let the GC take care of this object
   noFree,
   /// This object will be shared between threads
   forSharing,
   /// This object will be moved between threads
   forThreadTransfer,
   /// This object will be mutable after initialization
   mutableTarget,
   /// The caller is a pure function, so result may be immutable
   fromPureFunction,
   /// Object allocated has pointers
   hasPointers,
   /// Typical (default) options
   typical = array | noFree | forSharing | mutableTarget | 
 hasPointers
 }

 Anything to add to this?

Would it be better to name these after their interpretation, 
rather than expected use cases? For example, the allocator 
doesn't care if something is an array, it cares about resizing, 
and large block allocations. Perhaps s/array/expectRealloc/ (as 
an example). Similar for other ones.

The benefit here is that if, for example, I have some non-array 
use case for frequently reallocing then I can express that 
directly rather than having to pretend it's like an array.

May 12 2015

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 5/12/15 6:12 AM, Peter Alexander wrote:
 Would it be better to name these after their interpretation, rather than
 expected use cases? For example, the allocator doesn't care if something
 is an array, it cares about resizing, and large block allocations.
 Perhaps s/array/expectRealloc/ (as an example). Similar for other ones.

Yah, "resizable" is nice and self-explanatory. Thanks! -- Andrei

May 12 2015

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 5/10/15 8:58 AM, Timon Gehr wrote:
 On 05/10/2015 11:50 AM, Andrei Alexandrescu wrote:
 In C, C++, and D people have allocated memory for a long time in the
 following manner:
 ...
 Long story short, arrays should sit on a different heap than objects.
 ...

 Unless this has been fixed in the interim, I believe DMD lowers new
 S(args) to [S(args)].ptr for struct S. It should be ensured that the
 global allocator will not be misinformed about the kind of allocation
 that takes place, by removing this lowering.

That has been fixed. The compiler now calls a different runtime hook.

-Steve

May 12 2015

Timon Gehr <timon.gehr gmx.ch> writes:

On 05/12/2015 05:52 PM, Steven Schveighoffer wrote:
 On 5/10/15 8:58 AM, Timon Gehr wrote:
 On 05/10/2015 11:50 AM, Andrei Alexandrescu wrote:
 In C, C++, and D people have allocated memory for a long time in the
 following manner:
 ...
 Long story short, arrays should sit on a different heap than objects.
 ...

 Unless this has been fixed in the interim, I believe DMD lowers new
 S(args) to [S(args)].ptr for struct S. It should be ensured that the
 global allocator will not be misinformed about the kind of allocation
 that takes place, by removing this lowering.

 That has been fixed. The compiler now calls a different runtime hook.

 -Steve

Great.

May 12 2015

"Marc =?UTF-8?B?U2Now7x0eiI=?= <schuetzm gmx.net> writes:

On Sunday, 10 May 2015 at 09:50:00 UTC, Andrei Alexandrescu wrote:
 3. Thread-local vs. shared objects

 Currently in D it's legal to allocate memory in one thread and 
 deallocate it in another. (One simple way to look at it is 
 casting to shared.) This has a large performance cost that only 
 benefits very few actual cases.

 It follows that we need to change the notion that you first 
 allocate memory and then brand it as shared. The "will be 
 shared" knowledge must be present during allocation, and use 
 different methods of allocation for the two cases.

In addition to an immutable/shared heap, it may also be necessary 
to transfer data from one thread to another. The shared heap is 
for data with shared ownership, while other data with (unique) 
ownership can be sent to other threads.

May 11 2015

"Dicebot" <public dicebot.lv> writes:

On Monday, 11 May 2015 at 09:55:36 UTC, Marc Schütz wrote:
 In addition to an immutable/shared heap, it may also be 
 necessary to transfer data from one thread to another. The 
 shared heap is for data with shared ownership, while other data 
 with (unique) ownership can be sent to other threads.

This is starting to sound close to deadalnix proposal about 
different heaps for "cooked" and raw (unique) data.

May 11 2015

"Jonathan M Davis" <jmdavisProg gmx.com> writes:

On Sunday, 10 May 2015 at 09:50:00 UTC, Andrei Alexandrescu wrote:
 3. Thread-local vs. shared objects

 Currently in D it's legal to allocate memory in one thread and 
 deallocate it in another. (One simple way to look at it is 
 casting to shared.) This has a large performance cost that only 
 benefits very few actual cases.

 It follows that we need to change the notion that you first 
 allocate memory and then brand it as shared. The "will be 
 shared" knowledge must be present during allocation, and use 
 different methods of allocation for the two cases.

As long as casting to or from shared really doesn't do anything 
other than change how the type system treats that type, we can't 
really assume much of anything with regards to where shared _or_ 
thread-local variables came from. Maybe it's still of benefit for 
the allocator to know that it's going to be dealing with shared 
as opposed to thread-local variables when doing the allocation, 
but I don't see how we can get rid of the possibility of an 
object being allocated on one thread and deallocated on another 
without making major changes to shared - and std.concurrency 
isn't going to work very well if we can't cast to and from shared 
or immutable to pass objects across threads - not unless we add 
some way to indicate in the type system that a variable is being 
passed across threads.

It may very well be that we _need_ to change how shared is 
treated by the language and runtime and how its plumbing works, 
but as it stands, I don't see how we could possibly assume that 
whether an object is created as shared or thread-local - or even 
what thread it's created on - has much of anything to do with how 
it's _actually_ going to be used. In _most_ cases, an object is 
going to be used on the same thread that it's created on, and if 
it were possible to create an object as shared directly (which it 
usually isn't), then in most cases an object that was created as 
thread-local would be destroyed on the thread that it's created 
on, and an object created as shared would be used as shared and 
destroy on who-knows-which thread, but because of both the 
possibility and need to cast to and from shared to do anything 
useful with shared as well as passing objects across threads, we 
simply cannot assume anything about how an object will be used 
based on whether it's constructed as shared or not or which 
thread it's constructed on.

For any of that to change, shared needs to change.

- Jonathan M Davis

May 11 2015

"Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:

On Monday, 11 May 2015 at 10:54:10 UTC, Jonathan M Davis wrote:
 doing the allocation, but I don't see how we can get rid of the 
 possibility of an object being allocated on one thread and 
 deallocated on another without making major changes to shared - 
 and std.concurrency isn't going to work very well if we can't 
 cast to and from shared or immutable to pass objects across 
 threads - not unless we add some way to indicate in the type 
 system that a variable is being passed across threads.

Behavioural typing is an active research field so if you are to 
statically prove that a specific program configuration at runtime 
causes a type to be unshared/immutable you are outside the 
language design comfort zone. Which I am very much in favour of 
;-).

More on behavioural typing:

http://forum.dlang.org/thread/ovoarcbexpvrrceysnrs forum.dlang.org

Of course, the sensible thing to do is to have "behavioural 
typing light". E.g. to allow passage through gateways (with 
rather limiting constraints which allows static verification).

May 11 2015

D Programming

C/C++ Programming

Other

digitalmars.D - A few thoughts on std.allocator