www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - We need a typesystem-sanctioned way to cast qualifiers away

reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
I've gained a crapton of insight while working on collections. It's amazing.

One interesting aspect is the interaction of mutable and functional 
collections with the type qualifiers "const" and "immutable". I managed 
to navigate around issues quite nicely, with two exceptions:

1. Reference counting: it's mutation underneath an immutable appearance. 
For a good while I'd been uncomfortable about that, until I figured that 
this is the "systems" part of D. Other languages do use reference 
counting, but at the compiler level which allows cheating the type 
system. It is somewhat fresh to attempt a principled implementation of 
both reference counting and safe functional collections, simultaneously 
and at library level.

My conclusion is that changing the reference count in an otherwise 
immutable structure is an entirely reasonable thing to want do. We need 
a way to explain the type system "even though the payload is const or 
immutable, I'll change this particular uint so please don't do any 
optimizations that would invalidate the cast". The user is responsible 
for e.g. atomic reference counting in immutable data etc.

2. Allocation: functional containers carry with them a reference to an 
IAllocator interface, which tells them how to do memory allocation. 
Somewhat paradoxically, it is sometimes necessary to allocate memory 
even for immutable objects. For example, in the concatenation "value ~ 
list", the list's allocator must be used.

Again, the reference to IAllocator must be unqualified even inside an 
otherwise qualified object.

These are matters of implementation internals and I'm perfectly 
comfortable that it's necessary to overstep the boundaries of the type 
system. The only matter is finding a way to inform the type system that 
it must make legal certain casts.


Andrei
Jun 19 2015
next sibling parent "deadalnix" <deadalnix gmail.com> writes:
That's only one aspect of the problem? the other side of the coin 
is that you'd like the qualifier to turtle down your collection. 
Right now,

const(Collection!T) is different from 
const(Collection!(const(T))) .

Back to your problem, the only I see to write that in a way the 
compiler won't bite you later is to get a reference to the field 
(refcount of allocator) cast the qualifier of that reference, and 
then manipulating it.

Everything need to go through the casted reference. If you drop 
it and get it back again, the compiler may assume thing you don't 
want it to assume.
Jun 19 2015
prev sibling next sibling parent reply "Daniel N" <ufo orbiting.us> writes:
On Saturday, 20 June 2015 at 00:07:12 UTC, Andrei Alexandrescu 
wrote:
 Again, the reference to IAllocator must be unqualified even 
 inside an otherwise qualified object.
no love for a container factory? struct IAllocator{} struct container(T){} template factory(T) { struct factory { T contain; // potentially const IAllocator mutable; alias contain this; } } auto c = factory!(const container!uint)();
Jun 20 2015
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 6/20/15 12:30 AM, Daniel N wrote:
 On Saturday, 20 June 2015 at 00:07:12 UTC, Andrei Alexandrescu wrote:
 Again, the reference to IAllocator must be unqualified even inside an
 otherwise qualified object.
no love for a container factory? struct IAllocator{} struct container(T){} template factory(T) { struct factory { T contain; // potentially const IAllocator mutable; alias contain this; } } auto c = factory!(const container!uint)();
The container must be self-contained (ehm), i.e. be able to destroy itself. Also, people may use by mistake one factory for creating one container and a different one for adding elements to it. -- Andrei
Jun 21 2015
prev sibling next sibling parent reply "Edmund Smith" <edmund.pd.smith gmail.com> writes:
First post, so let's see how this goes:

On Saturday, 20 June 2015 at 00:07:12 UTC, Andrei Alexandrescu 
wrote:
 1. Reference counting: it's mutation underneath an immutable 
 appearance. For a good while I'd been uncomfortable about that, 
 until I figured that this is the "systems" part of D. Other 
 languages do use reference counting, but at the compiler level 
 which allows cheating the type system. It is somewhat fresh to 
 attempt a principled implementation of both reference counting 
 and safe functional collections, simultaneously and at library 
 level.

 My conclusion is that changing the reference count in an 
 otherwise immutable structure is an entirely reasonable thing 
 to want do. We need a way to explain the type system "even 
 though the payload is const or immutable, I'll change this 
 particular uint so please don't do any optimizations that would 
 invalidate the cast". The user is responsible for e.g. atomic 
 reference counting in immutable data etc.
This idea makes sense with the type checker, although I'm not sure how it fits D's current memory model. On the page detailing const and immutable (dlang.org/const3.html) it mentions that immutable data 'can be placed in ROM (Read Only Memory) or in memory pages marked by the hardware as read only', which may be problematic if an immutable's mutable reference counter is placed in a read-only page. Would this potentially be changed? It suggests that the compiler needs a way of telling if the struct is a typical immutable struct or is privately-mutable, publicly-immutable. As for the language level, it helps to look at how an immutable value propagates. It can be passed by value or by reference, passed between threads without issue and its value never changes. The thread safety strongly implies that what isn't immutable is shared, so that every field in an immutable struct is still thread-safe. Its exposed value never changing is also crucial IMO, to the point where I think it should be a compiler error if the 'facade' of immutability collapses (e.g. mutable data is public or modifies a public function's result). Passing copies also raises a point about the current state of 'immutable struct S', which has a few things different to normal structs (and block anything but POD types. Is this intentional?). Currently, D lacks destructors and postblit constructors on immutable structs, giving the error 'Error: immutable method SomeStruct.__postblit is not callable using a mutable object'. Throwing an immutable qualifier makes it no longer recognisable as a postblit constructor, so that doesn't work either. This makes perfect sense under the current assumption that 'immutable struct' implies it is a POD struct, but if/when under-the-bonnet mutability goes ahead, this assumption will no longer hold (and postblit constructors are pretty useful with refcounting). The same goes for destructors, too. Without these, passing immutable refcounts by copy isn't safe - the copy keeps the pointer, but doesn't increase the counter. This leads to a potential use-after-free if the copy outlives every 'proper' reference. I'm not sure whether this is all necessary, however - it is possible to just ignore 'immutable struct' and use 'alias MyStruct = immutable(MyActualStruct);', in which case the previous paragraph can be ignored. After playing around with examples of how to do proper immutable reference counting (e.g. static opCall factory) I think the smallest change large enough to work with would be to allow transitive immutability to not transfer immutability to 'private shared' variables (or behave this way in the memory model, so that casting to mutable is safe but the type system remains entirely transitive), while simultaneously making it a compile-time error to read or write to these variables from exposed (public) functions. This may want to be loosened or have exceptions (debug builds etc.), depending on other use-cases. 'shared' seems to be used at the moment for a few different hacks (e.g. GDC having 'shared' and 'volatile' behave identically for a while) and nothing particularly concrete, save some library functions and inter-thread message passing. It also seems suitable for avoiding overeager compiler optimisations, since the 'shared' qualifier already shows that typical assumptions can't be made about it. Also, most existing scenarios I can think of for a private shared variable in an immutable object are rather contrived. Finally, it should push the use of atomic code, or at least basic multithreading good practices. The forcing public code to avoid touching the mutable variables enforces the idea that this is meant for 'systems' code, not typical implementation logic (it would be pointless to use it for this). It also prevents the mutable-ness from leaking, either through return values or other logic ('return (mutaBool)?1:0') and enforces a clean design that the systems code wraps around, not the implementation logic wrapping around (and depending on) systems code. A brief example using this change: struct Container(T) { struct Node { private shared uint count; ContainerImpl!T payload; } Node* node; IAllocator alloc; this() { node = alloc.make!Node(1, ContainerImpl!T()); } this(this) { node.count.atomicOp!"++"; // } auto get() { return node.payload; } alias get this; ~this() { node.count.atomicOp!"++"; if(node.count == 0) alloc.dispose(node); } } //Works like normal with enforced atomicity (maybe overkill in single-threaded code) //When immutable, everything but count is immutable, and the exposed interface //is statically guaranteed to be immutable too; everything is thread safe. Looking for cases where this might not be backwards compatible leads to only a very narrow area that may cause behaviour changes: struct S { int i; private shared T t; ~this() { t.cleanup(); } } struct T { void cleanup(); immutable void cleanup(); } ... { immutable S s1 = S(1, T()); } //The transitively immutable s1.t is no longer immutable, // so calls the mutable cleanup on destruct instead
 2. Allocation: functional containers carry with them a 
 reference to an IAllocator interface, which tells them how to 
 do memory allocation. Somewhat paradoxically, it is sometimes 
 necessary to allocate memory even for immutable objects. For 
 example, in the concatenation "value ~ list", the list's 
 allocator must be used.

 Again, the reference to IAllocator must be unqualified even 
 inside an otherwise qualified object.
The same idea should work here too. Edmund PS - any comments on this comment's style are welcome, I'm new to posting here
Jun 21 2015
next sibling parent Iain Buclaw via Digitalmars-d <digitalmars-d puremagic.com> writes:
On 21 June 2015 at 09:31, Edmund Smith via Digitalmars-d <
digitalmars-d puremagic.com> wrote:

First post, so let's see how this goes:
 On Saturday, 20 June 2015 at 00:07:12 UTC, Andrei Alexandrescu wrote:

 1. Reference counting: it's mutation underneath an immutable appearance.
 For a good while I'd been uncomfortable about that, until I figured that
 this is the "systems" part of D. Other languages do use reference counting,
 but at the compiler level which allows cheating the type system. It is
 somewhat fresh to attempt a principled implementation of both reference
 counting and safe functional collections, simultaneously and at library
 level.

 My conclusion is that changing the reference count in an otherwise
 immutable structure is an entirely reasonable thing to want do. We need a
 way to explain the type system "even though the payload is const or
 immutable, I'll change this particular uint so please don't do any
 optimizations that would invalidate the cast". The user is responsible for
 e.g. atomic reference counting in immutable data etc.
This idea makes sense with the type checker, although I'm not sure how it fits D's current memory model. On the page detailing const and immutable ( dlang.org/const3.html) it mentions that immutable data 'can be placed in ROM (Read Only Memory) or in memory pages marked by the hardware as read only', which may be problematic if an immutable's mutable reference counter is placed in a read-only page. Would this potentially be changed? It suggests that the compiler needs a way of telling if the struct is a typical immutable struct or is privately-mutable, publicly-immutable. As for the language level, it helps to look at how an immutable value propagates. It can be passed by value or by reference, passed between threads without issue and its value never changes. The thread safety strongly implies that what isn't immutable is shared, so that every field in an immutable struct is still thread-safe. Its exposed value never changing is also crucial IMO, to the point where I think it should be a compiler error if the 'facade' of immutability collapses (e.g. mutable data is public or modifies a public function's result). Passing copies also raises a point about the current state of 'immutable struct S', which has a few things different to normal structs (and block anything but POD types. Is this intentional?). Currently, D lacks destructors and postblit constructors on immutable structs, giving the error 'Error: immutable method SomeStruct.__postblit is not callable using a mutable object'. Throwing an immutable qualifier makes it no longer recognisable as a postblit constructor, so that doesn't work either. This makes perfect sense under the current assumption that 'immutable struct' implies it is a POD struct, but if/when under-the-bonnet mutability goes ahead, this assumption will no longer hold (and postblit constructors are pretty useful with refcounting). The same goes for destructors, too. Without these, passing immutable refcounts by copy isn't safe - the copy keeps the pointer, but doesn't increase the counter. This leads to a potential use-after-free if the copy outlives every 'proper' reference. I'm not sure whether this is all necessary, however - it is possible to just ignore 'immutable struct' and use 'alias MyStruct = immutable(MyActualStruct);', in which case the previous paragraph can be ignored. After playing around with examples of how to do proper immutable reference counting (e.g. static opCall factory) I think the smallest change large enough to work with would be to allow transitive immutability to not transfer immutability to 'private shared' variables (or behave this way in the memory model, so that casting to mutable is safe but the type system remains entirely transitive), while simultaneously making it a compile-time error to read or write to these variables from exposed (public) functions. This may want to be loosened or have exceptions (debug builds etc.), depending on other use-cases. 'shared' seems to be used at the moment for a few different hacks (e.g. GDC having 'shared' and 'volatile' behave identically for a while) and nothing particularly concrete, save some library functions and inter-thread message passing. It also seems suitable for avoiding overeager compiler optimisations, since the 'shared' qualifier already shows that typical assumptions can't be made about it. Also, most existing scenarios I can think of for a private shared variable in an immutable object are rather contrived. Finally, it should push the use of atomic code, or at least basic multithreading good practices.
Shared was not well documented (nor were there any strong library support) when it was initially conceived. It's purpose has become clearer with use: http://forum.dlang.org/thread/xhlbenngkdpueqnhdqkg forum.dlang.org?page=1
Jun 21 2015
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 6/21/15 12:31 AM, Edmund Smith wrote:
 This idea makes sense with the type checker, although I'm not sure how
 it fits D's current memory model. On the page detailing const and
 immutable (dlang.org/const3.html) it mentions that immutable data 'can
 be placed in ROM (Read Only Memory) or in memory pages marked by the
 hardware as read only', which may be problematic if an immutable's
 mutable reference counter is placed in a read-only page. Would this
 potentially be changed? It suggests that the compiler needs a way of
 telling if the struct is a typical immutable struct or is
 privately-mutable, publicly-immutable.
Yah, that would need to be changed. That's why we need a way to tell the typesystem about that kind of data. It can't be put in read-protected memory of any kind.
 As for the language level, it helps to look at how an immutable value
 propagates. It can be passed by value or by reference, passed between
 threads without issue and its value never changes. The thread safety
 strongly implies that what isn't immutable is shared, so that every
 field in an immutable struct is still thread-safe. Its exposed value
 never changing is also crucial IMO, to the point where I think it should
 be a compiler error if the 'facade' of immutability collapses (e.g.
 mutable data is public or modifies a public function's result).
So, shared data member of immutable object would remain shared. Sensible. Interesting, I'll keep that in mind. Thanks.
 Passing copies also raises a point about the current state of 'immutable
 struct S', which has a few things different to normal structs (and block
 anything but POD types. Is this intentional?).
 Currently, D lacks destructors and postblit constructors on immutable
 structs, giving the error 'Error: immutable method SomeStruct.__postblit
 is not callable using a mutable object'. Throwing an immutable qualifier
 makes it no longer recognisable as a postblit constructor, so that
 doesn't work either. This makes perfect sense under the current
 assumption that 'immutable struct' implies it is a POD struct, but
 if/when under-the-bonnet mutability goes ahead, this assumption will no
 longer hold (and postblit constructors are pretty useful with
 refcounting). The same goes for destructors, too. Without these, passing
 immutable refcounts by copy isn't safe - the copy keeps the pointer, but
 doesn't increase the counter. This leads to a potential use-after-free
 if the copy outlives every 'proper' reference.
 I'm not sure whether this is all necessary, however - it is possible to
 just ignore 'immutable struct' and use 'alias MyStruct =
 immutable(MyActualStruct);', in which case the previous paragraph can be
 ignored.
Yah, we need immutable and const for ctors, postblit, and dtors with specific typechecking rules. Andrei
Jun 21 2015
parent Steven Schveighoffer <schveiguy yahoo.com> writes:
On 6/21/15 8:49 PM, Andrei Alexandrescu wrote:
 On 6/21/15 12:31 AM, Edmund Smith wrote:
 This idea makes sense with the type checker, although I'm not sure how
 it fits D's current memory model. On the page detailing const and
 immutable (dlang.org/const3.html) it mentions that immutable data 'can
 be placed in ROM (Read Only Memory) or in memory pages marked by the
 hardware as read only', which may be problematic if an immutable's
 mutable reference counter is placed in a read-only page. Would this
 potentially be changed? It suggests that the compiler needs a way of
 telling if the struct is a typical immutable struct or is
 privately-mutable, publicly-immutable.
Yah, that would need to be changed. That's why we need a way to tell the typesystem about that kind of data. It can't be put in read-protected memory of any kind.
Or, use a special value for the reference count that signals ROM. -Steve
Jun 21 2015
prev sibling next sibling parent reply "Tofu Ninja" <emmons0 purdue.edu> writes:
On Saturday, 20 June 2015 at 00:07:12 UTC, Andrei Alexandrescu 
wrote:
 ...
I question how much good something like this would do. I feel like const vs immutable vs shared vs inout is already very confusing as is, something like this would just add more stuff to think about on top of it and just make the whole system more confusing. I know its supposed to be for implementation internals, but I just worry that it will just make things more complicated. I feel like the type system is already pretty ugly right now, do we need to make that worse/is it worth making that worse? Also would it be possible to restructure the container to cleanly separate the mutable and immutable parts? I think if its possible, it would be a much better solution. Just my 2c...
Jun 21 2015
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 6/21/15 6:51 PM, Tofu Ninja wrote:
 On Saturday, 20 June 2015 at 00:07:12 UTC, Andrei Alexandrescu wrote:
 ...
I question how much good something like this would do. I feel like const vs immutable vs shared vs inout is already very confusing as is, something like this would just add more stuff to think about on top of it and just make the whole system more confusing. I know its supposed to be for implementation internals, but I just worry that it will just make things more complicated. I feel like the type system is already pretty ugly right now, do we need to make that worse/is it worth making that worse? Also would it be possible to restructure the container to cleanly separate the mutable and immutable parts? I think if its possible, it would be a much better solution. Just my 2c...
All good points. I think we need to accept that immutable objects must rely on the garbage collector; reference counting them is not accepted in D's type system. -- Andrei
Jun 21 2015
parent reply "Tofu Ninja" <emmons0 purdue.edu> writes:
On Monday, 22 June 2015 at 02:54:27 UTC, Andrei Alexandrescu 
wrote:
 All good points. I think we need to accept that immutable 
 objects must rely on the garbage collector; reference counting 
 them is not accepted in D's type system. -- Andrei
I am confused, what is wrong with having a mutable ref counted wrapper type that can carry an immutable payload?
Jun 21 2015
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 6/21/15 8:43 PM, Tofu Ninja wrote:
 On Monday, 22 June 2015 at 02:54:27 UTC, Andrei Alexandrescu wrote:
 All good points. I think we need to accept that immutable objects must
 rely on the garbage collector; reference counting them is not accepted
 in D's type system. -- Andrei
I am confused, what is wrong with having a mutable ref counted wrapper type that can carry an immutable payload?
It doesn't compose - e.g. soon as you want to make an immutable list of lists it comes unglued. -- Andrei
Jun 21 2015
parent "Tofu Ninja" <emmons0 purdue.edu> writes:
On Monday, 22 June 2015 at 04:12:29 UTC, Andrei Alexandrescu 
wrote:
 On 6/21/15 8:43 PM, Tofu Ninja wrote:
 On Monday, 22 June 2015 at 02:54:27 UTC, Andrei Alexandrescu 
 wrote:
 All good points. I think we need to accept that immutable 
 objects must
 rely on the garbage collector; reference counting them is not 
 accepted
 in D's type system. -- Andrei
I am confused, what is wrong with having a mutable ref counted wrapper type that can carry an immutable payload?
It doesn't compose - e.g. soon as you want to make an immutable list of lists it comes unglued. -- Andrei
Hmmm.... I see, I guess I understand better why this is a problem. Is there no way to do immutable container with mutable payload? Is there no way to stop the transitive nature of immutable/const? This seems like it would be a problem for any sort of container type, what sort of problems would happen if there was a way to stop transitive qualifiers from being applied to members?
Jun 21 2015
prev sibling parent "Ulrich =?UTF-8?B?S8O8dHRsZXIi?= <kuettler gmail.com> writes:
On Saturday, 20 June 2015 at 00:07:12 UTC, Andrei Alexandrescu 
wrote:
 I've gained a crapton of insight while working on collections. 
 It's amazing.

 One interesting aspect is the interaction of mutable and 
 functional collections with the type qualifiers "const" and 
 "immutable". I managed to navigate around issues quite nicely, 
 with two exceptions:

 1. Reference counting: it's mutation underneath an immutable 
 appearance. For a good while I'd been uncomfortable about that, 
 until I figured that this is the "systems" part of D. Other 
 languages do use reference counting, but at the compiler level 
 which allows cheating the type system. It is somewhat fresh to 
 attempt a principled implementation of both reference counting 
 and safe functional collections, simultaneously and at library 
 level.

 My conclusion is that changing the reference count in an 
 otherwise immutable structure is an entirely reasonable thing 
 to want do. We need a way to explain the type system "even 
 though the payload is const or immutable, I'll change this 
 particular uint so please don't do any optimizations that would 
 invalidate the cast". The user is responsible for e.g. atomic 
 reference counting in immutable data etc.
A similar situation is hashing: Functional (immutable) collections could be used as hash keys. If you want to use non-trivial hash functions in a pure immutable collection, you either calculate up front "just in case" or reevaluate each time. Caching (lazy evaluation) is off the table. Then again, maybe you navigated around that one already.
Jun 22 2015