www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - GC/non-GC memory as part of data type?

reply Gregor =?UTF-8?B?TcO8Y2ts?= <gregormueckl gmx.de> writes:
Hi!

This is an attempt to follow up on the DIP1025 discussion: what 
happens if all pointers/arrays/references carry the origin of the 
pointed to memory region as part of their type? The goal is to 
have a cleaner, more explicit separation of GC and non-GC heaps.

Pointers allocated through malloc() or acquired from external 
code would carry the information that they are non GC-memory as 
part of their type. Let's just designate that with a  native 
attribute for now.

This attribute limits what you can do safely:

- No overwriting a  native pointer address with the result of 
pointer arithmetic. Don't overwrite the pointer that needs to be 
free'd.
- Casting pointers to different types (e.g. to array types) 
transfers the  native-ness of the input pointer
- No assignment between  native and non- native pointer-like 
types. Casting to or from  native is a  system operation.
- ~ and ~= for  native arrays always produce a copy on the GC 
heap and the result is therefore not  native. No in-place 
appending. All other implicit language-level copy operations 
perform similar conversions.
- pointers returned from extern(C)/extern(C++) functions are 
always  native (if that is not the case, then an unsafe wrapper 
must be written)

Unresolved:
-  native-ness could probably be inferred by static analysis 
within a single non-template function. It would have to be 
declared at function interfaces. This grows the attribute zoo.
- Some function can only work on  native pointers, others only on 
non- native ones, a third kind can work with both.
- Pointer arithmetic on  native pointers would need some serious 
lifetime analysis to make that safer.
- No claim of completeness or soundness of these rules is made at 
this time :)
Nov 13
next sibling parent reply James Lu <jamtlu gmail.com> writes:
On Wednesday, 13 November 2019 at 15:19:38 UTC, Gregor Mückl 
wrote:
 Hi!

 This is an attempt to follow up on the DIP1025 discussion: what 
 happens if all pointers/arrays/references carry the origin of 
 the pointed to memory region as part of their type? The goal is 
 to have a cleaner, more explicit separation of GC and non-GC 
 heaps.

 [...]
We could add nogc types. That indicates to the implementation that the memory held by that type MAY be ignored by the GC and that that any pointers held in that type MUST NOT be moved by a moving GC. Alternatively, "special" pointers could have a .toPointer function that converts itself into the memory address they mean. Applications include XOR pointers.
Nov 13
next sibling parent Gregor =?UTF-8?B?TcO8Y2ts?= <gregormueckl gmx.de> writes:
On Wednesday, 13 November 2019 at 15:25:46 UTC, James Lu wrote:
 On Wednesday, 13 November 2019 at 15:19:38 UTC, Gregor Mückl 
 wrote:
 Hi!

 This is an attempt to follow up on the DIP1025 discussion: 
 what happens if all pointers/arrays/references carry the 
 origin of the pointed to memory region as part of their type? 
 The goal is to have a cleaner, more explicit separation of GC 
 and non-GC heaps.

 [...]
We could add nogc types. That indicates to the implementation that the memory held by that type MAY be ignored by the GC and that that any pointers held in that type MUST NOT be moved by a moving GC. Alternatively, "special" pointers could have a .toPointer function that converts itself into the memory address they mean. Applications include XOR pointers.
Haven't thought about moving GC allocated memory yet. Does the current D spec even allow this? The way you're allowed to take pointers to GC memory and store them in places unknown to the GC should prevent it.
Nov 14
prev sibling parent Nick Treleaven <nick geany.org> writes:
On Wednesday, 13 November 2019 at 15:25:46 UTC, James Lu wrote:
 We could add  nogc types. That indicates to the implementation
 that the memory held by that type MAY be ignored by the GC and 
 that
 that any pointers held in that type MUST NOT be moved by a 
 moving
 GC.
But a moving GC would only update a pointer that pointed to GC allocated memory. nogc on a field declaration could be useful.
Nov 15
prev sibling next sibling parent reply Patrick Schluter <Patrick.Schluter bbox.fr> writes:
On Wednesday, 13 November 2019 at 15:19:38 UTC, Gregor Mückl 
wrote:
 Hi!

 This is an attempt to follow up on the DIP1025 discussion: what 
 happens if all pointers/arrays/references carry the origin of 
 the pointed to memory region as part of their type? The goal is 
 to have a cleaner, more explicit separation of GC and non-GC 
 heaps.

 [...]
In fact, it's more or less just const/immutability for the pointer, isn't it?
Nov 14
parent Gregor =?UTF-8?B?TcO8Y2ts?= <gregormueckl gmx.de> writes:
On Thursday, 14 November 2019 at 14:01:50 UTC, Patrick Schluter 
wrote:
 On Wednesday, 13 November 2019 at 15:19:38 UTC, Gregor Mückl 
 wrote:
 Hi!

 This is an attempt to follow up on the DIP1025 discussion: 
 what happens if all pointers/arrays/references carry the 
 origin of the pointed to memory region as part of their type? 
 The goal is to have a cleaner, more explicit separation of GC 
 and non-GC heaps.

 [...]
In fact, it's more or less just const/immutability for the pointer, isn't it?
No, const is different from what I'm trying to describe. You can currently get a const pointer to GC allocated memory and still pass that to free(), for example. If there were a distinction between native and GC pointers, this would be impossible without an explicit, unsafe cast.
Nov 14
prev sibling next sibling parent reply jmh530 <john.michael.hall gmail.com> writes:
On Wednesday, 13 November 2019 at 15:19:38 UTC, Gregor Mückl 
wrote:
 Hi!

 This is an attempt to follow up on the DIP1025 discussion: what 
 happens if all pointers/arrays/references carry the origin of 
 the pointed to memory region as part of their type? The goal is 
 to have a cleaner, more explicit separation of GC and non-GC 
 heaps.
 [snip]
I'm sure there's a lot that I haven't considered, but it seems like that should be possible just that you might have to write up some of your own functionality. You could do something like below and then write some different versions of malloc, etc, and GC allocation functions and anything that uses the GC (like dynamic arrays), and probably your own ref too. import std.traits : isPointer; enum AllocStrategy { GC, malloc, other } struct Ptr(T, AllocStrategy allocStrategy) if (isPointer!T) { T x; alias x this; } void main() { int x = 1; auto y = Ptr!(int*, AllocStrategy.GC)(new int(x)); assert(*y == 1); }
Nov 14
parent Gregor =?UTF-8?B?TcO8Y2ts?= <gregormueckl gmx.de> writes:
On Thursday, 14 November 2019 at 20:06:53 UTC, jmh530 wrote:
 On Wednesday, 13 November 2019 at 15:19:38 UTC, Gregor Mückl 
 wrote:
 Hi!

 This is an attempt to follow up on the DIP1025 discussion: 
 what happens if all pointers/arrays/references carry the 
 origin of the pointed to memory region as part of their type? 
 The goal is to have a cleaner, more explicit separation of GC 
 and non-GC heaps.
 [snip]
I'm sure there's a lot that I haven't considered, but it seems like that should be possible just that you might have to write up some of your own functionality. You could do something like below and then write some different versions of malloc, etc, and GC allocation functions and anything that uses the GC (like dynamic arrays), and probably your own ref too. import std.traits : isPointer; enum AllocStrategy { GC, malloc, other } struct Ptr(T, AllocStrategy allocStrategy) if (isPointer!T) { T x; alias x this; } void main() { int x = 1; auto y = Ptr!(int*, AllocStrategy.GC)(new int(x)); assert(*y == 1); }
Pushing the problem into a library type doesn't really solve anything. Any interesting library that you might want to use will not use these wrappers. If you unwrap the pointers (or worse - your slices) and pass them in, you lose all information about the guarantees that should be tracked. Remember that you don't have control over how that external library reallocates memory for data that you pass in, e.g. when it appends to a slice.
Nov 14
prev sibling parent Walter Bright <newshound2 digitalmars.com> writes:
There are many ways to manage memory, and a non-trivial program will often use 
multiple methods:

1. gc
2. stack
3. static
4. malloc/free
5. reference counting
6. various custom allocators

Using type constructors to create two categories, gc and all the others, in the 
end is not particularly helpful because the number of categories is unbounded 
and reliance on such a system would require supporting all categories.
Nov 15