www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - One awesome GC feature we will use in Mir!

reply 9il <ilyayaroshenko gmail.com> writes:
I just remember that D's GC has NO_SCAN [1] attribute!

This will be added by default when for Mir allocations if type 
representation tuple has not references. For example, are 
Slice!(double*, 2) should never be scanned by GC, but it will be 
in GC heap until something refers it.

Let me know if you have ideas how to further improve memory 
management and required API in Mir and Lubeck.

[1] https://dlang.org/phobos/core_memory.html#.GC.BlkAttr.NO_SCAN

Best,
Ilya
Sep 18 2018
next sibling parent reply =?UTF-8?B?Tm9yZGzDtnc=?= <per.nordlow gmail.com> writes:
On Tuesday, 18 September 2018 at 14:23:44 UTC, 9il wrote:
 I just remember that D's GC has NO_SCAN [1] attribute!

 This will be added by default when for Mir allocations if type 
 representation tuple has not references. For example, are 
 Slice!(double*, 2) should never be scanned by GC, but it will 
 be in GC heap until something refers it.

 Let me know if you have ideas how to further improve memory 
 management and required API in Mir and Lubeck.

 [1] 
 https://dlang.org/phobos/core_memory.html#.GC.BlkAttr.NO_SCAN

 Best,
 Ilya
Can you elaborate on why this is the case?
Sep 18 2018
parent reply 9il <ilyayaroshenko gmail.com> writes:
On Tuesday, 18 September 2018 at 15:55:38 UTC, Nordlöw wrote:
 On Tuesday, 18 September 2018 at 14:23:44 UTC, 9il wrote:
 I just remember that D's GC has NO_SCAN [1] attribute!

 This will be added by default when for Mir allocations if type 
 representation tuple has not references. For example, are 
 Slice!(double*, 2) should never be scanned by GC, but it will 
 be in GC heap until something refers it.

 Let me know if you have ideas how to further improve memory 
 management and required API in Mir and Lubeck.

 [1] 
 https://dlang.org/phobos/core_memory.html#.GC.BlkAttr.NO_SCAN

 Best,
 Ilya
Can you elaborate on why this is the case?
Mir users work with time-series, matrixes, tensors. A lot of numeric and scientific data. Almost all structures are plain. mir.series is used instead of associative arrays. Associative arrays are used only to define data set, and then AA converted to Series of immutable (represented as two arrays). In practice 99% of data are plain arrays, and ~80% of this arrays are arrays composed of doubles, ints, or POD structs. Such types does not contains references to other GC allocated memory. So, we can reduce GC latency 5 times for production code. If a user allocates new double[], GC will scan whole array memory, because it is assumed that user may reuse this memory for types that have references. So, the main idea, is that if one allocates a Matrix of doubles, then just turn off scanning of its internal data. Casting from array of doubles to say strings is not safe, so we have the language instrument to prevent memory leaks for user code. As side effect this will reduce false pointers too.
Sep 18 2018
parent reply Vladimir Panteleev <thecybershadow.lists gmail.com> writes:
On Tuesday, 18 September 2018 at 16:15:45 UTC, 9il wrote:
 If a user allocates new double[], GC will scan whole array 
 memory, because it is assumed that user may reuse this memory 
 for types that have references.
Are you sure? That doesn't sound right. I know this is the case for void[] - even though you can't put pointers in it, it could hold "anything", so the GC strays on the safe side and assumes it has pointers. There was (is?) a problem with e.g. std.file.read("a") ~ std.file.read("b") - even though read marked the memory as not containing pointers, the result of concatenation is a new void[], which the GC thinks might contain pointers.
Sep 18 2018
parent reply 9il <ilyayaroshenko gmail.com> writes:
On Tuesday, 18 September 2018 at 16:29:30 UTC, Vladimir Panteleev 
wrote:
 On Tuesday, 18 September 2018 at 16:15:45 UTC, 9il wrote:
 If a user allocates new double[], GC will scan whole array 
 memory, because it is assumed that user may reuse this memory 
 for types that have references.
Are you sure? That doesn't sound right. I know this is the case for void[] - even though you can't put pointers in it, it could hold "anything", so the GC strays on the safe side and assumes it has pointers. There was (is?) a problem with e.g. std.file.read("a") ~ std.file.read("b") - even though read marked the memory as not containing pointers, the result of concatenation is a new void[], which the GC thinks might contain pointers.
Thanks! Is there is information about how GC set flags for `new` on the site?
Sep 18 2018
next sibling parent 9il <ilyayaroshenko gmail.com> writes:
On Tuesday, 18 September 2018 at 17:21:17 UTC, 9il wrote:
 On Tuesday, 18 September 2018 at 16:29:30 UTC, Vladimir Thanks! 
 Is there is information about how GC set flags for `new` on the 
 site?
Today my English is so bad =/ sorry
Sep 18 2018
prev sibling parent Vladimir Panteleev <thecybershadow.lists gmail.com> writes:
On Tuesday, 18 September 2018 at 17:21:17 UTC, 9il wrote:
 Thanks! Is there is information about how GC set flags for 
 `new` on the site?
I think it's something like this: The compiler lowers `new T[]` to _d_newarrayT or _d_newarrayiT [1]. These functions get a TypeInfo as a parameter. The actual TypeInfo object is generated by the compiler [2]. __arrayAlloc (called by _d_newarray*) then sets BlkAttr.NO_SCAN depending on what's in TypeInfo.flags. [3] [1]: https://github.com/dlang/druntime/blob/542b680f2c2e09e7f4b494898437c61216583fa5/src/rt/lifetime.d#L966-L1014 [2]: https://github.com/dlang/dmd/blob/3adcc9e4a0813d26725bb3c9e747ef8d4a2a8296/src/dmd/typinf.d#L35 [3]: https://github.com/dlang/druntime/blob/542b680f2c2e09e7f4b494898437c61216583fa5/src/rt/lifetime.d#L425
Sep 18 2018
prev sibling next sibling parent reply Vladimir Panteleev <thecybershadow.lists gmail.com> writes:
On Tuesday, 18 September 2018 at 14:23:44 UTC, 9il wrote:
 I just remember that D's GC has NO_SCAN [1] attribute!

 This will be added by default when for Mir allocations if type 
 representation tuple has not references. For example, are 
 Slice!(double*, 2) should never be scanned by GC, but it will 
 be in GC heap until something refers it.
Not sure if this is what you mean or not, but the D GC already doesn't scan types which do not contain references. This was added in D 1.000, see TypeInfo.flags&1. NO_SCAN is a way to further override that. If you mean that Slice itself (when on the heap) should not be scanned by the GC, I'm not sure that's a good idea. Is it not conceivable that a Slice would be the only reference left pointing at a block of memory in the heap?
Sep 18 2018
parent 9il <ilyayaroshenko gmail.com> writes:
On Tuesday, 18 September 2018 at 16:19:23 UTC, Vladimir Panteleev 
wrote:
 On Tuesday, 18 September 2018 at 14:23:44 UTC, 9il wrote:
 I just remember that D's GC has NO_SCAN [1] attribute!

 This will be added by default when for Mir allocations if type 
 representation tuple has not references. For example, are 
 Slice!(double*, 2) should never be scanned by GC, but it will 
 be in GC heap until something refers it.
Not sure if this is what you mean or not, but the D GC already doesn't scan types which do not contain references. This was added in D 1.000, see TypeInfo.flags&1. NO_SCAN is a way to further override that.
Ah, awesome! Did not know about it. Need to review all allocations this allocations in Mir anyway.
 If you mean that Slice itself (when on the heap) should not be 
 scanned by the GC, I'm not sure that's a good idea. Is it not 
 conceivable that a Slice would be the only reference left 
 pointing at a block of memory in the heap?
Sure, Slice with GC allocated pointer should be referenced (btw, it is struct).
Sep 18 2018
prev sibling parent reply Per =?UTF-8?B?Tm9yZGzDtnc=?= <per.nordlow gmail.com> writes:
On Tuesday, 18 September 2018 at 14:23:44 UTC, 9il wrote:
 I just remember that D's GC has NO_SCAN [1] attribute!
I thought D libraries like Mir and Lubeck only had to care about when to call GC.addRange after allocations that contain pointers to GC-backed storage and GC.removeRange before their corresponding deallocations. But that's perhaps only when using non-GC-backed allocators (not using new), right?
Sep 18 2018
parent Petar Kirov [ZombineDev] <petar.p.kirov gmail.com> writes:
On Tuesday, 18 September 2018 at 23:01:46 UTC, Per Nordlöw wrote:
 On Tuesday, 18 September 2018 at 14:23:44 UTC, 9il wrote:
 I just remember that D's GC has NO_SCAN [1] attribute!
I thought D libraries like Mir and Lubeck only had to care about when to call GC.addRange after allocations that contain pointers to GC-backed storage and GC.removeRange before their corresponding deallocations. But that's perhaps only when using non-GC-backed allocators (not using new), right?
GC.addRange and GC.removeRange are necessary to mark memory blocks allocated outside of the GC when they may contain pointers to memory allocated by the GC. For example, if you have a nogc container, the array backing the container can allocated vie libc's malloc. The users are free push elements allocated by the GC on this container, so to be safe, the container must use GC.addRange to tell the GC that the array may point to such elements. When the array grows, shrinks or the whole container is destoryed, the array needs to be passed to GC.removeRange, so the GC would be prevented from scanning freed memory. When you use the GC it automatically does the book keeping for you. It's only when you manually manage memory that you need to be careful whether this memory points to GC objects.
Sep 19 2018