digitalmars.D - GC.sizeOf(array.ptr)
- Dicebot (33/33) Sep 30 2014 There is one issue I have encountered during CDGC porting I may
- Steven Schveighoffer (21/51) Sep 30 2014 This is accurate, it stores the "used" size of the array. But it's only
- Dicebot (6/20) Sep 30 2014 Why is it stored in the beginning and not in the end of the block
- Steven Schveighoffer (32/49) Sep 30 2014 First, it is the capacity. It's just that the capacity lives at the
- Dicebot (12/74) Sep 30 2014 Ah that must be what confused me - I looked at small block offset
- Steven Schveighoffer (6/11) Sep 30 2014 I think it is possible, and perhaps more correct, to put the array
- Sean Kelly (11/20) Sep 30 2014 Yes, a lot of this is an artifact of the relatively simplistic
- Dicebot (7/31) Sep 30 2014 I decided to add a similar workaround to CDGC for now and fix the
- ketmar via Digitalmars-d (3/7) Sep 30 2014 wow! no, really, what else can i say? ;-)
- Sean Kelly (10/11) Sep 30 2014 Yes. As far as the GC is concerned, asking for the size of an
- Steven Schveighoffer (9/13) Sep 30 2014 The APPENDABLE bit was added as a solution to avoid having to reserve
- Sean Kelly (8/21) Sep 30 2014 Yeah I know. But when I defined the BlkAttr bitfield I'd
- Steven Schveighoffer (8/31) Sep 30 2014 Hm... looked at the code, I have no idea how the GC would handle
- Sean Kelly (8/14) Sep 30 2014 It wouldn't. The GC was just going to provide storage for some
There is one issue I have encountered during CDGC porting I may need advice with. Consider this simple snippet: ``` import core.memory; void main() { ubyte[] result; result.length = 4096; assert(GC.sizeOf(result.ptr) > 0); } `` Assertion passes with D1/Tango runtime but fails with current D2 runtime. This happens because `result.ptr` is not actually a pointer returned by gc_qalloc from array reallocation, but interior pointer 16 bytes from the start of that block. Druntime stores some metadata (length/capacity I presume) in the very beginning. As a result GC.sizeOf(array.ptr) results in 0 (being an interior pointer). Interesting side effect is that this code: ``` void main() { ubyte[] result; result.length = 4096; GC.free(result.ptr); } ``` ..does not actually free anything for the very same reason (result.ptr is interior pointer), it just silently succeeds doing nothing. Is such behaviour intended?
Sep 30 2014
On 9/30/14 9:42 AM, Dicebot wrote:There is one issue I have encountered during CDGC porting I may need advice with. Consider this simple snippet: ``` import core.memory; void main() { ubyte[] result; result.length = 4096; assert(GC.sizeOf(result.ptr) > 0); } `` Assertion passes with D1/Tango runtime but fails with current D2 runtime. This happens because `result.ptr` is not actually a pointer returned by gc_qalloc from array reallocation, but interior pointer 16 bytes from the start of that block. Druntime stores some metadata (length/capacity I presume) in the very beginning.This is accurate, it stores the "used" size of the array. But it's only the case for arrays, not general GC.malloc blocks. Alternative is to use result.capacity, which essentially looks up the same thing (and should be more accurate). But it doesn't cover the same inputs. Trivial note that you only need to allocate 2047 bytes to get this behavior.As a result GC.sizeOf(array.ptr) results in 0 (being an interior pointer). Interesting side effect is that this code: ``` void main() { ubyte[] result; result.length = 4096; GC.free(result.ptr); } ``` ..does not actually free anything for the very same reason (result.ptr is interior pointer), it just silently succeeds doing nothing.This is a problem. We should provide a definitive way to dealloc an array, if it's not already present (assuming delete goes away).Is such behaviour intended?Depends on what part you are talking about ;) But kidding aside, when I added the array appending update, I didn't intend to affect these functions, nor did I consider the cases above. So it was not intended to break these things, but I think we can repair them, or at least provide alternate means to create the same result. Interestingly, I think any lookup of block information for an interior pointer costs the same as looking up block info for a base pointer. I think adding a flag that allows performing operations on interior pointers might make this more palatable. But we absolutely need a way to free an array that does not require the knowledge of the inner workings of the runtime. -Steve
Sep 30 2014
On Tuesday, 30 September 2014 at 14:01:17 UTC, Steven Schveighoffer wrote:Why is it stored in the beginning and not in the end of the block (like capacity)? I'd like to explore options of removing interior pointer completely before proceeding with adding more special cases to GC functions.Assertion passes with D1/Tango runtime but fails with current D2 runtime. This happens because `result.ptr` is not actually a pointer returned by gc_qalloc from array reallocation, but interior pointer 16 bytes from the start of that block. Druntime stores some metadata (length/capacity I presume) in the very beginning.This is accurate, it stores the "used" size of the array. But it's only the case for arrays, not general GC.malloc blocks. Alternative is to use result.capacity, which essentially looks up the same thing (and should be more accurate). But it doesn't cover the same inputs.
Sep 30 2014
On 9/30/14 10:24 AM, Dicebot wrote:On Tuesday, 30 September 2014 at 14:01:17 UTC, Steven Schveighoffer wrote:First, it is the capacity. It's just that the capacity lives at the beginning of larger blocks. The reason is due to the ability to extend pages. With smaller blocks (2048 bytes or less), the page is divided into equal portions, and those can NEVER be extended. Any attempt to extend results in a realloc into another block. Putting the capacity at the end makes sense for 2 reasons: 1. 1 byte is already reserved to prevent cross-block pointers, 2. It doesn't cause alignment issues. We can't very well offset a 16 byte block by 16 bytes. But importantly, the capacity field does not move. However, for page and above size (4096+ bytes), the original (D1 and early D2) runtime would attempt to extend into the next page, without moving the data. Thus we save the copy of data into a new block, and just set some bits and we're done. But this poses a problem for when the capacity field is stored at the end -- especially since we are caching the block info. The block info can change with a call to GC.extend (whereas a fixed-size block, the block info CANNOT change). Depending on what "version" of the block info you have, the "end" can be different, and you may end up corrupting data. This is especially important for shared or immutable array blocks, where multiple threads could be appending at the same time. So I made the call to put it at the beginning of the block, which obviously doesn't change, and offset everything by 16 bytes to maintain alignment. It may very well be that we can put it at the end of the block instead, and you can probably do so without much effort in the runtime (everything uses CTFE functions to calculate padding and location of the capacity). It has been such a long time since I did that, I'm not very sure of all the reasons not to do it. A look through the mailing list archives might be useful. -SteveWhy is it stored in the beginning and not in the end of the block (like capacity)? I'd like to explore options of removing interior pointer completely before proceeding with adding more special cases to GC functions.Assertion passes with D1/Tango runtime but fails with current D2 runtime. This happens because `result.ptr` is not actually a pointer returned by gc_qalloc from array reallocation, but interior pointer 16 bytes from the start of that block. Druntime stores some metadata (length/capacity I presume) in the very beginning.This is accurate, it stores the "used" size of the array. But it's only the case for arrays, not general GC.malloc blocks. Alternative is to use result.capacity, which essentially looks up the same thing (and should be more accurate). But it doesn't cover the same inputs.
Sep 30 2014
On Tuesday, 30 September 2014 at 15:46:54 UTC, Steven Schveighoffer wrote:On 9/30/14 10:24 AM, Dicebot wrote:Ah that must be what confused me - I looked at small block offset calculation originally and blindly assumed same logic for other sizes. Sorry, my fault!On Tuesday, 30 September 2014 at 14:01:17 UTC, Steven Schveighoffer wrote:First, it is the capacity. It's just that the capacity lives at the beginning of larger blocks. The reason is due to the ability to extend pages. With smaller blocks (2048 bytes or less), the page is divided into equal portions, and those can NEVER be extended. Any attempt to extend results in a realloc into another block. Putting the capacity at the end makes sense for 2 reasons: 1. 1 byte is already reserved to prevent cross-block pointers, 2. It doesn't cause alignment issues. We can't very well offset a 16 byte block by 16 bytes. But importantly, the capacity field does not move. However, for page and above size (4096+ bytes), the original (D1 and early D2) runtime would attempt to extend into the next page, without moving the data. Thus we save the copy of data into a new block, and just set some bits and we're done.Why is it stored in the beginning and not in the end of the block (like capacity)? I'd like to explore options of removing interior pointer completely before proceeding with adding more special cases to GC functions.Assertion passes with D1/Tango runtime but fails with current D2 runtime. This happens because `result.ptr` is not actually a pointer returned by gc_qalloc from array reallocation, but interior pointer 16 bytes from the start of that block. Druntime stores some metadata (length/capacity I presume) in the very beginning.This is accurate, it stores the "used" size of the array. But it's only the case for arrays, not general GC.malloc blocks. Alternative is to use result.capacity, which essentially looks up the same thing (and should be more accurate). But it doesn't cover the same inputs.But this poses a problem for when the capacity field is stored at the end -- especially since we are caching the block info. The block info can change with a call to GC.extend (whereas a fixed-size block, the block info CANNOT change). Depending on what "version" of the block info you have, the "end" can be different, and you may end up corrupting data. This is especially important for shared or immutable array blocks, where multiple threads could be appending at the same time. So I made the call to put it at the beginning of the block, which obviously doesn't change, and offset everything by 16 bytes to maintain alignment. It may very well be that we can put it at the end of the block instead, and you can probably do so without much effort in the runtime (everything uses CTFE functions to calculate padding and location of the capacity). It has been such a long time since I did that, I'm not very sure of all the reasons not to do it. A look through the mailing list archives might be useful.I think it should be possible. That way actual block size will be simply considered a bit smaller and extending happen before reserved space is hit. But of course I have only a very vague knowledge of druntime ackquired while porting cdgc so may need to think about it a bit more and probably chat with Leandro too :) Have created bugzilla issue for now : https://issues.dlang.org/show_bug.cgi?id=13558
Sep 30 2014
On 9/30/14 12:01 PM, Dicebot wrote:I think it should be possible. That way actual block size will be simply considered a bit smaller and extending happen before reserved space is hit. But of course I have only a very vague knowledge of druntime ackquired while porting cdgc so may need to think about it a bit more and probably chat with Leandro too :)I think it is possible, and perhaps more correct, to put the array append size into a separate memory block for larger sizes. The placement at the end works very well and has good properties for smaller sizes. This is similar to how the flags work. -Steve
Sep 30 2014
On Tuesday, 30 September 2014 at 15:46:54 UTC, Steven Schveighoffer wrote:So I made the call to put it at the beginning of the block, which obviously doesn't change, and offset everything by 16 bytes to maintain alignment. It may very well be that we can put it at the end of the block instead, and you can probably do so without much effort in the runtime (everything uses CTFE functions to calculate padding and location of the capacity). It has been such a long time since I did that, I'm not very sure of all the reasons not to do it. A look through the mailing list archives might be useful.Yes, a lot of this is an artifact of the relatively simplistic manner that the current GC tracks memory. If large blocks had a header, for example, then this could theoretically live there and not cause any problems. As we move towards supporting precise scanning, the GC will need to be aware of the types of data it holds, and so some portion of the array appendability strategy should probably migrate into the GC. A redefinition of the GC interface is many years overdue. This just needs to be considered when it happens.
Sep 30 2014
On Tuesday, 30 September 2014 at 17:28:02 UTC, Sean Kelly wrote:On Tuesday, 30 September 2014 at 15:46:54 UTC, Steven Schveighoffer wrote:I decided to add a similar workaround to CDGC for now and fix the way it is stored in druntime a bit later :) On a related note : CDGC D2 port passes druntime test suite starting with today (with only shared library tests disabled), so initial upstream PR should happen very soon (just need to clean up all trace and "omg hax" commits :))So I made the call to put it at the beginning of the block, which obviously doesn't change, and offset everything by 16 bytes to maintain alignment. It may very well be that we can put it at the end of the block instead, and you can probably do so without much effort in the runtime (everything uses CTFE functions to calculate padding and location of the capacity). It has been such a long time since I did that, I'm not very sure of all the reasons not to do it. A look through the mailing list archives might be useful.Yes, a lot of this is an artifact of the relatively simplistic manner that the current GC tracks memory. If large blocks had a header, for example, then this could theoretically live there and not cause any problems. As we move towards supporting precise scanning, the GC will need to be aware of the types of data it holds, and so some portion of the array appendability strategy should probably migrate into the GC. A redefinition of the GC interface is many years overdue. This just needs to be considered when it happens.
Sep 30 2014
On Tue, 30 Sep 2014 17:43:04 +0000 Dicebot via Digitalmars-d <digitalmars-d puremagic.com> wrote:On a related note : CDGC D2 port passes druntime test suite=20 starting with today (with only shared library tests disabled), so=20 initial upstream PR should happen very soon (just need to clean=20 up all trace and "omg hax" commits :))wow! no, really, what else can i say? ;-)
Sep 30 2014
On Tuesday, 30 September 2014 at 13:42:14 UTC, Dicebot wrote:Is such behaviour intended?Yes. As far as the GC is concerned, asking for the size of an interior pointer is asking for the size of a slice of the data, and as slices are not extendable it will always return 0. All of the APPENDABLE stuff takes place outside the GC (except for the definition of the APPENDABLE BlkAttr, which really should be defined externally and within the user-reserved range of the bitfield instead of the GC-reserved range, but I digress...) and so it has no way of knowing that someone is using the blocks this way.
Sep 30 2014
On 9/30/14 1:23 PM, Sean Kelly wrote:(except for the definition of the APPENDABLE BlkAttr, which really should be defined externally and within the user-reserved range of the bitfield instead of the GC-reserved range, but I digress...)The APPENDABLE bit was added as a solution to avoid having to reserve that memory for all allocations. Basically, if you try to append to a block that doesn't have the bit, it simply reallocates conservatively. So it does have to be part of the GC metadata, because the most important effect is on blocks that AREN'T allocated via the array runtime. Otherwise, the append mechanism creeps into all aspects of memory allocation. -Steve
Sep 30 2014
On Tuesday, 30 September 2014 at 17:51:18 UTC, Steven Schveighoffer wrote:On 9/30/14 1:23 PM, Sean Kelly wrote:Yeah I know. But when I defined the BlkAttr bitfield I'd reserved one portion of the range for internal GC stuff and another portion for user-defined stuff. APPENDABLE should probably have landed in the user-defined portion. I don't see any of those comments in the current code or I'd point to them. I guess they were deleted at some point.(except for the definition of the APPENDABLE BlkAttr, which really should be defined externally and within the user-reserved range of the bitfield instead of the GC-reserved range, but I digress...)The APPENDABLE bit was added as a solution to avoid having to reserve that memory for all allocations. Basically, if you try to append to a block that doesn't have the bit, it simply reallocates conservatively. So it does have to be part of the GC metadata, because the most important effect is on blocks that AREN'T allocated via the array runtime. Otherwise, the append mechanism creeps into all aspects of memory allocation.
Sep 30 2014
On 9/30/14 2:19 PM, Sean Kelly wrote:On Tuesday, 30 September 2014 at 17:51:18 UTC, Steven Schveighoffer wrote:Hm... looked at the code, I have no idea how the GC would handle user-defined stuff. It seems to only deal with bits it knows about (i.e. APPENDABLE is specifically handled in gc/gc.d) I think I just took the next bit available, if there was a warning about GC internal bits when I added APPENDABLE, I either missed it or dismissed it. -SteveOn 9/30/14 1:23 PM, Sean Kelly wrote:Yeah I know. But when I defined the BlkAttr bitfield I'd reserved one portion of the range for internal GC stuff and another portion for user-defined stuff. APPENDABLE should probably have landed in the user-defined portion. I don't see any of those comments in the current code or I'd point to them. I guess they were deleted at some point.(except for the definition of the APPENDABLE BlkAttr, which really should be defined externally and within the user-reserved range of the bitfield instead of the GC-reserved range, but I digress...)The APPENDABLE bit was added as a solution to avoid having to reserve that memory for all allocations. Basically, if you try to append to a block that doesn't have the bit, it simply reallocates conservatively. So it does have to be part of the GC metadata, because the most important effect is on blocks that AREN'T allocated via the array runtime. Otherwise, the append mechanism creeps into all aspects of memory allocation.
Sep 30 2014
On Tuesday, 30 September 2014 at 19:19:18 UTC, Steven Schveighoffer wrote:Hm... looked at the code, I have no idea how the GC would handle user-defined stuff. It seems to only deal with bits it knows about (i.e. APPENDABLE is specifically handled in gc/gc.d)It wouldn't. The GC was just going to provide storage for some number of user-defined bits.I think I just took the next bit available, if there was a warning about GC internal bits when I added APPENDABLE, I either missed it or dismissed it.With BlkAttr defined independently in a bunch of different places at the time, the comment would have been easy to miss. I don't know that it really matters anyway, but it's something I noticed today.
Sep 30 2014