digitalmars.D - GC.sizeOf(array.ptr)

Dicebot (33/33) Sep 30 2014 There is one issue I have encountered during CDGC porting I may

Steven Schveighoffer (21/51) Sep 30 2014 This is accurate, it stores the "used" size of the array. But it's only

Dicebot (6/20) Sep 30 2014 Why is it stored in the beginning and not in the end of the block

Steven Schveighoffer (32/49) Sep 30 2014 First, it is the capacity. It's just that the capacity lives at the

Dicebot (12/74) Sep 30 2014 Ah that must be what confused me - I looked at small block offset

Steven Schveighoffer (6/11) Sep 30 2014 I think it is possible, and perhaps more correct, to put the array

Sean Kelly (11/20) Sep 30 2014 Yes, a lot of this is an artifact of the relatively simplistic

Dicebot (7/31) Sep 30 2014 I decided to add a similar workaround to CDGC for now and fix the

ketmar via Digitalmars-d (3/7) Sep 30 2014 wow! no, really, what else can i say? ;-)

Sean Kelly (10/11) Sep 30 2014 Yes. As far as the GC is concerned, asking for the size of an

Steven Schveighoffer (9/13) Sep 30 2014 The APPENDABLE bit was added as a solution to avoid having to reserve

Sean Kelly (8/21) Sep 30 2014 Yeah I know. But when I defined the BlkAttr bitfield I'd

Steven Schveighoffer (8/31) Sep 30 2014 Hm... looked at the code, I have no idea how the GC would handle

Sean Kelly (8/14) Sep 30 2014 It wouldn't. The GC was just going to provide storage for some

"Dicebot" <public dicebot.lv> writes:

There is one issue I have encountered during CDGC porting I may 
need advice with.
Consider this simple snippet:

```
import core.memory;

void main()
{
     ubyte[] result;
     result.length = 4096;
     assert(GC.sizeOf(result.ptr) > 0);
}
``

Assertion passes with D1/Tango runtime but fails with current D2 
runtime. This happens because `result.ptr` is not actually a 
pointer returned by gc_qalloc from array reallocation, but 
interior pointer 16 bytes from the start of that block. Druntime 
stores some metadata (length/capacity I presume) in the very 
beginning.

As a result GC.sizeOf(array.ptr) results in 0 (being an interior 
pointer).

Interesting side effect is that this code:

```
void main()
{
     ubyte[] result;
     result.length = 4096;
     GC.free(result.ptr);
}
```

..does not actually free anything for the very same reason 
(result.ptr is interior pointer), it just silently succeeds doing 
nothing.

Is such behaviour intended?

Sep 30 2014

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 9/30/14 9:42 AM, Dicebot wrote:
 There is one issue I have encountered during CDGC porting I may need
 advice with.
 Consider this simple snippet:

 ```
 import core.memory;

 void main()
 {
      ubyte[] result;
      result.length = 4096;
      assert(GC.sizeOf(result.ptr) > 0);
 }
 ``

 Assertion passes with D1/Tango runtime but fails with current D2
 runtime. This happens because `result.ptr` is not actually a pointer
 returned by gc_qalloc from array reallocation, but interior pointer 16
 bytes from the start of that block. Druntime stores some metadata
 (length/capacity I presume) in the very beginning.

This is accurate, it stores the "used" size of the array. But it's only 
the case for arrays, not general GC.malloc blocks.

Alternative is to use result.capacity, which essentially looks up the 
same thing (and should be more accurate). But it doesn't cover the same 
inputs.

Trivial note that you only need to allocate 2047 bytes to get this behavior.

 As a result GC.sizeOf(array.ptr) results in 0 (being an interior pointer).

 Interesting side effect is that this code:

 ```
 void main()
 {
      ubyte[] result;
      result.length = 4096;
      GC.free(result.ptr);
 }
 ```

 ..does not actually free anything for the very same reason (result.ptr
 is interior pointer), it just silently succeeds doing nothing.

This is a problem. We should provide a definitive way to dealloc an 
array, if it's not already present (assuming delete goes away).

 Is such behaviour intended?

Depends on what part you are talking about ;)

But kidding aside, when I added the array appending update, I didn't 
intend to affect these functions, nor did I consider the cases above. So 
it was not intended to break these things, but I think we can repair 
them, or at least provide alternate means to create the same result.

Interestingly, I think any lookup of block information for an interior 
pointer costs the same as looking up block info for a base pointer. I 
think adding a flag that allows performing operations on interior 
pointers might make this more palatable.

But we absolutely need a way to free an array that does not require the 
knowledge of the inner workings of the runtime.

-Steve

Sep 30 2014

"Dicebot" <public dicebot.lv> writes:

On Tuesday, 30 September 2014 at 14:01:17 UTC, Steven 
Schveighoffer wrote:
 Assertion passes with D1/Tango runtime but fails with current 
 D2
 runtime. This happens because `result.ptr` is not actually a 
 pointer
 returned by gc_qalloc from array reallocation, but interior 
 pointer 16
 bytes from the start of that block. Druntime stores some 
 metadata
 (length/capacity I presume) in the very beginning.

 This is accurate, it stores the "used" size of the array. But 
 it's only the case for arrays, not general GC.malloc blocks.

 Alternative is to use result.capacity, which essentially looks 
 up the same thing (and should be more accurate). But it doesn't 
 cover the same inputs.

Why is it stored in the beginning and not in the end of the block 
(like capacity)? I'd like to explore options of removing interior 
pointer completely before proceeding with adding more special 
cases to GC functions.

Sep 30 2014

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 9/30/14 10:24 AM, Dicebot wrote:
 On Tuesday, 30 September 2014 at 14:01:17 UTC, Steven Schveighoffer wrote:
 Assertion passes with D1/Tango runtime but fails with current D2
 runtime. This happens because `result.ptr` is not actually a pointer
 returned by gc_qalloc from array reallocation, but interior pointer 16
 bytes from the start of that block. Druntime stores some metadata
 (length/capacity I presume) in the very beginning.

 This is accurate, it stores the "used" size of the array. But it's
 only the case for arrays, not general GC.malloc blocks.

 Alternative is to use result.capacity, which essentially looks up the
 same thing (and should be more accurate). But it doesn't cover the
 same inputs.

 Why is it stored in the beginning and not in the end of the block (like
 capacity)? I'd like to explore options of removing interior pointer
 completely before proceeding with adding more special cases to GC
 functions.

First, it is the capacity. It's just that the capacity lives at the 
beginning of larger blocks.

The reason is due to the ability to extend pages.

With smaller blocks (2048 bytes or less), the page is divided into equal 
portions, and those can NEVER be extended. Any attempt to extend results 
in a realloc into another block. Putting the capacity at the end makes 
sense for 2 reasons: 1. 1 byte is already reserved to prevent 
cross-block pointers, 2. It doesn't cause alignment issues. We can't 
very well offset a 16 byte block by 16 bytes. But importantly, the 
capacity field does not move.

However, for page and above size (4096+ bytes), the original (D1 and 
early D2) runtime would attempt to extend into the next page, without 
moving the data. Thus we save the copy of data into a new block, and 
just set some bits and we're done.

But this poses a problem for when the capacity field is stored at the 
end -- especially since we are caching the block info. The block info 
can change with a call to GC.extend (whereas a fixed-size block, the 
block info CANNOT change). Depending on what "version" of the block info 
you have, the "end" can be different, and you may end up corrupting 
data. This is especially important for shared or immutable array blocks, 
where multiple threads could be appending at the same time.

So I made the call to put it at the beginning of the block, which 
obviously doesn't change, and offset everything by 16 bytes to maintain 
alignment.

It may very well be that we can put it at the end of the block instead, 
and you can probably do so without much effort in the runtime 
(everything uses CTFE functions to calculate padding and location of the 
capacity). It has been such a long time since I did that, I'm not very 
sure of all the reasons not to do it. A look through the mailing list 
archives might be useful.

-Steve

Sep 30 2014

"Dicebot" <public dicebot.lv> writes:

On Tuesday, 30 September 2014 at 15:46:54 UTC, Steven 
Schveighoffer wrote:
 On 9/30/14 10:24 AM, Dicebot wrote:
 On Tuesday, 30 September 2014 at 14:01:17 UTC, Steven 
 Schveighoffer wrote:
 Assertion passes with D1/Tango runtime but fails with 
 current D2
 runtime. This happens because `result.ptr` is not actually a 
 pointer
 returned by gc_qalloc from array reallocation, but interior 
 pointer 16
 bytes from the start of that block. Druntime stores some 
 metadata
 (length/capacity I presume) in the very beginning.

 This is accurate, it stores the "used" size of the array. But 
 it's
 only the case for arrays, not general GC.malloc blocks.

 Alternative is to use result.capacity, which essentially 
 looks up the
 same thing (and should be more accurate). But it doesn't 
 cover the
 same inputs.

 Why is it stored in the beginning and not in the end of the 
 block (like
 capacity)? I'd like to explore options of removing interior 
 pointer
 completely before proceeding with adding more special cases to 
 GC
 functions.

 First, it is the capacity. It's just that the capacity lives at 
 the beginning of larger blocks.

 The reason is due to the ability to extend pages.

 With smaller blocks (2048 bytes or less), the page is divided 
 into equal portions, and those can NEVER be extended. Any 
 attempt to extend results in a realloc into another block. 
 Putting the capacity at the end makes sense for 2 reasons: 1. 1 
 byte is already reserved to prevent cross-block pointers, 2. It 
 doesn't cause alignment issues. We can't very well offset a 16 
 byte block by 16 bytes. But importantly, the capacity field 
 does not move.

 However, for page and above size (4096+ bytes), the original 
 (D1 and early D2) runtime would attempt to extend into the next 
 page, without moving the data. Thus we save the copy of data 
 into a new block, and just set some bits and we're done.

Ah that must be what confused me - I looked at small block offset 
calculation originally and blindly assumed same logic for other 
sizes. Sorry, my fault!

 But this poses a problem for when the capacity field is stored 
 at the end -- especially since we are caching the block info. 
 The block info can change with a call to GC.extend (whereas a 
 fixed-size block, the block info CANNOT change). Depending on 
 what "version" of the block info you have, the "end" can be 
 different, and you may end up corrupting data. This is 
 especially important for shared or immutable array blocks, 
 where multiple threads could be appending at the same time.

 So I made the call to put it at the beginning of the block, 
 which obviously doesn't change, and offset everything by 16 
 bytes to maintain alignment.

 It may very well be that we can put it at the end of the block 
 instead, and you can probably do so without much effort in the 
 runtime (everything uses CTFE functions to calculate padding 
 and location of the capacity). It has been such a long time 
 since I did that, I'm not very sure of all the reasons not to 
 do it. A look through the mailing list archives might be useful.

I think it should be possible. That way actual block size will be 
simply considered a bit smaller and extending happen before 
reserved space is hit. But of course I have only a very vague 
knowledge of druntime ackquired while porting cdgc so may need to 
think about it a bit more and probably chat with Leandro too :)

Have created bugzilla issue for now : 
https://issues.dlang.org/show_bug.cgi?id=13558

Sep 30 2014

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 9/30/14 12:01 PM, Dicebot wrote:

 I think it should be possible. That way actual block size will be simply
 considered a bit smaller and extending happen before reserved space is
 hit. But of course I have only a very vague knowledge of druntime
 ackquired while porting cdgc so may need to think about it a bit more
 and probably chat with Leandro too :)

I think it is possible, and perhaps more correct, to put the array 
append size into a separate memory block for larger sizes. The placement 
at the end works very well and has good properties for smaller sizes.

This is similar to how the flags work.

-Steve

Sep 30 2014

"Sean Kelly" <sean invisibleduck.org> writes:

On Tuesday, 30 September 2014 at 15:46:54 UTC, Steven
Schveighoffer wrote:
 So I made the call to put it at the beginning of the block, 
 which obviously doesn't change, and offset everything by 16 
 bytes to maintain alignment.

 It may very well be that we can put it at the end of the block 
 instead, and you can probably do so without much effort in the 
 runtime (everything uses CTFE functions to calculate padding 
 and location of the capacity). It has been such a long time 
 since I did that, I'm not very sure of all the reasons not to 
 do it. A look through the mailing list archives might be useful.

Yes, a lot of this is an artifact of the relatively simplistic
manner that the current GC tracks memory.  If large blocks had a
header, for example, then this could theoretically live there and
not cause any problems.  As we move towards supporting precise
scanning, the GC will need to be aware of the types of data it
holds, and so some portion of the array appendability strategy
should probably migrate into the GC.  A redefinition of the GC
interface is many years overdue.  This just needs to be
considered when it happens.

Sep 30 2014

"Dicebot" <public dicebot.lv> writes:

On Tuesday, 30 September 2014 at 17:28:02 UTC, Sean Kelly wrote:
 On Tuesday, 30 September 2014 at 15:46:54 UTC, Steven
 Schveighoffer wrote:
 So I made the call to put it at the beginning of the block, 
 which obviously doesn't change, and offset everything by 16 
 bytes to maintain alignment.

 It may very well be that we can put it at the end of the block 
 instead, and you can probably do so without much effort in the 
 runtime (everything uses CTFE functions to calculate padding 
 and location of the capacity). It has been such a long time 
 since I did that, I'm not very sure of all the reasons not to 
 do it. A look through the mailing list archives might be 
 useful.

 Yes, a lot of this is an artifact of the relatively simplistic
 manner that the current GC tracks memory.  If large blocks had a
 header, for example, then this could theoretically live there 
 and
 not cause any problems.  As we move towards supporting precise
 scanning, the GC will need to be aware of the types of data it
 holds, and so some portion of the array appendability strategy
 should probably migrate into the GC.  A redefinition of the GC
 interface is many years overdue.  This just needs to be
 considered when it happens.

I decided to add a similar workaround to CDGC for now and fix the 
way it is stored in druntime a bit later :)

On a related note : CDGC D2 port passes druntime test suite 
starting with today (with only shared library tests disabled), so 
initial upstream PR should happen very soon (just need to clean 
up all trace and "omg hax" commits :))

Sep 30 2014

ketmar via Digitalmars-d <digitalmars-d puremagic.com> writes:

On Tue, 30 Sep 2014 17:43:04 +0000
Dicebot via Digitalmars-d <digitalmars-d puremagic.com> wrote:

 On a related note : CDGC D2 port passes druntime test suite=20
 starting with today (with only shared library tests disabled), so=20
 initial upstream PR should happen very soon (just need to clean=20
 up all trace and "omg hax" commits :))

wow! no, really, what else can i say? ;-)

Sep 30 2014

"Sean Kelly" <sean invisibleduck.org> writes:

On Tuesday, 30 September 2014 at 13:42:14 UTC, Dicebot wrote:
 Is such behaviour intended?

Yes.  As far as the GC is concerned, asking for the size of an
interior pointer is asking for the size of a slice of the data,
and as slices are not extendable it will always return 0.  All of
the APPENDABLE stuff takes place outside the GC (except for the
definition of the APPENDABLE BlkAttr, which really should be
defined externally and within the user-reserved range of the
bitfield instead of the GC-reserved range, but I digress...) and
so it has no way of knowing that someone is using the blocks this
way.

Sep 30 2014

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 9/30/14 1:23 PM, Sean Kelly wrote:

 (except for the
 definition of the APPENDABLE BlkAttr, which really should be
 defined externally and within the user-reserved range of the
 bitfield instead of the GC-reserved range, but I digress...)

The APPENDABLE bit was added as a solution to avoid having to reserve 
that memory for all allocations. Basically, if you try to append to a 
block that doesn't have the bit, it simply reallocates conservatively.

So it does have to be part of the GC metadata, because the most 
important effect is on blocks that AREN'T allocated via the array 
runtime. Otherwise, the append mechanism creeps into all aspects of 
memory allocation.

-Steve

Sep 30 2014

"Sean Kelly" <sean invisibleduck.org> writes:

On Tuesday, 30 September 2014 at 17:51:18 UTC, Steven
Schveighoffer wrote:
 On 9/30/14 1:23 PM, Sean Kelly wrote:

 (except for the
 definition of the APPENDABLE BlkAttr, which really should be
 defined externally and within the user-reserved range of the
 bitfield instead of the GC-reserved range, but I digress...)

 The APPENDABLE bit was added as a solution to avoid having to 
 reserve that memory for all allocations. Basically, if you try 
 to append to a block that doesn't have the bit, it simply 
 reallocates conservatively.

 So it does have to be part of the GC metadata, because the most 
 important effect is on blocks that AREN'T allocated via the 
 array runtime. Otherwise, the append mechanism creeps into all 
 aspects of memory allocation.

Yeah I know.  But when I defined the BlkAttr bitfield I'd
reserved one portion of the range for internal GC stuff and
another portion for user-defined stuff.  APPENDABLE should
probably have landed in the user-defined portion.  I don't see
any of those comments in the current code or I'd point to them.
I guess they were deleted at some point.

Sep 30 2014

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 9/30/14 2:19 PM, Sean Kelly wrote:
 On Tuesday, 30 September 2014 at 17:51:18 UTC, Steven
 Schveighoffer wrote:
 On 9/30/14 1:23 PM, Sean Kelly wrote:

 (except for the
 definition of the APPENDABLE BlkAttr, which really should be
 defined externally and within the user-reserved range of the
 bitfield instead of the GC-reserved range, but I digress...)

 The APPENDABLE bit was added as a solution to avoid having to reserve
 that memory for all allocations. Basically, if you try to append to a
 block that doesn't have the bit, it simply reallocates conservatively.

 So it does have to be part of the GC metadata, because the most
 important effect is on blocks that AREN'T allocated via the array
 runtime. Otherwise, the append mechanism creeps into all aspects of
 memory allocation.

 Yeah I know.  But when I defined the BlkAttr bitfield I'd
 reserved one portion of the range for internal GC stuff and
 another portion for user-defined stuff.  APPENDABLE should
 probably have landed in the user-defined portion.  I don't see
 any of those comments in the current code or I'd point to them.
 I guess they were deleted at some point.

Hm... looked at the code, I have no idea how the GC would handle 
user-defined stuff. It seems to only deal with bits it knows about (i.e. 
APPENDABLE is specifically handled in gc/gc.d)

I think I just took the next bit available, if there was a warning about 
GC internal bits when I added APPENDABLE, I either missed it or 
dismissed it.

-Steve

Sep 30 2014

"Sean Kelly" <sean invisibleduck.org> writes:

On Tuesday, 30 September 2014 at 19:19:18 UTC, Steven
Schveighoffer wrote:
 Hm... looked at the code, I have no idea how the GC would 
 handle user-defined stuff. It seems to only deal with bits it 
 knows about (i.e. APPENDABLE is specifically handled in gc/gc.d)

It wouldn't.  The GC was just going to provide storage for some
number of user-defined bits.


 I think I just took the next bit available, if there was a 
 warning about GC internal bits when I added APPENDABLE, I 
 either missed it or dismissed it.

With BlkAttr defined independently in a bunch of different places
at the time, the comment would have been easy to miss.  I don't
know that it really matters anyway, but it's something I noticed
today.

Sep 30 2014

D Programming

C/C++ Programming

Other

digitalmars.D - GC.sizeOf(array.ptr)