www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Dynamic array as stack and GC.BlkAttr.APPENDABLE

reply "IgorStepanov" <wazar mail.ru> writes:
Recently I encountered the following problem.
I need a simple stack of uint.
I want to push uints back and pop it. I don't want to copy this 
stack and I want it to work fast.

In a first approximation, the problem seems easy.

uint[] my_stack;
my_stack.reserve(256);

my_stack ~= 1; //push
uint head = my_stack[$ - 1]; //top
my_stack.length--; //pop
my_stack ~= 2; //push

However, when I changes the length and append new elem to the 
array after it, array is reallocated.

Ok, good point. Seems reasonable. I might create slice of array 
copy before changing of the length. And by default changing array 
length or creating slice should cause an array reallocation at 
appending.

However I don't plan to copy or creating slices of this array, 
thus I need to some magic for avoid reallocation.

I've found a GC.BlkAttr.APPENDABLE flag at core.memory.

How ever it hasn't helped me.
Is there another way to do it without manually managing of memory?
Nov 14 2014
next sibling parent "Brad Anderson" <eco gnuk.net> writes:
On Friday, 14 November 2014 at 22:16:42 UTC, IgorStepanov wrote:
 Recently I encountered the following problem.
 I need a simple stack of uint.
 I want to push uints back and pop it. I don't want to copy this 
 stack and I want it to work fast.
Nick wrote an article about this very thing. https://www.semitwist.com/articles/article/view/don-t-use-arrays-as-stacks
Nov 14 2014
prev sibling next sibling parent reply Dmitry Olshansky <dmitry.olsh gmail.com> writes:
15-Nov-2014 01:16, IgorStepanov пишет:
 Recently I encountered the following problem.
 I need a simple stack of uint.
 I want to push uints back and pop it. I don't want to copy this stack
 and I want it to work fast.

 In a first approximation, the problem seems easy.

 uint[] my_stack;
 my_stack.reserve(256);

 my_stack ~= 1; //push
 uint head = my_stack[$ - 1]; //top
 my_stack.length--; //pop
Just make push into: my_stack.assumeSafeAppend(); my_stack ~= value; To avoid relocations. -- Dmitry Olshansky
Nov 14 2014
next sibling parent "IgorStepanov" <wazar mail.ru> writes:
On Friday, 14 November 2014 at 22:25:20 UTC, Dmitry Olshansky 
wrote:
 15-Nov-2014 01:16, IgorStepanov пишет:
 Recently I encountered the following problem.
 I need a simple stack of uint.
 I want to push uints back and pop it. I don't want to copy 
 this stack
 and I want it to work fast.

 In a first approximation, the problem seems easy.

 uint[] my_stack;
 my_stack.reserve(256);

 my_stack ~= 1; //push
 uint head = my_stack[$ - 1]; //top
 my_stack.length--; //pop
Just make push into: my_stack.assumeSafeAppend(); my_stack ~= value; To avoid relocations.
Thanks! Your magic is works!
Nov 14 2014
prev sibling parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
On 11/14/14 5:25 PM, Dmitry Olshansky wrote:
 15-Nov-2014 01:16, IgorStepanov пишет:
 Recently I encountered the following problem.
 I need a simple stack of uint.
 I want to push uints back and pop it. I don't want to copy this stack
 and I want it to work fast.

 In a first approximation, the problem seems easy.

 uint[] my_stack;
 my_stack.reserve(256);

 my_stack ~= 1; //push
 uint head = my_stack[$ - 1]; //top
 my_stack.length--; //pop
Just make push into: my_stack.assumeSafeAppend(); my_stack ~= value; To avoid relocations.
In actuality, you do not need this before every push. You only need it between a pop and a push. I highly recommend not doing arrays-as-stacks, because you end up writing your own type. At that point, you may as well remove the funky array/slice semantics, and store the capacity yourself. -Steve
Nov 14 2014
next sibling parent reply Dmitry Olshansky <dmitry.olsh gmail.com> writes:
15-Nov-2014 01:38, Steven Schveighoffer пишет:
 On 11/14/14 5:25 PM, Dmitry Olshansky wrote:
 15-Nov-2014 01:16, IgorStepanov пишет:
 Recently I encountered the following problem.
 I need a simple stack of uint.
 I want to push uints back and pop it. I don't want to copy this stack
 and I want it to work fast.

 In a first approximation, the problem seems easy.

 uint[] my_stack;
 my_stack.reserve(256);

 my_stack ~= 1; //push
 uint head = my_stack[$ - 1]; //top
 my_stack.length--; //pop
Just make push into: my_stack.assumeSafeAppend(); my_stack ~= value; To avoid relocations.
In actuality, you do not need this before every push. You only need it between a pop and a push. I highly recommend not doing arrays-as-stacks, because you end up writing your own type.
+1
 At that point, you may as well remove the funky
 array/slice semantics, and store the capacity yourself.
That's the second approximation ;) The third goes to C's malloc. The 4-th starts with some small array on stack, then goes to C's malloc. -- Dmitry Olshansky
Nov 14 2014
parent "H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:
On Sat, Nov 15, 2014 at 02:02:29AM +0300, Dmitry Olshansky via Digitalmars-d
wrote:
 15-Nov-2014 01:38, Steven Schveighoffer пишет:
On 11/14/14 5:25 PM, Dmitry Olshansky wrote:
15-Nov-2014 01:16, IgorStepanov пишет:
Recently I encountered the following problem.
I need a simple stack of uint.
I want to push uints back and pop it. I don't want to copy this stack
and I want it to work fast.

In a first approximation, the problem seems easy.

uint[] my_stack;
my_stack.reserve(256);

my_stack ~= 1; //push
uint head = my_stack[$ - 1]; //top
my_stack.length--; //pop
Just make push into: my_stack.assumeSafeAppend(); my_stack ~= value; To avoid relocations.
In actuality, you do not need this before every push. You only need it between a pop and a push. I highly recommend not doing arrays-as-stacks, because you end up writing your own type.
+1
At that point, you may as well remove the funky array/slice
semantics, and store the capacity yourself.
That's the second approximation ;) The third goes to C's malloc. The 4-th starts with some small array on stack, then goes to C's malloc.
[...] This smells like it should be a Phobos module. ;-) No, seriously, I've reimplemented this myself a few times already. Mostly I've only gotten as far as assumeSafeAppend, but at least one instance seems to be pushing towards the second/third stage. T -- Trying to define yourself is like trying to bite your own teeth. -- Alan Watts
Nov 14 2014
prev sibling parent reply "IgorStepanov" <wazar mail.ru> writes:
On Friday, 14 November 2014 at 22:38:53 UTC, Steven Schveighoffer 
wrote:
 On 11/14/14 5:25 PM, Dmitry Olshansky wrote:
 15-Nov-2014 01:16, IgorStepanov пишет:
 Recently I encountered the following problem.
 I need a simple stack of uint.
 I want to push uints back and pop it. I don't want to copy 
 this stack
 and I want it to work fast.

 In a first approximation, the problem seems easy.

 uint[] my_stack;
 my_stack.reserve(256);

 my_stack ~= 1; //push
 uint head = my_stack[$ - 1]; //top
 my_stack.length--; //pop
Just make push into: my_stack.assumeSafeAppend(); my_stack ~= value; To avoid relocations.
In actuality, you do not need this before every push. You only need it between a pop and a push. I highly recommend not doing arrays-as-stacks, because you end up writing your own type.
I'm working on AssociativeArray implementation and I need to array which contains indices of free entries (entries is an array which contain Entry objects. When we need to find empty Entry to add a new value to the AA, we are getting the first element of the stack. (If the stack is empty, we need to add new element to entries). If entry has been removed from AA, it index is added to the stack). Thus I don't want to add extra fields, and this stack is an encapsulated entity.
Nov 14 2014
parent reply "IgorStepanov" <wazar mail.ru> writes:
On Friday, 14 November 2014 at 23:12:05 UTC, IgorStepanov wrote:
 On Friday, 14 November 2014 at 22:38:53 UTC, Steven 
 Schveighoffer wrote:
 On 11/14/14 5:25 PM, Dmitry Olshansky wrote:
 15-Nov-2014 01:16, IgorStepanov пишет:
 Recently I encountered the following problem.
 I need a simple stack of uint.
 I want to push uints back and pop it. I don't want to copy 
 this stack
 and I want it to work fast.

 In a first approximation, the problem seems easy.

 uint[] my_stack;
 my_stack.reserve(256);

 my_stack ~= 1; //push
 uint head = my_stack[$ - 1]; //top
 my_stack.length--; //pop
Just make push into: my_stack.assumeSafeAppend(); my_stack ~= value; To avoid relocations.
In actuality, you do not need this before every push. You only need it between a pop and a push. I highly recommend not doing arrays-as-stacks, because you end up writing your own type.
I'm working on AssociativeArray implementation and I need to array which contains indices of free entries (entries is an array which contain Entry objects. When we need to find empty Entry to add a new value to the AA, we are getting the first element of the stack. (If the stack is empty, we need to add new element to entries). If entry has been removed from AA, it index is added to the stack). Thus I don't want to add extra fields, and this stack is an encapsulated entity.
And if we talk about AA: What does the NO_INTERIOR flag? Why the table (buckets array) is allocated with GC.calloc and NO_INTERIOR flag instead of new Entry[size];
Nov 14 2014
parent reply ketmar via Digitalmars-d <digitalmars-d puremagic.com> writes:
On Fri, 14 Nov 2014 23:23:17 +0000
IgorStepanov via Digitalmars-d <digitalmars-d puremagic.com> wrote:

 What does the NO_INTERIOR flag?
it stops GC to acknowledge pointers inside allocated area as anchors. i.e. if there is no pointer to the head (first address) of allocated memory, it is assumed to be garbage. this way we have much less "false pointers", and GC not doing pointer->block conversions. for buckets we certainly has "head pointer" and can't have pointers to bucket elements without "head pointer". so it's safe to tell GC that it shouldn't do unnecessary work.
Nov 14 2014
parent reply "IgorStepanov" <wazar mail.ru> writes:
On Friday, 14 November 2014 at 23:49:00 UTC, ketmar via 
Digitalmars-d wrote:
 On Fri, 14 Nov 2014 23:23:17 +0000
 IgorStepanov via Digitalmars-d <digitalmars-d puremagic.com> 
 wrote:

 What does the NO_INTERIOR flag?
it stops GC to acknowledge pointers inside allocated area as anchors. i.e. if there is no pointer to the head (first address) of allocated memory, it is assumed to be garbage. this way we have much less "false pointers", and GC not doing pointer->block conversions. for buckets we certainly has "head pointer" and can't have pointers to bucket elements without "head pointer". so it's safe to tell GC that it shouldn't do unnecessary work.
In other words, if buckets array will contain only uint-s there is no reason to mark buckets with NO_INTERIOR?
Nov 14 2014
next sibling parent ketmar via Digitalmars-d <digitalmars-d puremagic.com> writes:
On Sat, 15 Nov 2014 01:56:17 +0000
IgorStepanov via Digitalmars-d <digitalmars-d puremagic.com> wrote:

 On Friday, 14 November 2014 at 23:49:00 UTC, ketmar via=20
 Digitalmars-d wrote:
 On Fri, 14 Nov 2014 23:23:17 +0000
 IgorStepanov via Digitalmars-d <digitalmars-d puremagic.com>=20
 wrote:

 What does the NO_INTERIOR flag?
it stops GC to acknowledge pointers inside allocated area as=20 anchors. i.e. if there is no pointer to the head (first address) of=20 allocated memory, it is assumed to be garbage. this way we have much less "false pointers", and GC not doing pointer->block conversions. for buckets we certainly has "head pointer" and can't have=20 pointers to bucket elements without "head pointer". so it's safe to tell GC=20 that it shouldn't do unnecessary work.
=20 In other words, if buckets array will contain only uint-s there=20 is no reason to mark buckets with NO_INTERIOR?
that's not about bucket contents, that's about other pointers to buckets. with NO_INTERIOR you guarantees that things like `&element[1]`, `&element[2]` and so on cannot exist without `&element[0]`. i.e. there ALWAYS be live pointer to the head. and if there is no such pointer, GC will mark memory as "garbage" even if there are still pointers to other elements, like `&element[1]`, etc. it's not about what buckets themselves contains. to stop scanning *contents* one using NO_SCAN, but that's obviously wrong for buckets, as they will contain pointers to keys and values. as AA buckets are internal structures user can't slice 'em in his code, it's safe to set NO_INTERIOR flag. btw, core.memory documents all this flags. maybe it should be fixed if it's not clear enough? i don't know if it is clear for someone without expirience in GC.
Nov 14 2014
prev sibling parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
On 11/14/14 8:56 PM, IgorStepanov wrote:
 On Friday, 14 November 2014 at 23:49:00 UTC, ketmar via Digitalmars-d
 wrote:
 On Fri, 14 Nov 2014 23:23:17 +0000
 IgorStepanov via Digitalmars-d <digitalmars-d puremagic.com> wrote:

 What does the NO_INTERIOR flag?
it stops GC to acknowledge pointers inside allocated area as anchors. i.e. if there is no pointer to the head (first address) of allocated memory, it is assumed to be garbage. this way we have much less "false pointers", and GC not doing pointer->block conversions. for buckets we certainly has "head pointer" and can't have pointers to bucket elements without "head pointer". so it's safe to tell GC that it shouldn't do unnecessary work.
In other words, if buckets array will contain only uint-s there is no reason to mark buckets with NO_INTERIOR?
In case ketmar's reply doesn't drive it home... NO_INTERIOR means that the GC should NOT consider pointers to a block as valid references if those pointers don't point to the HEAD of the block. In other words, they point to the interior. An example int * x = new int; *x = 5; byte *b = cast(byte *)x; b++; // b is now an interior pointer, yet x is not. If we had marked x's block as NO_INTERIOR, we are fine as long as x remains. If we set x to null, there is a danger that the block is collected, and now b is dangling. Now, this isn't a whole lot more efficient, you still have to look up what b points at and see that it has the flag set. BUT, where it DOES help is if you had some size_t somewhere on a stack, that happens to be the same value as b, the GC may think it's a pointer, but correctly not keep the block alive just for that "false pointer". The larger the block, the more helpful NO_INTERIOR is. -Steve
Nov 14 2014
parent reply "IgorStepanov" <wazar mail.ru> writes:
On Saturday, 15 November 2014 at 03:41:56 UTC, Steven 
Schveighoffer wrote:
 On 11/14/14 8:56 PM, IgorStepanov wrote:
 On Friday, 14 November 2014 at 23:49:00 UTC, ketmar via 
 Digitalmars-d
 wrote:
 On Fri, 14 Nov 2014 23:23:17 +0000
 IgorStepanov via Digitalmars-d <digitalmars-d puremagic.com> 
 wrote:

 What does the NO_INTERIOR flag?
it stops GC to acknowledge pointers inside allocated area as anchors. i.e. if there is no pointer to the head (first address) of allocated memory, it is assumed to be garbage. this way we have much less "false pointers", and GC not doing pointer->block conversions. for buckets we certainly has "head pointer" and can't have pointers to bucket elements without "head pointer". so it's safe to tell GC that it shouldn't do unnecessary work.
In other words, if buckets array will contain only uint-s there is no reason to mark buckets with NO_INTERIOR?
In case ketmar's reply doesn't drive it home... NO_INTERIOR means that the GC should NOT consider pointers to a block as valid references if those pointers don't point to the HEAD of the block. In other words, they point to the interior. An example int * x = new int; *x = 5; byte *b = cast(byte *)x; b++; // b is now an interior pointer, yet x is not. If we had marked x's block as NO_INTERIOR, we are fine as long as x remains. If we set x to null, there is a danger that the block is collected, and now b is dangling. Now, this isn't a whole lot more efficient, you still have to look up what b points at and see that it has the flag set. BUT, where it DOES help is if you had some size_t somewhere on a stack, that happens to be the same value as b, the GC may think it's a pointer, but correctly not keep the block alive just for that "false pointer". The larger the block, the more helpful NO_INTERIOR is. -Steve
Thanks for explanations. I undersood that now. However I have got a strange bug with NO_INTERIOR. I have the following function: static Bucket[] newBuckets(in size_t len) trusted pure nothrow { Bucket[] ret = new Bucket[len]; if (!__ctfe) { GC.setAttr(ret.ptr, GC.BlkAttr.NO_INTERIOR); } return ret; } The Bucket declaration has a two uint fields and one of those has a explicit initializer: private enum uint EmptyBucket = uint.max; static struct Bucket { uint depth; uint index = EmptyBucket; property bool occupied() const { return index != EmptyBucket; } } I call this function for creating new bucket array (in init() function, at first value inserting and at rehashing). I don't create a copy, slice, I don't get a array pointer. Before array creating and rehashing I use buckets only for access by index. However I have got a error, when `index` field initializes with some strange value when newBuckets argument is very big. (buckets.length is big value). When I comment a GC.setAttr(ret.ptr, GC.BlkAttr.NO_INTERIOR); line, code start works fine. Do I any fundamental error in this code? May be Bucket[] ret = new Bucket[len]; ret.ptr is not base pointer?
Nov 15 2014
parent reply Rainer Schuetze <r.sagitario gmx.de> writes:
On 15.11.2014 23:40, IgorStepanov wrote:
 Do I any fundamental error in this code?
 May be Bucket[] ret = new Bucket[len]; ret.ptr is not base pointer?
Yes, for arrays larger than 2kB, the allocation length information is placed before the actual data, and ret.ptr has an offset of 16 bytes into the allocation block. As a consequence, applying BlkAttr.NO_INTERIOR will very likely cause the block to be collected. NO_INTERIOR has no effect for smaller blocks. That's why the existing AA code explicitely uses GC.malloc instead of new[].
Nov 16 2014
parent "IgorStepanov" <wazar mail.ru> writes:
On Sunday, 16 November 2014 at 09:28:25 UTC, Rainer Schuetze 
wrote:
 On 15.11.2014 23:40, IgorStepanov wrote:
 Do I any fundamental error in this code?
 May be Bucket[] ret = new Bucket[len]; ret.ptr is not base 
 pointer?
Yes, for arrays larger than 2kB, the allocation length information is placed before the actual data, and ret.ptr has an offset of 16 bytes into the allocation block. As a consequence, applying BlkAttr.NO_INTERIOR will very likely cause the block to be collected. NO_INTERIOR has no effect for smaller blocks. That's why the existing AA code explicitely uses GC.malloc instead of new[].
Thank you very much. That explains a lot.
Nov 16 2014
prev sibling parent Steven Schveighoffer <schveiguy yahoo.com> writes:
On 11/14/14 5:16 PM, IgorStepanov wrote:
 I've found a GC.BlkAttr.APPENDABLE flag at core.memory.
And BTW, don't use this flag. Only the runtime should use it. -Steve
Nov 14 2014