digitalmars.D.learn - Greedy memory handling

monarch_dodra (13/13) Sep 11 2013 I have a function that will *massively* benefit from having a

Gary Willoughby (14/27) Sep 11 2013 I've done something similar before and the general rule then was

monarch_dodra (10/12) Sep 11 2013 But if the buffer is stored in a static variable, the GC will

Joseph Rushton Wakeling (2/7) Sep 11 2013 How about GC.addRoot and GC.removeRoot ... ?
Joseph Rushton Wakeling (15/24) Sep 11 2013 I should clarify that a bit more. I mean, from what I understand, you w...

monarch_dodra (7/39) Sep 11 2013 That's somewhat better, as it would allow the GC to collect my

Joseph Rushton Wakeling (21/23) Sep 11 2013 Just to clarify, is this buffer meant only for internal use in your func...

Dmitry Olshansky (6/27) Sep 11 2013 Problem is - said GC-freed memory could be then reused in some way. I

Joseph Rushton Wakeling (3/5) Sep 11 2013 Ahh, nasty. I'd assumed that the buffer would have been reset to null i...

monarch_dodra (19/49) Sep 11 2013 The buffer is meant strictly for internal use. It never escapes

Dmitry Olshansky (15/24) Sep 11 2013 You need weak references. With manually registered finalize for your

Namespace (5/18) Sep 11 2013 I do not know if it fits, but I had a similar problem some time
Jacob Carlborg (6/18) Sep 11 2013 How about keeping a stack or static buffer. If that gets too small use a...

H. S. Teoh (55/75) Sep 12 2013 [...]

Dmitry Olshansky (9/80) Sep 12 2013 This line above is not 100% good idea .. at least with deadbeaf as mask.

H. S. Teoh (17/56) Sep 12 2013 Well, that was just an example value. :) If we know which OS it is and

Dmitry Olshansky (12/65) Sep 12 2013 It occured to me that there are modes where full address space is

H. S. Teoh (41/64) Sep 12 2013 [...]

Dmitry Olshansky (28/89) Sep 12 2013 The only problem I can foresee is that when it runs the collection

monarch_dodra (13/20) Sep 12 2013 Yum. I like this.

Jacob Carlborg (5/7) Sep 12 2013 I was thinking he could reuse the stack/static buffer. Basically using

"monarch_dodra" <monarchdodra gmail.com> writes:

I have a function that will *massively* benefit from having a 
persistent internal buffer it can re-use (and grow) from call to 
call, instead of re-allocating on every call.

What I don't want is either of:
1. To set a fixed limitation of size, if the user ends up making 
repeated calls to something larger to my fixed size.
2. For a single big call which will allocate a HUGE internal 
buffer that will consume all my memory.

What I need is some sort of lazy buffer. Basically, the 
allocation holds, but I don't want the to prevent the GC from 
collecting it if it deems it has gotten too big, or needs more 
memory.

Any idea on how to do something like that? Or literature?

Sep 11 2013

"Gary Willoughby" <dev nomad.so> writes:

On Wednesday, 11 September 2013 at 08:06:37 UTC, monarch_dodra 
wrote:
 I have a function that will *massively* benefit from having a 
 persistent internal buffer it can re-use (and grow) from call 
 to call, instead of re-allocating on every call.

 What I don't want is either of:
 1. To set a fixed limitation of size, if the user ends up 
 making repeated calls to something larger to my fixed size.
 2. For a single big call which will allocate a HUGE internal 
 buffer that will consume all my memory.

 What I need is some sort of lazy buffer. Basically, the 
 allocation holds, but I don't want the to prevent the GC from 
 collecting it if it deems it has gotten too big, or needs more 
 memory.

 Any idea on how to do something like that? Or literature?

I've done something similar before and the general rule then was 
to start with a small buffer and if you need more just double it. 
So start with something like 4k(?) (depending on what you need) 
and before each call make sure you have enough, if not double the 
buffer by reallocating. This way you grow the buffer but only 
when needed. Also doubling makes sure you are not reallocating 
for each call.

Take a look in the core.memory runtime file for the GC methods. 
The ones of interest for you are: GC.alloc(size) and 
GC.realloc(*buffer, newSize) or GC.extend(*buffer, minSize, 
desiredSize). You can then let the GC handle it or free it 
yourself with GC.free(*buffer).

Sep 11 2013

"monarch_dodra" <monarchdodra gmail.com> writes:

On Wednesday, 11 September 2013 at 10:28:37 UTC, Gary Willoughby 
wrote:
 You can then let the GC handle it or free it yourself with 
 GC.free(*buffer).

But if the buffer is stored in a static variable, the GC will 
never collect it. I *could* also free it myself, but why/when 
would I do that?

Did you just just let your buffer grow, and never let it get 
collected?

Is there a way to do something like "I'm using this buffer, but 
if you want to collect it, then go ahead. I'll reallocate a new 
one *if/when* I need it again"

Sep 11 2013

Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:

On 11/09/13 12:34, monarch_dodra wrote:
 But if the buffer is stored in a static variable, the GC will never collect it.
 I *could* also free it myself, but why/when would I do that?

 Did you just just let your buffer grow, and never let it get collected?

 Is there a way to do something like "I'm using this buffer, but if you want to
 collect it, then go ahead. I'll reallocate a new one *if/when* I need it again"

How about GC.addRoot and GC.removeRoot ... ?

Sep 11 2013

Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:

On 11/09/13 13:14, Joseph Rushton Wakeling wrote:
 On 11/09/13 12:34, monarch_dodra wrote:
 But if the buffer is stored in a static variable, the GC will never collect it.
 I *could* also free it myself, but why/when would I do that?

 Did you just just let your buffer grow, and never let it get collected?

 Is there a way to do something like "I'm using this buffer, but if you want to
 collect it, then go ahead. I'll reallocate a new one *if/when* I need it again"

 How about GC.addRoot and GC.removeRoot ... ?

I should clarify that a bit more.  I mean, from what I understand, you want to 
be able to do something like this:

     void foo(/* vars */)
     {
         // 1. if buffer not allocated, allocate as necessary

         // 2. send GC a message: "Hey, I'm using this buffer!  Don't free!

         // 3. carry out your calculations

         // 4. send GC a message: "Hey, this buffer can be freed if you need
to."
     }

If I understand right, GC.addRoot should take care of (2) and GC.removeRoot can 
take care of (3).  Then, if there's a collection cycle in-between calls to foo, 
fine; if not, next time you enter foo(), the new call to GC.addRoot will
protect 
the memory for the lifetime of the calculation.

But this is conjecture, not speaking from experience :-)

Sep 11 2013

"monarch_dodra" <monarchdodra gmail.com> writes:

On Wednesday, 11 September 2013 at 11:19:27 UTC, Joseph Rushton
Wakeling wrote:
 On 11/09/13 13:14, Joseph Rushton Wakeling wrote:
 On 11/09/13 12:34, monarch_dodra wrote:
 But if the buffer is stored in a static variable, the GC will 
 never collect it.
 I *could* also free it myself, but why/when would I do that?

 Did you just just let your buffer grow, and never let it get 
 collected?

 Is there a way to do something like "I'm using this buffer, 
 but if you want to
 collect it, then go ahead. I'll reallocate a new one 
 *if/when* I need it again"

 How about GC.addRoot and GC.removeRoot ... ?

 I should clarify that a bit more.  I mean, from what I 
 understand, you want to be able to do something like this:

     void foo(/* vars */)
     {
         // 1. if buffer not allocated, allocate as necessary

         // 2. send GC a message: "Hey, I'm using this buffer!  
 Don't free!

         // 3. carry out your calculations

         // 4. send GC a message: "Hey, this buffer can be freed 
 if you need to."
     }

 If I understand right, GC.addRoot should take care of (2) and 
 GC.removeRoot can take care of (3).  Then, if there's a 
 collection cycle in-between calls to foo, fine; if not, next 
 time you enter foo(), the new call to GC.addRoot will protect 
 the memory for the lifetime of the calculation.

 But this is conjecture, not speaking from experience :-)

That's somewhat better, as it would allow the GC to collect my 
buffer, if it wants to, but I wouldn't actually know about it 
afterwards which leaves me screwed.

I *think* addRoot and removeRoot is really designed to pass GC 
memory to functions that aren't GC-scanned...

Sep 11 2013

Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:

On 11/09/13 15:13, monarch_dodra wrote:
 That's somewhat better, as it would allow the GC to collect my buffer, if it
 wants to, but I wouldn't actually know about it afterwards which leaves me
screwed.

Just to clarify, is this buffer meant only for internal use in your function or 
is it meant to be externally accessed as well?  I'd kind of assumed the former.

Either way, isn't it sufficient to have some kind of

     if (buf is null)
     {
         // allocate the buffer
     }

check in place?  The basic model seems right -- at the moment when you need the 
buffer, you check if it's allocated (and if not, allocate it as needed); you 
indicate to the GC that it shouldn't collect the memory; you use the buffer;
and 
the moment it's no longer needed, you indicate to the GC that it's collectable 
again.

It means having to be very careful to check the buffer's allocation status 
whenever you want to use it, but I think that's an unavoidable consequence of 
wanting a static variable that can be freed if needed.

The alternative I thought of was something like comparing the size difference 
between the currently-needed buffer and the last-needed buffer (... or if you 
want to be over-the-top, compare to a running average:-), and if the current
one 
is sufficiently smaller, free the old one and re-alloc a new one; but that's a 
bit _too_ greedy in the free-up-memory stakes, I think.

Sep 11 2013

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

11-Sep-2013 17:33, Joseph Rushton Wakeling пишет:
 On 11/09/13 15:13, monarch_dodra wrote:
 That's somewhat better, as it would allow the GC to collect my buffer,
 if it
 wants to, but I wouldn't actually know about it afterwards which
 leaves me screwed.

 Just to clarify, is this buffer meant only for internal use in your
 function or is it meant to be externally accessed as well?  I'd kind of
 assumed the former.

 Either way, isn't it sufficient to have some kind of

      if (buf is null)
      {
          // allocate the buffer
      }

 check in place?  The basic model seems right -- at the moment when you
 need the buffer, you check if it's allocated (and if not, allocate it as
 needed); you indicate to the GC that it shouldn't collect the memory;
 you use the buffer; and the moment it's no longer needed, you indicate
 to the GC that it's collectable again.

 It means having to be very careful to check the buffer's allocation
 status whenever you want to use it, but I think that's an unavoidable
 consequence of wanting a static variable that can be freed if needed.

Problem is - said GC-freed memory could be then reused in some way. I 
can't imagine how you'd test that the block that is allocated is *still 
your old* block.

-- 
Dmitry Olshansky

Sep 11 2013

Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:

On 11/09/13 15:45, Dmitry Olshansky wrote:
 Problem is - said GC-freed memory could be then reused in some way. I can't
 imagine how you'd test that the block that is allocated is *still your old*
block.

Ahh, nasty.  I'd assumed that the buffer would have been reset to null in the 
event that the GC freed its memory.

Sep 11 2013

"monarch_dodra" <monarchdodra gmail.com> writes:

On Wednesday, 11 September 2013 at 13:33:23 UTC, Joseph Rushton 
Wakeling wrote:
 On 11/09/13 15:13, monarch_dodra wrote:
 That's somewhat better, as it would allow the GC to collect my 
 buffer, if it
 wants to, but I wouldn't actually know about it afterwards 
 which leaves me screwed.

 Just to clarify, is this buffer meant only for internal use in 
 your function or is it meant to be externally accessed as well?
  I'd kind of assumed the former.

 Either way, isn't it sufficient to have some kind of

     if (buf is null)
     {
         // allocate the buffer
     }

 check in place?  The basic model seems right -- at the moment 
 when you need the buffer, you check if it's allocated (and if 
 not, allocate it as needed); you indicate to the GC that it 
 shouldn't collect the memory; you use the buffer; and the 
 moment it's no longer needed, you indicate to the GC that it's 
 collectable again.

 It means having to be very careful to check the buffer's 
 allocation status whenever you want to use it, but I think 
 that's an unavoidable consequence of wanting a static variable 
 that can be freed if needed.

 The alternative I thought of was something like comparing the 
 size difference between the currently-needed buffer and the 
 last-needed buffer (... or if you want to be over-the-top, 
 compare to a running average:-), and if the current one is 
 sufficiently smaller, free the old one and re-alloc a new one; 
 but that's a bit _too_ greedy in the free-up-memory stakes, I 
 think.

The buffer is meant strictly for internal use. It never escapes 
the function it is used in, which not re-entrant either.

Basically, I'm storing the buffer in a "static ubyte[]", and if 
there isn't enough room for what I'm doing, I simply make it 
grow. No problems there.

The issue I'm trying to solve is "and the moment it's no longer 
needed" part. The function is really just a free function, in a 
library. The user could use it ever only once, or use it very 
repeatedly, I don't know. I particular, the amount of buffer 
needed has a 1:1 correlation with the user's input size. The user 
could repeatedly call me with input in the size of a couple of 
bytes, or just once or twice with input in the megabytes.

I *could* just allocate and forget about it, but I was curious 
about having a mechanism where the buffer would just be 
"potentially collected" between two calls. As a form of 
"failsafe" if it got too greedy, or if the user just hasn't used 
the function in a while.

Sep 11 2013

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

11-Sep-2013 14:34, monarch_dodra пишет:
 On Wednesday, 11 September 2013 at 10:28:37 UTC, Gary Willoughby wrote:
 You can then let the GC handle it or free it yourself with
 GC.free(*buffer).

 But if the buffer is stored in a static variable, the GC will never
 collect it. I *could* also free it myself, but why/when would I do that?

 Did you just just let your buffer grow, and never let it get collected?

 Is there a way to do something like "I'm using this buffer, but if you
 want to collect it, then go ahead. I'll reallocate a new one *if/when* I
 need it again"

You need weak references. With manually registered finalize for your 
buffer + flag you might pull it off (but be extremely careful).
There is something like this in an upcoming std.signals2 IIRC.

Basically the sequence should be - pin the pointer with strong ref if 
it's valid, use it, unpin. If it wasn't valid - it got collected, 
allocate new buffer and repeat. The "was valid" is the ugly part, and 
prone to race condition (GC works in its own thread, got to 
disable/enable etc.).

All in all this is the kind of stuff that:
a) Druntime/Phobos should provided
b) Is actually needed for other things as well

I'd file an enhancement if there isn't one already.
-- 
Dmitry Olshansky

Sep 11 2013

"Namespace" <rswhite4 googlemail.com> writes:

On Wednesday, 11 September 2013 at 08:06:37 UTC, monarch_dodra 
wrote:
 I have a function that will *massively* benefit from having a 
 persistent internal buffer it can re-use (and grow) from call 
 to call, instead of re-allocating on every call.

 What I don't want is either of:
 1. To set a fixed limitation of size, if the user ends up 
 making repeated calls to something larger to my fixed size.
 2. For a single big call which will allocate a HUGE internal 
 buffer that will consume all my memory.

 What I need is some sort of lazy buffer. Basically, the 
 allocation holds, but I don't want the to prevent the GC from 
 collecting it if it deems it has gotten too big, or needs more 
 memory.

 Any idea on how to do something like that? Or literature?

I do not know if it fits, but I had a similar problem some time 
ago:
http://forum.dlang.org/thread/wsxajhlsupnraevowcgd forum.dlang.org

Sep 11 2013

Jacob Carlborg <doob me.com> writes:

On 2013-09-11 10:06, monarch_dodra wrote:
 I have a function that will *massively* benefit from having a persistent
 internal buffer it can re-use (and grow) from call to call, instead of
 re-allocating on every call.

 What I don't want is either of:
 1. To set a fixed limitation of size, if the user ends up making
 repeated calls to something larger to my fixed size.
 2. For a single big call which will allocate a HUGE internal buffer that
 will consume all my memory.

 What I need is some sort of lazy buffer. Basically, the allocation
 holds, but I don't want the to prevent the GC from collecting it if it
 deems it has gotten too big, or needs more memory.

 Any idea on how to do something like that? Or literature?

How about keeping a stack or static buffer. If that gets too small use a 
new buffer. When you're done with the new buffer set it to null to allow 
the GC to collect it. Then repeat.

-- 
/Jacob Carlborg

Sep 11 2013

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Thu, Sep 12, 2013 at 08:27:59AM +0200, Jacob Carlborg wrote:
 On 2013-09-11 10:06, monarch_dodra wrote:
I have a function that will *massively* benefit from having a
persistent internal buffer it can re-use (and grow) from call to
call, instead of re-allocating on every call.

What I don't want is either of:
1. To set a fixed limitation of size, if the user ends up making
repeated calls to something larger to my fixed size.
2. For a single big call which will allocate a HUGE internal buffer
that will consume all my memory.

What I need is some sort of lazy buffer. Basically, the allocation
holds, but I don't want the to prevent the GC from collecting it if
it deems it has gotten too big, or needs more memory.

Any idea on how to do something like that? Or literature?

 
 How about keeping a stack or static buffer. If that gets too small
 use a new buffer. When you're done with the new buffer set it to
 null to allow the GC to collect it. Then repeat.

[...]

The problem is, he wants to reuse the buffer next time if the GC hasn't
collected it yet.

Here's an idea, though. It doesn't completely solve the problem, but it
just occurred to me that "weak pointers" (i.e., ignored by the GC for
the purposes of marking) can be simulated by XOR'ing the pointer value
with some mask so that it's not recognized as a pointer by the GC. This
can be encapsulated by a weak pointer struct that automatically does the
translation:

	struct WeakPointer(T) {
		enum size_t mask = 0xdeadbeef;
		union Impl {
			T* ptr;
			size_t uintVal;
		}
		Impl impl;
		void set(T* ptr)  system {
			impl.ptr = ptr;
			impl.uintVal ^= mask;
		}
		T* get()  system {
			Impl i = impl;
			i.uintVal ^= mask;
			return i.ptr;
		}
	}

	WeakPointer!Buffer bufferRef;

	void doWork(Args...) {
		T* buffer;
		if (bufferRef.get() is null) {
			// Buffer hasn't been allocated yet
			buffer = allocateNewBuffer();
			bufferRef.set(buffer);
		} else {
			void *p;
			core.memory.GC.getAttr(p);
			if (p is null || p != bufferRef.get()) {
				// GC has collected previous buffer
				buffer = allocateNewBuffer();
				bufferRef.set(buffer);
			}
		}
		useBuffer(buffer);
		...
	}

Note that the inner if block is not 100% safe, because there's no
guarantee that even if the base pointer of the block hasn't changed, the
GC hasn't reallocated the block to somebody else. So this part is still
yet to be solved.


T

-- 
It is widely believed that reinventing the wheel is a waste of time; but
I disagree: without wheel reinventers, we would be still be stuck with
wooden horse-cart wheels.

Sep 12 2013

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

12-Sep-2013 17:51, H. S. Teoh пишет:
 On Thu, Sep 12, 2013 at 08:27:59AM +0200, Jacob Carlborg wrote:
 On 2013-09-11 10:06, monarch_dodra wrote:
 I have a function that will *massively* benefit from having a
 persistent internal buffer it can re-use (and grow) from call to
 call, instead of re-allocating on every call.

 What I don't want is either of:
 1. To set a fixed limitation of size, if the user ends up making
 repeated calls to something larger to my fixed size.
 2. For a single big call which will allocate a HUGE internal buffer
 that will consume all my memory.

 What I need is some sort of lazy buffer. Basically, the allocation
 holds, but I don't want the to prevent the GC from collecting it if
 it deems it has gotten too big, or needs more memory.

 Any idea on how to do something like that? Or literature?

 How about keeping a stack or static buffer. If that gets too small
 use a new buffer. When you're done with the new buffer set it to
 null to allow the GC to collect it. Then repeat.

 [...]

 The problem is, he wants to reuse the buffer next time if the GC hasn't
 collected it yet.

 Here's an idea, though. It doesn't completely solve the problem, but it
 just occurred to me that "weak pointers" (i.e., ignored by the GC for
 the purposes of marking) can be simulated by XOR'ing the pointer value
 with some mask so that it's not recognized as a pointer by the GC. This
 can be encapsulated by a weak pointer struct that automatically does the
 translation:

 	struct WeakPointer(T) {
 		enum size_t mask = 0xdeadbeef;
 		union Impl {
 			T* ptr;
 			size_t uintVal;
 		}
 		Impl impl;
 		void set(T* ptr)  system {
 			impl.ptr = ptr;
 			impl.uintVal ^= mask;
 		}
 		T* get()  system {
 			Impl i = impl;
 			i.uintVal ^= mask;
 			return i.ptr;
 		}
 	}

 	WeakPointer!Buffer bufferRef;

 	void doWork(Args...) {
 		T* buffer;
 		if (bufferRef.get() is null) {
 			// Buffer hasn't been allocated yet
 			buffer = allocateNewBuffer();
 			bufferRef.set(buffer);
 		} else {
 			void *p;
 			core.memory.GC.getAttr(p);

This line above is not 100% good idea .. at least with deadbeaf as mask.

If we do know what OS you compile for we may just flip the say upper bit 
and get a pointer into kernel space (and surely that isn't in GC pool). 
Even then your last paragraph pretty much destroys it.


Better option is to have finalizer hooked up to set some flag. Then 
_after_ restoring the pointer we consult that flag variable.

 			if (p is null || p != bufferRef.get()) {
 				// GC has collected previous buffer
 				buffer = allocateNewBuffer();
 				bufferRef.set(buffer);
 			}
 		}
 		useBuffer(buffer);
 		...
 	}

 Note that the inner if block is not 100% safe, because there's no
 guarantee that even if the base pointer of the block hasn't changed, the
 GC hasn't reallocated the block to somebody else. So this part is still
 yet to be solved.


 T


-- 
Dmitry Olshansky

Sep 12 2013

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Thu, Sep 12, 2013 at 07:50:25PM +0400, Dmitry Olshansky wrote:
 12-Sep-2013 17:51, H. S. Teoh пишет:

[...]
	struct WeakPointer(T) {
		enum size_t mask = 0xdeadbeef;
		union Impl {
			T* ptr;
			size_t uintVal;
		}
		Impl impl;
		void set(T* ptr)  system {
			impl.ptr = ptr;
			impl.uintVal ^= mask;
		}
		T* get()  system {
			Impl i = impl;
			i.uintVal ^= mask;
			return i.ptr;
		}
	}

	WeakPointer!Buffer bufferRef;

	void doWork(Args...) {
		T* buffer;
		if (bufferRef.get() is null) {
			// Buffer hasn't been allocated yet
			buffer = allocateNewBuffer();
			bufferRef.set(buffer);
		} else {
			void *p;
			core.memory.GC.getAttr(p);

 
 This line above is not 100% good idea .. at least with deadbeaf as
 mask.
 
 If we do know what OS you compile for we may just flip the say upper
 bit and get a pointer into kernel space (and surely that isn't in GC
 pool). Even then your last paragraph pretty much destroys it.

Well, that was just an example value. :) If we know which OS it is and
how it assigns VM addresses, then we can adjust the mask appropriately.

But yeah, calling GC.getAttr is unreliable since you can't tell whether
the block is what you had before, or somebody else's new data.


[...]
 Better option is to have finalizer hooked up to set some flag. Then
 _after_ restoring the pointer we consult that flag variable.

Good idea. The problem is, how to set a finalizer on a memory block that
can change in size? The OP's original situation was that the buffer can
be extended while in use, but I don't know of any D type that can
associate a dtor with a ubyte[] array (note that the GC collecting the
wrapper struct/class around the ubyte[] is not the same as collecting
the actual memory block storing the ubyte[] -- the former can happen
without the latter).


T

-- 
People tell me that I'm skeptical, but I don't believe it.

Sep 12 2013

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

12-Sep-2013 20:51, H. S. Teoh пишет:
 On Thu, Sep 12, 2013 at 07:50:25PM +0400, Dmitry Olshansky wrote:
 12-Sep-2013 17:51, H. S. Teoh пишет:

 [...]
 	struct WeakPointer(T) {
 		enum size_t mask = 0xdeadbeef;
 		union Impl {
 			T* ptr;
 			size_t uintVal;
 		}
 		Impl impl;
 		void set(T* ptr)  system {
 			impl.ptr = ptr;
 			impl.uintVal ^= mask;
 		}
 		T* get()  system {
 			Impl i = impl;
 			i.uintVal ^= mask;
 			return i.ptr;
 		}
 	}

 	WeakPointer!Buffer bufferRef;

 	void doWork(Args...) {
 		T* buffer;
 		if (bufferRef.get() is null) {
 			// Buffer hasn't been allocated yet
 			buffer = allocateNewBuffer();
 			bufferRef.set(buffer);
 		} else {
 			void *p;
 			core.memory.GC.getAttr(p);

 This line above is not 100% good idea .. at least with deadbeaf as
 mask.

 If we do know what OS you compile for we may just flip the say upper
 bit and get a pointer into kernel space (and surely that isn't in GC
 pool). Even then your last paragraph pretty much destroys it.

 Well, that was just an example value. :) If we know which OS it is and
 how it assigns VM addresses, then we can adjust the mask appropriately.

 But yeah, calling GC.getAttr is unreliable since you can't tell whether
 the block is what you had before, or somebody else's new data.

It occured to me that there are modes where full address space is 
available, typically so on x86 app running on top of x64 kernel (e.g. in 
Windows Wow64 could do that, Linux also has so-called x32 ABI).

 [...]
 Better option is to have finalizer hooked up to set some flag. Then
 _after_ restoring the pointer we consult that flag variable.

 Good idea. The problem is, how to set a finalizer on a memory block that
 can change in size? The OP's original situation was that the buffer can
 be extended while in use, but I don't know of any D type that can
 associate a dtor with a ubyte[] array (note that the GC collecting the
 wrapper struct/class around the ubyte[] is not the same as collecting
 the actual memory block storing the ubyte[] -- the former can happen
 without the latter).

Double indirection? Allocate a class that has finalizer, hold that via 
weak-ref. The wrapper in turn contains a pointer to the buffer. The 
interesting point then is that one may allocate said buffer via C's realloc.

Then once helper struct is collected the finalizer is called and this is 
where we call free to cleanup C's heap.

I'm thinking this actually is going to work.

-- 
Dmitry Olshansky

Sep 12 2013

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Thu, Sep 12, 2013 at 11:13:30PM +0400, Dmitry Olshansky wrote:
 12-Sep-2013 20:51, H. S. Teoh пишет:
On Thu, Sep 12, 2013 at 07:50:25PM +0400, Dmitry Olshansky wrote:


[...]
Better option is to have finalizer hooked up to set some flag. Then
_after_ restoring the pointer we consult that flag variable.

Good idea. The problem is, how to set a finalizer on a memory block
that can change in size? The OP's original situation was that the
buffer can be extended while in use, but I don't know of any D type
that can associate a dtor with a ubyte[] array (note that the GC
collecting the wrapper struct/class around the ubyte[] is not the
same as collecting the actual memory block storing the ubyte[] -- the
former can happen without the latter).

 
 Double indirection? Allocate a class that has finalizer, hold that
 via weak-ref. The wrapper in turn contains a pointer to the buffer.
 The interesting point then is that one may allocate said buffer via
 C's realloc.
 
 Then once helper struct is collected the finalizer is called and
 this is where we call free to cleanup C's heap.
 
 I'm thinking this actually is going to work.

[...]

Interesting idea, use C's malloc/realloc to hold the actual buffer. Only
possible catch is, will that cause the GC to collect when it runs out of
memory (which is the whole point of the OP's question)? I.e., does it
make a difference in GC behaviour to allocate, say, 10MB from the GC vs.
allocating 10MB from malloc/realloc?

Assuming we have that settled, something like this should work:

	bool isValid;
	final class BufWrapper {
		void* ptrToMallocedBuf;
		this(void* ptr) {
			// We need this, 'cos otherwise we don't know if
			// our weak ref to BufWrapper is still valid!
			isValid = true;

			ptrToMallocedBuf = ptr;
		}
		~this() {
			// If we're being collected, free the real
			// buffer too.
			free(ptrToMallocedBuf);
			isValid = false;
		}
	}

	// WeakPointer masks the pointer to BufWrapper in some suitable
	// way so that the GC will collect it when needed.
	WeakPointer!BufWrapper wrappedBufRef;

	void doWork(...) {
		void* buf;
		if (!isValid) {
			buf = realloc(null, bufSize);
			wrappedBufRef.set(buf);
		} else {
			buf = wrappedBufRef.get();
		}

		// use buf here.
	}


T

-- 
Public parking: euphemism for paid parking. -- Flora

Sep 12 2013

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

13-Sep-2013 00:11, H. S. Teoh пишет:
 On Thu, Sep 12, 2013 at 11:13:30PM +0400, Dmitry Olshansky wrote:
 12-Sep-2013 20:51, H. S. Teoh пишет:
 On Thu, Sep 12, 2013 at 07:50:25PM +0400, Dmitry Olshansky wrote:


 [...]
 Better option is to have finalizer hooked up to set some flag. Then
 _after_ restoring the pointer we consult that flag variable.

 Good idea. The problem is, how to set a finalizer on a memory block
 that can change in size? The OP's original situation was that the
 buffer can be extended while in use, but I don't know of any D type
 that can associate a dtor with a ubyte[] array (note that the GC
 collecting the wrapper struct/class around the ubyte[] is not the
 same as collecting the actual memory block storing the ubyte[] -- the
 former can happen without the latter).

 Double indirection? Allocate a class that has finalizer, hold that
 via weak-ref. The wrapper in turn contains a pointer to the buffer.
 The interesting point then is that one may allocate said buffer via
 C's realloc.

 Then once helper struct is collected the finalizer is called and
 this is where we call free to cleanup C's heap.

 I'm thinking this actually is going to work.

 [...]

 Interesting idea, use C's malloc/realloc to hold the actual buffer. Only
 possible catch is, will that cause the GC to collect when it runs out of
 memory (which is the whole point of the OP's question)? I.e., does it
 make a difference in GC behaviour to allocate, say, 10MB from the GC vs.
 allocating 10MB from malloc/realloc?

The only problem I can foresee is that when it runs the collection 
(*and* being tight on RAM) the C heap will not return said chunk back to 
OS. Then GC won't pick up that memory, and we'd get out of ram.

I would safely assume however that for big buffers a mmap/munmap is 
called (or its analogue) and hence memory is returned back to OS. That's 
what all allocators do for huge chunks by anyway.

Otherwise we are still in a good shape, the memory will eventually be 
freed, yet we get to reuse it quite cheaply in a tight loop.  I don't 
expect collections to run in these all that often ;)

 Assuming we have that settled, something like this should work:

 	bool isValid;
 	final class BufWrapper {
 		void* ptrToMallocedBuf;
 		this(void* ptr) {
 			// We need this, 'cos otherwise we don't know if
 			// our weak ref to BufWrapper is still valid!
 			isValid = true;

 			ptrToMallocedBuf = ptr;
 		}
 		~this() {
 			// If we're being collected, free the real
 			// buffer too.
 			free(ptrToMallocedBuf);
 			isValid = false;
 		}
 	}

 	// WeakPointer masks the pointer to BufWrapper in some suitable
 	// way so that the GC will collect it when needed.
 	WeakPointer!BufWrapper wrappedBufRef;

 	void doWork(...) {
 		void* buf;

Careful here - you really have first to get a pointer ... THEN check if 
it's valid.

 		if (!isValid) {
 			buf = realloc(null, bufSize);
 			wrappedBufRef.set(buf);
 		} else {

//otherwise at this point GC.collect runs and presto, memory is freed
//too bad such a thing will never show up in unittests
 			buf = wrappedBufRef.get();
 		}

 		// use buf here.
 	}

Checking the flag should be somehow part of weak ref job.
I'd rather make it less error prone:

void* buf;
//unmask pointer, do the flag check - false means was freed
if(!weakRef.readTo(buf)){
	//create & set new buf
	buf = realloc(...);
}

... //use buf

weakRef.set(buf);

I think I'd code it up if nobody beats me to it as I need the same exact 
pattern for std.regex anyway.

-- 
Dmitry Olshansky

Sep 12 2013

"monarch_dodra" <monarchdodra gmail.com> writes:

On Thursday, 12 September 2013 at 19:13:40 UTC, Dmitry Olshansky
wrote:
 Double indirection? Allocate a class that has finalizer, hold 
 that via weak-ref. The wrapper in turn contains a pointer to 
 the buffer. The interesting point then is that one may allocate 
 said buffer via C's realloc.

 Then once helper struct is collected the finalizer is called 
 and this is where we call free to cleanup C's heap.

 I'm thinking this actually is going to work.

Yum. I like this.

I was going to say: "At the end of the day, if the GC doesn't
*tell* us the collection happened, then the problem is not
solve-able. We'd need a way that would allow the GC to tell us
the memory was *finalized*". And then I'd go on to say "since our
GC is non-finalizing, there is simply no solution".

But then classes. Derp.

I'd be real interested in having a finalized solution. The
"details" of how memory addressing is not my strong suite, so I
wouldn't trust myself with all those union{ptr/size_t} things.

Thanks, I'll start toying around with this :)

Sep 12 2013

Jacob Carlborg <doob me.com> writes:

On 2013-09-12 15:51, H. S. Teoh wrote:

 The problem is, he wants to reuse the buffer next time if the GC hasn't
 collected it yet.

I was thinking he could reuse the stack/static buffer. Basically using 
two buffers, one static and one dynamic.

-- 
/Jacob Carlborg

Sep 12 2013

D Programming

C/C++ Programming

Other

digitalmars.D.learn - Greedy memory handling