digitalmars.D.learn - Often repeated array allocations

Namespace (12/12) Jul 20 2013 Let us assume we have a method of a class which is used often and

Dmitry Olshansky (5/16) Jul 20 2013 5. Keep a TLS scratch pad buffer (static class member) for said

=?UTF-8?B?QWxpIMOHZWhyZWxp?= (15/21) Jul 20 2013 As long as the function finishes with that buffer, there shouldn't be

Dmitry Olshansky (5/26) Jul 21 2013 Yes, this case is problematic. It cannot be called recursively even with...

Namespace (2/24) Jul 20 2013 TLS scratch pad buffer?

Dmitry Olshansky (22/41) Jul 21 2013 My analogy goes as follows: a chunk of memory for temporary needs =>

Namespace (6/26) Jul 21 2013 I really like the idea.

bearophile (6/10) Jul 21 2013 An idea is to add some debug{} code that computes statistics for

Namespace (3/13) Jul 21 2013 Too much effort. But if some of you have this experiences

monarch_dodra (14/33) Jul 22 2013 It would really help to know what your function is doing exactly.

Namespace (5/5) Jul 22 2013 It's part of Dgame and it's the shape draw method. The method

David (10/15) Jul 22 2013 This will be over 1024 vertices in no time, I am currently drawing

Namespace (6/32) Jul 22 2013 Hm, but Dgame isn't designed for that sort of thing. It's more

Dmitry Olshansky (4/10) Jul 21 2013 No one size fits all. Measure and profit.

bearophile (6/7) Jul 20 2013 D also supports alloca(), that for 1D arrays is almost bearable.

Namespace (3/10) Jul 20 2013 Yeah, but aren't 4000 elements a bit much for a stack allocated

bearophile (10/12) Jul 20 2013 4000 floats take about 16 KB. If your function is not recursive

Namespace (2/14) Jul 20 2013 So, you would never allocate with float[]?

bearophile (18/19) Jul 20 2013 Generally in D I allocate on the heap with new, and once in a

Namespace (12/12) Jul 20 2013 But D isn't like Ada. It's more like C++ and there Heap

Jacob Carlborg (12/23) Jul 21 2013 Perhaps:

Namespace (44/68) Jul 21 2013 But then I have mostly far too much and maybe a few times a bit

Jacob Carlborg (6/9) Jul 22 2013 You said you needed between 100 and 4000 floats. My suggestion will

bearophile (4/5) Jul 21 2013 But D should be a bit more like Ada, for such things :-)

Namespace (4/9) Jul 21 2013 I agree, but how? ;)

bearophile (26/29) Jul 21 2013 I think there is only a small number of Ada ideas worth copying,

"Namespace" <rswhite4 googlemail.com> writes:

Let us assume we have a method of a class which is used often and 
the method is called periodically and must allocate every time a 
array between 100 and 4000 elements. What would you do?

1. Simple: float[] array;
2. Reserve: float[] array; array.reserve(N); /// N is a parameter 
value
3. Use manual memory allocation (e.g. with malloc) and free the 
memory immediately at the end of the scope.
4. Use stack allocated memory (But maybe 4000 is too big?)


I can then control when _exactly_ the memory is released. But I 
want to hear other opinions. :)

Jul 20 2013

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

21-Jul-2013 00:19, Namespace пишет:
 Let us assume we have a method of a class which is used often and the
 method is called periodically and must allocate every time a array
 between 100 and 4000 elements. What would you do?

 1. Simple: float[] array;
 2. Reserve: float[] array; array.reserve(N); /// N is a parameter value
 3. Use manual memory allocation (e.g. with malloc) and free the memory
 immediately at the end of the scope.
 4. Use stack allocated memory (But maybe 4000 is too big?)


 then control when _exactly_ the memory is released. But I want to hear
 other opinions. :)

5. Keep a TLS scratch pad buffer (static  class member) for said 
100-4000 floats and re-use it.



-- 
Dmitry Olshansky

Jul 20 2013

=?UTF-8?B?QWxpIMOHZWhyZWxp?= <acehreli yahoo.com> writes:

On 07/20/2013 01:22 PM, Dmitry Olshansky wrote:

 21-Jul-2013 00:19, Namespace пишет:
 Let us assume we have a method of a class which is used often and the
 method is called periodically and must allocate every time a array
 between 100 and 4000 elements. What would you do?


 5. Keep a TLS scratch pad buffer (static  class member) for said
 100-4000 floats and re-use it.

As long as the function finishes with that buffer, there shouldn't be 
reentrancy issues in general. But if the function calls the same 
function, say on another object perhaps indirectly, then the two objects 
would be sharing the same buffer:

class C
{
     void foo() {
         /* modify the static buffer here */
         auto c = new C();
         c.foo();
         /* the static buffer has changed */
     }
}

Ali

Jul 20 2013

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

21-Jul-2013 00:42, Ali Çehreli пишет:
 On 07/20/2013 01:22 PM, Dmitry Olshansky wrote:

  > 21-Jul-2013 00:19, Namespace пишет:
  >> Let us assume we have a method of a class which is used often and the
  >> method is called periodically and must allocate every time a array
  >> between 100 and 4000 elements. What would you do?

  > 5. Keep a TLS scratch pad buffer (static  class member) for said
  > 100-4000 floats and re-use it.

 As long as the function finishes with that buffer, there shouldn't be
 reentrancy issues in general. But if the function calls the same
 function, say on another object perhaps indirectly, then the two objects
 would be sharing the same buffer:

Yes, this case is problematic. It cannot be called recursively even with 
another instance.

 class C
 {
      void foo() {
          /* modify the static buffer here */
          auto c = new C();
          c.foo();
          /* the static buffer has changed */
      }
 }

 Ali

-- 
Dmitry Olshansky

Jul 21 2013

"Namespace" <rswhite4 googlemail.com> writes:

On Saturday, 20 July 2013 at 20:22:56 UTC, Dmitry Olshansky wrote:
 21-Jul-2013 00:19, Namespace пишет:
 Let us assume we have a method of a class which is used often 
 and the
 method is called periodically and must allocate every time a 
 array
 between 100 and 4000 elements. What would you do?

 1. Simple: float[] array;
 2. Reserve: float[] array; array.reserve(N); /// N is a 
 parameter value
 3. Use manual memory allocation (e.g. with malloc) and free 
 the memory
 immediately at the end of the scope.
 4. Use stack allocated memory (But maybe 4000 is too big?)


 because I can
 then control when _exactly_ the memory is released. But I want 
 to hear
 other opinions. :)

 5. Keep a TLS scratch pad buffer (static  class member) for 
 said 100-4000 floats and re-use it.

TLS scratch pad buffer?

Jul 20 2013

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

21-Jul-2013 00:46, Namespace пишет:
 On Saturday, 20 July 2013 at 20:22:56 UTC, Dmitry Olshansky wrote:
 21-Jul-2013 00:19, Namespace пишет:
 Let us assume we have a method of a class which is used often and the
 method is called periodically and must allocate every time a array
 between 100 and 4000 elements. What would you do?

 1. Simple: float[] array;
 2. Reserve: float[] array; array.reserve(N); /// N is a parameter value
 3. Use manual memory allocation (e.g. with malloc) and free the memory
 immediately at the end of the scope.
 4. Use stack allocated memory (But maybe 4000 is too big?)


 then control when _exactly_ the memory is released. But I want to hear
 other opinions. :)

 5. Keep a TLS scratch pad buffer (static  class member) for said
 100-4000 floats and re-use it.

 TLS scratch pad buffer?

My analogy goes as follows: a chunk of memory for temporary needs => 
scratch pad (as in sheet of paper for quick notes/sketches).

Something along the lines of:

class A{
	static float[] buffer;
	static this(){
		buffer = new float[as_big_as_it_gets];
	}

	void foo(){
		float[] tempAlloc = buffer[0..need_this_much];
		tempAlloc[] = 0.0;
		...
	}	
}

As long as foo is not called recursively should just work. Other thing 
that may wreck this is if foo is called in Fiber context and uses yeild 
internally.

One may as well fall back to option 3 in rare cases where scratch pad is 
too small to fit the bill.

-- 
Dmitry Olshansky

Jul 21 2013

"Namespace" <rswhite4 googlemail.com> writes:

 My analogy goes as follows: a chunk of memory for temporary 
 needs => scratch pad (as in sheet of paper for quick 
 notes/sketches).

 Something along the lines of:

 class A{
 	static float[] buffer;
 	static this(){
 		buffer = new float[as_big_as_it_gets];
 	}

 	void foo(){
 		float[] tempAlloc = buffer[0..need_this_much];
 		tempAlloc[] = 0.0;
 		...
 	}	
 }

 As long as foo is not called recursively should just work. 
 Other thing that may wreck this is if foo is called in Fiber 
 context and uses yeild internally.

 One may as well fall back to option 3 in rare cases where 
 scratch pad is too small to fit the bill.

I really like the idea.
I've changed it a bit:
I have a float[1024] buffer which is used, as long as the 
requested size is less than 1024. If it's greater, I will 
temporary allocate the whole array with new float[Size];
Any improvements? Or is 1024 to small / big?

Jul 21 2013

"bearophile" <bearophileHUGS lycos.com> writes:

Namespace:

 I have a float[1024] buffer which is used, as long as the 
 requested size is less than 1024. If it's greater, I will 
 temporary allocate the whole array with new float[Size];

That's one of the ways I solve similar problems.


 Any improvements? Or is 1024 to small / big?

An idea is to add some debug{} code that computes statistics for 
the array sizes...

Bye,
bearophile

Jul 21 2013

"Namespace" <rswhite4 googlemail.com> writes:

On Sunday, 21 July 2013 at 21:31:08 UTC, bearophile wrote:
 Namespace:

 I have a float[1024] buffer which is used, as long as the 
 requested size is less than 1024. If it's greater, I will 
 temporary allocate the whole array with new float[Size];

 That's one of the ways I solve similar problems.


 Any improvements? Or is 1024 to small / big?

 An idea is to add some debug{} code that computes statistics 
 for the array sizes...

 Bye,
 bearophile

Too much effort. But if some of you have this experiences 
already, I would gladly profit from them. ;)

Jul 21 2013

"monarch_dodra" <monarchdodra gmail.com> writes:

On Sunday, 21 July 2013 at 21:35:15 UTC, Namespace wrote:
 On Sunday, 21 July 2013 at 21:31:08 UTC, bearophile wrote:
 Namespace:

 I have a float[1024] buffer which is used, as long as the 
 requested size is less than 1024. If it's greater, I will 
 temporary allocate the whole array with new float[Size];

 That's one of the ways I solve similar problems.


 Any improvements? Or is 1024 to small / big?

 An idea is to add some debug{} code that computes statistics 
 for the array sizes...

 Bye,
 bearophile

 Too much effort. But if some of you have this experiences 
 already, I would gladly profit from them. ;)

It would really help to know what your function is doing exactly.

Personally, I think you are over thinking it. Just do this:

//----
void foo ()
{
     static float[] scratchPadBuffer;
     // use scratchPadBuffer here
}
//----

And simply make buffer grow as it needs, and never shorten it. 
You could consider this to be "controlled leak" I guess. But at 
least, with this approach, you can't "guess wrong", and you'll 
only very rarely allocate.

Jul 22 2013

"Namespace" <rswhite4 googlemail.com> writes:

It's part of Dgame and it's the shape draw method. The method 
collects the vertices and give them to the vertex buffer.
I will try it with my current way: static stack buffer with a 
size of 1024 and otherwise I will temporary allocate memory. But 
how much shapes grow larger than 1024 vertices ;).

Jul 22 2013

David <d dav1d.de> writes:

Am 22.07.2013 10:41, schrieb Namespace:
 It's part of Dgame and it's the shape draw method. The method collects
 the vertices and give them to the vertex buffer.
 I will try it with my current way: static stack buffer with a size of
 1024 and otherwise I will temporary allocate memory. But how much shapes
 grow larger than 1024 vertices ;).

This will be over 1024 vertices in no time, I am currently drawing
vertices with a size of ~500MiB depending on the terrain. The CPU
doesn't need to know of all vertices at once, so collect them and upload
them  to the GPU if they go over a certain limit and begin to fill your
buffer from the start again. I use a malloc allocated buffer, its size
is never decreased, I initially allocate an educated guess + some extra
to be safe, then if it really happens to run ot of memory I simple
resize it depending on how much is left with another educated guess +
some extra.

Jul 22 2013

"Namespace" <rswhite4 googlemail.com> writes:

On Monday, 22 July 2013 at 09:26:33 UTC, David wrote:
 Am 22.07.2013 10:41, schrieb Namespace:
 It's part of Dgame and it's the shape draw method. The method 
 collects
 the vertices and give them to the vertex buffer.
 I will try it with my current way: static stack buffer with a 
 size of
 1024 and otherwise I will temporary allocate memory. But how 
 much shapes
 grow larger than 1024 vertices ;).

 This will be over 1024 vertices in no time, I am currently 
 drawing
 vertices with a size of ~500MiB depending on the terrain. The 
 CPU
 doesn't need to know of all vertices at once, so collect them 
 and upload
 them  to the GPU if they go over a certain limit and begin to 
 fill your
 buffer from the start again. I use a malloc allocated buffer, 
 its size
 is never decreased, I initially allocate an educated guess + 
 some extra
 to be safe, then if it really happens to run ot of memory I 
 simple
 resize it depending on how much is left with another educated 
 guess +
 some extra.

Hm, but Dgame isn't designed for that sort of thing. It's more 
like SFML for C++. So do you think that such Framework will ever 
use more vertices than 1024?
Anyway, if it's larger, I allocate with malloc enough memory and 
free it at the end of the scope.

Jul 22 2013

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

22-Jul-2013 01:19, Namespace пишет:

 I really like the idea.
 I've changed it a bit:
 I have a float[1024] buffer which is used, as long as the requested size
 is less than 1024. If it's greater, I will temporary allocate the whole
 array with new float[Size];
 Any improvements? Or is 1024 to small / big?

No one size fits all. Measure and profit.

-- 
Dmitry Olshansky

Jul 21 2013

"bearophile" <bearophileHUGS lycos.com> writes:

Namespace:

 But I want to hear other opinions. :)

D also supports alloca(), that for 1D arrays is almost bearable.

See also:
http://d.puremagic.com/issues/show_bug.cgi?id=9832

Bye,
bearophile

Jul 20 2013

"Namespace" <rswhite4 googlemail.com> writes:

On Saturday, 20 July 2013 at 20:34:50 UTC, bearophile wrote:
 Namespace:

 But I want to hear other opinions. :)

 D also supports alloca(), that for 1D arrays is almost bearable.

 See also:
 http://d.puremagic.com/issues/show_bug.cgi?id=9832

 Bye,
 bearophile

Yeah, but aren't 4000 elements a bit much for a stack allocated 
array?

Jul 20 2013

"bearophile" <bearophileHUGS lycos.com> writes:

Namespace:

 Yeah, but aren't 4000 elements a bit much for a stack allocated 
 array?

4000 floats take about 16 KB. If your function is not recursive 
and it's not much transitively recursive, then I think it's 
acceptable. But how much stack are your other functions using? 
You have to estimate if you have enough stack space. Today it's 
very easy to have 50+ MB of stack, if necessary. DMD on Windows 
supports a linker command like this:

  -L/STACK:200000000

Bye,
bearophile

Jul 20 2013

"Namespace" <rswhite4 googlemail.com> writes:

On Saturday, 20 July 2013 at 21:39:31 UTC, bearophile wrote:
 Namespace:

 Yeah, but aren't 4000 elements a bit much for a stack 
 allocated array?

 4000 floats take about 16 KB. If your function is not recursive 
 and it's not much transitively recursive, then I think it's 
 acceptable. But how much stack are your other functions using? 
 You have to estimate if you have enough stack space. Today it's 
 very easy to have 50+ MB of stack, if necessary. DMD on Windows 
 supports a linker command like this:

  -L/STACK:200000000

 Bye,
 bearophile

So, you would never allocate with float[]?

Jul 20 2013

"bearophile" <bearophileHUGS lycos.com> writes:

Namespace:

 So, you would never allocate with float[]?

Generally in D I allocate on the heap with new, and once in a 
while with minimallyInitializedArray. Stack-allocation is left 
for situations where max performance is needed. Like you have to 
build a tree of string distances, and you use a Levenshtein 
distance to compute them. Usually a Levenshtein distance needs 
one buffer, or two. If you want to put one million of strings in 
such tree of distances, you can't allocate one billions of 
buffers, it's too much wasted time. For such case I allocate the 
Levenshtein buffer with an alloca (or I allocate the buffer 
before the function, etc, there are many ways).

I think the usual D code uses too much heap allocation. 
Allocations on the heap are slow. If you take a look at modern 
Ada code (like Ada2012 code) you see very little heap 
allocations, or no heap allocations at all :-) Probably Rust code 
will avoid lot of of heap allocations.

Bye,
bearophile

Jul 20 2013

"Namespace" <rswhite4 googlemail.com> writes:

But D isn't like Ada. It's more like C++ and there Heap 
allocations is often used too.
It would be really cool if we had allocators already.
Something like:
----
with (AllocatorX) { /// will use malloc and free instead of 
calling the GC
     float[] arr;
     arr ~= 42;
}
----

And I still don't know what a 'TLS scratch pad buffer' is.

Jul 20 2013

Jacob Carlborg <doob me.com> writes:

On 2013-07-21 08:45, Namespace wrote:
 But D isn't like Ada. It's more like C++ and there Heap allocations is
 often used too.
 It would be really cool if we had allocators already.
 Something like:
 ----
 with (AllocatorX) { /// will use malloc and free instead of calling the GC
      float[] arr;
      arr ~= 42;
 }
 ----

 And I still don't know what a 'TLS scratch pad buffer' is.

Perhaps:

float[4000] scratchPadBuffer;

void foo ()
{
     // use scratchPadBuffer here
}

I guess he just refers to a some temporary data you need during the 
execution of a function and at the end of the function you don't care 
about it.

-- 
/Jacob Carlborg

Jul 21 2013

"Namespace" <rswhite4 googlemail.com> writes:

On Sunday, 21 July 2013 at 09:16:47 UTC, Jacob Carlborg wrote:
 On 2013-07-21 08:45, Namespace wrote:
 But D isn't like Ada. It's more like C++ and there Heap 
 allocations is
 often used too.
 It would be really cool if we had allocators already.
 Something like:
 ----
 with (AllocatorX) { /// will use malloc and free instead of 
 calling the GC
     float[] arr;
     arr ~= 42;
 }
 ----

 And I still don't know what a 'TLS scratch pad buffer' is.

 Perhaps:

 float[4000] scratchPadBuffer;

 void foo ()
 {
     // use scratchPadBuffer here
 }

 I guess he just refers to a some temporary data you need during 
 the execution of a function and at the end of the function you 
 don't care about it.

But then I have mostly far too much and maybe a few times a bit 
too less store. It's not flexible. But maybe with a smaller 
granule.
What's about this:

----
struct Chunk(T, ushort maxSize = 1024) {
public:
	static T[maxSize] _chunk;

	T* ptr;
	size_t length;
	size_t capacity;

	this(size_t capacity) {
		this.capacity = capacity;

		if (capacity < maxSize)
			this.ptr = &_chunk[0];
		else
			this.ptr = cast(T*) .malloc(this.capacity * T.sizeof);
	}

	 disable
	this(this);

	~this() {
		if (this.ptr && this.capacity > maxSize)
			.free(this.ptr);
	}

	void ensureAddable(size_t capacity) {
		if (capacity > this.capacity) {
			this.capacity = capacity;

			if (this.capacity > maxSize)
				this.ptr = cast(T*) .realloc(this.ptr, this.capacity * 
T.sizeof);
		}
	}

	void opOpAssign(string op : "~", U)(auto ref U item) {
		this.ptr[this.length++] = cast(T) item;
	}

	void opOpAssign(string op : "~", U, size_t n)(auto ref U[n] 
items) {
		foreach (ref U item; items) {
			this.ptr[this.length++] = cast(T) item;
		}
	}
}
----

Jul 21 2013

Jacob Carlborg <doob me.com> writes:

On 2013-07-21 12:18, Namespace wrote:

 But then I have mostly far too much and maybe a few times a bit too less
 store. It's not flexible. But maybe with a smaller granule.
 What's about this:

You said you needed between 100 and 4000 floats. My suggestion will 
allocate 4000 floats once per thread. Use what you need from the array.

I guess your solution with Chunk encapsulate this usage pattern.

-- 
/Jacob Carlborg

Jul 22 2013

"bearophile" <bearophileHUGS lycos.com> writes:

Namespace:

 But D isn't like Ada.

But D should be a bit more like Ada, for such things :-)

Bye,
bearophile

Jul 21 2013

"Namespace" <rswhite4 googlemail.com> writes:

On Sunday, 21 July 2013 at 10:30:16 UTC, bearophile wrote:
 Namespace:

 But D isn't like Ada.

 But D should be a bit more like Ada, for such things :-)

 Bye,
 bearophile

I agree, but how? ;)
You know by yourself that Walter & Andrei are very hard to 
convince by such things.

Jul 21 2013

"bearophile" <bearophileHUGS lycos.com> writes:

Namespace:

 I agree, but how? ;)
 You know by yourself that Walter & Andrei are very hard to 
 convince by such things.

I think there is only a small number of Ada ideas worth copying, 
like:
- Integral overflows;
- Ranged integers;
- A library of collections allocated in-place in the standard 
library (Bounded Collections of Ada2012);
- More deterministic multi processing;
- Nicer and more handy enumerated sets;
- No undefined behaviours;
- Stack-allocated fixed-length arrays;
- etc.

Some of such things are work in progress (no undefined 
behaviours), some of them can go in the D standard library 
(bounded collections, variable length arrays with a bit of help 
from the compiler), some of them can be implemented with help 
from the compiler (partially library-defined integral type with 
overflow tests). Some of them maybe can be implemented once some 
bugs are sorted out and alias this gets better (ranged integers). 
Some of them are just currently not present in D (more 
deterministic multi processing), but D is supposed to have 
something to replace them ("scope", a better "shared", Walter is 
introducing a bit of object ownership behind the curtains, etc).

And then it becomes a matter of using some of those idioms.

Bye,
bearophile

Jul 21 2013

D Programming

C/C++ Programming

Other

digitalmars.D.learn - Often repeated array allocations