digitalmars.D.learn - Often repeated array allocations
- Namespace (12/12) Jul 20 2013 Let us assume we have a method of a class which is used often and
- Dmitry Olshansky (5/16) Jul 20 2013 5. Keep a TLS scratch pad buffer (static class member) for said
- =?UTF-8?B?QWxpIMOHZWhyZWxp?= (15/21) Jul 20 2013 As long as the function finishes with that buffer, there shouldn't be
- Dmitry Olshansky (5/26) Jul 21 2013 Yes, this case is problematic. It cannot be called recursively even with...
- Namespace (2/24) Jul 20 2013 TLS scratch pad buffer?
- Dmitry Olshansky (22/41) Jul 21 2013 My analogy goes as follows: a chunk of memory for temporary needs =>
- Namespace (6/26) Jul 21 2013 I really like the idea.
- bearophile (6/10) Jul 21 2013 An idea is to add some debug{} code that computes statistics for
- Namespace (3/13) Jul 21 2013 Too much effort. But if some of you have this experiences
- monarch_dodra (14/33) Jul 22 2013 It would really help to know what your function is doing exactly.
- Dmitry Olshansky (4/10) Jul 21 2013 No one size fits all. Measure and profit.
- bearophile (6/7) Jul 20 2013 D also supports alloca(), that for 1D arrays is almost bearable.
- Namespace (3/10) Jul 20 2013 Yeah, but aren't 4000 elements a bit much for a stack allocated
- bearophile (10/12) Jul 20 2013 4000 floats take about 16 KB. If your function is not recursive
- Namespace (2/14) Jul 20 2013 So, you would never allocate with float[]?
- bearophile (18/19) Jul 20 2013 Generally in D I allocate on the heap with new, and once in a
- Namespace (12/12) Jul 20 2013 But D isn't like Ada. It's more like C++ and there Heap
- Jacob Carlborg (12/23) Jul 21 2013 Perhaps:
- Namespace (44/68) Jul 21 2013 But then I have mostly far too much and maybe a few times a bit
- Jacob Carlborg (6/9) Jul 22 2013 You said you needed between 100 and 4000 floats. My suggestion will
- bearophile (4/5) Jul 21 2013 But D should be a bit more like Ada, for such things :-)
- Namespace (4/9) Jul 21 2013 I agree, but how? ;)
- bearophile (26/29) Jul 21 2013 I think there is only a small number of Ada ideas worth copying,
Let us assume we have a method of a class which is used often and the method is called periodically and must allocate every time a array between 100 and 4000 elements. What would you do? 1. Simple: float[] array; 2. Reserve: float[] array; array.reserve(N); /// N is a parameter value 3. Use manual memory allocation (e.g. with malloc) and free the memory immediately at the end of the scope. 4. Use stack allocated memory (But maybe 4000 is too big?) I can then control when _exactly_ the memory is released. But I want to hear other opinions. :)
Jul 20 2013
21-Jul-2013 00:19, Namespace пишет:Let us assume we have a method of a class which is used often and the method is called periodically and must allocate every time a array between 100 and 4000 elements. What would you do? 1. Simple: float[] array; 2. Reserve: float[] array; array.reserve(N); /// N is a parameter value 3. Use manual memory allocation (e.g. with malloc) and free the memory immediately at the end of the scope. 4. Use stack allocated memory (But maybe 4000 is too big?) then control when _exactly_ the memory is released. But I want to hear other opinions. :)5. Keep a TLS scratch pad buffer (static class member) for said 100-4000 floats and re-use it. -- Dmitry Olshansky
Jul 20 2013
On 07/20/2013 01:22 PM, Dmitry Olshansky wrote:21-Jul-2013 00:19, Namespace пишет:Let us assume we have a method of a class which is used often and the method is called periodically and must allocate every time a array between 100 and 4000 elements. What would you do?5. Keep a TLS scratch pad buffer (static class member) for said 100-4000 floats and re-use it.As long as the function finishes with that buffer, there shouldn't be reentrancy issues in general. But if the function calls the same function, say on another object perhaps indirectly, then the two objects would be sharing the same buffer: class C { void foo() { /* modify the static buffer here */ auto c = new C(); c.foo(); /* the static buffer has changed */ } } Ali
Jul 20 2013
21-Jul-2013 00:42, Ali Çehreli пишет:On 07/20/2013 01:22 PM, Dmitry Olshansky wrote: > 21-Jul-2013 00:19, Namespace пишет: >> Let us assume we have a method of a class which is used often and the >> method is called periodically and must allocate every time a array >> between 100 and 4000 elements. What would you do? > 5. Keep a TLS scratch pad buffer (static class member) for said > 100-4000 floats and re-use it. As long as the function finishes with that buffer, there shouldn't be reentrancy issues in general. But if the function calls the same function, say on another object perhaps indirectly, then the two objects would be sharing the same buffer:Yes, this case is problematic. It cannot be called recursively even with another instance.class C { void foo() { /* modify the static buffer here */ auto c = new C(); c.foo(); /* the static buffer has changed */ } } Ali-- Dmitry Olshansky
Jul 21 2013
On Saturday, 20 July 2013 at 20:22:56 UTC, Dmitry Olshansky wrote:21-Jul-2013 00:19, Namespace пишет:TLS scratch pad buffer?Let us assume we have a method of a class which is used often and the method is called periodically and must allocate every time a array between 100 and 4000 elements. What would you do? 1. Simple: float[] array; 2. Reserve: float[] array; array.reserve(N); /// N is a parameter value 3. Use manual memory allocation (e.g. with malloc) and free the memory immediately at the end of the scope. 4. Use stack allocated memory (But maybe 4000 is too big?) because I can then control when _exactly_ the memory is released. But I want to hear other opinions. :)5. Keep a TLS scratch pad buffer (static class member) for said 100-4000 floats and re-use it.
Jul 20 2013
21-Jul-2013 00:46, Namespace пишет:On Saturday, 20 July 2013 at 20:22:56 UTC, Dmitry Olshansky wrote:My analogy goes as follows: a chunk of memory for temporary needs => scratch pad (as in sheet of paper for quick notes/sketches). Something along the lines of: class A{ static float[] buffer; static this(){ buffer = new float[as_big_as_it_gets]; } void foo(){ float[] tempAlloc = buffer[0..need_this_much]; tempAlloc[] = 0.0; ... } } As long as foo is not called recursively should just work. Other thing that may wreck this is if foo is called in Fiber context and uses yeild internally. One may as well fall back to option 3 in rare cases where scratch pad is too small to fit the bill. -- Dmitry Olshansky21-Jul-2013 00:19, Namespace пишет:TLS scratch pad buffer?Let us assume we have a method of a class which is used often and the method is called periodically and must allocate every time a array between 100 and 4000 elements. What would you do? 1. Simple: float[] array; 2. Reserve: float[] array; array.reserve(N); /// N is a parameter value 3. Use manual memory allocation (e.g. with malloc) and free the memory immediately at the end of the scope. 4. Use stack allocated memory (But maybe 4000 is too big?) then control when _exactly_ the memory is released. But I want to hear other opinions. :)5. Keep a TLS scratch pad buffer (static class member) for said 100-4000 floats and re-use it.
Jul 21 2013
My analogy goes as follows: a chunk of memory for temporary needs => scratch pad (as in sheet of paper for quick notes/sketches). Something along the lines of: class A{ static float[] buffer; static this(){ buffer = new float[as_big_as_it_gets]; } void foo(){ float[] tempAlloc = buffer[0..need_this_much]; tempAlloc[] = 0.0; ... } } As long as foo is not called recursively should just work. Other thing that may wreck this is if foo is called in Fiber context and uses yeild internally. One may as well fall back to option 3 in rare cases where scratch pad is too small to fit the bill.I really like the idea. I've changed it a bit: I have a float[1024] buffer which is used, as long as the requested size is less than 1024. If it's greater, I will temporary allocate the whole array with new float[Size]; Any improvements? Or is 1024 to small / big?
Jul 21 2013
Namespace:I have a float[1024] buffer which is used, as long as the requested size is less than 1024. If it's greater, I will temporary allocate the whole array with new float[Size];That's one of the ways I solve similar problems.Any improvements? Or is 1024 to small / big?An idea is to add some debug{} code that computes statistics for the array sizes... Bye, bearophile
Jul 21 2013
On Sunday, 21 July 2013 at 21:31:08 UTC, bearophile wrote:Namespace:Too much effort. But if some of you have this experiences already, I would gladly profit from them. ;)I have a float[1024] buffer which is used, as long as the requested size is less than 1024. If it's greater, I will temporary allocate the whole array with new float[Size];That's one of the ways I solve similar problems.Any improvements? Or is 1024 to small / big?An idea is to add some debug{} code that computes statistics for the array sizes... Bye, bearophile
Jul 21 2013
On Sunday, 21 July 2013 at 21:35:15 UTC, Namespace wrote:On Sunday, 21 July 2013 at 21:31:08 UTC, bearophile wrote:It would really help to know what your function is doing exactly. Personally, I think you are over thinking it. Just do this: //---- void foo () { static float[] scratchPadBuffer; // use scratchPadBuffer here } //---- And simply make buffer grow as it needs, and never shorten it. You could consider this to be "controlled leak" I guess. But at least, with this approach, you can't "guess wrong", and you'll only very rarely allocate.Namespace:Too much effort. But if some of you have this experiences already, I would gladly profit from them. ;)I have a float[1024] buffer which is used, as long as the requested size is less than 1024. If it's greater, I will temporary allocate the whole array with new float[Size];That's one of the ways I solve similar problems.Any improvements? Or is 1024 to small / big?An idea is to add some debug{} code that computes statistics for the array sizes... Bye, bearophile
Jul 22 2013
It's part of Dgame and it's the shape draw method. The method collects the vertices and give them to the vertex buffer. I will try it with my current way: static stack buffer with a size of 1024 and otherwise I will temporary allocate memory. But how much shapes grow larger than 1024 vertices ;).
Jul 22 2013
Am 22.07.2013 10:41, schrieb Namespace:It's part of Dgame and it's the shape draw method. The method collects the vertices and give them to the vertex buffer. I will try it with my current way: static stack buffer with a size of 1024 and otherwise I will temporary allocate memory. But how much shapes grow larger than 1024 vertices ;).This will be over 1024 vertices in no time, I am currently drawing vertices with a size of ~500MiB depending on the terrain. The CPU doesn't need to know of all vertices at once, so collect them and upload them to the GPU if they go over a certain limit and begin to fill your buffer from the start again. I use a malloc allocated buffer, its size is never decreased, I initially allocate an educated guess + some extra to be safe, then if it really happens to run ot of memory I simple resize it depending on how much is left with another educated guess + some extra.
Jul 22 2013
On Monday, 22 July 2013 at 09:26:33 UTC, David wrote:Am 22.07.2013 10:41, schrieb Namespace:Hm, but Dgame isn't designed for that sort of thing. It's more like SFML for C++. So do you think that such Framework will ever use more vertices than 1024? Anyway, if it's larger, I allocate with malloc enough memory and free it at the end of the scope.It's part of Dgame and it's the shape draw method. The method collects the vertices and give them to the vertex buffer. I will try it with my current way: static stack buffer with a size of 1024 and otherwise I will temporary allocate memory. But how much shapes grow larger than 1024 vertices ;).This will be over 1024 vertices in no time, I am currently drawing vertices with a size of ~500MiB depending on the terrain. The CPU doesn't need to know of all vertices at once, so collect them and upload them to the GPU if they go over a certain limit and begin to fill your buffer from the start again. I use a malloc allocated buffer, its size is never decreased, I initially allocate an educated guess + some extra to be safe, then if it really happens to run ot of memory I simple resize it depending on how much is left with another educated guess + some extra.
Jul 22 2013
22-Jul-2013 01:19, Namespace пишет:I really like the idea. I've changed it a bit: I have a float[1024] buffer which is used, as long as the requested size is less than 1024. If it's greater, I will temporary allocate the whole array with new float[Size]; Any improvements? Or is 1024 to small / big?No one size fits all. Measure and profit. -- Dmitry Olshansky
Jul 21 2013
Namespace:But I want to hear other opinions. :)D also supports alloca(), that for 1D arrays is almost bearable. See also: http://d.puremagic.com/issues/show_bug.cgi?id=9832 Bye, bearophile
Jul 20 2013
On Saturday, 20 July 2013 at 20:34:50 UTC, bearophile wrote:Namespace:Yeah, but aren't 4000 elements a bit much for a stack allocated array?But I want to hear other opinions. :)D also supports alloca(), that for 1D arrays is almost bearable. See also: http://d.puremagic.com/issues/show_bug.cgi?id=9832 Bye, bearophile
Jul 20 2013
Namespace:Yeah, but aren't 4000 elements a bit much for a stack allocated array?4000 floats take about 16 KB. If your function is not recursive and it's not much transitively recursive, then I think it's acceptable. But how much stack are your other functions using? You have to estimate if you have enough stack space. Today it's very easy to have 50+ MB of stack, if necessary. DMD on Windows supports a linker command like this: -L/STACK:200000000 Bye, bearophile
Jul 20 2013
On Saturday, 20 July 2013 at 21:39:31 UTC, bearophile wrote:Namespace:So, you would never allocate with float[]?Yeah, but aren't 4000 elements a bit much for a stack allocated array?4000 floats take about 16 KB. If your function is not recursive and it's not much transitively recursive, then I think it's acceptable. But how much stack are your other functions using? You have to estimate if you have enough stack space. Today it's very easy to have 50+ MB of stack, if necessary. DMD on Windows supports a linker command like this: -L/STACK:200000000 Bye, bearophile
Jul 20 2013
Namespace:So, you would never allocate with float[]?Generally in D I allocate on the heap with new, and once in a while with minimallyInitializedArray. Stack-allocation is left for situations where max performance is needed. Like you have to build a tree of string distances, and you use a Levenshtein distance to compute them. Usually a Levenshtein distance needs one buffer, or two. If you want to put one million of strings in such tree of distances, you can't allocate one billions of buffers, it's too much wasted time. For such case I allocate the Levenshtein buffer with an alloca (or I allocate the buffer before the function, etc, there are many ways). I think the usual D code uses too much heap allocation. Allocations on the heap are slow. If you take a look at modern Ada code (like Ada2012 code) you see very little heap allocations, or no heap allocations at all :-) Probably Rust code will avoid lot of of heap allocations. Bye, bearophile
Jul 20 2013
But D isn't like Ada. It's more like C++ and there Heap allocations is often used too. It would be really cool if we had allocators already. Something like: ---- with (AllocatorX) { /// will use malloc and free instead of calling the GC float[] arr; arr ~= 42; } ---- And I still don't know what a 'TLS scratch pad buffer' is.
Jul 20 2013
On 2013-07-21 08:45, Namespace wrote:But D isn't like Ada. It's more like C++ and there Heap allocations is often used too. It would be really cool if we had allocators already. Something like: ---- with (AllocatorX) { /// will use malloc and free instead of calling the GC float[] arr; arr ~= 42; } ---- And I still don't know what a 'TLS scratch pad buffer' is.Perhaps: float[4000] scratchPadBuffer; void foo () { // use scratchPadBuffer here } I guess he just refers to a some temporary data you need during the execution of a function and at the end of the function you don't care about it. -- /Jacob Carlborg
Jul 21 2013
On Sunday, 21 July 2013 at 09:16:47 UTC, Jacob Carlborg wrote:On 2013-07-21 08:45, Namespace wrote:But then I have mostly far too much and maybe a few times a bit too less store. It's not flexible. But maybe with a smaller granule. What's about this: ---- struct Chunk(T, ushort maxSize = 1024) { public: static T[maxSize] _chunk; T* ptr; size_t length; size_t capacity; this(size_t capacity) { this.capacity = capacity; if (capacity < maxSize) this.ptr = &_chunk[0]; else this.ptr = cast(T*) .malloc(this.capacity * T.sizeof); } disable this(this); ~this() { if (this.ptr && this.capacity > maxSize) .free(this.ptr); } void ensureAddable(size_t capacity) { if (capacity > this.capacity) { this.capacity = capacity; if (this.capacity > maxSize) this.ptr = cast(T*) .realloc(this.ptr, this.capacity * T.sizeof); } } void opOpAssign(string op : "~", U)(auto ref U item) { this.ptr[this.length++] = cast(T) item; } void opOpAssign(string op : "~", U, size_t n)(auto ref U[n] items) { foreach (ref U item; items) { this.ptr[this.length++] = cast(T) item; } } } ----But D isn't like Ada. It's more like C++ and there Heap allocations is often used too. It would be really cool if we had allocators already. Something like: ---- with (AllocatorX) { /// will use malloc and free instead of calling the GC float[] arr; arr ~= 42; } ---- And I still don't know what a 'TLS scratch pad buffer' is.Perhaps: float[4000] scratchPadBuffer; void foo () { // use scratchPadBuffer here } I guess he just refers to a some temporary data you need during the execution of a function and at the end of the function you don't care about it.
Jul 21 2013
On 2013-07-21 12:18, Namespace wrote:But then I have mostly far too much and maybe a few times a bit too less store. It's not flexible. But maybe with a smaller granule. What's about this:You said you needed between 100 and 4000 floats. My suggestion will allocate 4000 floats once per thread. Use what you need from the array. I guess your solution with Chunk encapsulate this usage pattern. -- /Jacob Carlborg
Jul 22 2013
Namespace:But D isn't like Ada.But D should be a bit more like Ada, for such things :-) Bye, bearophile
Jul 21 2013
On Sunday, 21 July 2013 at 10:30:16 UTC, bearophile wrote:Namespace:I agree, but how? ;) You know by yourself that Walter & Andrei are very hard to convince by such things.But D isn't like Ada.But D should be a bit more like Ada, for such things :-) Bye, bearophile
Jul 21 2013
Namespace:I agree, but how? ;) You know by yourself that Walter & Andrei are very hard to convince by such things.I think there is only a small number of Ada ideas worth copying, like: - Integral overflows; - Ranged integers; - A library of collections allocated in-place in the standard library (Bounded Collections of Ada2012); - More deterministic multi processing; - Nicer and more handy enumerated sets; - No undefined behaviours; - Stack-allocated fixed-length arrays; - etc. Some of such things are work in progress (no undefined behaviours), some of them can go in the D standard library (bounded collections, variable length arrays with a bit of help from the compiler), some of them can be implemented with help from the compiler (partially library-defined integral type with overflow tests). Some of them maybe can be implemented once some bugs are sorted out and alias this gets better (ranged integers). Some of them are just currently not present in D (more deterministic multi processing), but D is supposed to have something to replace them ("scope", a better "shared", Walter is introducing a bit of object ownership behind the curtains, etc). And then it becomes a matter of using some of those idioms. Bye, bearophile
Jul 21 2013