www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - Allocate N elements

reply "Namespace" <rswhite4 googlemail.com> writes:
Is there no way (besides the ugly malloc or any wrappers) to 
allocate _exactly_ N elements at runtime (no static array)?
I tried
----
int[] arr = new int[sizeOfItems];
----
but if sizeOfItems is e.g. 512, it allocates 1024.
So is there no way to allocate a fixed memory block without the 
usage of C functions?
Jul 15 2013
parent reply "Adam D. Ruppe" <destructionator gmail.com> writes:
On Monday, 15 July 2013 at 11:46:54 UTC, Namespace wrote:
 int[] arr = new int[sizeOfItems];
Did you also try int[] arr = new int[](sizeOfItems)?
Jul 15 2013
parent reply "Namespace" <rswhite4 googlemail.com> writes:
On Monday, 15 July 2013 at 12:23:21 UTC, Adam D. Ruppe wrote:
 On Monday, 15 July 2013 at 11:46:54 UTC, Namespace wrote:
 int[] arr = new int[sizeOfItems];
Did you also try int[] arr = new int[](sizeOfItems)?
Example code: ---- void main() { int[] arr1 = new int[512]; writeln(arr1.length, "::", arr1.capacity); int[] arr2 = new int[](512); writeln(arr2.length, "::", arr2.capacity); } ---- Output: 512::1019 512::1019
Jul 15 2013
next sibling parent reply "bearophile" <bearophileHUGS lycos.com> writes:
Namespace:

 void main()
 {
 	int[] arr1 = new int[512];
 	writeln(arr1.length, "::", arr1.capacity);
 	int[] arr2 = new int[](512);
 	writeln(arr2.length, "::", arr2.capacity);
 }
 ----

 Output:
 512::1019
 512::1019
import std.stdio, std.array; void main() { auto a1 = new int[512]; writeln(a1.length, " ", a1.capacity); auto a2 = new int[](512); writeln(a2.length, " ", a2.capacity); auto a3 = minimallyInitializedArray!(int[])(512); writeln(a3.length, " ", a3.capacity); } Output: 512 1019 512 1019 512 0 But that (of new arrays) is a bad design, it wastes too much memory, and I think it should be fixed. In Python this doesn't overallocate: a4 = [0] * 512 An alternative is to introduce a function like std.array.InitializedArray. Bye, bearophile
Jul 15 2013
parent reply "monarch_dodra" <monarchdodra gmail.com> writes:
On Monday, 15 July 2013 at 13:13:42 UTC, bearophile wrote:
 Output:
 512 1019
 512 1019
 512 0

 But that (of new arrays) is a bad design, it wastes too much 
 memory, and I think it should be fixed. In Python this doesn't 
 overallocate:
So what? The only thing you showed, is that minimallyInitialized doesn't know how much it allocated. If you allocate 513 elements with malloc, you'll way over allocate too. What's your point? //-------- import std.stdio, core.memory; void main() { auto u = GC.qalloc(513); writeln(u.size); //print 1024. Oops. } //-------- You'll waste memory either way. The only difference is a 1-2 byte difference for arrays smaller than 2K, and 16 bytes for arrays larger than 2K. The comparison is very unfair, and the wasted room is minimal. It's just more visible... yet more exploitable. Dynamic GC arrays have a tendency to grow in-place, and use all their capacity, when malloc always eagerly relocates. To answer the original question:
 Is there no way (besides the ugly malloc or any wrappers) to 
 allocate _exactly_ N elements at runtime (no static array)?
 I tried
No. *EVEN* with an "ugly malloc", you'll still over allocate (see above). Hell, at the end of the day, it's even worst, because you over-allocate, yet you don't know it, nor exploit the over-allocated data (unless you *very* manually use qalloc).
 Another question:
 I have this code:
 [...]
 Is it possible to prohibit that the slice is resized,
 to avoid GC allocations?
No. The only way this works is *if* there is a GC to safety net catch you if relocation must occur. That said, you can very easily write your own: //---- //Disclaimer: VERY unsafe. ref T[] unsafeAppend(T, U)(ref T[] arr, U u) { immutable len = arr.length; arr.ptr[len] = u; arr = arr[0 .. len + 1]; } //---- void main() { auto u = GC.qalloc(512); int[] arr = (cast(int*)u.base)[0 .. 0]; arr.unsafeAppend(1).unsafeAppend(2).unsafeAppend(3).writeln(); } //---- Disclaimer: This will give you 0 overflow protection. Also, do NOT use this with GC arrays: The added elements will not be "seen" by the GC, and will also be clobbered by normal appends.
Jul 15 2013
next sibling parent reply "bearophile" <bearophileHUGS lycos.com> writes:
monarch_dodra:

 But that (of new arrays) is a bad design, it wastes too much 
 memory, and I think it should be fixed. In Python this doesn't 
 overallocate:
So what? The only thing you showed, is that minimallyInitialized doesn't know how much it allocated. If you allocate 513 elements with malloc, you'll way over allocate too. What's your point? You'll waste memory either way.
I didn't know it, sorry. I forgot. Can we let minimallyInitializedArray know the capacity? Regarding the small arrays, so to avoid the memory wasting you need to allocate a larger one and slice it. Bye, bearophile
Jul 15 2013
parent reply "monarch_dodra" <monarchdodra gmail.com> writes:
On Monday, 15 July 2013 at 15:54:57 UTC, bearophile wrote:
 monarch_dodra:

 But that (of new arrays) is a bad design, it wastes too much 
 memory, and I think it should be fixed. In Python this 
 doesn't overallocate:
So what? The only thing you showed, is that minimallyInitialized doesn't know how much it allocated. If you allocate 513 elements with malloc, you'll way over allocate too. What's your point? You'll waste memory either way.
I didn't know it, sorry. I forgot. Can we let minimallyInitializedArray know the capacity?
I'm working on it ;)
 Regarding the small arrays, so to avoid the memory wasting you 
 need to allocate a larger one and slice it.

 Bye,
 bearophile
One of the problems is that when you want an arbitrarily sized allocation, it is usually standard to allocate 2^N in size. While this is optimal for "traditional" malloc, it's actually the *worst* allocation size you could ask for a GC array with appendable info (eg, traditional new) :/
Jul 15 2013
parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Mon, Jul 15, 2013 at 07:32:37PM +0200, monarch_dodra wrote:
 On Monday, 15 July 2013 at 15:54:57 UTC, bearophile wrote:
monarch_dodra:

But that (of new arrays) is a bad design, it wastes too much
memory, and I think it should be fixed. In Python this doesn't
overallocate:
So what? The only thing you showed, is that minimallyInitialized doesn't know how much it allocated. If you allocate 513 elements with malloc, you'll way over allocate too. What's your point? You'll waste memory either way.
I didn't know it, sorry. I forgot. Can we let minimallyInitializedArray know the capacity?
I'm working on it ;)
Regarding the small arrays, so to avoid the memory wasting you
need to allocate a larger one and slice it.

Bye,
bearophile
One of the problems is that when you want an arbitrarily sized allocation, it is usually standard to allocate 2^N in size. While this is optimal for "traditional" malloc, it's actually the *worst* allocation size you could ask for a GC array with appendable info (eg, traditional new) :/
This is wrong. The 2^N size should be applied *after* GC appendable info (or whatever else bookkeeping info you need) has been accounted for, not before. If the GC header/whatever requires, say, 16 bytes, then the ideal array size should be 2^N-16, so that everything fits neatly within a power of 2. Otherwise you end up with 2^N+16 bytes that are actually allocated, and it just goes downhill from there. T -- What do you call optometrist jokes? Vitreous humor.
Jul 15 2013
parent "monarch_dodra" <monarchdodra gmail.com> writes:
On Monday, 15 July 2013 at 17:39:43 UTC, H. S. Teoh wrote:
 On Mon, Jul 15, 2013 at 07:32:37PM +0200, monarch_dodra wrote:
 On Monday, 15 July 2013 at 15:54:57 UTC, bearophile wrote:
monarch_dodra:

But that (of new arrays) is a bad design, it wastes too much
memory, and I think it should be fixed. In Python this 
doesn't
overallocate:
So what? The only thing you showed, is that minimallyInitialized doesn't know how much it allocated. If you allocate 513 elements with malloc, you'll way over allocate too. What's your point? You'll waste memory either way.
I didn't know it, sorry. I forgot. Can we let minimallyInitializedArray know the capacity?
I'm working on it ;)
Regarding the small arrays, so to avoid the memory wasting you
need to allocate a larger one and slice it.

Bye,
bearophile
One of the problems is that when you want an arbitrarily sized allocation, it is usually standard to allocate 2^N in size. While this is optimal for "traditional" malloc, it's actually the *worst* allocation size you could ask for a GC array with appendable info (eg, traditional new) :/
This is wrong. The 2^N size should be applied *after* GC appendable info (or whatever else bookkeeping info you need) has been accounted for, not before. If the GC header/whatever requires, say, 16 bytes, then the ideal array size should be 2^N-16, so that everything fits neatly within a power of 2. Otherwise you end up with 2^N+16 bytes that are actually allocated, and it just goes downhill from there. T
Just to be clear, the GC itself has no overhead. Only the APPENDABLE bookkeeping does any overhead. EG: GC.malloc(1024); This will allocate a 1024 byte block. No problem here. HOWEVER, if you write: int[] arr = new int[](1000); //Allocates a 4Ki block int[] arr = new int[](1024); //Allocates an 8Ki block The example is a bit extreme, but you should get what I mean. If you do a malloc with GC.BlkAttr.APPENDABLE, you will also get exactly what you asked for. However, it will be the user's responsibility to take into account that some of those allocated bytes will not actually be useable (the APPENDABLE block). FYI, the way APPEDNABLE works: Arrays 1-255 bytes: 1 extra byte will be appended for used capacity info at the end of the array. Arrays 256-2046: 2 extra bytes will be appended. 2047+: The array will *start* with a 16 byte block for appendable info. A dummy 1 byte block will also be appended to the end of the block, for "arr[$ .. $]". (So yes, you need to accound for a *17* byte overhead) That said, allocating an array using a straight up malloc(APPEDNABLE) is *very* complicated, and it is all very specific to the current GC anyways: It is very very error prone, and non portable.
Jul 16 2013
prev sibling parent reply "Namespace" <rswhite4 googlemail.com> writes:
 No. *EVEN* with an "ugly malloc", you'll still over allocate 
 (see above).
---- int* ptr = cast(int*) malloc(513 * int.sizeof); int[] arr = ptr[0 .. 513]; writeln(arr.length, "::", arr.capacity); ---- Output: 513::0 Where is the over allocation?
Jul 15 2013
parent reply "Adam D. Ruppe" <destructionator gmail.com> writes:
On Monday, 15 July 2013 at 18:16:45 UTC, Namespace wrote:
 writeln(arr.length, "::", arr.capacity);
arr.capacity checks the GC block, and since you malloced it, there is no gc block for it to check. So it simply doesn't know if there's any extra capacity there and reports 0 just to be safe.
Jul 15 2013
parent reply "Namespace" <rswhite4 googlemail.com> writes:
On Monday, 15 July 2013 at 18:29:12 UTC, Adam D. Ruppe wrote:
 On Monday, 15 July 2013 at 18:16:45 UTC, Namespace wrote:
 writeln(arr.length, "::", arr.capacity);
arr.capacity checks the GC block, and since you malloced it, there is no gc block for it to check. So it simply doesn't know if there's any extra capacity there and reports 0 just to be safe.
Ah, good to know. But anyway malloc allocates exact N elements, without ugly overhead. Would be really good if there was a way to avoid that the GC takes over the memory.
Jul 15 2013
next sibling parent =?UTF-8?B?QWxpIMOHZWhyZWxp?= <acehreli yahoo.com> writes:
On 07/15/2013 12:53 PM, Namespace wrote:

 But anyway malloc allocates exact N elements, without ugly overhead.
I doubt it. If its allocating from a bucket, then what actually gets used is the size of that bucket. Ali
Jul 15 2013
prev sibling next sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
Namespace:

 Ah, good to know. But anyway malloc allocates exact N elements, 
 without ugly overhead.
 Would be really good if there was a way to avoid that the GC 
 takes over the memory.
That wasted space is what you pay to reduce memory fragmentation with a not copying GC. Bye, bearophile
Jul 15 2013
prev sibling parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Mon, Jul 15, 2013 at 09:53:32PM +0200, Namespace wrote:
 On Monday, 15 July 2013 at 18:29:12 UTC, Adam D. Ruppe wrote:
On Monday, 15 July 2013 at 18:16:45 UTC, Namespace wrote:
writeln(arr.length, "::", arr.capacity);
arr.capacity checks the GC block, and since you malloced it, there is no gc block for it to check. So it simply doesn't know if there's any extra capacity there and reports 0 just to be safe.
Ah, good to know. But anyway malloc allocates exact N elements, without ugly overhead.
[...] I doubt malloc has no overhead. AFAIK, many malloc implementations store some kind of bookkeeping info in the area of memory just before the pointer that's returned. After all, malloc/free has to somehow know which memory blocks are allocated and which are free. Some implementations also store canary values in that area in order to detect memory corruptions. T -- Старый друг лучше новых двух.
Jul 15 2013
prev sibling parent reply "Namespace" <rswhite4 googlemail.com> writes:
Another question:
I have this code:

----
void main() {
	int* ptr = cast(int*) malloc(11 * int.sizeof);
	int[] arr = ptr[0 .. 11];

	assert(arr.ptr is ptr);

	arr ~= 42;

	assert(ptr !is arr.ptr);
}
----

Is it possible to prohibit that the slice is resized, to avoid GC 
allocations?
Jul 15 2013
parent "bearophile" <bearophileHUGS lycos.com> writes:
Namespace:

 Is it possible to prohibit that the slice is resized, to avoid 
 GC allocations?
For that I think you need to define a struct that disables the append and uses an alias this. But then some of your array function argument signatures need to change, because lot of D code uses "raw" arrays signatures like: void foo(int[] arg) {} Instead of: alias MArr = int[]; void foo(MArr arg) {} Or using Typedef (untested): alias MArr = Typedef!(int[]); void foo(MArr arg) {} Bye, bearophile
Jul 15 2013