digitalmars.D.bugs - [Issue 2504] New: Reserve for associative arrays
- d-bugmail puremagic.com (19/19) Dec 10 2008 http://d.puremagic.com/issues/show_bug.cgi?id=2504
- d-bugmail puremagic.com (22/28) Sep 23 2009 http://d.puremagic.com/issues/show_bug.cgi?id=2504
- d-bugmail puremagic.com (7/7) Nov 30 2009 http://d.puremagic.com/issues/show_bug.cgi?id=2504
- d-bugmail puremagic.com (17/19) Nov 30 2009 http://d.puremagic.com/issues/show_bug.cgi?id=2504
- d-bugmail puremagic.com (33/33) Apr 17 2010 http://d.puremagic.com/issues/show_bug.cgi?id=2504
http://d.puremagic.com/issues/show_bug.cgi?id=2504 Summary: Reserve for associative arrays Product: D Version: 2.021 Platform: PC OS/Version: Windows Status: NEW Severity: enhancement Priority: P2 Component: DMD AssignedTo: bugzilla digitalmars.com ReportedBy: dsimcha yahoo.com It appears that adding an element to an associative array always triggers a memory allocation. Especially in multithreaded code, this is inefficient. It would be nice if associative arrays had a .reserve(size_t n) property, which reserved enough space for n objects, and stored the capacity internally. The idea is that, until the reserve buffer is exhausted, no interaction of any kind with the GC would be needed to add an element to the AA. --
Dec 10 2008
http://d.puremagic.com/issues/show_bug.cgi?id=2504 ZY Zhou <rinick gmail.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |rinick gmail.comIt appears that adding an element to an associative array always triggers a memory allocation. Especially in multithreaded code, this is inefficient. It would be nice if associative arrays had a .reserve(size_t n) property, which reserved enough space for n objects, and stored the capacity internally. The idea is that, until the reserve buffer is exhausted, no interaction of any kind with the GC would be needed to add an element to the AA.It's not all about memory int[int] a; foreach(i;0..20_000_000) a[i] = i; //takes 2 _minutes_ foreach(i;0..20_000_000) a[i] = i; //do it again, only 2 seconds 60 times slower. If it's memory allocation problem, the result should be similar to Dynamic array: int[] t; foreach(i;0..20_000_000) t ~= i; // 2 seconds foreach(i;0..20_000_000) t[i] = i; // 0.3 seconds I guess AA rehashs the whole array when capacity changes, and rehashing is much slower than memory allocation. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Sep 23 2009
http://d.puremagic.com/issues/show_bug.cgi?id=2504 No, this is because, on the second run, the program has already reserved a bunch of memory from the OS, so the GC doesn't run as often. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Nov 30 2009
http://d.puremagic.com/issues/show_bug.cgi?id=2504No, this is because, on the second run, the program has already reserved a bunch of memory from the OS, so the GC doesn't run as often.try this: import std.stdio; void main(){ int[int] a; foreach(i;0..20_000_000){ a[i] = i; if((i&0xFFFF) == 0) writeln(i); } } You can see how slow it becomes when AA is large. I don't think memory allocation could take so much time. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Nov 30 2009
http://d.puremagic.com/issues/show_bug.cgi?id=2504 Michael Rynn <y0uf00bar gmail.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |y0uf00bar gmail.com --- See enhancement 4098. I have experimented with a NodeHeap for the AA, that allocates Nodes in much bigger blocks, and this does speed by up a considerable amount the insertion and cleanup times. The NodeHeap idea is a single sized block dedicated heap manager created for each AA instance. No advantage at all for lookup times. The disadvantage is that Nodes released by a remove, are not returned to the GC pool until the entire block of Nodes has been removed. Of course when it does get freed, its in a big blocks at once. Also when allocating Nodes one after the other, I suppose a smart memory heap might be carving up much bigger blocks anyway, with various Pools for different object sizes. Having a manual AA.clear helps the GC as well. Even so, having a node heap and using for the first time, the system must get a new big bunch of memory, and maybe that will always take some time. Its not in the current 4098 set (which is a lot to swallow already), but I it could be an optional run time extra parameter to the HashMap/HashSet setup wrapper. It uses lots of really small nodes. The AA management object would then have a non-null heap manager object, to allocate and free blocks from a dedicated instance pool (Just like the Tango HashMap, which does exactly this). I have already proposed to add some more fields to the hash map management object, and one more non-default runtime option (only available from HashMap template wrapper) will hopefully not be a heap of trouble. If it was an option, would people use it wisely? -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Apr 17 2010