www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Where is my memory?

reply "Ozan =?UTF-8?B?U8O8ZWwi?= <ozan.sueel gmail.com> writes:
Hi!
I'm working on a Big Data project, where a huge amount of RAM is 
needed. Using D I've run into a - let's called it - memory leak. 
I tested this with following code:

	foreach(i; 0..1000) {
		int[] ints;
		foreach(j; 0..1000) {
			ints ~= uniform(0, 100);
		}
	}

Without the first foreach this code use only 220KB,
with the first foreach this code needs 2,5MB.
(220KB x 1000 is something around 2,5MB).

But why is GC (Garbage Collector) not running? Following the 
explanations in http://wiki.dlang.org/Memory_Management memory 
usage should be something around  220KB.

I used GC.minimize, slow down the loop, replaced 
uniform...doesn't work.
(By the way: After a while my LINUX server killed my big data 
project because running out of RAM & SWAP space)


Thanks for your advice,
Ozan
Mar 22 2015
next sibling parent Rikki Cattermole <alphaglosined gmail.com> writes:
On 22/03/2015 10:42 p.m., "Ozan =?UTF-8?B?U8O8ZWwi?= 
<ozan.sueel gmail.com>" wrote:
 Hi!
 I'm working on a Big Data project, where a huge amount of RAM is needed.
 Using D I've run into a - let's called it - memory leak. I tested this
 with following code:

      foreach(i; 0..1000) {
          int[] ints;
          foreach(j; 0..1000) {
              ints ~= uniform(0, 100);
          }
      }

 Without the first foreach this code use only 220KB,
 with the first foreach this code needs 2,5MB.
 (220KB x 1000 is something around 2,5MB).

 But why is GC (Garbage Collector) not running? Following the
 explanations in http://wiki.dlang.org/Memory_Management memory usage
 should be something around  220KB.

 I used GC.minimize, slow down the loop, replaced uniform...doesn't work.
 (By the way: After a while my LINUX server killed my big data project
 because running out of RAM & SWAP space)


 Thanks for your advice,
 Ozan
Since you know exact sizes perhaps, a rewrite is needed? int[] buffer; buffer.length = 1000 * 1000; size_t offset; foreach(i; 0 .. 1000) { foreach(j; offset .. offset + 1000) { buffer[j] = uniform(0, 100); } offset += 1000; } Really that buffer should be malloc'ed without the GC knowing about it. And later manually free'd. In fact, I would recommend moving that buffer as: static int[1000 * 1000] buffer; void myfunc() { size_t offset; foreach(i; 0 .. 1000) { foreach(j; offset .. offset + 1000) { buffer[j] = uniform(0, 100); } offset += 1000; } // use buffer } This method will not allocate per execution. But be careful. Each running of myfunc will overwrite what is in buffer.
Mar 22 2015
prev sibling next sibling parent "Martin Nowak" <code dawg.eu> writes:
On Sunday, 22 March 2015 at 09:42:41 UTC, Ozan Süel wrote:
 But why is GC (Garbage Collector) not running? Following the 
 explanations in http://wiki.dlang.org/Memory_Management memory 
 usage should be something around  220KB.
The GC maps memory from the underlying OS in pool sized chunks. The smallest pool is already 4MB, so the GC doesn't trigger before that pool is completely used, and it also can't free the pools until it no longer contains live data. If you want to run a collection, you need to call GC.collect, GC.minimize unmaps Pools. If you're running on 32-bit (the default on Windows) that loop might pin memory because of false pointers. Out of interest, do you intend to work on some ML library. There exists a few math libraries, particularly estate and SciD, but we're lacking a nice ML package. http://code.dlang.org/?sort=updated&category=library.scientific
Mar 22 2015
prev sibling parent reply "anonymous" <anonymous example.com> writes:
On Sunday, 22 March 2015 at 09:42:41 UTC, Ozan Süel wrote:
 Hi!
 I'm working on a Big Data project, where a huge amount of RAM 
 is needed. Using D I've run into a - let's called it - memory 
 leak. I tested this with following code:

 	foreach(i; 0..1000) {
 		int[] ints;
 		foreach(j; 0..1000) {
 			ints ~= uniform(0, 100);
 		}
 	}

 Without the first foreach this code use only 220KB,
 with the first foreach this code needs 2,5MB.
 (220KB x 1000 is something around 2,5MB).
220KB x 1000 would be around 200MB.
 But why is GC (Garbage Collector) not running? Following the 
 explanations in http://wiki.dlang.org/Memory_Management memory 
 usage should be something around  220KB.
The GC is running. 1000 x 1000 ints would be 4MB. That alone is more than the 2.5MB you're observing.
 I used GC.minimize, slow down the loop, replaced 
 uniform...doesn't work.
 (By the way: After a while my LINUX server killed my big data 
 project because running out of RAM & SWAP space)
GC.minimize alone can't do anything if the memory isn't collected. Use GC.collect(), too. Also, if you know the final size of the array beforehand, you can `reserve` it. That avoids a lot of copying (and garbage creation) when appending. int[] ints; ints.reserve(1000); /* <- */ foreach(j; 0..1000) { ints ~= uniform(0, 100); }
Mar 22 2015
parent Steven Schveighoffer <schveiguy yahoo.com> writes:
On 3/22/15 6:32 AM, anonymous wrote:
 On Sunday, 22 March 2015 at 09:42:41 UTC, Ozan Süel wrote:
 Hi!
 I'm working on a Big Data project, where a huge amount of RAM is
 needed. Using D I've run into a - let's called it - memory leak. I
 tested this with following code:

     foreach(i; 0..1000) {
         int[] ints;
         foreach(j; 0..1000) {
             ints ~= uniform(0, 100);
         }
     }

 Without the first foreach this code use only 220KB,
 with the first foreach this code needs 2,5MB.
 (220KB x 1000 is something around 2,5MB).
220KB x 1000 would be around 200MB.
 But why is GC (Garbage Collector) not running? Following the
 explanations in http://wiki.dlang.org/Memory_Management memory usage
 should be something around  220KB.
The GC is running. 1000 x 1000 ints would be 4MB. That alone is more than the 2.5MB you're observing.
 I used GC.minimize, slow down the loop, replaced uniform...doesn't work.
 (By the way: After a while my LINUX server killed my big data project
 because running out of RAM & SWAP space)
GC.minimize alone can't do anything if the memory isn't collected. Use GC.collect(), too. Also, if you know the final size of the array beforehand, you can `reserve` it. That avoids a lot of copying (and garbage creation) when appending. int[] ints; ints.reserve(1000); /* <- */ foreach(j; 0..1000) { ints ~= uniform(0, 100); }
Another tip, don't throw away that memory you allocated! int[] ints; ints.reserve(1000); foreach(i; 0..1000) { ints.length = 0; ints.assumeSafeAppend; ... // inner loop } -Steve
Mar 23 2015