www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.bugs - [Issue 1561] New: AA's create many false references for garbage collector

reply d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=1561

           Summary: AA's create many false references for garbage collector
           Product: D
           Version: 1.022
          Platform: PC
        OS/Version: Windows
            Status: NEW
          Severity: major
          Priority: P2
         Component: Phobos
        AssignedTo: bugzilla digitalmars.com
        ReportedBy: wbaxter gmail.com


A program that uses a lot of AA's will leak memory.
It looks like maybe the reason is that the aaA structs which contain the hash
value are allocated as void[size], so the hash value is always interpreted as a
pointer.

Tangos version of aaA.d does it slightly differently, checking the size and
setting the gc NO_SCAN bit if the key and value types can't hold a pointer:

        // Not found, create new elem
        //printf("create new one\n");
        size_t size = aaA.sizeof + keysize + valuesize;
        uint   bits = keysize   < (void*).sizeof &&
                      keysize   > (void).sizeof  &&
                      valuesize < (void*).sizeof &&
                      valuesize > (void).sizeof  ? BlkAttr.NO_SCAN : 0;
        e = cast(aaA *) gc_calloc(size, bits);


Test case for leakage using AA's with phobos below.  Not sure what this does
with Tango, but I think it will probably still fail on a 32-bit architecture
since int.sizeof is in the range of sizes that still gets scanned.  

It seems like a better approach would be to a) use keyti's TypeInfo to decide
if it's a pointer type or not and b) arrange for _aaGet to be called with the
value type's TypeInfo too, instead of just a size, and also use that to decide
if the alloc'ed memory should have the NO_SCAN bit set or not.

It could be that all of the above diagnostic guesswork is wrong.  But I am
certain that the program below leaks memory, whatever the actual reason.

---------------< test case >-----------------------
import std.stdio;
import std.gc;

// Just an ordinary AA with a lot of values.
// neither keys nor values look like pointers.
class BigAA
{
    int[int] aa;
    this() { 
        for(int i=0;i<1000;i++) {
            aa[i] = i;
        }
    }
}

void main()
{
    int nloops = 10_000;
    auto b = new BigAA[100];

    for(int i=0; i<nloops; ++i)
    {
        // Create some AAs (overwriting old ones)
        foreach(ref v; b) { v = new BigAA; }

        // See how we're doing
        std.gc.GCStats stats;
        std.gc.fullCollect();
        std.gc.getStats(stats);
        writefln("Loop %-5s - poolsize=%-10s   %s Mbytes  (%s KB)", 
                 i, stats.poolsize, 
                 stats.usedsize/1024/1024, 
                 stats.usedsize/1024);

    }
}


-- 
Oct 09 2007
next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=1561






-------
I compiled and ran the program against tango using dmd 1.020. Since there
doesn't seem to be gcstats equivalent functionality, I removed it and monitored
the memory usage with top. 

I terminated the process after eight minutes of it running: it consumed a
constant and small amount of memory during its execution.

With dmd 1.021 and phobos, I see the memory usage growing over time.


-- 
Oct 10 2007
prev sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=1561






Thanks for trying it out with Tango.  But if Tango doesn't show the problem
then that almost certainly means my diagnosis is incorrect.


-- 
Oct 10 2007