www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.bugs - [Issue 4650] New: Static data that must be scanned by the GC should be grouped

reply d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=4650

           Summary: Static data that must be scanned by the GC should be
                    grouped
           Product: D
           Version: D1 & D2
          Platform: All
        OS/Version: All
            Status: NEW
          Severity: enhancement
          Priority: P2
         Component: DMD
        AssignedTo: nobody puremagic.com
        ReportedBy: llucax gmail.com



PDT ---
Now the GC scans all the static data of the program, since it uses the libc
variables __data_start and _end to get its limits.

There is a lot of stuff in the static data that doesn't need to be scanned,
most notably the TypeInfos[*], which is a great portion of the static data. C
libraries static data, for example, would be scanned too, when it makes no
sense to do so.

I experience a 20% increment in the total static data of a small program[1] by
just adding about 5 more types to the GC implementation, which translate to a
appreciable loss in performance because of the extra scanning and probably the
extra false pointers.

It would be nice if the compiler could group all the static that must really be
scanned (programs static variables) together and make its limits available to
the GC. It would be even nicer to leave static variables that have no pointers
out of that group, and even much more nicer to create a pointer map like the
one in the patch from bug 3463 to allow precise heap scanning. That way the
only memory in the program that would have to be scanned conservatively will be
the stack.

[*] This is not entirely true, since IIRC the TypeInfo store the .init
property, which can be overwritten by the user, storing a pointer to the GC
heap there, but I think is a rare enough case to be considered, I think
imposing that limitation would be a problem in real life programs.

[1] http://codepad.org/xGDCS3KO

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Aug 15 2010
next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=4650




PDT ---
I think you can omit the [*] entirely, I think .init being writable was,
fortunately, just a product of my imagination...

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Aug 15 2010
prev sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=4650


nfxjfg gmail.com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |nfxjfg gmail.com



Generally the GC should only scan data for which at least hasPointers() returns
true, and that isn't logically constant (e.g. TypeInfo instances, even though
they can contain pointers/references).

Maybe implementation would be simplest by adding a a pointer range to
ModuleInfo, that tells the GC what exactly should be scanned. Ideally, static
variables for which hasPointers() is false would not be included in this range.
This should drastically reduce the amount of data needed to be scanned by the
GC, because the C data segment is not included.

An extended implementation could accompany the pointer range with a precise
pointer map.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Aug 15 2010