digitalmars.D.learn - How the GC distinguishes code from data
- %u (9/9) Jan 05 2011 Hi,
- Simen kjaeraas (16/25) Jan 05 2011 s =
- Steven Schveighoffer (11/21) Jan 05 2011 There is another problem that I recently ran into. If you allocate a
- %u (5/6) Jan 05 2011 NO_SCAN, meaning it has no pointers in it, and the GC will ignore it.
- Pelle (4/10) Jan 06 2011 It assumes everything on the stack is pointers, at the moment, I believe...
- %u (5/7) Jan 07 2011 tell it to.
- Pelle (6/13) Jan 07 2011 Kinda sorta. I haven't had any problems from that. If you allocate very
- %u (4/6) Jan 07 2011 But if I need to do that, then what would be the difference between void...
- Simen kjaeraas (5/9) Jan 07 2011 None what so ever. If you want to mark some memory with special bits,
- %u (5/6) Jan 07 2011 Huh.. then what about what is said in this link?
- Steven Schveighoffer (41/48) Jan 07 2011 First, you should understand that the GC does not know what data is in a...
- %u (4/6) Jan 07 2011 conservatively marked as containing pointers.
- =?UTF-8?B?IkrDqXLDtG1lIE0uIEJlcmdlciI=?= (20/33) Jan 06 2011 to
Hi, There's a question that's been lurking in the back of my mind ever since I learned about D: How does the GC distinguish code from data when determining the objects to collect? (E.g. void[] from uint[], size_t from void*, etc.?) If I have a large uint[], it's practically guaranteed to have data that looks like pointers, and that might cause memory leaks. Furthermore, if the GC moves things around, it would corrupt my data. How is this handled? Thank you!
Jan 05 2011
%u <wfunction hotmail.com> wrote:Hi, There's a question that's been lurking in the back of my mind ever sin=ce =I learned about D: How does the GC distinguish code from data when determining the object=s =to collect? (E.g. void[] from uint[], size_t from void*, etc.?)This is hardly the code/data dualism (data can easily hold pointers), bu= t simply POD/pointers.If I have a large uint[], it's practically guaranteed to have data tha=t =looks like pointers, and that might cause memory leaks.If you have allocated a large uint[], most likely =C3=ACt will be flagge= d NO_SCAN, meaning it has no pointers in it, and the GC will ignore it.Furthermore, if the GC moves things around, it would corrupt my data. How is this handled?The current GC does not move things. One could write such a GC for D (I believe), and in such a case data would be marked NO_MOVE if for whateve= r reason it cannot be moved. -- = Simen
Jan 05 2011
On Wed, 05 Jan 2011 16:56:47 -0500, Simen kjaeraas <simen.kjaras gmail.com> wrote:%u <wfunction hotmail.com> wrote:There is another problem that I recently ran into. If you allocate a large memory block, even one marked as not containing pointers, there is a medium probability that a 'fake' pointer exists that points *at* that block, not from it. This means that uint[] may never get collected unless you manually free it.If I have a large uint[], it's practically guaranteed to have data that looks like pointers, and that might cause memory leaks.If you have allocated a large uint[], most likely ìt will be flagged NO_SCAN, meaning it has no pointers in it, and the GC will ignore it.A moving GC cannot exist without precise scanning. Anything that is marked from a conservative block (one that has no pointer map) would not be able to move. -SteveFurthermore, if the GC moves things around, it would corrupt my data. How is this handled?The current GC does not move things. One could write such a GC for D (I believe), and in such a case data would be marked NO_MOVE if for whatever reason it cannot be moved.
Jan 05 2011
If you have allocated a large uint[], most likely =C3=ACt will be flaggedNO_SCAN, meaning it has no pointers in it, and the GC will ignore it. Ah, but the trouble is, no one said that this array has to be in the GC heap! I could easily have a void[] and a uint[] that both point to non-GC managed memory. Or I might even have a uint[] allocated on the stack! How does the GC distinguish these, when there's no "attribute" it can mark? (Or does it?!)
Jan 05 2011
On 01/06/2011 07:31 AM, %u wrote:It assumes everything on the stack is pointers, at the moment, I believe. If it's not on the garbage collected heap, it won't scan it unless you tell it to.If you have allocated a large uint[], most likely =C3=ACt will be flaggedNO_SCAN, meaning it has no pointers in it, and the GC will ignore it. Ah, but the trouble is, no one said that this array has to be in the GC heap! I could easily have a void[] and a uint[] that both point to non-GC managed memory. Or I might even have a uint[] allocated on the stack! How does the GC distinguish these, when there's no "attribute" it can mark? (Or does it?!)
Jan 06 2011
It assumes everything on the stack is pointers, at the moment, I believeUh-oh... not the answer I wanted to hear, but I was half-expecting this. So doesn't that mean that, at the moment, D will leak memory?If it's not on the garbage collected heap, it won't scan it unless youtell it to. But what if it's a void[] on a non-GC heap? Doesn't the language say that needs to be scanned too?
Jan 07 2011
On 01/07/2011 06:47 PM, %u wrote:Kinda sorta. I haven't had any problems from that. If you allocate very large blocks in the garbage collector you may face trouble :-)It assumes everything on the stack is pointers, at the moment, I believeUh-oh... not the answer I wanted to hear, but I was half-expecting this. So doesn't that mean that, at the moment, D will leak memory?You have to add it to the garbage collector's list of roots, I'm not sure what it's named exactly. Note that you only have to do that if there actually are pointers to the gc heap there.If it's not on the garbage collected heap, it won't scan it unless you tell it to.But what if it's a void[] on a non-GC heap? Doesn't the language say that needs to be scanned too?
Jan 07 2011
Kinda sorta. I haven't had any problems from that. If you allocate very largeblocks in the garbage collector you may face trouble :-) Haha okay, thanks. :) (This makes me shiver quite a bit...)You have to add it to the garbage collector's list of rootsBut if I need to do that, then what would be the difference between void[] and ubyte[]?
Jan 07 2011
%u <wfunction hotmail.com> wrote:None what so ever. If you want to mark some memory with special bits, use setattr in core.memory. -- SimenYou have to add it to the garbage collector's list of rootsBut if I need to do that, then what would be the difference between void[] and ubyte[]?
Jan 07 2011
None what so ever.Huh.. then what about what is said in this link? http://d.puremagic.com/issues/show_bug.cgi?id=5326#c1 I was told that void[] could contain references, but that ubyte[] would not, and that the GC would need to scan the former but not the latter. Is that wrong? Thank you!
Jan 07 2011
On Fri, 07 Jan 2011 16:39:20 -0500, %u <wfunction hotmail.com> wrote:First, you should understand that the GC does not know what data is in a memory block. It has no idea that the block is a void[] or a ubyte[] or a class instance or whatever it is. All it knows is that it's data. What makes it scan a block is a bit set on the block indicating that it contains pointers. This bit is set by the higher-level runtime routines (like the ones that create an array) which use the TypeInfo to determine whether to set the NO_SCAN bit or not. Second, memory that is not part of D's allocation is *not* scanned or marked, no matter where it is. Essentially the mark routine goes like this (pseudocode): foreach(root; roots) if(root.hasPointers) // notice this has nothing to do with type foreach(pointer; root) if(pointer.pointsAt.GCHeapBlock) pointer.heapBlock.mark = true; while(changesWereMade) foreach(heapBlock; heap) if(heapBlock.hasPointers) foreach(pointer; heapBlock) if(pointer.pointsAt.GCHeapBlock) { pointer.heapBlock.mark = true; changesWereMade = true; } // free memory foreach(heapBlock; heap) if(!heapBlock.mark) free(heapBlock) So essentially, you can see if you allocated memory for example with malloc, and you didn't add it as a root, it's neither scanned nor marked. It does not participate whatsoever with the collection cycle, no matter what the type of the data is. Now, you should also realize that just because an array is a void[] doesn't necessarily make it marked as containing pointers. It is quite possible to implicitly cast a ubyte[] to a void[], and this does not change the NO_SCAN bit in the memory block. Data *allocated* as a void[] (which I highly recommend *not* doing) will be conservatively marked as containing pointers. This is probably where you get the notion that void[] contains pointers. -SteveNone what so ever.Huh.. then what about what is said in this link? http://d.puremagic.com/issues/show_bug.cgi?id=5326#c1 I was told that void[] could contain references, but that ubyte[] would not, and that the GC would need to scan the former but not the latter. Is that wrong?
Jan 07 2011
First, you should understand that the GC does not know what data is in a memoryblock. That is exactly why I was wondering how it figures things out. :)Data *allocated* as a void[] (which I highly recommend *not* doing) will beconservatively marked as containing pointers. Ah, all right, that clears things up! Thank you!!
Jan 07 2011
%u wrote:Hi, =20 There's a question that's been lurking in the back of my mind ever sinc=e Ilearned about D: =20 How does the GC distinguish code from data when determining the objects=tocollect? (E.g. void[] from uint[], size_t from void*, etc.?) =20 If I have a large uint[], it's practically guaranteed to have data that=lookslike pointers, and that might cause memory leaks. Furthermore, if the G=C movesthings around, it would corrupt my data. How is this handled? =20 Thank you!The GC knows about global variables, the stack, everything that was allocated through it and everything that you tell it to scan (which allows using C malloc without seeing an object disappear because the only remaining pointers are in a malloc'ed buffer). Moreover, for GC-allocated data (and maybe the globals too), the GC knows that some data cannot contain pointers and will refrain from scanning it (it will always assume that anything on the stack or that you tell it to scan contains pointers). The GC keeps track internally of the memory where it knows there are no pointers and the memory where there may be pointers. Jerome --=20 mailto:jeberger free.fr http://jeberger.free.fr Jabber: jeberger jabber.fr
Jan 06 2011