D - Should all Arrays be more like Objects
- Mike Wynn (32/32) Jan 22 2002 As currently implemented Static arrays are on the stack, and you can wit...
- Walter (16/47) Jan 22 2002 without
- Russ Lewis (10/10) Jan 23 2002 The GC could check reference counts on stack objects as they go out of s...
- Mike Wynn (24/34) Jan 23 2002 As I read it the GC is a Mark and sweep, conservative collector.
- Russ Lewis (33/45) Jan 23 2002 You're right, sorry. I forgot about it. I keep thinking about it as a
- Walter (9/16) Jan 23 2002 counting
- Juan Carlos Arevalo Baeza (21/24) Jan 23 2002 only
- Russ Lewis (58/61) Jan 24 2002 I've been thinking about that. I see several possible solutions, none o...
- Walter (48/109) Jan 24 2002 The solutions you propose I think will work, but there'll be considerabl...
As currently implemented Static arrays are on the stack, and you can without warnings pass them to functions or constructors as dynamic arrays, which may cause the dynamic array reference to remain live after the stack frame has gone. more alarming than this is my hypothosis that static arrays within objects are not noticed by the GC and thus if I hold onto a reference to an array within an object but not the object. The array will may be freed from under my feet by the GC. I understand the need for structures and static arrays, and the desire to get down and dirty. but you can do that and remain safe if constrained a little. as point of reference C and C++ have the register keyword, and in C is means behaves LIKE a register (not IS A register) but like one; so it has no address but in C++ it means optimise this value's use, the former is IMHO a much nicer behaviour, as a programmer I can restrict what I can do with the variable allowing the compiler to optimise if it feels in the mood to do so. if the objectives of D are to reduce the pit falls when writing OO programmes then and classes as all dynamic ... either all Arrays and Structures are dynamic too or an automatic or stack attribute for method parameters two rules (I think), references to stack items can not be stored in dynamic items and references to stack items of the current frame can not be put into any items parameters refer to. so a stack parameter can only be put into stack items of the current frame or passed to methods that take stack parameters without manual intervention. The programmer would have to create a dynamic copy of the static item. i.e. dup the array. because D is static, the compiler could do the analysis for you. Mike.
Jan 22 2002
"Mike Wynn" <mike.wynn l8night.co.uk> wrote in message news:a2k4gh$cc7$1 digitaldaemon.com...As currently implemented Static arrays are on the stack, and you canwithoutwarnings pass them to functions or constructors as dynamic arrays, whichmaycause the dynamic array reference to remain live after the stack frame has gone.Yes, that is a general problem with anything on the stack you take a pointer to.more alarming than this is my hypothosis that static arrays within objects are not noticed by the GC and thus if I hold onto a reference to an array within an object but not the object. The array will may be freed fromundermy feet by the GC.No, the GC will recognize that case.I understand the need for structures and static arrays, and the desire to get down and dirty. but you can do that and remain safe if constrained a little. as point of reference C and C++ have the register keyword, and in C ismeansbehaves LIKE a register (not IS A register) but like one; so it has no address but in C++ it means optimise this value's use, the former is IMHO a much nicer behaviour, as a programmer I can restrict what I can do with the variable allowing the compiler to optimise if it feels in the mood to doso.if the objectives of D are to reduce the pit falls when writing OO programmes then and classes as all dynamic ... either all Arrays and Structures are dynamic too or an automatic or stack attribute for method parameters two rules (I think), references to stack items can not be stored indynamicitems and references to stack items of the current frame can not be put into any items parameters refer to. so a stack parameter can only be put into stack items of the current frame or passed to methods that take stack parameters without manualintervention.The programmer would have to create a dynamic copy of the static item.i.e.dup the array. because D is static, the compiler could do the analysis for you.Those are good points, and I thought a long time about them in the design of D. I eventually came to the conclusion that needing to make copies of arrays and objects all the time was too much overhead.
Jan 22 2002
The GC could check reference counts on stack objects as they go out of scope. You could throw a "ReferencesRemainException" if somebody is still pointing to it. This is much like my idea that delete should throw an exception if references remain.... -- The Villagers are Online! villagersonline.com .[ (the fox.(quick,brown)) jumped.over(the dog.lazy) ] .[ (a version.of(English).(precise.more)) is(possible) ] ?[ you want.to(help(develop(it))) ]
Jan 23 2002
As I read it the GC is a Mark and sweep, conservative collector. so no ref counts to use, to add them would waste lots of time for all the cases when not required and to only have them for stack arrays would require the same analysis as for automatic promotion to heap arrays. so the only way to see if it's live is to walk the whole heap every time you exit a non leaf method that has passed an array as a parameter. and to put more spanners in the works, a conservative collector can get it wrong so it may not be live, you may just have an int with the same value. Who gets the exception., the thread who's stack has been used or the thread(s) with the array. instead of going to the expence of trying to detect it, why no disallow it, or make sure it never happens if all array where objects except those within objects as the GC keeps the parent alive you never have a problem. the compiler can optimise arrays in leaf methods, and with good memory and one that does not coalesce too argessively then the right size block will be available. Mike. "Russ Lewis" <spamhole-2001-07-16 deming-os.org> wrote in message news:3C4EE3D3.30B44B51 deming-os.org...The GC could check reference counts on stack objects as they go out ofscope.You could throw a "ReferencesRemainException" if somebody is stillpointing toit. This is much like my idea that delete should throw an exception ifreferencesremain.... -- The Villagers are Online! villagersonline.com .[ (the fox.(quick,brown)) jumped.over(the dog.lazy) ] .[ (a version.of(English).(precise.more)) is(possible) ] ?[ you want.to(help(develop(it))) ]
Jan 23 2002
Mike Wynn wrote:As I read it the GC is a Mark and sweep, conservative collector. so no ref counts to use, to add them would waste lots of time for all the cases when not required and to only have them for stack arrays would require the same analysis as for automatic promotion to heap arrays.You're right, sorry. I forgot about it. I keep thinking about it as a reference-counting collector because I think that it should be one :) NOTE TO WALTER: David Bacon of IBM Research has implemented a reference counting collector that does loop detection without any other algorithms. That means no mark-and-sweep, stack traversal, or anything like that. It's called Recycler and is implemented on the Jalapeno JVM. http://www.research.ibm.com/people/d/dfb is his home page, and http://www.research.ibm.com/people/d/dfb/papers.html#Bacon01Java describes how he does it.instead of going to the expence of trying to detect it, why no disallow it, or make sure it never happens if all array where objects except those within objects as the GC keeps the parent alive you never have a problem. the compiler can optimise arrays in leaf methods, and with good memory and one that does not coalesce too argessively then the right size block will be available.I thought about proposing a function pointer type modifier that would allow you to declare pointers (passed as parameters) that couldn't be saved after the function returns...syntax error if you did otherwise. However, that breaks this code: int StartJob(char command[]); // returns an id int WaitForJobToFinish(int ID); void MyFunc() { char command = "do stuff"; int id = StartJob(command); WaitForJobToFinish(id); }; You want to ensure that all pointers to 'command' are gone before MyFunc() exits. But we can't because the thing we're calling has to save a pointer to it between the calls to StartJob() and WaitForJobToFinish()...unless we force StartJob() to copy the buffer, but then we're getting unnecessary overhead again. I'm open to ideas here... -- The Villagers are Online! villagersonline.com .[ (the fox.(quick,brown)) jumped.over(the dog.lazy) ] .[ (a version.of(English).(precise.more)) is(possible) ] ?[ you want.to(help(develop(it))) ]
Jan 23 2002
"Russ Lewis" <spamhole-2001-07-16 deming-os.org> wrote in message news:3C4F034C.BD4D051A deming-os.org...NOTE TO WALTER: David Bacon of IBM Research has implemented a referencecountingcollector that does loop detection without any other algorithms. Thatmeans nomark-and-sweep, stack traversal, or anything like that. It's calledRecyclerand is implemented on the Jalapeno JVM. http://www.research.ibm.com/people/d/dfb is his home page, and http://www.research.ibm.com/people/d/dfb/papers.html#Bacon01Java describeshowhe does it.Thanks! But I don't think reference counting can work with D, since the only pointer to an object might be an interior pointer. I don't see how that can work with reference counts.
Jan 23 2002
"Walter" <walter digitalmars.com> wrote in message news:a2nt3q$2v4f$2 digitaldaemon.com...Thanks! But I don't think reference counting can work with D, since theonlypointer to an object might be an interior pointer. I don't see how thatcanwork with reference counts.Well... There's an option that sounds interesting to me: assuming that pointers are somewhat deprecated, as they should be in a higher-level language, they could be implemented more inefficiently as a pointer to an object, and an offet within that object. Then, reference counting would work. This could be even more interesting if you consider the possibility of implementing an extra level on indirection in object pointers (references), where the pointer is an index into an array of active "real object pointers". In that case, you can easily implement a self-compacting heap. Anyway, it's an idea. Oh, and you could still have pointers into real memory by making the object be at position zero. For OS programming (if you still want to have that as an option), you could also have a special type (call it "address") which would be just an untyped raw pointer (just like in assembler), and would be used with casts. Salutaciones, JCAB
Jan 23 2002
Walter wrote:Thanks! But I don't think reference counting can work with D, since the only pointer to an object might be an interior pointer. I don't see how that can work with reference counts.I've been thinking about that. I see several possible solutions, none of which is optimal. Be sure to read (5), which I think has the most promise: SIMPLE IDEAS: 1) All pointers are actually double pointers; one element points to the actual thing being referenced, while one points to the garbage-collectable thing which it is inside of; 2) When setting (or unsetting) a pointer, do a lookup into a table of objects to find what the pointer is inside of. If this was a hash table, then you would have a reasonably good chance of getting an immediate hit as long as the pointer is NOT an interior pointer... MORE COMPLEX ONES: 4) Require that all reference-counted objects be on n-byte boundaries (perhaps 32 byte boundaries). Then, you can find the start of the object as follows: a) Put a magic value as the first 32 bits of every reference counted object that table. c) When looking for the "root" of an object, first go back to the 32-byte boundary and check for the magic value. If you don't find it, keep going back 32 bytes at a time until you do. d) When you find a magic value, guess that it's the root, and look at the "pointer into the memory table." Confirm that this pointer is to a valid address in the table and that the table at that entry point points back to this object. If so, then you have the object root. If not, return to (c) and continue. 5) Make all members of an object also be garbage collected. However, you set a flag (or a pointer or offset value) that indicates that the method of "garbage collecting" a member is simply to decrement the reference count of the parent object. a) When using a pointer to an object to get a pointer to a member of it, you check the count on the child first. If this count is 0, then you increment the count on the parent as well as the child, as you are making the child "refer" to the parent. b) Later, when you lose the pointer to the child, you decrement its count. If the child ends up "garbage collected", then you don't delete the memory...instead, you decrement the count on the parent. Then the parent might or might not get collected. c) Dynamic arrays will have to be implemented as 12 bytes instead of 8; each array has a pointer to a "root" (root of the entire array, not this slice of it), and a pointer to the "root of this slice", and a length field. Thus, you can always know where to find the reference count when an array reference is destroyed. d) You can't have any GC overhead inside structs (so while you can GC a struct, you can't GC its members). Also, it's not practical to have a GC record for every 'int' inside a class. Perhaps if you take a pointer to such elements, you have to uses a special non-GC pointer. This isn't actually so bad, if you think about it... e) If you absolutely demand GC-ing members of a class: For small members of classes, you could group them into n-byte chunks (each of which starts with a GC record) and require that all classes be allocated on a similar boundary. Thus, except for pointers into a struct (which have to be non-GC pointers), you can always find a GC record by taking the pointer and rounding down to the n-byte boundary. -- The Villagers are Online! villagersonline.com .[ (the fox.(quick,brown)) jumped.over(the dog.lazy) ] .[ (a version.of(English).(precise.more)) is(possible) ] ?[ you want.to(help(develop(it))) ]
Jan 24 2002
The solutions you propose I think will work, but there'll be considerable added overhead in both execution time and memory consumption. Also, some of the interoperability with C objects will no longer be possible. I've written some programs using conservative GC and benchmarked them against fast reference counting schemes (written by others). My GC outperformed them, so I'm not sold on the idea. The GC I used is the one that comes with the Phobos library source, and it's a pretty basic one. Upgrading it to a generational collector would produce a big increment in performance. -Walter "Russ Lewis" <spamhole-2001-07-16 deming-os.org> wrote in message news:3C5039CF.397B4A12 deming-os.org...Walter wrote:onlyThanks! But I don't think reference counting can work with D, since thecanpointer to an object might be an interior pointer. I don't see how thatwhichwork with reference counts.I've been thinking about that. I see several possible solutions, none ofis optimal. Be sure to read (5), which I think has the most promise: SIMPLE IDEAS: 1) All pointers are actually double pointers; one element points to theactualthing being referenced, while one points to the garbage-collectable thingwhichit is inside of; 2) When setting (or unsetting) a pointer, do a lookup into a table ofobjects tofind what the pointer is inside of. If this was a hash table, then youwouldhave a reasonably good chance of getting an immediate hit as long as thepointeris NOT an interior pointer... MORE COMPLEX ONES: 4) Require that all reference-counted objects be on n-byte boundaries(perhaps32 byte boundaries). Then, you can find the start of the object asfollows:a) Put a magic value as the first 32 bits of every reference countedobjecttothat table. c) When looking for the "root" of an object, first go back to the32-byteboundary and check for the magic value. If you don't find it, keep goingback32 bytes at a time until you do. d) When you find a magic value, guess that it's the root, and look atthe"pointer into the memory table." Confirm that this pointer is to a valid address in the table and that the table at that entry point points back tothisobject. If so, then you have the object root. If not, return to (c) and continue. 5) Make all members of an object also be garbage collected. However, youset aflag (or a pointer or offset value) that indicates that the method of"garbagecollecting" a member is simply to decrement the reference count of theparentobject. a) When using a pointer to an object to get a pointer to a member ofit, youcheck the count on the child first. If this count is 0, then youincrement thecount on the parent as well as the child, as you are making the child"refer" tothe parent. b) Later, when you lose the pointer to the child, you decrement itscount.If the child ends up "garbage collected", then you don't delete the memory...instead, you decrement the count on the parent. Then the parentmightor might not get collected. c) Dynamic arrays will have to be implemented as 12 bytes instead of8; eacharray has a pointer to a "root" (root of the entire array, not this sliceofit), and a pointer to the "root of this slice", and a length field. Thus,youcan always know where to find the reference count when an array referenceisdestroyed. d) You can't have any GC overhead inside structs (so while you can GCastruct, you can't GC its members). Also, it's not practical to have a GCrecordfor every 'int' inside a class. Perhaps if you take a pointer to suchelements,you have to uses a special non-GC pointer. This isn't actually so bad, ifyouthink about it... e) If you absolutely demand GC-ing members of a class: For smallmembersof classes, you could group them into n-byte chunks (each of which startswith aGC record) and require that all classes be allocated on a similarboundary.Thus, except for pointers into a struct (which have to be non-GCpointers), youcan always find a GC record by taking the pointer and rounding down to the n-byte boundary. -- The Villagers are Online! villagersonline.com .[ (the fox.(quick,brown)) jumped.over(the dog.lazy) ] .[ (a version.of(English).(precise.more)) is(possible) ] ?[ you want.to(help(develop(it))) ]
Jan 24 2002