D - Garbage Collection is bad... other comments
- rafael baptista (65/65) Jan 12 2003 I just saw the D spec today... and it is quite wonderful. There are so m...
- Clay (12/13) Jan 12 2003 I disagree. For business applications, it is nice not to have to spend ...
- Evan McClanahan (23/40) Jan 13 2003 I know Mr. Baptisa by reputation, so I think that I can anticipate some
- rafael b aptista (37/43) Jan 13 2003 Yeah, that describes me pretty well. Wow. I had no idea I had a reputati...
- Evan McClanahan (21/75) Jan 13 2003 I applied to your company a long time ago (assuming that you're who I
- Sean L. Palmer (45/88) Jan 13 2003 reputation!
- Mike Wynn (12/21) Jan 13 2003 and you can therefore create all your required objects up front, and man...
- Walter (55/77) Jan 23 2003 Some great comments! My replies are inline below.
- Ilya Minkov (26/26) Jan 28 2003 Hello.
- Russell Lewis (17/27) Jan 13 2003 gc.disable();
- Ilya Minkov (52/52) Jan 13 2003 Please, believe me garbage collection (GC) is the future of game design....
- Burton Radons (18/25) Jan 13 2003 Type-based searching isn't implemented yet. DMD should produce a
- rafael baptista (35/54) Jan 13 2003 You are right. That is why game programmers avoid accessing slow devices...
- Ross Judson (13/23) Jan 13 2003 Dude, what do you think Java (and most other GC-enabled environments)
- Evan McClanahan (11/24) Jan 14 2003 But D is nicer than C, and I'd rather use it, and have it be a general
- Bill Cox (22/25) Jan 24 2003 Hi.
- Ilya Minkov (83/115) Jan 24 2003 Hello.
- Bill Cox (26/40) Jan 25 2003 I didn't make myself clear. I don't write recursive destructors. Our
- Ilya Minkov (32/54) Jan 25 2003 What further work does it do?
- Bill Cox (137/178) Jan 25 2003 Ok. Here's a description of our CASE tool:
- Ilya Minkov (30/32) Jan 31 2003 Seems to be kept secret though. I was searching for it, and was unable
- Sean L. Palmer (18/50) Feb 01 2003 This is why functional languages always copy data.
- Mike Wynn (21/29) Feb 01 2003 the
- Mike Wynn (21/29) Feb 01 2003 the
- Walter (9/16) Feb 05 2003 the
- Ilya Minkov (12/17) Feb 05 2003 Granted the GC is on. If it's off, these all will pollute the memory.
- Walter (4/15) Feb 05 2003 I hear you, but I really think it's worth just taking advantage of the G...
- Bill Cox (36/72) Feb 02 2003 Try: www.viasic.com/download/datadraw.tar.gz
- rafael baptista (33/55) Jan 14 2003 Thats right, a GC will normally allocate memory in large blocks, which i...
- Ross Judson (12/26) Jan 14 2003 Two things come to mind...first, some customers care about performance.
- Ilya Minkov (34/81) Jan 14 2003 OK. Let me see. I've got an OpenGL game somewhere which is written in
- Ilya Minkov (12/13) Jan 14 2003 ^^^^^^^
- Walter (12/14) Jan 23 2003 stalling
- Achillefs Margaritis (91/156) Jan 30 2003 C++ can be fully garbage-collected. Just make a GC super class of all
- Bill Cox (32/91) Jan 31 2003 This looks like fairly useful code in some cases, but it's not real GC.
- Ilya Minkov (51/53) Jan 31 2003 Walter considers C++ too flexible. And he doesn't rob you of C++, he
- Burton Radons (11/16) Jan 31 2003 Let's see what happens when type-based scanning is in first; nobody
- Walter (6/16) Feb 03 2003 I have a lot of ideas about improving GC, such as making it a copying
- Evan McClanahan (4/24) Feb 09 2003 Implementing hardware write barriers? I didn't think that the hardware
- Mike Wynn (4/11) Feb 09 2003 I took that to be; marking the page as read only. and catching the page
- Evan McClanahan (12/18) Feb 09 2003 From what I have read, can't find the link right now, that usually
- Walter (7/10) Feb 03 2003 global
I just saw the D spec today... and it is quite wonderful. There are so many ways that C and C++ could be better and D addresses them. But I think D is a little too ambitious to its detriment. It has a good side: it fixes syntactic and semantic problems with C/C++. It eliminates all kinds of misfeatures and cruft in C++. And D has a bad side: it attempts to implement all kinds of features that are best left to the programmer - not the language spec: built in string comparison, dynamic arrays, associative arrays, but the worst in my mind is the inclusion of garbage collection. As is pointed out in the D overview garbage collection makes D unusable for real time programming - this makes it unusable for Kernel programming, device drivers, computer games, any real time graphics, industrial control etc. etc. It limits the language in a way that is really not necessary. Why not just add in language features that make it easier to integrate GC without making it a part of the language. For my part I wouldn't use GC on any project in any case. My experience has been that GC is inferior to manually managing memory in every case anyway. Why? Garbage collection is slower: There is an argument in the Garbage collection section of the D documents that GC is somehow faster than manually managing memory. Essentially the argument boils down to that totally bungling manual memory management is slower than GC. Ok that is true. If I program with tons of redundant destructors, and have massive memory leaks, and spend lots of time manually allocating and deallocating small buffers, and constantly twiddling thousands of ref counts then managing memory manually is a disaster. Most programs manage memory properly and the overhead of memory management is not a problem. The argument that GC does not suffer from memory leaks is also invalid. In GC'd system, if I leave around references to memory that I never intend to use again I am leaking memory just as surely as if I did not call delete on an allocated buffer in c++. Finally there is the argument that manually managing memory is somehow difficult and error prone. In C++ managing memory is simple - you create a few templatized container classes at the start of a project. They will allocate and free *all* of the memory in your program. You verify that they work correctly and then write the rest of your program using these classes. If you are calling "new" all over your C++ code you are asking for trouble. The advantage is that you have full control over how memory is managed in your program. I can plan exactly how I will deal with the different types of memory, where my data structures are placed in memory. I can code explicity to make sure that certain data structures will be in cache memory at the same time. I had a similar reaction of having the compiler automatically build me hash tables, string classes and dynamic arrays. I normally code these things myself with particular attention to how they are going to be used. The unit test stuff is also redundant. I can chose to implement that stuff whether the language supports it or not. I normally make unit test programs that generate output that I can regress against earlier builds - but I wouldn't want to saddle arbitrary programs with having to run my often quite extensive and time consuming unit tests at startup. Imaginary numbers and bit fields should similarly be left up to the programmer. I suppose I can just chose not to use these features if I don't want to - but it strikes me as unnecessary cruft - and I subscribe to the idea beautifully expressed in the D overview that removing cruft makes a language better. The GC is worse though - because its in there operating - arbitrarily stalling my program whether I use it or not. I urge the implementor of D to think about it this way: you have lots of ideas. Many of them are really good - but it is likely that despite your best efforts not all of them are. It would make your project much more likely to succeed if you decoupled the ideas as much as possible such that the good ones can succeed without being weighed down by the bad. So wherever possible I think you should move functionality from the language spec to a standard library. This way the library can continue to improve, while the language proper could stabilize early. This is probably old ground - I assume lots of people have probably already commented on the GC. I looked for a thread on GC and could not find one. I for one would find D a lot more attractive if it did not have GC.
Jan 12 2003
In article <avsfoq$6nc$1 digitaldaemon.com>, rafael baptista says...I for one would find D a lot more attractive if it did not have GC.I disagree. For business applications, it is nice not to have to spend time thinking about memory management, b/c it isn't worth the gains at runtime. For simple senarios, STL and GC are probably the same complexity for the programmer. When the variable goes out of scope within a method, it can be cleaned up. But for more complicated code, it helps that we do not think about memory management and more about the software. For example, when a object is created in class A, given to class B and C, and either B or C will clean it up (it is not know ahead of time), a GC is much easier to write for. It comes down to, "who is D for?". But, I wonder if it is possible to create a language that allows for both GC and STL memory management. disclaimer: I am familiar with STL, but I have never used it.
Jan 12 2003
Clay wrote:In article <avsfoq$6nc$1 digitaldaemon.com>, rafael baptista says...I know Mr. Baptisa by reputation, so I think that I can anticipate some of his concerns (if he's the same person I'm thinking of). As he's a games programmer, he's very concered with more or less realtime performance, and performance with small bits of memory to allocate. I've thought of many of the same issues, being in the same field, and I think that it's more a matter of problem domain than anything else. I was trying to think of a way to do GC on a console, and I've basicaly come to the conclusion that it's more trouble than it would be worth. While GC is great for people with less pressing time and space concerns, like yourself, seemingly, it lacks a lot of fine grained control, and regaining that control would almost require writing a special case CG for every game, which would certainly take as much at as all of the memory management and debugging on a project. I think that his argument is similar to something that I've been saying, which is that memory management should be more cleanly modularized, so that one can pick and choose among strategies, without much language level interference. The unittest comments are less well founded, since unittest go away in release builds. I imagine that it's easy enough to do testing with the systems that are in there without many problems. Kinda a non-issue for me. I agree that some of the builtins should be moved into the library, but I don't really have the time right now to go very deeply into it. EvanI for one would find D a lot more attractive if it did not have GC.I disagree. For business applications, it is nice not to have to spend time thinking about memory management, b/c it isn't worth the gains at runtime. For simple senarios, STL and GC are probably the same complexity for the programmer. When the variable goes out of scope within a method, it can be cleaned up. But for more complicated code, it helps that we do not think about memory management and more about the software. For example, when a object is created in class A, given to class B and C, and either B or C will clean it up (it is not know ahead of time), a GC is much easier to write for. It comes down to, "who is D for?". But, I wonder if it is possible to create a language that allows for both GC and STL memory management.
Jan 13 2003
In article <avu432$224e$1 digitaldaemon.com>, Evan McClanahan says...I know Mr. Baptisa by reputation, so I think that I can anticipate some of his concerns (if he's the same person I'm thinking of). As he's a games programmer, he's very concered with more or less realtime performance, and performance with small bits of memory to allocate.Yeah, that describes me pretty well. Wow. I had no idea I had a reputation! Evan, you and I are in perfect agreement on GC, I think - it may be good for some other projects but no for what we do.The unittest comments are less well founded, since unittest go away in release builds.Yeah, I don't agree with you about that. I think unit tests are a good idea. Some people would implement them as a series of quick sanity checks on their libraries that would run at load time in debug builds. A proper unit test in my mind is a console app that does an exhaustive check on all of your code and tests all of your assumptions and prints out a long log of all the things it tried. Every time you make changes to a lib, you run the unit test and then diff the log against a previous run of the log and make sure that the results are the same - or that they only differ in expected ways. You then check into source code control the updated diff log. I have known companies that have implemented source code control in such a way that you cannot check your files back in until the system has regressed all the unit tests. Running unit tests at startup time has many disadvantages: 1. Even in a debug build you have limits on how long you are willing to wait through a unit test. With an offline unit test you can make it do everything you want. 2. Unit tests are better as text oriented apps, and most modern applications are graphical. You would have to make the unit test spew to a log, and then go check the log. Similarly it is a pain to try to run a gui app in a script. So if you want to make scripts that regress your libraries you have to make dummy console apps that only run the unit test in a text mode. 3. Unit tests running at application startup time are not running when they should run. You want a unit test to run every time you change the code that it tests - not necessarily every time you start up your main application. Making a unit test function for a class be a class member function has another disadvantage: You are running the unit test after the class constructor, and the unit test is oriented toward testing only that one instance. To unit test a class I prefer to make a non-member function that exercises the whole API in one function - including all the constructors. Experienced programmers can disagree about the best way to do unit tests - and that is exactly why unit tests should not be built in to the language. The way that D supports unit tests is not the way most programmers who implement unit test chose to implement them. So you are codifying into the language something which is not established as the best practice anyway.
Jan 13 2003
rafael b aptista wrote:In article <avu432$224e$1 digitaldaemon.com>, Evan McClanahan says...I applied to your company a long time ago (assuming that you're who I think that you are and not someone with the same name), which is how I know who you are (or might be). Game programming is a smallish world anyway. Some interesting stuff on your site. I assumed that your perspective would be coming from GBA developement, and went from there. A GC would be disastrous in that environment, and I think that it would be bad for a lot of system level stuff as well. But then, I think that my perfect language would be almost nothing in the system core, a robust syntax extender, and a biggish library of standard extensions, so that might explain the slant in my perspective.I know Mr. Baptisa by reputation, so I think that I can anticipate some of his concerns (if he's the same person I'm thinking of). As he's a games programmer, he's very concered with more or less realtime performance, and performance with small bits of memory to allocate.Yeah, that describes me pretty well. Wow. I had no idea I had a reputation! Evan, you and I are in perfect agreement on GC, I think - it may be good for some other projects but no for what we do.Ok, I can see your point. I don't think, though that they're insanely harmful, as you can still write traditional unittests of the kind that you describe, and if people decide that the unittests don't work as a normal feature, then they'll go away eventually. I've used the other DBC features more, so you're likely right. Without a proper computer or internet connection at home, I haven't had a chance to program in D as much as I would like to, and the programs that I have had a chance to write were untested and undocumented hacks, so what do i know about testing? :) EvanThe unittest comments are less well founded, since unittest go away in release builds.Yeah, I don't agree with you about that. I think unit tests are a good idea. Some people would implement them as a series of quick sanity checks on their libraries that would run at load time in debug builds. A proper unit test in my mind is a console app that does an exhaustive check on all of your code and tests all of your assumptions and prints out a long log of all the things it tried. Every time you make changes to a lib, you run the unit test and then diff the log against a previous run of the log and make sure that the results are the same - or that they only differ in expected ways. You then check into source code control the updated diff log. I have known companies that have implemented source code control in such a way that you cannot check your files back in until the system has regressed all the unit tests. Running unit tests at startup time has many disadvantages: 1. Even in a debug build you have limits on how long you are willing to wait through a unit test. With an offline unit test you can make it do everything you want. 2. Unit tests are better as text oriented apps, and most modern applications are graphical. You would have to make the unit test spew to a log, and then go check the log. Similarly it is a pain to try to run a gui app in a script. So if you want to make scripts that regress your libraries you have to make dummy console apps that only run the unit test in a text mode. 3. Unit tests running at application startup time are not running when they should run. You want a unit test to run every time you change the code that it tests - not necessarily every time you start up your main application. Making a unit test function for a class be a class member function has another disadvantage: You are running the unit test after the class constructor, and the unit test is oriented toward testing only that one instance. To unit test a class I prefer to make a non-member function that exercises the whole API in one function - including all the constructors. Experienced programmers can disagree about the best way to do unit tests - and that is exactly why unit tests should not be built in to the language. The way that D supports unit tests is not the way most programmers who implement unit test chose to implement them. So you are codifying into the language something which is not established as the best practice anyway.
Jan 13 2003
"rafael b aptista" <rafael_member pathlink.com> wrote in message news:avukdf$2f0g$1 digitaldaemon.com...In article <avu432$224e$1 digitaldaemon.com>, Evan McClanahan says...reputation!I know Mr. Baptisa by reputation, so I think that I can anticipate some of his concerns (if he's the same person I'm thinking of). As he's a games programmer, he's very concered with more or less realtime performance, and performance with small bits of memory to allocate.Yeah, that describes me pretty well. Wow. I had no idea I had aEvan, you and I are in perfect agreement on GC, I think - it may be goodforsome other projects but no for what we do.On a console game, it's always possible to just allocate the entire memory from the system and manage it yourself. That seems to be what everybody does in C++. In D, you can disable GC, and even if you don't, it doesn't run unless you run out of memory. The problem is that D doesn't have good RAII semantics, so implementing your own memory management is a painful process resembling C++ explicit new and delete. That method is bug ridden. In fact I am not sure D allows you to run ctors on arbitrary memory, so it wouldn't be possible to write your own 'new' function template.idea.The unittest comments are less well founded, since unittest go away in release builds.Yeah, I don't agree with you about that. I think unit tests are a goodSome people would implement them as a series of quick sanity checks ontheirlibraries that would run at load time in debug builds. A proper unit testin mymind is a console app that does an exhaustive check on all of your codeandtests all of your assumptions and prints out a long log of all the thingsittried. Every time you make changes to a lib, you run the unit test andthen diffthe log against a previous run of the log and make sure that the resultsare thesame - or that they only differ in expected ways. You then check intosourcecode control the updated diff log. I have known companies that haveimplementedsource code control in such a way that you cannot check your files back inuntilthe system has regressed all the unit tests. Running unit tests at startup time has many disadvantages: 1. Even in a debug build you have limits on how long you are willing towaitthrough a unit test. With an offline unit test you can make it doeverything youwant. 2. Unit tests are better as text oriented apps, and most modernapplications aregraphical. You would have to make the unit test spew to a log, and then gocheckthe log. Similarly it is a pain to try to run a gui app in a script. So ifyouwant to make scripts that regress your libraries you have to make dummyconsoleapps that only run the unit test in a text mode. 3. Unit tests running at application startup time are not running whentheyshould run. You want a unit test to run every time you change the codethat ittests - not necessarily every time you start up your main application. Making a unit test function for a class be a class member function hasanotherdisadvantage: You are running the unit test after the class constructor,and theunit test is oriented toward testing only that one instance. To unit testaclass I prefer to make a non-member function that exercises the whole APIin onefunction - including all the constructors.Unit tests can be global module-scope functions too.Experienced programmers can disagree about the best way to do unit tests -andthat is exactly why unit tests should not be built in to the language.Seems like it's more of a #ifdef UNITTESTS RunUnitTest(); #endif UNITTESTSThe way that D supports unit tests is not the way most programmers whoimplementunit test chose to implement them. So you are codifying into the language something which is not established as the best practice anyway.I kinda like integrating the code for the unit tests into the module it tests. But I would like more control over when they are run. Sean
Jan 13 2003
On a console game, it's always possible to just allocate the entire memory from the system and manage it yourself. That seems to be what everybody does in C++. In D, you can disable GC, and even if you don't, it doesn't run unless you run out of memory.and you can therefore create all your required objects up front, and manage then manually; (as Java games / embedded programmers do)The problem is that D doesn't have good RAII semantics, so implementingyourown memory management is a painful process resembling C++ explicit new and delete. That method is bug ridden. In fact I am not sure D allows you to run ctors on arbitrary memory, so it wouldn't be possible to write your own 'new' function template.agree, and D with placement constructors would be good, the effect on the GC (if not disabled) might be undesirable and would allow some nasty 'unions' to be created (construct two objects in the same space). not only would you need to inform the GC that a memory range was under your control, but any object in there would still need to be walked as they could contain refs to GC managed objects. (one of the stumbleing blocks with realtime Java)
Jan 13 2003
Some great comments! My replies are inline below. "rafael b aptista" <rafael_member pathlink.com> wrote in message news:avukdf$2f0g$1 digitaldaemon.com...Running unit tests at startup time has many disadvantages: 1. Even in a debug build you have limits on how long you are willing towaitthrough a unit test. With an offline unit test you can make it doeverything youwant.A unit test is not a complete replacement for a comprehensive test suite. It just tests the functionality of individual parts of the project. Unit tests can also be turned on or off on a module by module basis - once one section is debugged, you can turn off unit tests for it until it gets changed. My own experience with D unit testing has been that it has saved me from many, many embarassingly stupid bugs. I don't mind waiting a couple seconds on startup for a debug build. If it takes a long time to run the unit tests, it's probably worth looking at the unit tests and pulling the time consuming bits out into an offline test. That's really what I do with the D compiler system itself. The basics are done with unit tests, the comprehensive time consuming tests are done with a separate test suite. But the unit tests assure me a basic level of functionality every time without needing to run the long test suite.2. Unit tests are better as text oriented apps, and most modernapplications aregraphical. You would have to make the unit test spew to a log, and then gocheckthe log. Similarly it is a pain to try to run a gui app in a script. So ifyouwant to make scripts that regress your libraries you have to make dummyconsoleapps that only run the unit test in a text mode.Unit tests in D are designed to throw an exception when they fail. Catching exceptions and putting up a message box for them isn't a big problem. When I've worked on gui apps, I did use log files extensively. Instead of printf'ing to the console, I just printf'd to a log file.3. Unit tests running at application startup time are not running whentheyshould run. You want a unit test to run every time you change the codethat ittests - not necessarily every time you start up your main application.Doing it when the program starts is a reliable way of doing it. For example, if you have module A and module B, A depends on B, and the code for B changes, it's nice to get the unit test for A run too. Having some external mechanism decide which unit tests to run when particular modules change is difficult to get right, and is certainly wrong if it decides solely on the basis of source file changes.Making a unit test function for a class be a class member function hasanotherdisadvantage: You are running the unit test after the class constructor,and theunit test is oriented toward testing only that one instance. To unit testaclass I prefer to make a non-member function that exercises the whole APIin onefunction - including all the constructors.You can do that in D with a module level unit test.Experienced programmers can disagree about the best way to do unit tests -andthat is exactly why unit tests should not be built in to the language.I suggest you could say that about every language feature - just look at this forum! In any case, in my experience, very very few programmers do unit tests. Why? Because it's a pain to do without language support. In D, unit tests are quick and easy to add, they are right next to the code being tested, and so are more likely to get written and be up to date. Running them at startup means they are more likely to be run.The way that D supports unit tests is not the way most programmers whoimplementunit test chose to implement them. So you are codifying into the language something which is not established as the best practice anyway.D does not prevent doing unit tests in a non-builtin manner. But (as far as I know) D is the very first language to build in the concept of unit tests, in a quick, convenient, and reliable manner. I suggest that this will encourage a much greater percentage of programmers taking the time to even write any unit tests, which will be all to the better. I won't argue that D is the best way possible to do unit tests, experience using the language over time may easilly suggest improvements. But it's a great start, and unit tests are a badly needed language feature.
Jan 23 2003
Hello. There exists a GC algorithm, which can run without stopping the execution at all, in parallel, and is thus okay for realtime systems. Three Colour Marking GC Each object has a marking field, which can hold one of three colors: - white: objects to which no references were found. All objects start out white. - grey: object is known to be reachable (referenced), but wasn't scanned for references yet; - black: completely inspected. Shall not be revisited by a scanner, unless it changes color. The only runtime modification is that whenever pointers are being modifed, their containers have to be re-colored from black to grey. After there are no more grey objects, white ones can be collected. That's how it should work in principle. You can imagine some sufficently efficient implementatations. This algorithm is also in a modified form used in OCaml - that's how come that games are written in it. After starting out in mid-seventies it has born a number of related algorithms. Hm. This leads again to the question of a flexible code generation, so that it's possible to plug in one or another GC system, or possibly another memory managers. Or structural optimisations. A complex feature not only by itself, but also for interface design. But on the other hand it could take some burden off compiler writer and pass it to library writers. -i.
Jan 28 2003
rafael baptista wrote:I just saw the D spec today... and it is quite wonderful. There are so many ways that C and C++ could be better and D addresses them. But I think D is a little too ambitious to its detriment. It has a good side: it fixes syntactic and semantic problems with C/C++. It eliminates all kinds of misfeatures and cruft in C++. And D has a bad side: it attempts to implement all kinds of features that are best left to the programmer - not the language spec: built in string comparison, dynamic arrays, associative arrays, but the worst in my mind is the inclusion of garbage collection.gc.disable(); In Walter's compiler (although other compilers will differ in the future, I suppose) the GC only runs when the memory allocation routines have run out of buffers that they've gotten from the OS. That is, the underlying routines for 'new' grab somewhat large blocks of virtual memory from the OS, and piece them up into individual allocations. When the allocator has used all of its memory, it runs the garbage collector BEFORE it asks the OS for more memory. It only talks to the OS if it cannot find any place in its current space to make the allocation. If you disable the garbage collector, then it just doesn't do GC when the memory allocator runs out of memory. Just use 'new' and 'delete' as you would have in C++. You can also just disable the gc for a small time: gc.disable(); DoTimeCriticalThing(); gc.enable();
Jan 13 2003
Please, believe me garbage collection (GC) is the future of game design. "Realtime programming" doesn't really include games and other things called "realtime", since they have to cope with disc access and such, graphic card, other devices, which are NOT REALTIME! You send some command and wait for it to be completed, and this usually lasts much, much, much, much longer than a GC cycle. guess. (www.hugi.de) He proposes highly efficient structures for games, which are highly complex meshes of (concave) objects. Now, as I read this through i was fascinated. But as soon as i thought of implementation, i got sudden strong headaches. As later on i was learning about Hans Boehms GC for C, then about functional programming in OCaml (GC-ed language), it became evident to me, that such structures as proposed by TAD can be very easily and efficiently implemented with GC and only this way. Ref-counting ultimately fails in many cases. To the question of efficiency. Though the current implementation of D GC is said to be very simple-minded and thus not very efficient, there exists a very efficient implementation for C which is farly popular. I've read of experiences of creating a simple IDE (Wedit) for LCC-Win32, a very simple C compiler. The author claims, that plugging in the Hans Boehm GC into the project reduced the time requiered to load a project to 1/8 of original, because GC-malloc is simply better optimized than the normal one. The manual says that Boehm GC proves to be significantly faster for almost any application managing many tiny objects, but tends to be slower for applications working with large objects only. The reason for the first claim is that it performs a number of optimisations (heap compaction), helping keep objects which are used together near each other, which drastically reduces cache misses, and as you might know the cache is about as fast as the CPU, while the RAM speed has been improving much slower than the CPU/Cache speed and the impact is already huge, and is likely to grow in the future. I guess D GC doesn't do such things yet, but they are to come. These optimisations can only be efficiently implemented in conjuction with a GC, that is, if you do them, you've got a GC basically for free. The second claim is based upon the fact, that Boehm GC tends to spend too much time figuring out what is a pointer and what not by content. You can limit such pointless search for large pointerless (pun intended) objects, but it doesn't solve the issue in general. D GC uses type information and *never* searches large masses without use, so that you can assume that it should be faster than Boehm GC for large objects. That is, D gc should also work cleaner. And if you're still not convinced, go on and turn it off. You're just not doing yourself a favor. D GC does the same amount of work what ref-counting does once in a while, while ref-counting has to do that at almost any operation. You can also cause GC cycles by timer once every now and then, keeping memory usage constantly low and cycle times short, at the cost of making them longer in general. The way i'm currently aware of is to disable and then enable the GC, not sure it works. regards, -i./midiclub
Jan 13 2003
Ilya Minkov wrote:The second claim is based upon the fact, that Boehm GC tends to spend too much time figuring out what is a pointer and what not by content. You can limit such pointless search for large pointerless (pun intended) objects, but it doesn't solve the issue in general. D GC uses type information and *never* searches large masses without use, so that you can assume that it should be faster than Boehm GC for large objects. That is, D gc should also work cleaner.Type-based searching isn't implemented yet. DMD should produce a super-optimised scanning function for each allocated and public type in this module. D's GC could also do block allocation for small objects or those which have proper characteristics in previous runs (good but fairly constant use, with high turnover of allocation and freeing). That makes allocation and freeing take just a few cycles and reduces overhead. The effect of this is incalculable at this time as it gets into cache issues, but I've seen the content-to-pointers ratio go in the hundreds when dealing with mesh vertices (textures were uploaded and then freed, otherwise it would be one pointer for 700Kb data). I'm very optimistic that I'll be able to continually run the GC on any game. But I'm even more sure that it's worth working for. If the GC delay is still too great, then all games have sequence points. When the player changes a level or a discrete segment, or is dead, or is clearly AFK, or if the player is swapping out, loading, saving, switching from the menu, or is not being attacked, then is your time.
Jan 13 2003
In article <avv7ve$2tos$1 digitaldaemon.com>, Ilya Minkov says...Please, believe me garbage collection (GC) is the future of game design.I doubt it."Realtime programming" doesn't really include games and other things called "realtime", since they have to cope with disc access and such, graphic card, other devices, which are NOT REALTIME!You are right. That is why game programmers avoid accessing slow devices like disks in the middle of a game. If you absolutely must get something off disk you normally stream it in. You set up an interrupt driven process or something like that. So say I need a new model off the disk in a few seconds, I would place a request to the disk streaming subsystem to start getting the stuff off the disk. If you sit there like fool waiting for your CD drive to spin up, then your game is very choppy.You don't even have to believe me.I don't have to believe you. I write games for a living. ;)The author claims, that plugging in the Hans Boehm GC into the project reduced the time requiered to load a project to 1/8 of original, because GC-malloc is simply better optimized than the normal one. The manual says that Boehm GC proves to be significantly faster for almost any application managing many tiny objects, but tends to be slower for applications working with large objects only.I don't doubt it. Asking the operating system for a buffer is very very slow. On windows it actually pauses your thread, and does an OS context switch. If you are allocating tiny buffers straight from the OS all the time, you need to go back to programming school. Memory should be allocated from the OS in large chunks and then doled out inside your progam as needed - using your own optimized allocators. This is what gets you cache coherency. This is what guarantees that all the vertex data for a model is in cache when you transform it, etc. It is not fair to compare using the most highly optimized GC you have ever heard of against bad manual memory management practices.The reason for the first claim is that it performs a number of optimisations (heap compaction),There is no heap compaction better than me placing my data structures in memory explicitly for cache coherency. Plus there is no penalty for me having holes in the memory space that are bigger than a memory page anyway. If a whole page is unused it will never be loaded into the cache and will never trouble me. Compacting it away is just wasting bandwidth on the bus that I could be using to load textures.And if you're still not convinced, go on and turn it off. You're just not doing yourself a favor. D GC does the same amount of work what ref-counting does once in a while, while ref-counting has to do that at almost any operation.The alternative to GC is not ref counting. Few objects have such an unpredictable life cycle that they cannot be managed any other way. GC essentially embed the ref counting inside the GC subsystem and makes you do some kind of reference tracking for all memory buffer wether you need it or not. I ref count very few objects in a given project. And when I do I'm typically ref counting entire objects, not every single buffer. Plus I can manipulate the ref counts only when it matters. I don't have a GC system internally twiddling pointer ref counts mindlessly when all that is happening is that a temporary pointer was made for a function call or some other limited scope.
Jan 13 2003
rafael baptista wrote:I don't doubt it. Asking the operating system for a buffer is very very slow. On windows it actually pauses your thread, and does an OS context switch. If you are allocating tiny buffers straight from the OS all the time, you need to go back to programming school. Memory should be allocated from the OS in large chunks and then doled out inside your progam as needed - using your own optimized allocators. This is what gets you cache coherency. This is what guarantees that all the vertex data for a model is in cache when you transform it, etc.Dude, what do you think Java (and most other GC-enabled environments) do? They allocate big buffers from the OS, then divide it into various categories of storage (in Java's case, various generations). Highly efficient strategies are then used to quickly allocate chunks of memory that used and discarded quickly. More stable allocations are transferred into storage that has strategies appropriate to that. If you're doing your own memory allocation code, you have to do a LOT of work, and write a LOT of bug free code, to get to where the Java allocator is. D needs GC. You're doing realtime work; stick with C, which is highly appropriate to that task! RJ
Jan 13 2003
Ross Judson wrote:Dude, what do you think Java (and most other GC-enabled environments) do? They allocate big buffers from the OS, then divide it into various categories of storage (in Java's case, various generations). Highly efficient strategies are then used to quickly allocate chunks of memory that used and discarded quickly. More stable allocations are transferred into storage that has strategies appropriate to that. If you're doing your own memory allocation code, you have to do a LOT of work, and write a LOT of bug free code, to get to where the Java allocator is. D needs GC. You're doing realtime work; stick with C, which is highly appropriate to that task!But D is nicer than C, and I'd rather use it, and have it be a general usage language, rather than a rapid development language for apps that need more speed than VB. I like GC, don't get me wrong. But without hardware and OS support, GC is always going to have some problems with certain kinds of applications. Since that support doesn't seem to be forthcoming in the near future, D should be ready to accept those cases and have a simple method for *transparently* swapping out what happens with someone writes new or delete. I shouldn't have to be doing anything complicated to switch it all out. Evan
Jan 14 2003
Hi. I'm new to this news group, but not to language design. D can't be all things to all people, so we might as well live with it's focus. Garbage collection is already there, and that defines the scope of the D language about as much as any other feature. That said, here's my thoughts about GC in general: GC in general buys very little. Anyone writing complicated programs already have to write recursive destructors to remove objects from relationships with other objects. The value of GC is simply that one extra line (delete myObject;) doesn't have to be typed. People argue that GC helps reduce dangling pointer bugs, but rather than a dangling pointer, you've got a memory leak. While GC doesn't really hurt a language like D for most applications, it trashes it's use (as D's author points out) for real time programming (not going to see any kernels written in D any time soon). There's also another weak reason I dislike GC: synthesis of datastructures into hardware really doesn't work in a GC based language.In any current language, you'd be right. However, highly optimized memory management can and should be generated by compilers. We do this now where I work. I type no extra code, but get automatically generated free-lists, and bug-free recursive destructors. It's kind of nice. Bill CoxIf you're doing your own memory allocation code, you have to do a LOT of work, and write a LOT of bug free code, to get to where the Java allocator is.
Jan 24 2003
Hello. Mind what you say. Though recursive destruction is sufficent for a great number of problems, for some it is not. I hve caught myself organising data nicely in a tree, then giving the pointer to some small branch somewhere and destroying the tree. Done. The program doesn't spend even a bit too much memory. Even less than requiered. Perfect, right? Furthermore, i've accidentally deleted an object which is still in use, but the holder doesn't get notified of it anyway. It just *has* to crash as soon as any access is done. Such domains lead to some necessity to keep track of how many users an object has, and this was *sometimes hard* in Delphi. And error-prone. This manner of organising data in recursively-destroyed structures is typical of experienced programmers, to have some central point to keep it all, and is not requiered in most cases where GC is used. It simply requeres a different management for data. Avoid common roots. Only actual users may have references to objects. Then they shall be collected if they're no use and are in a way. Those people like you with their old habits may try GC someday on some program of theirs, which was working well, and notice that GC has spoiled it, but the usual cause for it is not obeying these simple rules, which only simplify the design and not complicate it. For some uses though, a GC appears to be of no use. For these cases, i see a necessity of a separate heap for by-design non-GCed objects, since there still might be objects which requiere GC in a multi-module programme, like strings and such. This restricts nothing, and the "mark" phase time would decimate vastly if a significant portion is not GCed, not to mention it would already be very short as soon as mark-methods are implemented. A simple implementation would be that "non-GCed" objects have an empty or programmer-written "mark" method (that is, a method which is called by a GC, so that an object can mark the objects it's using), and should not be deleted if it has not been marked by any other objects. Simply by marking itself, or by more sophisticated means. I admit that my idea, that every object which provedes a corresponding method would be thus informed of GC-collected information is not always implementable. In some implementations it is quite possible, that the only thing which it would get is whether it was marked at all or not, but not how often it has been done. But i'd wish to know how many references to "me" exist at the time of mark phase, if it's possible. -i. PS. Embedded comments follow | | \|/ ' Bill Cox wrote:Hi. I'm new to this news group, but not to language design. D can't be all things to all people, so we might as well live with it's focus. Garbage collection is already there, and that defines the scope of the D language about as much as any other feature. That said, here's my thoughts about GC in general: GC in general buys very little. Anyone writing complicated programs already have to write recursive destructors to remove objects from relationships with other objects. The value of GC is simply that one extra line (delete myObject;) doesn't have to be typed.It is a vast help in simple cases. Don't tell me that noone writes simple programs. Or doesn't want to write a complex programme in a simple manner. Simple units can very well evolve to complex programs, i hope in D to a much further extent as in C++.People argue that GC helps reduce dangling pointer bugs, but rather than a dangling pointer, you've got a memory leak.Only without a GC, or with non-GC-friendly structures, which in my experience tend not to appear by themselves, but rather on purpose of conventional memory management. And don't tell me you've nevel leaked memory... You may have leaked tiny objects and wouldn't notice it. I did notice sometimes, but only ocassionaly. A good language supported GC (eg Caml) *can't* leak memory at a full mark cycle, because it colects all memory not being referenced. Simply NULL out your pointers and you're done. As soon as your application or possibly some other application needs memory, they would be collected. Do you really care what your "mem-meter" shows you if you don't intend to use it? Boehm GC can leak memory, but the reason is the absense of language support, and that it's thus unable to be sure of some things.While GC doesn't really hurt a language like D for most applications, it trashes it's use (as D's author points out) for real time programming (not going to see any kernels written in D any time soon).Why? D is not Java. GC is not forced. And else it is inherently not even a bit less efficient than C. Rather more, if Walter decides to get rid of C calling convention for usual fuctions. :>There's also another weak reason I dislike GC: synthesis of datastructures into hardware really doesn't work in a GC based language.???Bug-free recursive destruction is not a problem in my eyes. But keeping an eye that you always recursively copy objects and not give away references to them, or that you manipulate the refcounter correctly is tedious. GC would improve performance in certain cases. Copy-on-write? Then you need a ref-counter. Or a garbage collection. Generational GC would do very well in such case. Like Caml shows, which (almost) always uses copy-on-write, eg when you change something in a linked list. I'd rather be for a system-wide GC usage. Example: data passed to OpenGL is being copied. It is rediculous, simpler would be to use copy-on-change (and OpenGL usually doesn't change data), and to GC the leftovers. It is very probable that the data doesn't have to be copied at all then, because applications rarely change most kinds of data during some frame is drawn. Well, OpenGL inludes ways to make data transfers as rare as possible, but their use becomes tedious.In any current language, you'd be right. However, highly optimized memory management can and should be generated by compilers. We do this now where I work. I type no extra code, but get automatically generated free-lists, and bug-free recursive destructors. It's kind of nice.If you're doing your own memory allocation code, you have to do a LOT of work, and write a LOT of bug free code, to get to where the Java allocator is.Bill Cox
Jan 24 2003
Hi.Though recursive destruction is sufficent for a great number of problems, for some it is not. I hve caught myself organising data nicely in a tree, then giving the pointer to some small branch somewhere and destroying the tree. Done. The program doesn't spend even a bit too much memory. Even less than requiered. Perfect, right? Furthermore, i've accidentally deleted an object which is still in use, but the holder doesn't get notified of it anyway. It just *has* to crash as soon as any access is done.I didn't make myself clear. I don't write recursive destructors. Our data management CASE tool generates them for us, 100% automaticallay. It gets it right every time. We never have any unaccounted for pointers. It's fast and productive. In fact, we don't even buy Purify, even though we program in C. We just don't need it. Seg-v's are mostly a thing of the past for us. People using GC are still writing their own destructors, and still getting them wrong. A great way to improve productivity is to use CASE tools to do it for you. I really don't see the need for GC. As case tools are often a sign that the language needs to be extended, I've recently written a language that eliminates the need for the CASE tools we use at work. You have to go further. One goal of the language was optimized to work well with code generators. Every relationship (container class) generates code in both the parent and child classes, and updates the recursive destructor in each class. You can't do that with templates. You don't need a new language to do this, though. As I said, we do this at work in C with code generators.Basically, GC makes it harder to implement a program on hardware. The field of reconfigurable computing is dedicated to doing this. GC complicates things. GC in D is fine. You make your choices in language design, and obviously smart people often include GC. It just makes the language less attractive for the few of us who need to code on the bleeding edge of speed (we write chip design software). Bill CoxThere's also another weak reason I dislike GC: synthesis of datastructures into hardware really doesn't work in a GC based language.???
Jan 25 2003
Bill Cox wrote:Hi. I didn't make myself clear. I don't write recursive destructors. Our data management CASE tool generates them for us, 100% automaticallay. It gets it right every time. We never have any unaccounted for pointers. It's fast and productive. In fact, we don't even buy Purify, even though we program in C. We just don't need it. Seg-v's are mostly a thing of the past for us.What further work does it do? Recursive destruction is by far not the only problem. How is the rest of memory management dealt with? I mean, to prevent destruction of useful and how to track down objects not used anymore.People using GC are still writing their own destructors, and still getting them wrong. A great way to improve productivity is to use CASE tools to do it for you. I really don't see the need for GC.Destructors are not requiered anymore, except for the cases where you need to destroy some hardware together with an object. :> Well, there are a few special cases. People make mistakes with any model. But IMO making no mistakes with GC is very easy. Although i'm an advocate of GC, i'm open to every other solution.As case tools are often a sign that the language needs to be extended, I've recently written a language that eliminates the need for the CASE tools we use at work. You have to go further. One goal of the language was optimized to work well with code generators. Every relationship (container class) generates code in both the parent and child classes, and updates the recursive destructor in each class. You can't do that with templates. You don't need a new language to do this, though. As I said, we do this at work in C with code generators.Describe the extentions you consider useful in complete detail, else it's no use for Walter. Or give a link. I appreciate that you have *done* something (esp. something that works), but what you've written here so far gives me no idea of what it is and how it should work. I'm not well-informed about CASE in general, and i bet the most of the newsgroup isn't as well. There are currently multiple directions, where D should go. First, it has to become solid in its current form. Second, backends have to be written for different kinds of use, eg. a very portable VM. And then, i'll be making an effort to make writing compilers in D and language extentions to D easy, and i'll gladly take your proposal into account. Currently i'm looking at writing of COCOM-like library for D, enabling fast and flexible parsing and syntax tree transformation.Basically, GC makes it harder to implement a program on hardware. The field of reconfigurable computing is dedicated to doing this. GC complicates things.Wouldn't you want to write a program in VHDL or such if you wanted to get hardware as output? Well, i can imagine that nowadays it is possible with higher-level languages. But isn't D an overkill? Oh, you're working for an ASIC company. Then obviously "hard" management is better for *you* than a GC. But how should it work? A ref-counter can be a drop-in replacement for a GC (less reliable though), and how do you make the inegration of CASE? -i.
Jan 25 2003
HI, Ilya. Ilya Minkov wrote:Bill Cox wrote:Ok. Here's a description of our CASE tool: The case tool (DataDraw) is pretty simple. It's open-source and compiles under Visual C++ on Windows. I use it on Linux with wine (a Windows emulator). You draw your class schema much as you would with a UML editor, or a database entity-relationship schema editor. You add properties to classes by double clicking on them and adding the fields. Then, to add relationships (container classes) between classes, you select a relationship drawing tool and draw arrows between the classes. By double clicking on the relationship, you get to set it's type: one-to-one, linked list, tail-linked-list (linked list with a pointer to the last element), doubly-linked list, static array, dynamic array, name-based hash table, or static inheritance (standard C++ style inheritance). You also get to specify whether or not the children should be deleted when the parent is deleted. There are a couple other option: relationship names, and whether or not there is a back pointer to the parent embedded in the child. After describing your data structures, it will generate code for you. There are several code generators to choose from, mostly variations of C coding styles, and one C++ style. We use on of the C styles because it gives us dynamic inheritance, which is a big deal for the kind of software we write (integrated chip design tools that run off of a common database). Static inheritance is great when you are constructing objects from scratch. Our objects already exist in the database, and we need to extend them on-the-fly. The code generator we use does this with no memory or speed penalty, since it represents all the class data in individual arrays of data. To extend all the objects of a given class, the memory manager just allocates more arrays. Instead of using C style pointers, we the code generator typedefs an index as the reference type for the class. We get strong type checking in debug mode by typedefing the handles as pointers to dummy classes, and casting them to ints when accessed. That sounds ugly, but it's all buried under the hood by automatically generated access macros that hide it all. The memory managers in two of the code generators are quite advanced. First, each class has options you can select for how to manage it's memory. Objects of some classes are never individually deleted, only all at once. For those classes, the structures are put into a dynmic array. Allocating an object typically is just a compare and an incremt (thee machine instructions, I think). For most classes, however, you need both create and delete on the fly. The code generator that generates more normal C code alloates memory one page at a time, and then sub-allocates your objects of a given class out of them, which optimizes both paging and cache hits. Free lists are automatically generated and maintained. Structures are also automatically alligned, with fields sorted by size. The funkier code generator we use (the one that uses indexes into arrays instead of pointers to structures) uses dynamic arrays for all properties of all classes. We've tested the performance of our code with both generators, and generally see a 10-20% speed increase using the dynamic array representation instead of structures. This memory manager uses the trick of allocating 50% more memory each time a dynamic array needs more memory. Recursive destructors are also generated, as are some other useful functions. The recursive destructors delete each child in any relationship marked as needing the children deleted. Other relationships are patched. For example, code to remove the object from linked lists is generated. Of particular interest to us is the very efficeint code for our container classes (relationships). A linked list relationship generated by DataDraw embeds the next pointer in the child class, which is a far more efficeint implementation, in terms of both speed and memory. In debug mode, all memory accesses are checked, including array indexing. Some of the code generators also generate binary load/save, and full database integrety self-checks. However, we've come to the conclusion that simple binary dumps make a poor save format, since it's hard to remain backwards compatible. Also, since our pointers almost never get trashed, the database integrety self check is also no longer used. There are ways our database can be trashed. Users of fscanf run into it. We sometimes get seg-v's with too few parameters to printf. All in all, our coding pretty safe from these kinds of bugs. ...Hi. I didn't make myself clear. I don't write recursive destructors. Our data management CASE tool generates them for us, 100% automaticallay. It gets it right every time. We never have any unaccounted for pointers. It's fast and productive. In fact, we don't even buy Purify, even though we program in C. We just don't need it. Seg-v's are mostly a thing of the past for us.What further work does it do? Recursive destruction is by far not the only problem. How is the rest of memory management dealt with? I mean, to prevent destruction of useful and how to track down objects not used anymore.I've written a simple code translator from a language that incorporates many of the featues of the DataDraw CASE tool. To add the full power of code generators, you need to be able to compile and run code that generates code to be compiled. This is basically templates on steriods. I'm not familiar with the basic container class packages in D yet, so I'll just write out psuedo-code for an example code generator to add fields for a linked list to two classes. generateRelFields(Relationship rel) { Class parent = rel.parentClass; Class child = rel.childClass; if(rel.type == LINKED_LIST) { generate(parentName:parent.name, childName:child.name, relName:rel.name) { extend class <parentName> { struct <relName> { <childName> first; } add(<childName> child) { child.<relName>.next = this.<relName>.first; this.<relName>.first = child; } remove(<childName> child) { ... } extend class <childName> { struct <relName> { <childName> next; <parentName> owner; } } } if(rel.deleteChildren) { generate(parentName:parent.name, childName:child.name, relName:rel.name) { extend class <parentName> { extend destroy() { <childName> child, nextChild; for(child = this.<relName>.first; child != null; child = nextChild) { nextChild = child.<relName>.next; child.destroy(); } } } extend class <childName> { extend destroy() { if(child.<relName>.owner != null) { child.<relName>.owner.remove(child); } } } } } } } Note the addition of the generate statement, which is similar to templates. Also notice that classes, methods, and functions can all be extended. This is a code-generator friendly syntax.As case tools are often a sign that the language needs to be extended, I've recently written a language that eliminates the need for the CASE tools we use at work. You have to go further. One goal of the language was optimized to work well with code generators. Every relationship (container class) generates code in both the parent and child classes, and updates the recursive destructor in each class. You can't do that with templates. You don't need a new language to do this, though. As I said, we do this at work in C with code generators.Describe the extentions you consider useful in complete detail, else it's no use for Walter. Or give a link.VHDL (or Verilog) is the standard way to describe hardware. Note that VHDL has a generate statement for generating hardware. IMO, it's VHDL's single most important advantage over Verilog. The reconfigurable computing guys are trying to compile normal programs into hardware. It shouldn't be a goal of D, but it is something I dabble in. Bill CoxBasically, GC makes it harder to implement a program on hardware. The field of reconfigurable computing is dedicated to doing this. GC complicates things.Wouldn't you want to write a program in VHDL or such if you wanted to get hardware as output? Well, i can imagine that nowadays it is possible with higher-level languages. But isn't D an overkill? Oh, you're working for an ASIC company. Then obviously "hard" management is better for *you* than a GC. But how should it work? A ref-counter can be a drop-in replacement for a GC (less reliable though), and how do you make the inegration of CASE?
Jan 25 2003
Hi. Bill Cox wrote:The case tool (DataDraw) is pretty simple. It's open-source and compiles under Visual C++ on Windows.Seems to be kept secret though. I was searching for it, and was unable to find. The description is though very interesting, i still don't seem to understand some mechanics about object deletion. This all seems to be about having a central object repository this or another way, and scope of objects has to be known explicitly. That is, AFAIU it automates the normal C/C++/Delphi data management ideoms? I still don't understand how this simple (and for D vital) example is handled: string operations. Imagine, some function has created a string, and returns it. Now, some operations were done on the string: copying, slicing, and so on. Copying and some cases of slicing (i guess) don't change the string, so it is not being copied for efficiency reasons. Now, the same piece of data has multiple users. Now some user makes a change in data, desiring to keep the changes private. Then there are two cases: - Noone else owns the data. It can be changed in-place; - Data has another users. It has to be copied. This means it needs to know whether the piece of data has further users. Delphi solves this for strings by maintainig the reference counter. GC simplifies it, by allowing for "late" copying of data and then it would find out what data needs to be removed. It was possible to prove, that GC is more efficient that reference counting in a general case of tiny objects and strings. Come to think of it: why did the X-Window servers constantly leak memory before they became a GC attached? Because they handle small structures, not cenralized. And how do you get to know that you don't need a certain part in a centralized data storage? Or explain me what i get wrong. -i.
Jan 31 2003
This is why functional languages always copy data. I would think that the correct semantics is to always copy on write unless the compiler can tell for absolute sure that there are no other users of the data, in which case it can optimize to do the operation in place. An array passed by reference would be exempt from this; however D currently has no way at all to pass arrays by value, so we're forced to deal with the "is it mine, is it yours, is it ours" mess on every front (user code, library code, system code). Ownership is the kind of thing that can't be verified in a precondition or postcondition; there's no way to tell in general without knowing the entire program. That is a job better suited to a machine than a human programmer. That is why I would like the default to be pass by value, copy on write, *unless* either the programmer requests pass by reference or the compiler can tell for sure, due to scoping/data visibility/data flow analysis that it's safe not to copy on write. Err on the side of safety. Sean "Ilya Minkov" <ilminkov planet-interkom.de> wrote in message news:b1ek0e$u42$1 digitaldaemon.com...Hi. Bill Cox wrote:The case tool (DataDraw) is pretty simple. It's open-source and compiles under Visual C++ on Windows.Seems to be kept secret though. I was searching for it, and was unable to find. The description is though very interesting, i still don't seem to understand some mechanics about object deletion. This all seems to be about having a central object repository this or another way, and scope of objects has to be known explicitly. That is, AFAIU it automates the normal C/C++/Delphi data management ideoms? I still don't understand how this simple (and for D vital) example is handled: string operations. Imagine, some function has created a string, and returns it. Now, some operations were done on the string: copying, slicing, and so on. Copying and some cases of slicing (i guess) don't change the string, so it is not being copied for efficiency reasons. Now, the same piece of data has multiple users. Now some user makes a change in data, desiring to keep the changes private. Then there are two cases: - Noone else owns the data. It can be changed in-place; - Data has another users. It has to be copied. This means it needs to know whether the piece of data has further users. Delphi solves this for strings by maintainig the reference counter. GC simplifies it, by allowing for "late" copying of data and then it would find out what data needs to be removed. It was possible to prove, that GC is more efficient that reference counting in a general case of tiny objects and strings. Come to think of it: why did the X-Window servers constantly leak memory before they became a GC attached? Because they handle small structures, not cenralized. And how do you get to know that you don't need a certain part in a centralized data storage? Or explain me what i get wrong. -i.
Feb 01 2003
"Sean L. Palmer" <seanpalmer directvinternet.com> wrote in message news:b1ge7e$29kd$1 digitaldaemon.com...This is why functional languages always copy data. I would think that the correct semantics is to always copy on write unless the compiler can tell for absolute sure that there are no other users ofthedata, in which case it can optimize to do the operation in place. Anarraypassed by reference would be exempt from this; however D currently has no way at all to pass arrays by value, so we're forced to deal with the "isitmine, is it yours, is it ours" mess on every front (user code, librarycode,system code).it is, but there is an issue with threading, if the function has a big delay, say it signal one event and then waits on an other event, before reading or writing the item, which it expects is a copy, and another thread is waiting for that first signal modifies the original (which is was passed by ref or owns) and signals the second event, the value with be the changed and not the previous. this can be avoided by copying in the function prolog if a write is detected any where in the function, or programmer understanding the effects of lazy copy on write. arrays already have odd 'in' behaviour, so I guess its o.k. to perform a lazy copy as I doubt the condition would occur often and as long as there is a way to force a copy array.length += 0 ; // for instance (safe to do on a 0 length array unlike array[0] = array[0];) Mike.
Feb 01 2003
"Sean L. Palmer" <seanpalmer directvinternet.com> wrote in message news:b1ge7e$29kd$1 digitaldaemon.com...This is why functional languages always copy data. I would think that the correct semantics is to always copy on write unless the compiler can tell for absolute sure that there are no other users ofthedata, in which case it can optimize to do the operation in place. Anarraypassed by reference would be exempt from this; however D currently has no way at all to pass arrays by value, so we're forced to deal with the "isitmine, is it yours, is it ours" mess on every front (user code, librarycode,system code).it is, but there is an issue with threading, if the function has a big delay, say it signal one event and then waits on an other event, before reading or writing the item, which it expects is a copy, and another thread is waiting for that first signal modifies the original (which is was passed by ref or owns) and signals the second event, the value with be the changed and not the previous. this can be avoided by copying in the function prolog if a write is detected any where in the function, or programmer understanding the effects of lazy copy on write. arrays already have odd 'in' behaviour, so I guess its o.k. to perform a lazy copy as I doubt the condition would occur often and as long as there is a way to force a copy array.length += 0 ; // for instance (safe to do on a 0 length array unlike array[0] = array[0];) Mike.
Feb 01 2003
"Sean L. Palmer" <seanpalmer directvinternet.com> wrote in message news:b1ge7e$29kd$1 digitaldaemon.com...I would think that the correct semantics is to always copy on write unless the compiler can tell for absolute sure that there are no other users ofthedata, in which case it can optimize to do the operation in place. Anarraypassed by reference would be exempt from this; however D currently has no way at all to pass arrays by value, so we're forced to deal with the "isitmine, is it yours, is it ours" mess on every front (user code, librarycode,system code).The best solution is, as you suggest, copy on write. In D, you don't really need to know who owns it. Just practice copy on write. Any extra copies will get cleaned up by the gc.
Feb 05 2003
Walter wrote:The best solution is, as you suggest, copy on write. In D, you don't really need to know who owns it. Just practice copy on write. Any extra copies will get cleaned up by the gc.Granted the GC is on. If it's off, these all will pollute the memory. That's why a was arguing for 2 separate heaps, so that objects are created on GCed heap by default, or can (if the programmer is confident) created on a heap completely ignored by a GC. I know that being on GC heap shouldn't be a performance hit, after GC is improved. Except for the cases when objects are very pointer-rich (trees, lists and such), but in some cases it just doesn't make sense to scan objects, because the programmer has decided to follow his old C habits and they are all referenced from somewhere anyway. These should also be leak- and dangler-guarded at debug compile, and completely ignored by GC at release compile.
Feb 05 2003
"Ilya Minkov" <midiclub 8ung.at> wrote in message news:b1rrn0$gps$1 digitaldaemon.com...Granted the GC is on. If it's off, these all will pollute the memory. That's why a was arguing for 2 separate heaps, so that objects are created on GCed heap by default, or can (if the programmer is confident) created on a heap completely ignored by a GC. I know that being on GC heap shouldn't be a performance hit, after GC is improved. Except for the cases when objects are very pointer-rich (trees, lists and such), but in some cases it just doesn't make sense to scan objects, because the programmer has decided to follow his old C habits and they are all referenced from somewhere anyway. These should also be leak- and dangler-guarded at debug compile, and completely ignored by GC at release compile.I hear you, but I really think it's worth just taking advantage of the GC and not worrying about it.
Feb 05 2003
Hi, Ilya. Ilya Minkov wrote:Hi. Bill Cox wrote:Try: www.viasic.com/download/datadraw.tar.gz There ins't any active community working on it. The code dates back to 1991, and is a bit dated, and it's gotten pretty hacked up over the years. Tools like these generally don't get used much. I find it really odd. It seems to me that programmers aren't really focused on productivity. The entire field of CASE is full of good tools that no one ever uses.The case tool (DataDraw) is pretty simple. It's open-source and compiles under Visual C++ on Windows.Seems to be kept secret though. I was searching for it, and was unable to find.The description is though very interesting, i still don't seem to understand some mechanics about object deletion. This all seems to be about having a central object repository this or another way, and scope of objects has to be known explicitly. That is, AFAIU it automates the normal C/C++/Delphi data management ideoms?Most of the code generators generate simple custom object databases. There are ways, for example, of traversing all the objects of a given class. However, there's really no magic here. It just outputs code that you might be tempted to write, given the class definitions and relationships. The nice thing is that it automates writing code that you might not have time to get right every time, like allocating memory for objects of a given class in pages, and then sub-allocating individual objects out of each page, and maintaining free lists for the class.I still don't understand how this simple (and for D vital) example is handled: string operations. Imagine, some function has created a string, and returns it. Now, some operations were done on the string: copying, slicing, and so on. Copying and some cases of slicing (i guess) don't change the string, so it is not being copied for efficiency reasons. Now, the same piece of data has multiple users. Now some user makes a change in data, desiring to keep the changes private. Then there are two cases: - Noone else owns the data. It can be changed in-place; - Data has another users. It has to be copied. This means it needs to know whether the piece of data has further users. Delphi solves this for strings by maintainig the reference counter. GC simplifies it, by allowing for "late" copying of data and then it would find out what data needs to be removed. It was possible to prove, that GC is more efficient that reference counting in a general case of tiny objects and strings.Strings fall into a difficult space. They're not really quit full blown objects that represent something concrete. DataDraw doesn't deal with them, other than to allow you to copy them in and out of objects in the database. Strings, vectors, and matricies are "value" classes, rather than "object" classes. "Values" tend to live on the stack, and go away at the end of scope. "Objects" tend to live on the heap and have complex relationships to other objects. The problem with strings is having to copy all that data around as you would a simple int value. Yuk.Come to think of it: why did the X-Window servers constantly leak memory before they became a GC attached? Because they handle small structures, not cenralized. And how do you get to know that you don't need a certain part in a centralized data storage? Or explain me what i get wrong.Objects in the centralized storage go away when deleted. It's really not much different than in a GC based system, since you still need to remove objects from their relationships to other objects when you want them to go away. The anoying value objects with associated memory are a problem, but not for the centralized storage if you don't allow them there. String manipulation is typically a significant focus for both language design and benchmarks. Strings are a tricky issues that I feel tends to distract us from other important language applications. D seems to have strings nailed. Bill
Feb 02 2003
rafael baptista wrote: Asking the operating system for a buffer is very very slow. On windows it actually pauses your thread, and does an OS context switch. If you are allocating tiny buffers straight from the OS all the time, you need to go back to programming school. Memory should be allocated from the OS in large chunks and then doled out inside your progam as needed - using your own optimized allocators.Ross Judson says... Dude, what do you think Java (and most other GC-enabled environments) do? They allocate big buffers from the OS, then divide it into various categories of storage (in Java's case, various generations). Highly efficient strategies are then used to quickly allocate chunks of memory that used and discarded quickly. More stable allocations are transferred into storage that has strategies appropriate to that. If you're doing your own memory allocation code, you have to do a LOT of work, and write a LOT of bug free code, to get to where the Java allocator is. D needs GC. You're doing realtime work; stick with C, which is highly appropriate to that task! RJThats right, a GC will normally allocate memory in large blocks, which is a good practice, and so will anyone doing proper memory management. So what I mean is just that you can't say GC is faster by comparing it any number of bad practices when managing memory manually. There is no doubt about it, that you can make systems to automatically manage memory that are resonably efficient in a given domain... which is why I want the freedom to write my own. Its not that hard to do and has lots of advantages. When necessary I will write my own GC-like manager for certain classes with unpredictable life cycles ( and only for those ). I will write fixed size allocators for other things. I may have tens of allocators going at once guaranteeing that certain types of data will end up on the same page of memory - for cache reasons. Sometimes I want to replace my heap manager with a debug version that lets me track memory leaks, or lets me graph what subsystems are using up the most memory, or lets me see what subsystems are giving me the most cache misses etc. I may want to receive a warning when certain systems are above their memory budget. I may want to know about any buffers I have overwritten. I want low level control of the memory management all the time. I want maximum flexibility there. I will accept a slow system defined "printf" function, or a slow file open dialog, or a fat and slow directory listing function. Those are called infrequently. But I don't want anything between me and the processor, and I don't want anything between me and system memory. Professional programmers know that their customers only care about performance. No one buys software because the code is reusable, or easily to maintain, or simple for newbie programmers to understand. Your customers buy what is the fastest, has the most features, is easiest to use and makes best use of their hardware. They don't care about anything that makes your life easier. I won't be satisfied with the middle of the road, lowest common denominator, useful for all purposes libraries that ship with my tools. I don't want to cede even one cycle, one byte... nothing to my competitor. Fast, proper, custom written memory management is not just for games. Also, ever wonder why you never see any shrink wrapped software written in Java?
Jan 14 2003
rafael baptista wrote:Professional programmers know that their customers only care about performance. No one buys software because the code is reusable, or easily to maintain, or simple for newbie programmers to understand. Your customers buy what is the fastest, has the most features, is easiest to use and makes best use of their hardware. They don't care about anything that makes your life easier. I won't be satisfied with the middle of the road, lowest common denominator, useful for all purposes libraries that ship with my tools. I don't want to cede even one cycle, one byte... nothing to my competitor. Fast, proper, custom written memory management is not just for games. Also, ever wonder why you never see any shrink wrapped software written in Java?Two things come to mind...first, some customers care about performance. Most care about _correctness_. The more a programmer has to do himself, the more opportunity there is to blow the correctness goal. There's a reason we don't do much work in assembly any more. I'd say that there are a decent number of Java client-side applications out there, but they're not all that well known. Writing good client-side Java is hard work. Check out http://www.jgoodies.com for some very polished work by a talented guy. Ever wonder why you see less and less server-side software written in C? Most of it is done in Java, these days...there's a reason for that! RJ
Jan 14 2003
rafael baptista wrote:In article <avv7ve$2tos$1 digitaldaemon.com>, Ilya Minkov says...OK. Let me see. I've got an OpenGL game somewhere which is written in OCaml. I'll try to compile it and tell you what it yields. I haven't seen it yet, it's not necessarily a complicated one where that all would matter. OCaml is a 100% GC language with mixed powerful functional and imperative features, combined with OO, and there exists a GCC-based native compiler for it.Please, believe me garbage collection (GC) is the future of game design.I doubt it.How do you know the OS doesn't want to access the disk for some purpose? It frequently does. :) BTW, i might be writing a game in a couple of erhh... years, and i'll try to remember what you said. ;)and such, graphic card, other devices, which are NOT REALTIME!You are right. That is why game programmers avoid accessing slow devices like disks in the middle of a game. If you absolutely must get something off disk you normally stream it in. You set up an interrupt driven process or something like that. So say I need a new model off the disk in a few seconds, I would place a request to the disk streaming subsystem to start getting the stuff off the disk. If you sit there like fool waiting for your CD drive to spin up, then your game is very choppy.I don't have to believe you. I write games for a living. ;)OK, OK, i respect Your opinion a lot, you old rusty... :>Naturally, i won't be keeping vertices one by one. ;)The author claims, that plugging in the Hans Boehm GC into ...I don't doubt it. Asking the operating system for a buffer is very very slow. On windows it actually pauses your thread, and does an OS context switch. If you are allocating tiny buffers straight from the OS all the time, you need to go back to programming school. Memory should be allocated from the OS in large chunks and then doled out inside your progam as needed - using your own optimized allocators. This is what gets you cache coherency. This is what guarantees that all the vertex data for a model is in cache when you transform it, etc.It is not fair to compare using the most highly optimized GC you have ever heard of against bad manual memory management practices.OK.There is no heap compaction better than me ...Strong Ego. :>in memory explicitly for cache coherency. Plus there is no penalty for me having holes in the memory space that are bigger than a memory page anyway. If a whole page is unused it will never be loaded into the cache and will never trouble me. Compacting it away is just wasting bandwidth on the bus that I could be using to load textures.It might just as well only compact the heap if it makes sense :) (probably late though). And it might avoid moving something if not really needed. It's more an implementation issue than of the principle.The alternative to GC is not ref counting. Few objects have such an unpredictable life cycle that they cannot be managed any other way.You probably mean unmanaged. :) Better GCs avoid bothering with old objects anyway.GC essentially embed the ref counting inside the GC subsystem and makes you do some kind of reference tracking for all memory buffer wether you need it or not. I ref count very few objects in a given project.But it does it only once in a while.And when I do I'm typically ref counting entire objects, not every single buffer. Plus I can manipulate the ref counts only when it matters.I know. I thought about it a lot. Of course even the structures i mentioned can be used with an efficient reference counting. I even came to a conclusion, that it's requiered and GC would be shut off unless either weak pointers, or object notification is implemented. under the last i mean, the GC could run a designated method of every object which provides it, to notify it that the GC cycle has just run and that they have so many "parents" and here's the handle to the rest of GC statistics and so on. So you could teach your object to commit a suicide after it's had only one parent for a while. This parent would be the central object repository, cache.I don't have a GC system internally twiddling pointer ref counts mindlessly when all that is happening is that a temporary pointer was made for a function call or some other limited scope.I don't think it's *that* important. It's just worth an experiment i'll be sure to do someday. -i.
Jan 14 2003
Ilya Minkov wrote:which are highly complex meshes of (concave) objects. Now, as I read^^^^^^^ Sorry, it should read "convex". Please remind me if i ever make another mistake like that, and i'll blow my brain out. ;) It has to do with the fact, that a ray hits a forward surface in a convex only once, so that back surface culling is sufficent for correct display, no sorting or z-buffer requiered. You simply start with a block you're in and draw the neighbors clipped through the portals. Though many aspects are not important for modern hardware anymore, the overdraw would get completely eliminated, and simlar techniques eliminate overdraw in the other cases. -i.
Jan 14 2003
"rafael baptista" <rafael_member pathlink.com> wrote in message news:avsfoq$6nc$1 digitaldaemon.com...The GC is worse though - because its in there operating - arbitrarilystallingmy program whether I use it or not.In D, the GC *only* gets called when a call is made to allocate memory. It is not in a separate thread arbitrarilly and randomly running. To deal with time sensitive code, if you preallocate all the data you'll need for the time sensitive portion, you'll never run the risk of the GC running in the middle of it. Note that a similar problem exists if you use malloc - the library can arbitrarilly decide to call the operating system to allocate more memory to the process, which can take arbitrarilly long amounts of time. The only way to guarantee latency, even in C++, is to preallocate the data.
Jan 23 2003
C++ can be fully garbage-collected. Just make a GC super class of all garbage collected classes that upon construction adds the object to a global internal list, then clear the list from time to time. For example: class GC { public: GC(); virtual ~GC(); static void Clear(); }; static CArray<GC*> _gc; GC::GC() { _gc += this; } GC::~GC() { _gc -= this; } void GC::Clear() { for(i = 0; i < _gc.GetSize(); i++) { GC *gc = _gc[i]; _gc[i] = 0; delete gc; } _gc.SetSize(0); } class Foo : public GC { }; int main() { //foo is garbage collected Foo *foo = new Foo; //clear dangling references; will delete 'foo' GC::Clear(); } C++ is so flexible!!! D should be that way also. Garbage collection should be an option of the programmer. "rafael baptista" <rafael_member pathlink.com> wrote in message news:avsfoq$6nc$1 digitaldaemon.com...I just saw the D spec today... and it is quite wonderful. There are somany waysthat C and C++ could be better and D addresses them. But I think D is alittletoo ambitious to its detriment. It has a good side: it fixes syntactic and semantic problems with C/C++. It eliminates all kinds of misfeatures andcruftin C++. And D has a bad side: it attempts to implement all kinds of features thatarebest left to the programmer - not the language spec: built in stringcomparison,dynamic arrays, associative arrays, but the worst in my mind is theinclusion ofgarbage collection. As is pointed out in the D overview garbage collection makes D unusablefor realtime programming - this makes it unusable for Kernel programming, device drivers, computer games, any real time graphics, industrial control etc.etc. Itlimits the language in a way that is really not necessary. Why not justadd inlanguage features that make it easier to integrate GC without making it apartof the language. For my part I wouldn't use GC on any project in any case. My experiencehas beenthat GC is inferior to manually managing memory in every case anyway. Why? Garbage collection is slower: There is an argument in the Garbagecollectionsection of the D documents that GC is somehow faster than manuallymanagingmemory. Essentially the argument boils down to that totally bunglingmanualmemory management is slower than GC. Ok that is true. If I program withtons ofredundant destructors, and have massive memory leaks, and spend lots oftimemanually allocating and deallocating small buffers, and constantlytwiddlingthousands of ref counts then managing memory manually is a disaster. Most programs manage memory properly and the overhead of memory management isnot aproblem. The argument that GC does not suffer from memory leaks is also invalid. InGC'dsystem, if I leave around references to memory that I never intend to useagainI am leaking memory just as surely as if I did not call delete on anallocatedbuffer in c++. Finally there is the argument that manually managing memory is somehowdifficultand error prone. In C++ managing memory is simple - you create a fewtemplatizedcontainer classes at the start of a project. They will allocate and free*all*of the memory in your program. You verify that they work correctly andthenwrite the rest of your program using these classes. If you are calling"new" allover your C++ code you are asking for trouble. The advantage is that youhavefull control over how memory is managed in your program. I can planexactly howI will deal with the different types of memory, where my data structuresareplaced in memory. I can code explicity to make sure that certain datastructureswill be in cache memory at the same time. I had a similar reaction of having the compiler automatically build mehashtables, string classes and dynamic arrays. I normally code these thingsmyselfwith particular attention to how they are going to be used. The unit teststuffis also redundant. I can chose to implement that stuff whether thelanguagesupports it or not. I normally make unit test programs that generateoutput thatI can regress against earlier builds - but I wouldn't want to saddlearbitraryprograms with having to run my often quite extensive and time consumingunittests at startup. Imaginary numbers and bit fields should similarly beleft upto the programmer. I suppose I can just chose not to use these features if I don't want to -but itstrikes me as unnecessary cruft - and I subscribe to the idea beautifully expressed in the D overview that removing cruft makes a language better. The GC is worse though - because its in there operating - arbitrarilystallingmy program whether I use it or not. I urge the implementor of D to think about it this way: you have lots ofideas.Many of them are really good - but it is likely that despite your besteffortsnot all of them are. It would make your project much more likely tosucceed ifyou decoupled the ideas as much as possible such that the good ones cansucceedwithout being weighed down by the bad. So wherever possible I think youshouldmove functionality from the language spec to a standard library. This waythelibrary can continue to improve, while the language proper could stabilize early. This is probably old ground - I assume lots of people have probablyalreadycommented on the GC. I looked for a thread on GC and could not find one. I for one would find D a lot more attractive if it did not have GC.
Jan 30 2003
HI, Achillefs. Achillefs Margaritis wrote:C++ can be fully garbage-collected. Just make a GC super class of all garbage collected classes that upon construction adds the object to a global internal list, then clear the list from time to time. For example: class GC { public: GC(); virtual ~GC(); static void Clear(); }; static CArray<GC*> _gc; GC::GC() { _gc += this; } GC::~GC() { _gc -= this; } void GC::Clear() { for(i = 0; i < _gc.GetSize(); i++) { GC *gc = _gc[i]; _gc[i] = 0; delete gc; } _gc.SetSize(0); } class Foo : public GC { }; int main() { //foo is garbage collected Foo *foo = new Foo; //clear dangling references; will delete 'foo' GC::Clear(); } C++ is so flexible!!! D should be that way also. Garbage collection should be an option of the programmer.This looks like fairly useful code in some cases, but it's not real GC. It simply deletes ALL GC derived objects that have every been allocated but not manually deleted. To make C++ do real GC is both very slow and very hard. You can use handle classes that are actual class objects. They get added to the root set when they are declared as local variables. When they're destroyed, they are removed from the root set. The container classes need to use handles that aren't in the root set, but which can be traversed given a pointer to the containing a GC derived object. Yuk. The more common way in C++ is reference counting, which can be done with handle classes easily, but at a significant performance cost. Also, it doens't really work. It can GC tree structures, but it fails to delete anything with a loop in it, which most data structures have. Even a linked list typically has the child objects pointing back at the parent. It's a loop. If you want good GC, it needs to be supported directly in the language. I just don't have any use for GC. As I've said before, it doesn't really buy you much. All it does is change a delete statement into a direct call to the destructor, which you have to write anyway, or you won't get the object out of it's relationships with other objects. A better way is to automatically generate recursive destructors. That's how it's done in relational databases (well, almost -- they do it in an interpreted way). Just declare that a relationship delete's it children when the parent is deleted. It couldn't be easier. I've written languages that use GC (a Scheme interpreter). I even used the cool in-place GC (not mark-and-sweep), that runs just as fast as stop-and-copy (faster if you take caching and paging into account). Sure GC can be fast. But why complicate the language? It just isn't worth it. Bill
Jan 31 2003
Achillefs Margaritis wrote:C++ is so flexible!!! D should be that way also. Garbage collection should be an option of the programmer.Walter considers C++ too flexible. And he doesn't rob you of C++, he gives you another language. And you can't implement an efficient GC in C. However, i can't see why such a C++ one can't be implemented on D - if you consider it better of course. A GC can profit a lot from modified code generation. A compiler implementation could choose one of automatic memory management solutions, which would not requiere any changes in code, that is are equivalent to the programmer but having different efficiency characteristics. - simple mark-and-sweep collector, like D is using now is just the beginning - i have heard that it doesn't profit from code generation yet and hence keeps executable size small; - mark phase can be accelerated, if the compiler automatically generates an optimised marking method for each class. This is going to have a small size overhead, but reduce the total cost of GC to almost zero; - coloring collector: can work in background, and has very good run-time characteristics that it doesn't stop the main process, doesn't necessarily cause wild paging, and keeps memory footprint rather tight. It might slow down an application a bit, in part because it requieres wrapping pointer writes in a GC method. Might also increase the code size; - smart reference counting: i have stumbled over a document describing a reference counting memory manager, which can also deal with self-referencing structures. Its advantage is that the application uses exactly as much memory as it needs, but the slowdown is to be much higher than with a coloring GC, pauses might be the case, and a code size would increase significantly. I'd say an implementation may be free to provide these or other automatic memory management systems, so far they are compliant with a GC interface. I suppose that the memory manager interface is minimal on purpose, since things i have proposed before, like "mark-notification", are not possible with any managers except for mark-and-sweep. I've read yet another, and this time better done comparison of "static" and "scripting" languages, and it turns out that the decisions of Perl and Python, to allow the programmer to do complex tasks in a simple way, usually shaves off half (or more) of the time requiered for writing an application. It may not be the most optimal way, but it is generally good for programme correctness. There, hundreds of programmers wrote a simple application in C, C++, Perl, Python, Ruby, Java, and TCL. It shows that though C and C++ might lead to an optimal solution, most often they lead to a very bad one since the programmer tries to accomplish a task in a limited time and simplifies the job. In that example, all entries done in languages with native, simple support for associative arrays were correct and run reasonbly fast, while the others lead to a number of broken programmes. It shows again that Walter's decision makes sense, since it makes it possible to write utilities in D about as simply and with more confidence, and resulting in faster execution, than in Perl. I'm pretty much sure D would devastatingly win this comparison. -i.
Jan 31 2003
Ilya Minkov wrote:A GC can profit a lot from modified code generation. A compiler implementation could choose one of automatic memory management solutions, which would not requiere any changes in code, that is are equivalent to the programmer but having different efficiency characteristics.Let's see what happens when type-based scanning is in first; nobody knows what effect it'll have on scanning, but in a small utility I'm making (fractal browsing sample for dig) scanning takes 60 milliseconds but is scanning... 802 pointers over 1120001 dwords of data, so 1396 times too much data. Once that is pared down I predict I'll be able to call a full collection every second without having to worry about delays in almost every program. The exception will be VERY heavily pointer-oriented programs. We're talking Lisp level and programs where you have an "Int" class. I don't care about those.
Jan 31 2003
"Burton Radons" <loth users.sourceforge.net> wrote in message news:b1eo7q$15m6$1 digitaldaemon.com...Let's see what happens when type-based scanning is in first; nobody knows what effect it'll have on scanning, but in a small utility I'm making (fractal browsing sample for dig) scanning takes 60 milliseconds but is scanning... 802 pointers over 1120001 dwords of data, so 1396 times too much data. Once that is pared down I predict I'll be able to call a full collection every second without having to worry about delays in almost every program. The exception will be VERY heavily pointer-oriented programs. We're talking Lisp level and programs where you have an "Int" class. I don't care about those.I have a lot of ideas about improving GC, such as making it a copying generational collector, implementing hardware write barriers, and using type info to mark whole regions as "not containing any pointers", etc. There is a LOT that can be done to boost performance.
Feb 03 2003
Walter wrote:"Burton Radons" <loth users.sourceforge.net> wrote in message news:b1eo7q$15m6$1 digitaldaemon.com...Implementing hardware write barriers? I didn't think that the hardware was there on most memory controllers. EvanLet's see what happens when type-based scanning is in first; nobody knows what effect it'll have on scanning, but in a small utility I'm making (fractal browsing sample for dig) scanning takes 60 milliseconds but is scanning... 802 pointers over 1120001 dwords of data, so 1396 times too much data. Once that is pared down I predict I'll be able to call a full collection every second without having to worry about delays in almost every program. The exception will be VERY heavily pointer-oriented programs. We're talking Lisp level and programs where you have an "Int" class. I don't care about those.I have a lot of ideas about improving GC, such as making it a copying generational collector, implementing hardware write barriers, and using type info to mark whole regions as "not containing any pointers", etc. There is a LOT that can be done to boost performance.
Feb 09 2003
typeI have a lot of ideas about improving GC, such as making it a copying generational collector, implementing hardware write barriers, and usingis ainfo to mark whole regions as "not containing any pointers", etc. ThereI took that to be; marking the page as read only. and catching the page fault.LOT that can be done to boost performance.Implementing hardware write barriers? I didn't think that the hardware was there on most memory controllers.
Feb 09 2003
Mike Wynn wrote:From what I have read, can't find the link right now, that usually results in a decrease in performance, because the whole page needs to be scanned once the barrer is faulted. You only know that the page is dirty, not how dirty, and then have to scan 4MB for what is possibly only one allocated object. I for one would rather go with real software read and write barriers, which can then be replaced with hooks to real hardware read and write barriers once they finally start being reincluded in memory chips, which I figure that they will be before too long, now that microsoft is pressing on the whole garbage collected EvanImplementing hardware write barriers? I didn't think that the hardware was there on most memory controllers.I took that to be; marking the page as read only. and catching the page fault.
Feb 09 2003
"Evan McClanahan" <evan dontSPAMaltarinteractive.com> wrote in message news:b25v5t$2bms$1 digitaldaemon.com...Mike Wynn wrote:on a pentium the page can be 4K or 4M and I believe there is a 2M page too (May be PPro) and I agree that it well designed software write barriers are most likely cheaper (performance, maintance and development) 3 colour non copying concurrent collectors only actually require write and return barriers the people I used to know who write GC's don't believe in h/w gc, partly because it restrict what you can do with the object layouts and partly because you can't change or upgrade the gc. h/w that would assist one type of GC might hinder another. I think it will be a while before hardware gc becomes standard, there are CPU's that don't even have basic MM most old consoles and pda's (although I think WinCE devices have to have MM, I know PalmOS does not require it). and I think the PS2 has MM 'cos it can run linux (not uLinux) and of course if your device has a JavaVM, why bother with hardware MM, when you can do it all in software and save on sillicon. or use Tao intent or similar system. I would expect a CLR bytecode cpu, like the Sun and Arm attempts with Java Bytecode aware CPU's before h/w GC Mike.From what I have read, can't find the link right now, that usually results in a decrease in performance, because the whole page needs to be scanned once the barrer is faulted. You only know that the page is dirty, not how dirty, and then have to scan 4MB for what is possibly only one allocated object. I for one would rather go with real software read and write barriers, which can then be replaced with hooks to real hardware read and write barriers once they finally start being reincluded in memory chips, which I figure that they will be before too long, now that microsoft is pressing on the whole garbage collectedImplementing hardware write barriers? I didn't think that the hardware was there on most memory controllers.I took that to be; marking the page as read only. and catching the page fault.
Feb 09 2003
Evan McClanahan wrote:Mike Wynn wrote:Every time the page status is changed the TLB cache would need to be flushed (an expensive operation), this change would also have to be reflected across multiple processors on a SMP system. All of this would probably require support from the operating system, at least with respect to hooks to preset/reset/get the read/write flag of the page. The scanning of the page could be limited by picking a smaller page size, the 80x86 supports 4Kb as well as 4Mb pages (and 2Mb pages with page addressing extensions - see CR4.5 & CPUID(2)/EDX.6) C 2003/2/9From what I have read, can't find the link right now, that usually results in a decrease in performance, because the whole page needs to be scanned once the barrer is faulted. You only know that the page is dirty, not how dirty, and then have to scan 4MB for what is possibly only one allocated object. I for one would rather go with real software read and write barriers, which can then be replaced with hooks to real hardware read and write barriers once they finally start being reincluded in memory chips, which I figure that they will be before too long, now that microsoft is pressing on the whole garbage collected EvanImplementing hardware write barriers? I didn't think that the hardware was there on most memory controllers.I took that to be; marking the page as read only. and catching the page fault.
Feb 08 2003
"Achillefs Margaritis" <axilmar b-online.gr> wrote in message news:b1ce36$1ir3$1 digitaldaemon.com...C++ can be fully garbage-collected. Just make a GC super class of all garbage collected classes that upon construction adds the object to aglobalinternal list, then clear the list from time to time.I know, I use GC in my C++ programs. There are some problems, though, because the C++ compiler provides no type information to the runtime. The way a C++ compiler is implemented restricts the kind of GC algorithms that can be used.
Feb 03 2003