digitalmars.D - Dynamic Linking & Memory Management
- Benji Smith (65/65) Jan 26 2005 I've been interested to read some of the recent discussion about the D
- Kris (5/70) Jan 26 2005 The gods be praised! Good example, Benji.
- Matthew (13/94) Jan 26 2005 Indeed.
- Kris (14/114) Jan 26 2005 Encore!
- Matthew (18/140) Jan 26 2005 As I've said before, I want both models supportable, and I don't accept ...
- Matthew (3/6) Jan 26 2005 btw, Java-phobic hyperbole aside, I should point out that until D can ha...
- John Reimer (5/13) Jan 26 2005 Umm... have you updated your newsreader? Did you see Walter's recent
- Matthew (3/15) Jan 26 2005 Coolio. We can get back to telling big-W what a star we think he is, now...
- Matthew (8/74) Jan 26 2005 Hear, hear! (And thanks for spending the time to write such an eloquent
- Walter (3/17) Jan 26 2005 I agree. I'm working on it.
- Kris (4/22) Jan 26 2005 Hallelujah!
- John Reimer (7/17) Jan 26 2005 :-D
- Matthew (27/45) Jan 26 2005 Cool. I can shut my overbusy chops now! :-)
- Kris (5/28) Jan 26 2005 I didn't quite follow all of that, but here's something that Pragma sugg...
- Matthew (4/53) Jan 26 2005 I want to be able to write C-API DLLs in D, which I think would be the
- Kris (18/41) Jan 26 2005 I should point out all the tricky stuff you're talking about just goes a...
- Matthew (48/118) Jan 26 2005 True. But then one has the inordinate hassle of managing a Phobos.DLL
- Kris (15/28) Jan 26 2005 So how about this:
- Matthew (5/43) Jan 27 2005 What about an application, written in D and statically linked to a GC,
- Kris (38/41) Jan 27 2005 Right.
- Walter (14/21) Jan 28 2005 utilize that
- Kris (10/32) Jan 28 2005 We're misunderstanding each other, Walter. But there's nothing unusual a...
- Walter (8/13) Jan 28 2005 The way to do it is construct a pool of those soft references, so the gc
-
Dave
(40/54)
Jan 28 2005
- pragma (43/51) Jan 28 2005 Hey, sure thing.
- Kris (21/74) Jan 28 2005 Good points, Pragma. Another thing to consider, regarding the explicit u...
- Benji Smith (6/7) Jan 27 2005 Fantastic. Thanks, Walter. I really appreciate how receptive you are
I've been interested to read some of the recent discussion about the D garbage collector, and I'd like to describe my current Java project to give some perspective on what I think is ideal memory management. I'm working on a technical analysis and simulation application for historical stock market data. And despite the fact that I've written about ten thousand lines of code myself, I'm using third-party libraries for many aspects of the project. First of all, I'm using JDBC drivers for MySQL and MS SQL Server. The MySQL drivers consist of 3 different JAR files that I include in my application's classpath. The MS SQL Server drivers are contained in another 3 JAR files. I'm also using the Xerces XML parser from the Apache group (3 more JARs), the JFreeChart graphical charting compontents (5 more JARs), the JUnit testing framework (1 JAR), a GNU commandline parsing library (1 JAR), and a few other miscellaneous libraries. All told, I'm importing functionality from more than fifteen different libraries. And the application is still very small; by the time I’ve finished developing it, I’ll probably be using twice as many libraries. But when I write my code, I can write it as though I'm statically linking with each of those libraries. I don't need to use special export semantics when I need to call code from any of those third party vendors. And with a few of those libraries (the commandline parsing library in particular), I may end up writing my own implementation. When I do, I won't have to change the semantics to reflect the fact that I'm no longer using a compiled library. The rest of my code can be completely agnostic to whether I'm linking with source files, compiled class files, class files bundled into a JAR package, classes generated at runtime through reflection hooks, or classes loaded dynamically using a custom classloader. Since my application needs to support third-party plug-in development (users can load their own classes as custom charting indicators), dynamic runtime loading of classes is essential to my design. Ubiquitous static-linking would not be an option for me with this application. But there’s another important issue, too: debugging. Last week I discovered a memory leak somewhere in my application. If I allowed some of the analysis code to run for a few hours--combing through all 18 million data points from the last 25 years of stock market data--the heap would grow from its initial allocation (8 MB) to its maximum allocation (256 MB). Luckily for me, all of those allocations take place within a single virtual machine, which uses a single garbage collector to manage all of the memory from all of the libraries I'm using. That allows me to use a profiling application to monitor the allocations of all the objects in the heap and--much more importantly--to find out which objects are holding references to which other objects. Within moments, I could see that the JDBC allocations were getting cleaned up properly, but one of my custom collection classes was failing—about 2% of the time—to release object references it was no longer using. After an hour or so of tinkering with the profiler, I was able to track down and fix that memory leak. The application now uses a steady 12 MB of heap memory, no matter how long it runs. If I were building the same application with D, there would be fifteen different garbage collectors operating in fifteen different heap-spaces, and the objects allocated within one heap space might be referenced by objects in another heap space, each managed by a different garbage collector. It would be much more difficult to develop a heap profiling tool that could successfully allow a developer to navigate through such a fragmented heap space, particularly if the developer needed to figure out which GC was supposed to collect each out-of-reach object. Tracking down and fixing that memory leak probably would have taken a lot longer than an hour and a half. Consequently, I strongly support the development of a model within D that allows for a single GC instance per process. Any other scenario sounds like a development & debugging nightmare.
Jan 26 2005
The gods be praised! Good example, Benji. This is why it's been noted that D would not pass muster, in much of the all-important commercial field. - Kris In article <ce4gv0dk6v1qgde6o94msecgi4huasdhga 4ax.com>, Benji Smith says...I've been interested to read some of the recent discussion about the D garbage collector, and I'd like to describe my current Java project to give some perspective on what I think is ideal memory management. I'm working on a technical analysis and simulation application for historical stock market data. And despite the fact that I've written about ten thousand lines of code myself, I'm using third-party libraries for many aspects of the project. First of all, I'm using JDBC drivers for MySQL and MS SQL Server. The MySQL drivers consist of 3 different JAR files that I include in my application's classpath. The MS SQL Server drivers are contained in another 3 JAR files. I'm also using the Xerces XML parser from the Apache group (3 more JARs), the JFreeChart graphical charting compontents (5 more JARs), the JUnit testing framework (1 JAR), a GNU commandline parsing library (1 JAR), and a few other miscellaneous libraries. All told, I'm importing functionality from more than fifteen different libraries. And the application is still very small; by the time I’ve finished developing it, I’ll probably be using twice as many libraries. But when I write my code, I can write it as though I'm statically linking with each of those libraries. I don't need to use special export semantics when I need to call code from any of those third party vendors. And with a few of those libraries (the commandline parsing library in particular), I may end up writing my own implementation. When I do, I won't have to change the semantics to reflect the fact that I'm no longer using a compiled library. The rest of my code can be completely agnostic to whether I'm linking with source files, compiled class files, class files bundled into a JAR package, classes generated at runtime through reflection hooks, or classes loaded dynamically using a custom classloader. Since my application needs to support third-party plug-in development (users can load their own classes as custom charting indicators), dynamic runtime loading of classes is essential to my design. Ubiquitous static-linking would not be an option for me with this application. But there’s another important issue, too: debugging. Last week I discovered a memory leak somewhere in my application. If I allowed some of the analysis code to run for a few hours--combing through all 18 million data points from the last 25 years of stock market data--the heap would grow from its initial allocation (8 MB) to its maximum allocation (256 MB). Luckily for me, all of those allocations take place within a single virtual machine, which uses a single garbage collector to manage all of the memory from all of the libraries I'm using. That allows me to use a profiling application to monitor the allocations of all the objects in the heap and--much more importantly--to find out which objects are holding references to which other objects. Within moments, I could see that the JDBC allocations were getting cleaned up properly, but one of my custom collection classes was failing—about 2% of the time—to release object references it was no longer using. After an hour or so of tinkering with the profiler, I was able to track down and fix that memory leak. The application now uses a steady 12 MB of heap memory, no matter how long it runs. If I were building the same application with D, there would be fifteen different garbage collectors operating in fifteen different heap-spaces, and the objects allocated within one heap space might be referenced by objects in another heap space, each managed by a different garbage collector. It would be much more difficult to develop a heap profiling tool that could successfully allow a developer to navigate through such a fragmented heap space, particularly if the developer needed to figure out which GC was supposed to collect each out-of-reach object. Tracking down and fixing that memory leak probably would have taken a lot longer than an hour and a half. Consequently, I strongly support the development of a model within D that allows for a single GC instance per process. Any other scenario sounds like a development & debugging nightmare.
Jan 26 2005
Indeed. I, for one, can live with D 1.0 not having dynamic class loading, but that would be a sine qua non for 2.0. Furthermore, I think 1.0 must have a cooperative/unified GC architecture between link units, otherwise, again, we'll just be writing our DLLs in C and all non-compiled in code to an app will be via D extensions (which, while fun to write occasionaly, will get *really* tiresome). Or maybe I'm missing some deeper truth on the viability of D. If so, can someone please enlighten me so I can stop carping on like a harbinger of Doom. The Dr ..... "Kris" <Kris_member pathlink.com> wrote in message news:ct95fa$2tv5$1 digitaldaemon.com...The gods be praised! Good example, Benji. This is why it's been noted that D would not pass muster, in much of the all-important commercial field. - Kris In article <ce4gv0dk6v1qgde6o94msecgi4huasdhga 4ax.com>, Benji Smith says...I've been interested to read some of the recent discussion about the D garbage collector, and I'd like to describe my current Java project to give some perspective on what I think is ideal memory management. I'm working on a technical analysis and simulation application for historical stock market data. And despite the fact that I've written about ten thousand lines of code myself, I'm using third-party libraries for many aspects of the project. First of all, I'm using JDBC drivers for MySQL and MS SQL Server. The MySQL drivers consist of 3 different JAR files that I include in my application's classpath. The MS SQL Server drivers are contained in another 3 JAR files. I'm also using the Xerces XML parser from the Apache group (3 more JARs), the JFreeChart graphical charting compontents (5 more JARs), the JUnit testing framework (1 JAR), a GNU commandline parsing library (1 JAR), and a few other miscellaneous libraries. All told, I'm importing functionality from more than fifteen different libraries. And the application is still very small; by the time I've finished developing it, I'll probably be using twice as many libraries. But when I write my code, I can write it as though I'm statically linking with each of those libraries. I don't need to use special export semantics when I need to call code from any of those third party vendors. And with a few of those libraries (the commandline parsing library in particular), I may end up writing my own implementation. When I do, I won't have to change the semantics to reflect the fact that I'm no longer using a compiled library. The rest of my code can be completely agnostic to whether I'm linking with source files, compiled class files, class files bundled into a JAR package, classes generated at runtime through reflection hooks, or classes loaded dynamically using a custom classloader. Since my application needs to support third-party plug-in development (users can load their own classes as custom charting indicators), dynamic runtime loading of classes is essential to my design. Ubiquitous static-linking would not be an option for me with this application. But there's another important issue, too: debugging. Last week I discovered a memory leak somewhere in my application. If I allowed some of the analysis code to run for a few hours--combing through all 18 million data points from the last 25 years of stock market data--the heap would grow from its initial allocation (8 MB) to its maximum allocation (256 MB). Luckily for me, all of those allocations take place within a single virtual machine, which uses a single garbage collector to manage all of the memory from all of the libraries I'm using. That allows me to use a profiling application to monitor the allocations of all the objects in the heap and--much more importantly--to find out which objects are holding references to which other objects. Within moments, I could see that the JDBC allocations were getting cleaned up properly, but one of my custom collection classes was failing-about 2% of the time-to release object references it was no longer using. After an hour or so of tinkering with the profiler, I was able to track down and fix that memory leak. The application now uses a steady 12 MB of heap memory, no matter how long it runs. If I were building the same application with D, there would be fifteen different garbage collectors operating in fifteen different heap-spaces, and the objects allocated within one heap space might be referenced by objects in another heap space, each managed by a different garbage collector. It would be much more difficult to develop a heap profiling tool that could successfully allow a developer to navigate through such a fragmented heap space, particularly if the developer needed to figure out which GC was supposed to collect each out-of-reach object. Tracking down and fixing that memory leak probably would have taken a lot longer than an hour and a half. Consequently, I strongly support the development of a model within D that allows for a single GC instance per process. Any other scenario sounds like a development & debugging nightmare.
Jan 26 2005
Encore! And lest we forget: D would happily support the all-important unified GC (per process) if the GC were simply moved to a shared-lib. Further, Sean has already invested the majority of effort in carefully extracting the GC from the runtime, such that this is a near reality. Walter notes that he's had customer-support difficulties in the past over shared-libs, due to the vagaries of Win32. Unfortunately, that negative experience is being reflected directly in the range of valid programming models effectively supported by the D language. We /really/ need to move forward on this issue. Perhaps we can start (yet again) by asking why Walter feels we're all so much better off with a proliferation of GC instances instead of just one, easily manageable, instance? - Kris In article <ct968s$2uv3$1 digitaldaemon.com>, Matthew says...Indeed. I, for one, can live with D 1.0 not having dynamic class loading, but that would be a sine qua non for 2.0. Furthermore, I think 1.0 must have a cooperative/unified GC architecture between link units, otherwise, again, we'll just be writing our DLLs in C and all non-compiled in code to an app will be via D extensions (which, while fun to write occasionaly, will get *really* tiresome). Or maybe I'm missing some deeper truth on the viability of D. If so, can someone please enlighten me so I can stop carping on like a harbinger of Doom. The Dr ..... "Kris" <Kris_member pathlink.com> wrote in message news:ct95fa$2tv5$1 digitaldaemon.com...The gods be praised! Good example, Benji. This is why it's been noted that D would not pass muster, in much of the all-important commercial field. - Kris In article <ce4gv0dk6v1qgde6o94msecgi4huasdhga 4ax.com>, Benji Smith says...I've been interested to read some of the recent discussion about the D garbage collector, and I'd like to describe my current Java project to give some perspective on what I think is ideal memory management. I'm working on a technical analysis and simulation application for historical stock market data. And despite the fact that I've written about ten thousand lines of code myself, I'm using third-party libraries for many aspects of the project. First of all, I'm using JDBC drivers for MySQL and MS SQL Server. The MySQL drivers consist of 3 different JAR files that I include in my application's classpath. The MS SQL Server drivers are contained in another 3 JAR files. I'm also using the Xerces XML parser from the Apache group (3 more JARs), the JFreeChart graphical charting compontents (5 more JARs), the JUnit testing framework (1 JAR), a GNU commandline parsing library (1 JAR), and a few other miscellaneous libraries. All told, I'm importing functionality from more than fifteen different libraries. And the application is still very small; by the time I've finished developing it, I'll probably be using twice as many libraries. But when I write my code, I can write it as though I'm statically linking with each of those libraries. I don't need to use special export semantics when I need to call code from any of those third party vendors. And with a few of those libraries (the commandline parsing library in particular), I may end up writing my own implementation. When I do, I won't have to change the semantics to reflect the fact that I'm no longer using a compiled library. The rest of my code can be completely agnostic to whether I'm linking with source files, compiled class files, class files bundled into a JAR package, classes generated at runtime through reflection hooks, or classes loaded dynamically using a custom classloader. Since my application needs to support third-party plug-in development (users can load their own classes as custom charting indicators), dynamic runtime loading of classes is essential to my design. Ubiquitous static-linking would not be an option for me with this application. But there's another important issue, too: debugging. Last week I discovered a memory leak somewhere in my application. If I allowed some of the analysis code to run for a few hours--combing through all 18 million data points from the last 25 years of stock market data--the heap would grow from its initial allocation (8 MB) to its maximum allocation (256 MB). Luckily for me, all of those allocations take place within a single virtual machine, which uses a single garbage collector to manage all of the memory from all of the libraries I'm using. That allows me to use a profiling application to monitor the allocations of all the objects in the heap and--much more importantly--to find out which objects are holding references to which other objects. Within moments, I could see that the JDBC allocations were getting cleaned up properly, but one of my custom collection classes was failing-about 2% of the time-to release object references it was no longer using. After an hour or so of tinkering with the profiler, I was able to track down and fix that memory leak. The application now uses a steady 12 MB of heap memory, no matter how long it runs. If I were building the same application with D, there would be fifteen different garbage collectors operating in fifteen different heap-spaces, and the objects allocated within one heap space might be referenced by objects in another heap space, each managed by a different garbage collector. It would be much more difficult to develop a heap profiling tool that could successfully allow a developer to navigate through such a fragmented heap space, particularly if the developer needed to figure out which GC was supposed to collect each out-of-reach object. Tracking down and fixing that memory leak probably would have taken a lot longer than an hour and a half. Consequently, I strongly support the development of a model within D that allows for a single GC instance per process. Any other scenario sounds like a development & debugging nightmare.
Jan 26 2005
And lest we forget: D would happily support the all-important unified GC (per process) if the GC were simply moved to a shared-lib. Further, Sean has already invested the majority of effort in carefully extracting the GC from the runtime, such that this is a near reality.As I've said before, I want both models supportable, and I don't accept that this is technically infeasible. Notwithstanding, if both are not, then we must go for a separation between the pure statically linked model and the dynamically linked GC model. If we stay with static linking only, D's a joke, isn't it?Walter notes that he's had customer-support difficulties in the past over shared-libs, due to the vagaries of Win32. Unfortunately, that negative experience is being reflected directly in the range of valid programming models effectively supported by the D language.I agree. And I think this will kill D. As I've whinged and whined on, I can't understand how Walter thinks that D will be viable with the status quo. Alas, though Walter has huge amounts of valuable experience and insight (more than mine, I would hazard), I think he fails to recognise, or at least act on, two important facts: 1. he doesn't have *all* experience. None of us do. And, much more importantly, ... 2. many of us do not have any serious problems doing *very successful* (see below) work in C/C++. If D is not a quantum leap forward, _without_ new hassles, then why the hell is anyone ever going to use it? Because it's better than Java?!? Pah!We /really/ need to move forward on this issue. Perhaps we can start (yet again) by asking why Walter feels we're all so much better off with a proliferation of GC instances instead of just one, easily manageable, instance?We're not better off with that. We're nowhere with that! Someone turn out the lights on their way out ... The Dr ..... (below to be seen): I've worked on several highly commercially important projects over the last several years, most of which have been (primarily) implemented in C++. All the guff that people generally whinge on about as problems in C++ have proved either non-existant, irrelevant, or easily amenable to good practice. Some of these are still in production, 2, 4, 5 years later, and have never had a millisecond of downtime. So why do we need D, if it's going to be hassle-bundled?In article <ct968s$2uv3$1 digitaldaemon.com>, Matthew says...Indeed. I, for one, can live with D 1.0 not having dynamic class loading, but that would be a sine qua non for 2.0. Furthermore, I think 1.0 must have a cooperative/unified GC architecture between link units, otherwise, again, we'll just be writing our DLLs in C and all non-compiled in code to an app will be via D extensions (which, while fun to write occasionaly, will get *really* tiresome). Or maybe I'm missing some deeper truth on the viability of D. If so, can someone please enlighten me so I can stop carping on like a harbinger of Doom. The Dr ..... "Kris" <Kris_member pathlink.com> wrote in message news:ct95fa$2tv5$1 digitaldaemon.com...The gods be praised! Good example, Benji. This is why it's been noted that D would not pass muster, in much of the all-important commercial field. - Kris In article <ce4gv0dk6v1qgde6o94msecgi4huasdhga 4ax.com>, Benji Smith says...I've been interested to read some of the recent discussion about the D garbage collector, and I'd like to describe my current Java project to give some perspective on what I think is ideal memory management. I'm working on a technical analysis and simulation application for historical stock market data. And despite the fact that I've written about ten thousand lines of code myself, I'm using third-party libraries for many aspects of the project. First of all, I'm using JDBC drivers for MySQL and MS SQL Server. The MySQL drivers consist of 3 different JAR files that I include in my application's classpath. The MS SQL Server drivers are contained in another 3 JAR files. I'm also using the Xerces XML parser from the Apache group (3 more JARs), the JFreeChart graphical charting compontents (5 more JARs), the JUnit testing framework (1 JAR), a GNU commandline parsing library (1 JAR), and a few other miscellaneous libraries. All told, I'm importing functionality from more than fifteen different libraries. And the application is still very small; by the time I've finished developing it, I'll probably be using twice as many libraries. But when I write my code, I can write it as though I'm statically linking with each of those libraries. I don't need to use special export semantics when I need to call code from any of those third party vendors. And with a few of those libraries (the commandline parsing library in particular), I may end up writing my own implementation. When I do, I won't have to change the semantics to reflect the fact that I'm no longer using a compiled library. The rest of my code can be completely agnostic to whether I'm linking with source files, compiled class files, class files bundled into a JAR package, classes generated at runtime through reflection hooks, or classes loaded dynamically using a custom classloader. Since my application needs to support third-party plug-in development (users can load their own classes as custom charting indicators), dynamic runtime loading of classes is essential to my design. Ubiquitous static-linking would not be an option for me with this application. But there's another important issue, too: debugging. Last week I discovered a memory leak somewhere in my application. If I allowed some of the analysis code to run for a few hours--combing through all 18 million data points from the last 25 years of stock market data--the heap would grow from its initial allocation (8 MB) to its maximum allocation (256 MB). Luckily for me, all of those allocations take place within a single virtual machine, which uses a single garbage collector to manage all of the memory from all of the libraries I'm using. That allows me to use a profiling application to monitor the allocations of all the objects in the heap and--much more importantly--to find out which objects are holding references to which other objects. Within moments, I could see that the JDBC allocations were getting cleaned up properly, but one of my custom collection classes was failing-about 2% of the time-to release object references it was no longer using. After an hour or so of tinkering with the profiler, I was able to track down and fix that memory leak. The application now uses a steady 12 MB of heap memory, no matter how long it runs. If I were building the same application with D, there would be fifteen different garbage collectors operating in fifteen different heap-spaces, and the objects allocated within one heap space might be referenced by objects in another heap space, each managed by a different garbage collector. It would be much more difficult to develop a heap profiling tool that could successfully allow a developer to navigate through such a fragmented heap space, particularly if the developer needed to figure out which GC was supposed to collect each out-of-reach object. Tracking down and fixing that memory leak probably would have taken a lot longer than an hour and a half. Consequently, I strongly support the development of a model within D that allows for a single GC instance per process. Any other scenario sounds like a development & debugging nightmare.
Jan 26 2005
"Matthew" <admin.hat stlsoft.dot.org> wrote in message news:ct9gl9$9k4$1 digitaldaemon.com...2. many of us do not have any serious problems doing *very successful* (see below) work in C/C++. If D is not a quantum leap forward, _without_ new hassles, then why the hell is anyone ever going to use it? Because it's better than Java?!? Pah!btw, Java-phobic hyperbole aside, I should point out that until D can handle scenarios such as outlined in Benji's excellent post, D isn't even fit to kiss the bloated arse of Java. And that's a sad position to be in, to be sure ...
Jan 26 2005
On Thu, 27 Jan 2005 12:42:57 +1100, Matthew wrote:"Matthew" <admin.hat stlsoft.dot.org> wrote in message news:ct9gl9$9k4$1 digitaldaemon.com...Umm... have you updated your newsreader? Did you see Walter's recent response in this topic? It seems the point has been taken. :-) - John R.2. many of us do not have any serious problems doing *very successful* (see below) work in C/C++. If D is not a quantum leap forward, _without_ new hassles, then why the hell is anyone ever going to use it? Because it's better than Java?!? Pah!btw, Java-phobic hyperbole aside, I should point out that until D can handle scenarios such as outlined in Benji's excellent post, D isn't even fit to kiss the bloated arse of Java. And that's a sad position to be in, to be sure ...
Jan 26 2005
"John Reimer" <brk_6502 yahoo.com> wrote in message news:pan.2005.01.27.01.43.39.773589 yahoo.com...On Thu, 27 Jan 2005 12:42:57 +1100, Matthew wrote:Been having issues with my cable, and local net."Matthew" <admin.hat stlsoft.dot.org> wrote in message news:ct9gl9$9k4$1 digitaldaemon.com...Umm... have you updated your newsreader? Did you see Walter's recent response in this topic?2. many of us do not have any serious problems doing *very successful* (see below) work in C/C++. If D is not a quantum leap forward, _without_ new hassles, then why the hell is anyone ever going to use it? Because it's better than Java?!? Pah!btw, Java-phobic hyperbole aside, I should point out that until D can handle scenarios such as outlined in Benji's excellent post, D isn't even fit to kiss the bloated arse of Java. And that's a sad position to be in, to be sure ...It seems the point has been taken. :-)Coolio. We can get back to telling big-W what a star we think he is, now. <CG>
Jan 26 2005
Hear, hear! (And thanks for spending the time to write such an eloquent and persuasive post.) I not only agree with everything you say, I think we should overtly state that without support such as this, D is Doomed: Dead, Duck-like, save for small self-contained utility programs (for which C, never mind C++, suffices adequately, IMO). "Benji Smith" <dlanguage xxagg.com> wrote in message news:ce4gv0dk6v1qgde6o94msecgi4huasdhga 4ax.com...I've been interested to read some of the recent discussion about the D garbage collector, and I'd like to describe my current Java project to give some perspective on what I think is ideal memory management. I'm working on a technical analysis and simulation application for historical stock market data. And despite the fact that I've written about ten thousand lines of code myself, I'm using third-party libraries for many aspects of the project. First of all, I'm using JDBC drivers for MySQL and MS SQL Server. The MySQL drivers consist of 3 different JAR files that I include in my application's classpath. The MS SQL Server drivers are contained in another 3 JAR files. I'm also using the Xerces XML parser from the Apache group (3 more JARs), the JFreeChart graphical charting compontents (5 more JARs), the JUnit testing framework (1 JAR), a GNU commandline parsing library (1 JAR), and a few other miscellaneous libraries. All told, I'm importing functionality from more than fifteen different libraries. And the application is still very small; by the time I've finished developing it, I'll probably be using twice as many libraries. But when I write my code, I can write it as though I'm statically linking with each of those libraries. I don't need to use special export semantics when I need to call code from any of those third party vendors. And with a few of those libraries (the commandline parsing library in particular), I may end up writing my own implementation. When I do, I won't have to change the semantics to reflect the fact that I'm no longer using a compiled library. The rest of my code can be completely agnostic to whether I'm linking with source files, compiled class files, class files bundled into a JAR package, classes generated at runtime through reflection hooks, or classes loaded dynamically using a custom classloader. Since my application needs to support third-party plug-in development (users can load their own classes as custom charting indicators), dynamic runtime loading of classes is essential to my design. Ubiquitous static-linking would not be an option for me with this application. But there's another important issue, too: debugging. Last week I discovered a memory leak somewhere in my application. If I allowed some of the analysis code to run for a few hours--combing through all 18 million data points from the last 25 years of stock market data--the heap would grow from its initial allocation (8 MB) to its maximum allocation (256 MB). Luckily for me, all of those allocations take place within a single virtual machine, which uses a single garbage collector to manage all of the memory from all of the libraries I'm using. That allows me to use a profiling application to monitor the allocations of all the objects in the heap and--much more importantly--to find out which objects are holding references to which other objects. Within moments, I could see that the JDBC allocations were getting cleaned up properly, but one of my custom collection classes was failing-about 2% of the time-to release object references it was no longer using. After an hour or so of tinkering with the profiler, I was able to track down and fix that memory leak. The application now uses a steady 12 MB of heap memory, no matter how long it runs. If I were building the same application with D, there would be fifteen different garbage collectors operating in fifteen different heap-spaces, and the objects allocated within one heap space might be referenced by objects in another heap space, each managed by a different garbage collector. It would be much more difficult to develop a heap profiling tool that could successfully allow a developer to navigate through such a fragmented heap space, particularly if the developer needed to figure out which GC was supposed to collect each out-of-reach object. Tracking down and fixing that memory leak probably would have taken a lot longer than an hour and a half. Consequently, I strongly support the development of a model within D that allows for a single GC instance per process. Any other scenario sounds like a development & debugging nightmare.
Jan 26 2005
"Benji Smith" <dlanguage xxagg.com> wrote in message news:ce4gv0dk6v1qgde6o94msecgi4huasdhga 4ax.com...If I were building the same application with D, there would be fifteen different garbage collectors operating in fifteen different heap-spaces, and the objects allocated within one heap space might be referenced by objects in another heap space, each managed by a different garbage collector. It would be much more difficult to develop a heap profiling tool that could successfully allow a developer to navigate through such a fragmented heap space, particularly if the developer needed to figure out which GC was supposed to collect each out-of-reach object. Tracking down and fixing that memory leak probably would have taken a lot longer than an hour and a half. Consequently, I strongly support the development of a model within D that allows for a single GC instance per process. Any other scenario sounds like a development & debugging nightmare.I agree. I'm working on it.
Jan 26 2005
In article <ct9afo$2l8$1 digitaldaemon.com>, Walter says..."Benji Smith" <dlanguage xxagg.com> wrote in message news:ce4gv0dk6v1qgde6o94msecgi4huasdhga 4ax.com...Hallelujah! I will now shutup about this for a while :-) (can I get another Hallelujah?)If I were building the same application with D, there would be fifteen different garbage collectors operating in fifteen different heap-spaces, and the objects allocated within one heap space might be referenced by objects in another heap space, each managed by a different garbage collector. It would be much more difficult to develop a heap profiling tool that could successfully allow a developer to navigate through such a fragmented heap space, particularly if the developer needed to figure out which GC was supposed to collect each out-of-reach object. Tracking down and fixing that memory leak probably would have taken a lot longer than an hour and a half. Consequently, I strongly support the development of a model within D that allows for a single GC instance per process. Any other scenario sounds like a development & debugging nightmare.I agree. I'm working on it.
Jan 26 2005
On Thu, 27 Jan 2005 00:16:37 +0000, Kris wrote:In article <ct9afo$2l8$1 digitaldaemon.com>, Walter says...:-D This is indeed good news! And Walter, thanks for listening. Nice to see that you can be so resilient despite being pounded like a fence post (but it was for a good cause). :-) - John R.I agree. I'm working on it.Hallelujah! I will now shutup about this for a while :-) (can I get another Hallelujah?)
Jan 26 2005
"Walter" <newshound digitalmars.com> wrote in message news:ct9afo$2l8$1 digitaldaemon.com..."Benji Smith" <dlanguage xxagg.com> wrote in message news:ce4gv0dk6v1qgde6o94msecgi4huasdhga 4ax.com...Cool. I can shut my overbusy chops now! :-) Please may we have intelligent GCs that can detect each other, within a process space, and connect up in a sensible, and dynamic-lib-unloading proof fashion? Here's how I think it might work, roughly: If the process is written in D, it'll have a GC built-in (statically linked). [I agree with Walter 100% that processes should have a GC built in, and not have to rely on a DLL. Painful for small process distribution, along with other DLL issues ...] Now any subsequent D dynamic link-unit, whether that be a DDL (i.e. 'exposing' D classes in an analogous fashion to Java's JARs), or a DLL (i.e. exposing a C API/interface, implemented in D), when loaded, does not create its own GC but links to that of the process. Easy peasy, because its axiomatic that the life of any loaded dynamic link-units cannot exceed the life of the host process If the process is *not* written in D, here's where it gets interesting. IMO, it should be possible to have the same GC-locating mechanism attach a subsequent D link-unit (whether DDL or DLL). The problem is, how do we keep the first D link-unit's code and data locked in memory after the application has unloaded it. One answer would be to just increment the first link-unit's ref count (e.g. another call to dlopen()/LoadLibrary()) and then discard it. This would work fine, never crash, but would cause issues in long running servers that need to unload modules to pick up newer versions. Another answer would be that each GC-dependent link-unit would add such a lock onto the first-one-in, and release it when they're released. There may be issues here, however, since one can get into trouble (un)loading libs during lib (un)loading (albeit that I've never run into this outside C++.NET). In either case, a linked-to GC would have to expose a method for either locking it, or passing its instance name/handle to be locked Naturally, we have the issue of how a to-be-created GC detects and connects to a prior instance, and how this can be made thread-safe. That's a topic for discussion ... The Dr .....If I were building the same application with D, there would be fifteen different garbage collectors operating in fifteen different heap-spaces, and the objects allocated within one heap space might be referenced by objects in another heap space, each managed by a different garbage collector. It would be much more difficult to develop a heap profiling tool that could successfully allow a developer to navigate through such a fragmented heap space, particularly if the developer needed to figure out which GC was supposed to collect each out-of-reach object. Tracking down and fixing that memory leak probably would have taken a lot longer than an hour and a half. Consequently, I strongly support the development of a model within D that allows for a single GC instance per process. Any other scenario sounds like a development & debugging nightmare.I agree. I'm working on it.
Jan 26 2005
In article <ct9hki$afi$1 digitaldaemon.com>, Matthew says...Here's how I think it might work, roughly: If the process is written in D, it'll have a GC built-in (statically linked). [I agree with Walter 100% that processes should have a GC built in, and not have to rely on a DLL. Painful for small process distribution, along with other DLL issues ...] Now any subsequent D dynamic link-unit, whether that be a DDL (i.e. 'exposing' D classes in an analogous fashion to Java's JARs), or a DLL (i.e. exposing a C API/interface, implemented in D), when loaded, does not create its own GC but links to that of the process. Easy peasy, because its axiomatic that the life of any loaded dynamic link-units cannot exceed the life of the host process If the process is *not* written in D, here's where it gets interesting. IMO, it should be possible to have the same GC-locating mechanism attach a subsequent D link-unit (whether DDL or DLL). The problem is, how do we keep the first D link-unit's code and data locked in memory after the application has unloaded it. One answer would be to just increment the first link-unit's ref count (e.g. another call to dlopen()/LoadLibrary()) and then discard it. This would work fine, never crash, but would cause issues in long running servers that need to unload modules to pick up newer versions. Another answer would be that each GC-dependent link-unit would add such a lock onto the first-one-in, and release it when they're released. There may be issues here, however, since one can get into trouble (un)loading libs during lib (un)loading (albeit that I've never run into this outside C++.NET). In either case, a linked-to GC would have to expose a method for either locking it, or passing its instance name/handle to be locked Naturally, we have the issue of how a to-be-created GC detects and connects to a prior instance, and how this can be made thread-safe. That's a topic for discussion ... The Dr .....I didn't quite follow all of that, but here's something that Pragma suggested many moons ago: if external link-units were managed by the "one & only" GC, it could happily reap them when, and only when, there are no more live references. I know that's a somewhat trivial statement, but it has powerful mojo;
Jan 26 2005
"Kris" <Kris_member pathlink.com> wrote in message news:ct9k70$cv5$1 digitaldaemon.com...In article <ct9hki$afi$1 digitaldaemon.com>, Matthew says...I want to be able to write C-API DLLs in D, which I think would be the fly in that mojo.Here's how I think it might work, roughly: If the process is written in D, it'll have a GC built-in (statically linked). [I agree with Walter 100% that processes should have a GC built in, and not have to rely on a DLL. Painful for small process distribution, along with other DLL issues ...] Now any subsequent D dynamic link-unit, whether that be a DDL (i.e. 'exposing' D classes in an analogous fashion to Java's JARs), or a DLL (i.e. exposing a C API/interface, implemented in D), when loaded, does not create its own GC but links to that of the process. Easy peasy, because its axiomatic that the life of any loaded dynamic link-units cannot exceed the life of the host process If the process is *not* written in D, here's where it gets interesting. IMO, it should be possible to have the same GC-locating mechanism attach a subsequent D link-unit (whether DDL or DLL). The problem is, how do we keep the first D link-unit's code and data locked in memory after the application has unloaded it. One answer would be to just increment the first link-unit's ref count (e.g. another call to dlopen()/LoadLibrary()) and then discard it. This would work fine, never crash, but would cause issues in long running servers that need to unload modules to pick up newer versions. Another answer would be that each GC-dependent link-unit would add such a lock onto the first-one-in, and release it when they're released. There may be issues here, however, since one can get into trouble (un)loading libs during lib (un)loading (albeit that I've never run into this outside C++.NET). In either case, a linked-to GC would have to expose a method for either locking it, or passing its instance name/handle to be locked Naturally, we have the issue of how a to-be-created GC detects and connects to a prior instance, and how this can be made thread-safe. That's a topic for discussion ... The Dr .....I didn't quite follow all of that, but here's something that Pragma suggested many moons ago: if external link-units were managed by the "one & only" GC, it could happily reap them when, and only when, there are no more live references. I know that's a somewhat trivial statement, but it has powerful mojo;
Jan 26 2005
In article <ct9hki$afi$1 digitaldaemon.com>, Matthew says...Here's how I think it might work, roughly: If the process is written in D, it'll have a GC built-in (statically linked). [I agree with Walter 100% that processes should have a GC built in, and not have to rely on a DLL. Painful for small process distribution, along with other DLL issues ...] Now any subsequent D dynamic link-unit, whether that be a DDL (i.e. 'exposing' D classes in an analogous fashion to Java's JARs), or a DLL (i.e. exposing a C API/interface, implemented in D), when loaded, does not create its own GC but links to that of the process. Easy peasy, because its axiomatic that the life of any loaded dynamic link-units cannot exceed the life of the host process If the process is *not* written in D, here's where it gets interesting. IMO, it should be possible to have the same GC-locating mechanism attach a subsequent D link-unit (whether DDL or DLL). The problem is, how do we keep the first D link-unit's code and data locked in memory after the application has unloaded it. One answer would be to just increment the first link-unit's ref count (e.g. another call to dlopen()/LoadLibrary()) and then discard it. This would work fine, never crash, but would cause issues in long running servers that need to unload modules to pick up newer versions. Another answer would be that each GC-dependent link-unit would add such a lock onto the first-one-in, and release it when they're released. There may be issues here, however, since one can get into trouble (un)loading libs during lib (un)loading (albeit that I've never run into this outside C++.NET). In either case, a linked-to GC would have to expose a method for either locking it, or passing its instance name/handle to be locked Naturally, we have the issue of how a to-be-created GC detects and connects to a prior instance, and how this can be made thread-safe. That's a topic for discussion ... The Dr .....I should point out all the tricky stuff you're talking about just goes away if the GC is simply a shared-lib (DLL or DDL). May I ask, Matthew: what is your discomfort with shared-libs? I ask because I just don't know why the Microsoft O/S-related version-issue can't be resolved (to an acceptable degree) for one specific instance (the GC). On a regular basis, I build commercial frameworks that dynamically load/unload mobile code; part of any true solution has to allow for unloading said code, and the mobile code itself should not be bloated out with multiple instances of the GC implementation. Nor, come to think of it, should it be carrying all the floating-point support that the damned Object.printf() brings in with it. Both go against Walter's "no code bloat!" mantra and, franky, are completely unecessary. Oh, and before anyone say's something about disk-space; the latter is purely about transmission bandwidth & latency. Still; I'll try not speculate upon the outcome. - Kris I said I'd shutup; I guess I lied :-(
Jan 26 2005
"Kris" <Kris_member pathlink.com> wrote in message news:ct9ltc$es1$1 digitaldaemon.com...In article <ct9hki$afi$1 digitaldaemon.com>, Matthew says...True. But then one has the inordinate hassle of managing a Phobos.DLL for any little utility program. Note: I use the word 'inordinate' there in context. Having to care about Phobos.DLL for a utility would, for me, rule out D as a language for implementing all the small tools and utilities (as in http://synesis.com.au/systools.html) that I write. That's a 100% certainty. Hence, I agree with Walter, _in this regard_, that Phobos should not have to be dynamically linked.Here's how I think it might work, roughly: If the process is written in D, it'll have a GC built-in (statically linked). [I agree with Walter 100% that processes should have a GC built in, and not have to rely on a DLL. Painful for small process distribution, along with other DLL issues ...] Now any subsequent D dynamic link-unit, whether that be a DDL (i.e. 'exposing' D classes in an analogous fashion to Java's JARs), or a DLL (i.e. exposing a C API/interface, implemented in D), when loaded, does not create its own GC but links to that of the process. Easy peasy, because its axiomatic that the life of any loaded dynamic link-units cannot exceed the life of the host process If the process is *not* written in D, here's where it gets interesting. IMO, it should be possible to have the same GC-locating mechanism attach a subsequent D link-unit (whether DDL or DLL). The problem is, how do we keep the first D link-unit's code and data locked in memory after the application has unloaded it. One answer would be to just increment the first link-unit's ref count (e.g. another call to dlopen()/LoadLibrary()) and then discard it. This would work fine, never crash, but would cause issues in long running servers that need to unload modules to pick up newer versions. Another answer would be that each GC-dependent link-unit would add such a lock onto the first-one-in, and release it when they're released. There may be issues here, however, since one can get into trouble (un)loading libs during lib (un)loading (albeit that I've never run into this outside C++.NET). In either case, a linked-to GC would have to expose a method for either locking it, or passing its instance name/handle to be locked Naturally, we have the issue of how a to-be-created GC detects and connects to a prior instance, and how this can be made thread-safe. That's a topic for discussion ... The Dr .....I should point out all the tricky stuff you're talking about just goes away if the GC is simply a shared-lib (DLL or DDL).May I ask, Matthew: what is your discomfort with shared-libs? I ask because I just don't know why the Microsoft O/S-related version-issue can't be resolved (to an acceptable degree) for one specific instance (the GC).I don't want to have to write installers for simple programs / plug-in DLLs (e.g. shell extensions; http://shellext.com/). Note: I *absolutely* want/must have Phobos in a DLL for serious / large-scale work. To not have it as such would as surely rule out D for my consideration for any large scale project. Naturally, these provide a contradiction, without a straightforward solution. Therefore, I believe that the only viable solution is that D does something smarter than C/C++, and support both. It requires a degree of sophistication in the runtime arbitration between multiple GC creation events, but I hardly think this is an insurmountable problem. I'd be very surprised if this is something that we cannot all work through, or that will confound big-W's programming skill. In any case, I don't see that there's a (commercially) viable alternative.On a regular basis, I build commercial frameworks that dynamically load/unload mobile code; part of any true solution has to allow for unloading said code, and the mobile code itself should not be bloated out with multiple instances of the GC implementation.Exactly.Nor, come to think of it, should it be carrying all the floating-point support that the damned Object.printf() brings in with it.This is another issue, but one on which I completely agree.Both go against Walter's "no code bloat!" mantra and, franky, are completely unecessary.Agreed.Oh, and before anyone say's something about disk-space; the latter is purely about transmission bandwidth & latency.Not so. There is a more fundamental objection: coupling should be minimised in all circumstances. I might offer something like "anyone exposed to enough software engineering on a commercial scale will come to believe this" but that'd get me involved in YABA, so I'll just say "it has been my unwavering experience, in a multitude of languages, technologies, application domains, that increasing coupling is bad, and decreasing coupling is good". If someone were to look at D, then look at Phobos, then look at Object, then see the coupling between Object and printf() and the CRT, and follow their own experiential logical flow to come to the conclusion that "D is crap", I would wish to persuade them otherwise, but I could not fault their reasoning.Still; I'll try not speculate upon the outcome.I would think that the recent, singular (?), success of our carping and whineing in getting a movement from big-W on the dynamic-linking issue offers all kinds of encouragment to those who are yet to hit their D sweet spot.- Kris I said I'd shutup; I guess I lied :-(It's people like you, Kris, who have the knowledge, the general experience, and, perhaps most importantly, the significant real-world D experience, who provide the meat on the bones of the instinctual (some would say half-cocked) mutterings of the likes of me. Please don't shut up. The Dr .....
Jan 26 2005
In article <ct9mvn$g05$1 digitaldaemon.com>, Matthew says...I don't want to have to write installers for simple programs / plug-in DLLs (e.g. shell extensions; http://shellext.com/). Note: I *absolutely* want/must have Phobos in a DLL for serious / large-scale work. To not have it as such would as surely rule out D for my consideration for any large scale project. Naturally, these provide a contradiction, without a straightforward solution. Therefore, I believe that the only viable solution is that D does something smarter than C/C++, and support both. It requires a degree of sophistication in the runtime arbitration between multiple GC creation events, but I hardly think this is an insurmountable problem. I'd be very surprised if this is something that we cannot all work through, or that will confound big-W's programming skill. In any case, I don't see that there's a (commercially) viable alternative.So how about this: 1) the GC lives within a library (as it does now) 2) there is an optional DLL, compiled with the library GC, and there is a seperate library shim to bind said DLL instead 3) the developer makes a choice to either (a) link with the library GC directly or (b) link with the DLL GC shim 4) the choice is made by adding the DLL-shim library-name to the dmd command-line, causing the linker to select the DLL GC rather than the default statically-linked GC (the linker will not try to bind the static GC instance if those symbols have already been satisfied) Does that covers all bases? There's probably several other ways to acheive a similar effect. Note that the default, and simple, behaviour is to statically link the GC ... - Kris
Jan 26 2005
"Kris" <Kris_member pathlink.com> wrote in message news:ct9s2l$lo5$1 digitaldaemon.com...In article <ct9mvn$g05$1 digitaldaemon.com>, Matthew says...What about an application, written in D and statically linked to a GC, that may or may not load a DDL to get some D classes, depending on its cmd-line params?I don't want to have to write installers for simple programs / plug-in DLLs (e.g. shell extensions; http://shellext.com/). Note: I *absolutely* want/must have Phobos in a DLL for serious / large-scale work. To not have it as such would as surely rule out D for my consideration for any large scale project. Naturally, these provide a contradiction, without a straightforward solution. Therefore, I believe that the only viable solution is that D does something smarter than C/C++, and support both. It requires a degree of sophistication in the runtime arbitration between multiple GC creation events, but I hardly think this is an insurmountable problem. I'd be very surprised if this is something that we cannot all work through, or that will confound big-W's programming skill. In any case, I don't see that there's a (commercially) viable alternative.So how about this: 1) the GC lives within a library (as it does now) 2) there is an optional DLL, compiled with the library GC, and there is a seperate library shim to bind said DLL instead 3) the developer makes a choice to either (a) link with the library GC directly or (b) link with the DLL GC shim 4) the choice is made by adding the DLL-shim library-name to the dmd command-line, causing the linker to select the DLL GC rather than the default statically-linked GC (the linker will not try to bind the static GC instance if those symbols have already been satisfied) Does that covers all bases? There's probably several other ways to acheive a similar effect. Note that the default, and simple, behaviour is to statically link the GC ...
Jan 27 2005
In article <ctcbun$v1o$1 digitaldaemon.com>, Matthew says...What about an application, written in D and statically linked to a GC, that may or may not load a DDL to get some D classes, depending on its cmd-line params?Right. Well, I had assumed that any developer going to the trouble of supporting dynamically loadable modules (providing a 'container') would have statically linked to the DLL version GC, since all those lovely little loadable modules will be using said DLL anyway. There are certain considerations that apply to containers, particularly those of a dynamic variety. As such, I don't think it's much of a stretch to note that such designs should use the DLL GC instead. In the end, that takes care of all those hideously complex issues you noted prior, in a robust manner, and it's simpler than /consistently/ following all those little details Walter added to the DMD doc :-) Having said that, Walter has at least provided the bare-bones. I'll utilize that to provide a means of hiding the grubby details, such that both dynamic & static linking of DLLs will be both thoroughly transparent and painless. IMO, this kind of thing should ideally be left to the O/S; not re-invented by the language runtime (someone else had noted this, also). Sometimes one has to sidestep the O/S, but in this case I don't feel the complexity tradeoffs are reasonable. That is; I believe containers will be simpler and probably more robust if they avoid trying to do some fancy internal sharing of multiple GC instances. Just going with a single, shared GC instance, managed by the O/S is the better option. That simplicity might hopefully lead to more people writing dynamically loadable code, such as D Servlets. It also makes it easier for others to write alternate GC implementations, without the added complexity of re-implementing and thoroughly testing all that GC-sharing 'stuff'! That's just my opinion, but it is the manner in which I will personally awaken the two containers currently slumbering within Mango; along with the mobile-code to go with them :-) Lastly, I should note that this is just for the dynamic 'containment' style of programming (the specific case we're talking about). Other types of programs would link the GC in whatever means was appropriate to them (where static linking of the static-library GC would be the default, static, behavior). Thoughts, Matthew? And how many times can one legitimally say 'static' in a single sentence? - Kris p.s. Pragma is building a container also, so I'd like to get his perspective on this too.
Jan 27 2005
"Kris" <Kris_member pathlink.com> wrote in message news:ctcgoa$13ro$1 digitaldaemon.com...Having said that, Walter has at least provided the bare-bones. I'llutilize thatto provide a means of hiding the grubby details, such that both dynamic &staticlinking of DLLs will be both thoroughly transparent and painless. IMO, this kind of thing should ideally be left to the O/S; not re-inventedbythe language runtime (someone else had noted this, also). Sometimes onehas tosidestep the O/S, but in this case I don't feel the complexity tradeoffsarereasonable.Most of the time, all you need to do is cut & paste from the examples given. One reason the details are shown is because D is a systems programming language, and knowing the how & the why of the details means one is much more likely to use it successfully. It also enables one to modify it for special purposes. I also agree that the OS should provide gc services. But I am not in a position to design an OS <g>, so we must work with what we have.
Jan 28 2005
In article <ctd1v8$2239$1 digitaldaemon.com>, Walter says..."Kris" <Kris_member pathlink.com> wrote in message news:ctcgoa$13ro$1 digitaldaemon.com...It's /great/ that you documented all the details!Having said that, Walter has at least provided the bare-bones. I'llutilize thatto provide a means of hiding the grubby details, such that both dynamic &staticlinking of DLLs will be both thoroughly transparent and painless. IMO, this kind of thing should ideally be left to the O/S; not re-inventedbythe language runtime (someone else had noted this, also). Sometimes onehas tosidestep the O/S, but in this case I don't feel the complexity tradeoffsarereasonable.Most of the time, all you need to do is cut & paste from the examples given. One reason the details are shown is because D is a systems programming language, and knowing the how & the why of the details means one is much more likely to use it successfully. It also enables one to modify it for special purposes.I also agree that the OS should provide gc services. But I am not in a position to design an OS <g>, so we mst work with what we have.We're misunderstanding each other, Walter. But there's nothing unusual about that :-) Thanks for addressing the issue. Everyone has their own idea of how to skin the proverbial cat, but the end result is typically the same: one dead cat. Can you perhaps enlighten us on how to contruct robust "soft-references"? It appears that the GC disables all threads whilst reaping allocations, which could then lead to deadlock between the GC and a soft-reference manager. Are all threads (except the GC) halted when a destructor is invoked?
Jan 28 2005
"Kris" <Kris_member pathlink.com> wrote in message news:cte22g$97t$1 digitaldaemon.com...Can you perhaps enlighten us on how to contruct robust "soft-references"?The way to do it is construct a pool of those soft references, so the gc won't reap them.It appears that the GC disables all threads whilst reaping allocations, whichcouldthen lead to deadlock between the GC and a soft-reference manager. Are all threads (except the GC) halted when a destructor is invoked?All the threads that the gc knows about (via std.thread). If you create a thread directly, not using std.thread, the gc won't stop it, scan it, or know anything about it.
Jan 28 2005
"Walter" <newshound digitalmars.com> wrote in message news:ctd1v8$2239$1 digitaldaemon.com..."Kris" <Kris_member pathlink.com> wrote in message news:ctcgoa$13ro$1 digitaldaemon.com...<snip>Having said that, Walter has at least provided the bare-bones. I'llutilize thatto provide a means of hiding the grubby details, such that both dynamic &staticlinking of DLLs will be both thoroughly transparent and painless.Most of the time, all you need to do is cut & paste from the examples given. One reason the details are shown is because D is a systems programming language, and knowing the how & the why of the details means one is much more likely to use it successfully. It also enables one to modify it for special purposes.<snip>Walter - thank you for the DLL/GC addition!! I gotta add my $0.02 on this though.. If the code inside DllMain, MyDLL_Initialize and MyDLL_Terminate can be handled by some sort of boiler-plate wrapper for 8 of 10 uses, I think it would be a /very/ good thing to provide it (while still allowing the developer to use the detailed version). This would be especially true if it would make shared library development more portable between Win and the 'nix's for the majority of cases where the code in MyDLL_Initialize and MyDLL_Terminate can be handled by an import and a few wrapper functions like: import std.gc; import std.slinit; // extern(C) { _minit(), etc... } version (Windows) { HINSTANCE g_hInst; extern (Windows) BOOL DllMain(HINSTANCE hInstance, ULONG ulReason, LPVOID pvReserved) { return SL_DllMain(hInstance,ulReason,g_hInst); } } // version (Windows) export void MySharedLib_Initialize(void* gc) { SL_Init(gc); } export void MySharedLib_Terminate() { SL_Term(); } I think it worth the effort just to minimize the code overhead (and learning curve and clutter) needed for most shared libs. But if it also turns out that the standard copy and paste code (of your example) needs to be different between Win and Linux, wrapper functions will make things that much more elegant for portable library development, IMHO. To me, this would coincide with the D philosophy of hiding the messy details for the general case while still providing for their use if needed. - Dave
Jan 28 2005
In article <ctcgoa$13ro$1 digitaldaemon.com>, Kris says...p.s. Pragma is building a container also, so I'd like to get his perspective on this too.Hey, sure thing. Not to dilute Kris' argument here, but I think that Walter has given us what was needed for GC management between dll's and processes. I haven't thought around all the corners of the problem space yet, but it looks more and more to me that using a separate dll for the GC may acutally further complicate things. At first, I didn't think this was so. But the updated model now creates a 1-to-1 mapping between GC's and processes, irrespective of how many dll's are in use. To me, that seems a damn fine solution, if not a step in the right direction. That aside, the bigger issue is class management across dll boundaries. Most applications do not need to worry about the validity of v-tables and delegates, since the dll is usually freed at program termination (this goes especially for static linking). It is a problem that is not covered by the GC at all, so it requires additional management; hence Kris' notion of "Containers". For those not familiar with the problem, here's what can easily happen. Say I have an export from a dll that returns an object of class "Foobar". I then free the dll since its no longer needed. Finally, I attempt to print the contents of Foobar.// given: mylibrary represents a dll void makeSegFault(){ Foobar foo = mylibrary.getNewFoobar(); mylibrary.unload(); writeln(foo.toString()); }This will segfault since the vtable for 'foo' was a part of the dll. Thankfully, the recent GC enhancements allow us to at least keep foo's memory footprint intact, but the methods are history. Also, reloading the dll cannot be reliably used to 'magically' restore that vtable. This pattern is easier to create than one would think, especially when one is cramming data into generic AA's and references become widely dispersed inside a large system. For DSP, the solution I'm going to use involves a combination of object-proxies and reference counting of said proxies per dll. A dll reload will not break code, since the proxies can be prodded to re-constitute their dll-bound counterparts. This way, the proxies can be freely refrenced throughout the application, save the dll they're interfacing with (feedback would be *bad*). The only other airtight solution I can think of, would be to apply the GC pattern to dll's. This means that a dll is not unloaded until the heap is free from all refrences into a dll's address space (lazy unloading via garbage collection). Adding a given dll's address space as a root to the GC should cover this. The only drawback here is that its effectively the same as the present situation given that you cannot force a dll unload without potentially breaking something; the real advantage of dll's is to load and unload at will. Aside: does anyone know what happens if you touch a used .class file while a Java app is running? Can Java's ClassLoader be told to unload or reload a class file that's in use? I'm curious since I'd like to know how other platforms have handled this space. - EricAnderton at yahoo
Jan 28 2005
In article <ctdqsl$30sc$1 digitaldaemon.com>, pragma says...In article <ctcgoa$13ro$1 digitaldaemon.com>, Kris says...Good points, Pragma. Another thing to consider, regarding the explicit unloading of DLLs, is the 'version' issue. If one replaces an existing instance of some dynamically-loaded module with another, newer version, then the contract between the container and any existing (remote) clients has effectively been broken. I note this because each newer version should be loaded as such; as a distinct and seperate instance in addition to any prior version instances. Doing so leads to long-term stability. The upshot is that such a container would not have a regular need to /explicitly/ drop any particular (and previously loaded) module. Therefore, your approach of using the GC to manage module 'liveness' is rather suitable. Placing the GC within a DLL does not complicate this, as far as I can tell. There's at least one tricky part there: how to know whether or not each dynamically loaded-module is still actually loaded. I think soft-references would alleviate that problem, and there are some ways to do that in D, although there's a subtle danger of deadlock since it appears that the GC halts all other threads when it reaps the heap :-( Perhaps Walter could enlighten us on how to construct robust soft-references? Thinking about this brings up another issue to consider; starting a thread from within a DLL will potentially cause the GC, and the process, to fail. Something to be careful of.p.s. Pragma is building a container also, so I'd like to get his perspective on this too.Hey, sure thing. Not to dilute Kris' argument here, but I think that Walter has given us what was needed for GC management between dll's and processes. I haven't thought around all the corners of the problem space yet, but it looks more and more to me that using a separate dll for the GC may acutally further complicate things. At first, I didn't think this was so. But the updated model now creates a 1-to-1 mapping between GC's and processes, irrespective of how many dll's are in use. To me, that seems a damn fine solution, if not a step in the right direction. That aside, the bigger issue is class management across dll boundaries. Most applications do not need to worry about the validity of v-tables and delegates, since the dll is usually freed at program termination (this goes especially for static linking). It is a problem that is not covered by the GC at all, so it requires additional management; hence Kris' notion of "Containers". For those not familiar with the problem, here's what can easily happen. Say I have an export from a dll that returns an object of class "Foobar". I then free the dll since its no longer needed. Finally, I attempt to print the contents of Foobar.// given: mylibrary represents a dll void makeSegFault(){ Foobar foo = mylibrary.getNewFoobar(); mylibrary.unload(); writeln(foo.toString()); }This will segfault since the vtable for 'foo' was a part of the dll. Thankfully, the recent GC enhancements allow us to at least keep foo's memory footprint intact, but the methods are history. Also, reloading the dll cannot be reliably used to 'magically' restore that vtable. This pattern is easier to create than one would think, especially when one is cramming data into generic AA's and references become widely dispersed inside a large system. For DSP, the solution I'm going to use involves a combination of object-proxies and reference counting of said proxies per dll. A dll reload will not break code, since the proxies can be prodded to re-constitute their dll-bound counterparts. This way, the proxies can be freely refrenced throughout the application, save the dll they're interfacing with (feedback would be *bad*). The only other airtight solution I can think of, would be to apply the GC pattern to dll's. This means that a dll is not unloaded until the heap is free from all refrences into a dll's address space (lazy unloading via garbage collection). Adding a given dll's address space as a root to the GC should cover this. The only drawback here is that its effectively the same as the present situation given that you cannot force a dll unload without potentially breaking something; the real advantage of dll's is to load and unload at will. Aside: does anyone know what happens if you touch a used .class file while a Java app is running? Can Java's ClassLoader be told to unload or reload a class file that's in use? I'm curious since I'd like to know how other platforms have handled this space. - EricAnderton at yahoo
Jan 28 2005
In article <cte18s$82o$1 digitaldaemon.com>, Kris says...Good points, Pragma. Another thing to consider, regarding the explicit unloading of DLLs, is the 'version' issue. If one replaces an existing instance of some dynamically-loaded module with another, newer version, then the contract between the container and any existing (remote) clients has effectively been broken.Yep. This is why I've advocated that we all get into the habit of naming our dlls with the version number as a part of the name. It solves the majority of these problems. The other techniques I've proposed in the past, may very well be suitable in an application-to-application manner. Overall, this is an area where sufficent (and justified) pushback from Walter would have us forge an open standard for this kind of thing.The upshot is that such a container would not have a regular need to /explicitly/ drop any particular (and previously loaded) module. Therefore, your approach of using the GC to manage module 'liveness' is rather suitable.I see where you're going with this. Assuming that the only reason for a reload is to grab a newer version, you don't need to unload the old one at all.Placing the GC within a DLL does not complicate this, as far as I can tell.I'm confused. Did you mean "manage the dll with the GC" instead?There's at least one tricky part there: how to know whether or not each dynamically loaded-module is still actually loaded. I think soft-references would alleviate that problem, and there are some ways to do that in D, although there's a subtle danger of deadlock since it appears that the GC halts all other threads when it reaps the heap :-( Perhaps Walter could enlighten us on how to construct robust soft-references?You're talking about having a soft (weak?) reference to the library in question, correct? Constructing weaak-refrences in D should be as easy as writing a wrapper class that tells the GC to ignore the weak-pointer's address when checking for roots. Now, checking their validitiy is tough to solve, since the GC doesn't expose any way to check if a pointer is under it's control (sure you could use win32, but it's not portable) And as for deadlock: what if the call to unload a library is called on the GC's thread via a destructor? Would that fix the problem? I suppose if the dll held some kind of mutex inside of dllmain, that it would cause trouble. But this may come back to "Best Practices" for managing such a mechanism.Thinking about this brings up another issue to consider; starting a thread from within a DLL will potentially cause the GC, and the process, to fail. Something to be careful of.I'll have to take your word for this. Perhaps you can furnish me with a more concrete example? Unless you're inside of dllMain, there shouldn't be any side effects that I'm aware of. Also, the MSDN library has a slew of articles of what to do and not to do inside of dllMain. The gist of it all is that you should do the absolute minimum needed inside that routine as to avoid problems just within win32 itself. - EricAnderton at yahoo
Jan 28 2005
In article <cte4c0$c8n$1 digitaldaemon.com>, pragma says...In article <cte18s$82o$1 digitaldaemon.com>, Kris says...Ahh; I was just referring to the earlier assertion that placing the GC itself within a seperate DLL might actually increase complexity. I don't think it does, but I could be wrong.Placing the GC within a DLL does not complicate this, as far as I can tell.I'm confused. Did you mean "manage the dll with the GC" instead?There's at least one tricky part there: how to know whether or not each dynamically loaded-module is still actually loaded. I think soft-references would alleviate that problem, and there are some ways to do that in D, although there's a subtle danger of deadlock since it appears that the GC halts all other threads when it reaps the heap :-(And as for deadlock: what if the call to unload a library is called on the GC's thread via a destructor? Would that fix the problem? I suppose if the dll held some kind of mutex inside of dllmain, that it would cause trouble. But this may come back to "Best Practices" for managing such a mechanism.Deadlock could occur if (a) the destructor is used to unload the module, (b) all threads are halted whilst the GC runs (and hence during the destructor call), and (c) the mutex protecting the "module is currently loaded" flag is held by one of the stalled threads; one which was 'concurrently' asking for a handle to that specific module. The GC thread would stall on that same mutex. One way around this would be to utilize a mutex-free queue, to stack up destructor requests for unloading reaped module instances -- thereby decoupling the GC from aforementioned mutex.If one assumes that the GC has a valid reason for halting all threads during a sweep, then any thread it does not know about is a potential threat to stability. Since Phobos (and thus std.Thread) is still linked statically, all DLLs will have their own std.Thread instance, yet will be sharing a single GC. The single GC only knows about one instance of std.Thread, and subsequently can only halt those threads created via that particular instance. Any thread created via a DLL will be noted only within that DLL std.Thread pool, and thus will not be stalled during a GC sweep. Therein lies trouble :-) Full resolution is conceptually trivial, but apparently controversial.Thinking about this brings up another issue to consider; starting a thread from within a DLL will potentially cause the GC, and the process, to fail. Something to be careful of.I'll have to take your word for this. Perhaps you can furnish me with a more concrete example?
Jan 28 2005
On Wed, 26 Jan 2005 14:19:04 -0800, "Walter" <newshound digitalmars.com> wrote:I agree. I'm working on it.Fantastic. Thanks, Walter. I really appreciate how receptive you are to the whims (er...I mean, the intelligent and informed opinions) of the people in this ng. --Benji
Jan 27 2005