www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Adding Java and C++ to the MQTT benchmarks or: How I Learned to Stop

reply "Atila Neves" <atila.neves gmail.com> writes:
http://atilanevesoncode.wordpress.com/2014/01/08/adding-java-and-c-to-the-mqtt-benchmarks-or-how-i-learned-to-stop-worrying-and-love-the-garbage-collector/
Jan 08 2014
next sibling parent "Paulo Pinto" <pjmlp progtools.org> writes:
On Wednesday, 8 January 2014 at 11:35:21 UTC, Atila Neves wrote:
 http://atilanevesoncode.wordpress.com/2014/01/08/adding-java-and-c-to-the-mqtt-benchmarks-or-how-i-learned-to-stop-worrying-and-love-the-garbage-collector/
Thanks for sharing your experience. It goes with my experience moving enterprise server code from C++ to JVM/.NET land. What people forget about C++ smart pointers vs Objective-C/Rust/ParaSail ones is that without compiler support, you just spend too much time doing the said operations. Over the holidays I spent some time researching about the Mesa/Cedar system developed at Xerox PARC. Cedar was already a GC enabled systems programming language, strong typed. Quite remarkable what the system could do as a GUI desktop workstation in the early 80's and we are still fighting in 2014 to get GC enabled systems programming languages accepted in the mainstream. -- Paulo
Jan 08 2014
prev sibling next sibling parent reply "bearophile" <bearophileHUGS lycos.com> writes:
Atila Neves:

 http://atilanevesoncode.wordpress.com/2014/01/08/adding-java-and-c-to-the-mqtt-benchmarks-or-how-i-learned-to-stop-worrying-and-love-the-garbage-collector/
In this file: https://github.com/atilaneves/mqtt/blob/master/mqttd/factory.d Instead of code: switch(fixedHeader.type) { case MqttType.CONNECT: return cereal.value!MqttConnect(fixedHeader); case MqttType.CONNACK: Perhaps you want code as: final switch(fixedHeader.type) with (MqttType) { case CONNECT: return cereal.value!MqttConnect(fixedHeader); case CONNACK: ... Or even (modifying the enum): final switch(fixedHeader.type) with (MqttType) { case connect: return cereal.value!MqttConnect(fixedHeader); case connack: ... Bye, bearophile
Jan 08 2014
parent "Atila Neves" <atila.neves gmail.com> writes:
Thanks. I didn't think of using with, possibly because I've never 
used it before. It's one of those cool little features that I 
liked when I read about it but never remember about later.

I didn't use final switch on purpose; I normally would, but I 
didn't implement all the possible MQTT message types. If I ever 
do, it'll definitely be a final switch.

Atila

On Wednesday, 8 January 2014 at 12:35:02 UTC, bearophile wrote:
 Atila Neves:

 http://atilanevesoncode.wordpress.com/2014/01/08/adding-java-and-c-to-the-mqtt-benchmarks-or-how-i-learned-to-stop-worrying-and-love-the-garbage-collector/
In this file: https://github.com/atilaneves/mqtt/blob/master/mqttd/factory.d Instead of code: switch(fixedHeader.type) { case MqttType.CONNECT: return cereal.value!MqttConnect(fixedHeader); case MqttType.CONNACK: Perhaps you want code as: final switch(fixedHeader.type) with (MqttType) { case CONNECT: return cereal.value!MqttConnect(fixedHeader); case CONNACK: ... Or even (modifying the enum): final switch(fixedHeader.type) with (MqttType) { case connect: return cereal.value!MqttConnect(fixedHeader); case connack: ... Bye, bearophile
Jan 08 2014
prev sibling next sibling parent reply "bearophile" <bearophileHUGS lycos.com> writes:
Atila Neves:

 http://atilanevesoncode.wordpress.com/2014/01/08/adding-java-and-c-to-the-mqtt-benchmarks-or-how-i-learned-to-stop-worrying-and-love-the-garbage-collector/
Going to Reddit? Bye, bearophile
Jan 08 2014
parent reply "Atila Neves" <atila.neves gmail.com> writes:
I don't know if I have enough rep for it, I'd appreciate it if 
someone who does posts it there.

On Wednesday, 8 January 2014 at 18:24:00 UTC, bearophile wrote:
 Atila Neves:

 http://atilanevesoncode.wordpress.com/2014/01/08/adding-java-and-c-to-the-mqtt-benchmarks-or-how-i-learned-to-stop-worrying-and-love-the-garbage-collector/
Going to Reddit? Bye, bearophile
Jan 08 2014
parent reply Paulo Pinto <pjmlp progtools.org> writes:
Am 08.01.2014 19:31, schrieb Atila Neves:
 I don't know if I have enough rep for it, I'd appreciate it if someone
 who does posts it there.

 On Wednesday, 8 January 2014 at 18:24:00 UTC, bearophile wrote:
 Atila Neves:

 http://atilanevesoncode.wordpress.com/2014/01/08/adding-java-and-c-to-the-mqtt-benchmarks-or-how-i-learned-to-stop-worrying-and-love-the-garbage-collector/
Going to Reddit? Bye, bearophile
Done http://www.reddit.com/r/programming/comments/1uqabe/adding_java_and_c_to_the_mqtt_benchmarks_or_how_i/ http://www.reddit.com/r/d_language/comments/1uqa4d/adding_java_and_c_to_the_mqtt_benchmarks_or_how_i/ -- Paulo
Jan 08 2014
parent reply "Atila Neves" <atila.neves gmail.com> writes:
Thanks. Not many votes though given all the downvotes. The 
comments manage to be even worse than on my first blog post.

For some reason they all assume I don't know C++ even though I 
know it way better than D, not to mention that they nearly all 
miss the point altogether. Sigh.

On Wednesday, 8 January 2014 at 18:59:45 UTC, Paulo Pinto wrote:
 Am 08.01.2014 19:31, schrieb Atila Neves:
 I don't know if I have enough rep for it, I'd appreciate it if 
 someone
 who does posts it there.

 On Wednesday, 8 January 2014 at 18:24:00 UTC, bearophile wrote:
 Atila Neves:

 http://atilanevesoncode.wordpress.com/2014/01/08/adding-java-and-c-to-the-mqtt-benchmarks-or-how-i-learned-to-stop-worrying-and-love-the-garbage-collector/
Going to Reddit? Bye, bearophile
Done http://www.reddit.com/r/programming/comments/1uqabe/adding_java_and_c_to_the_mqtt_benchmarks_or_how_i/ http://www.reddit.com/r/d_language/comments/1uqa4d/adding_java_and_c_to_the_mqtt_benchmarks_or_how_i/ -- Paulo
Jan 08 2014
parent reply "Jesse Phillips" <Jesse.K.Phillips+D gmail.com> writes:
On Thursday, 9 January 2014 at 00:37:27 UTC, Atila Neves wrote:
 Thanks. Not many votes though given all the downvotes. The 
 comments manage to be even worse than on my first blog post.

 For some reason they all assume I don't know C++ even though I 
 know it way better than D, not to mention that they nearly all 
 miss the point altogether. Sigh.
I wonder if someone who "knows" C++ is going to help you out and improve your code, much like others did with the other languages you used.
Jan 09 2014
parent reply "Atila Neves" <atila.neves gmail.com> writes:
On Thursday, 9 January 2014 at 15:37:11 UTC, Jesse Phillips wrote:
 On Thursday, 9 January 2014 at 00:37:27 UTC, Atila Neves wrote:
 Thanks. Not many votes though given all the downvotes. The 
 comments manage to be even worse than on my first blog post.

 For some reason they all assume I don't know C++ even though I 
 know it way better than D, not to mention that they nearly all 
 miss the point altogether. Sigh.
I wonder if someone who "knows" C++ is going to help you out and improve your code, much like others did with the other languages you used.
I know C++. It's not that I can't finish it, it's that I can't be bothered to. That's the whole point of the post. Atila
Jan 10 2014
parent reply "Jesse Phillips" <Jesse.K.Phillips+D gmail.com> writes:
On Friday, 10 January 2014 at 11:43:05 UTC, Atila Neves wrote:
 On Thursday, 9 January 2014 at 15:37:11 UTC, Jesse Phillips 
 wrote:
 On Thursday, 9 January 2014 at 00:37:27 UTC, Atila Neves wrote:
 Thanks. Not many votes though given all the downvotes. The 
 comments manage to be even worse than on my first blog post.

 For some reason they all assume I don't know C++ even though 
 I know it way better than D, not to mention that they nearly 
 all miss the point altogether. Sigh.
I wonder if someone who "knows" C++ is going to help you out and improve your code, much like others did with the other languages you used.
I know C++. It's not that I can't finish it, it's that I can't be bothered to. That's the whole point of the post. Atila
I know, that doesn't mean someone can't come in and fix what they see wrong with it. C++ programmers have less reason to prove their language, but I think most are in denial that their language is diffacult and that it is a problem.
Jan 10 2014
next sibling parent reply Paulo Pinto <pjmlp progtools.org> writes:
On 10.01.2014 17:21, Jesse Phillips wrote:
 On Friday, 10 January 2014 at 11:43:05 UTC, Atila Neves wrote:
 On Thursday, 9 January 2014 at 15:37:11 UTC, Jesse Phillips wrote:
 On Thursday, 9 January 2014 at 00:37:27 UTC, Atila Neves wrote:
 Thanks. Not many votes though given all the downvotes. The comments
 manage to be even worse than on my first blog post.

 For some reason they all assume I don't know C++ even though I know
 it way better than D, not to mention that they nearly all miss the
 point altogether. Sigh.
I wonder if someone who "knows" C++ is going to help you out and improve your code, much like others did with the other languages you used.
I know C++. It's not that I can't finish it, it's that I can't be bothered to. That's the whole point of the post. Atila
I know, that doesn't mean someone can't come in and fix what they see wrong with it. C++ programmers have less reason to prove their language, but I think most are in denial that their language is diffacult and that it is a problem.
It does not help that C and C++ are currently the only portable languages across mainstream OS vendors. Currently I am using C++ for my Android hobby development, not because I don't like Java, rather as it being the only common language across all mobile SDKs. -- Paulo
Jan 10 2014
parent reply "Atila Neves" <atila.neves gmail.com> writes:
 It does not help that C and C++ are currently the only portable 
 languages across mainstream OS vendors.

 Currently I am using C++ for my Android hobby development, not 
 because I don't like Java, rather as it being the only common 
 language across all mobile SDKs.
I feel your pain. If I were to do a cross-platform app I'd probably do the same. At least the Android NDK has new gcc versions to use for C++11. I assume the same is true for iOS. Atila
Jan 10 2014
parent Jacob Carlborg <doob me.com> writes:
On 2014-01-10 19:52, Atila Neves wrote:

 I feel your pain. If I were to do a cross-platform app I'd probably do
 the same. At least the Android NDK has new gcc versions to use for
 C++11. I assume the same is true for iOS.
Yeah, iOS uses LLVM so that means C++11 as well. -- /Jacob Carlborg
Jan 10 2014
prev sibling parent "Atila Neves" <atila.neves gmail.com> writes:
 I wonder if someone who "knows" C++ is going to help you out 
 and improve your code, much like others did with the other 
 languages you used.
I know C++. It's not that I can't finish it, it's that I can't be bothered to. That's the whole point of the post. Atila
I know, that doesn't mean someone can't come in and fix what they see wrong with it. C++ programmers have less reason to prove their language, but I think most are in denial that their language is diffacult and that it is a problem.
Ah right, I misunderstood your what you meant. The denial is real and I think the comments on reddit are proof of that. Who knows, maybe I'll do it myself. The weirdest part of it for me is that my (broken but working) C++ implementation didn't even do badly performance-wise and people still complained. Atila
Jan 10 2014
prev sibling next sibling parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Wed, Jan 08, 2014 at 11:35:19AM +0000, Atila Neves wrote:
 http://atilanevesoncode.wordpress.com/2014/01/08/adding-java-and-c-to-the-mqtt-benchmarks-or-how-i-learned-to-stop-worrying-and-love-the-garbage-collector/
I have to say, this is also my experience with C++ after I learnt D. Writing C++ is just so painful, so time-consuming, and so not rewarding for the amount of effort you put into it, that I just can't bring myself to write C++ anymore when I have the choice. And manual memory management is a big part of that time sink. Which is why I believe that a lot of the GC-phobia among the C/C++ folk is misplaced. I can sympathise, though, because coming from a C/C++ background myself, I was highly skeptical of GC'd languages, and didn't find it to be a particularly appealing aspect of D when I first started learning it. But as I learned D, I eventually got used to having the GC around, and discovered that not only it reduced the number of memory bugs dramatically, it also increased my productivity dramatically: I never realized just how much time and effort it took to write code with manual memory management: you constantly have to think about how exactly you're going to be storing your objects, who it's going to get passed to, how to decide who's responsible for freeing it, what's the best strategy for deciding who allocates and who frees. These considerations permeate every aspect of your code, because you need to know whether to pass/return an object* to someone, and whether this pointer implies transfer of ownership or not, since that determines who's responsible to free it, etc.. Even with C++'s smart pointers, you still have to decide which one to use, and what pitfalls are associated with them (beware of cycles with refcounted pointers, passing auto_ptr to somebody might invalidate it after they return, etc.). It's like income tax: on just about every line of code you write, you have to pay the "memory management tax" of extra mental overhead and time spent fixing pointer bugs in order to not get the IRS (Invalid Reference Segfault :P) knocking on your shell prompt. Manual memory management is a LOT of effort, and to be quite honest, unless you're writing an AAA 3D game engine, you don't *need* that last 5% performance improvement that manual memory management *might* gives you. That is, if you get it right. Which most C/C++ coders don't. Case in point: recently at work I had the dubious pleasure of encountering some C code with a particularly pathological memory mismanagement bug. To give a bit of context: in the past, this part of the code used to be completely manually-managed with malloc's and free's everywhere. Just like most C code that implements business logic, it worked well when the original people who wrote it maintained it. But life happens, and people leave and new people come, so over time, the code degenerated into a sad mess riddled with memory leaks and pointer bugs everywhere. So the team lead finally put his foot down, and replaced much of that old code with a ref-counted infrastructure. (This being C, installing a GC was too much work; plus, GC-phobia is pretty strong in these parts.) After all, ref-counting is the silver bullet to cure manual memory management troubles, right? Well... Fast-forward a couple o' years, and here I am, helping a coworker figure out why the code was crashing. Long story short, we eventually found that it was keeping a ref-counted container that contains two (or more) ref-counted objects, each of which represented an async task spawned by the parent process. The idea behind this code was to run multiple computations on the same data, and we will use the results from whoever finishes first. The remaining task(s) will simply be terminated. So *somebody*, noting that we had a ref-counted system, decided to take advantage of that fact by setting it up so that when a task finishes, it will destroy the sub-object it's associated with, and the dtor of this object (which will be automatically invoked by the ref-counting system) will then walk the container and destruct every other object, which in turn will terminate their associated tasks. Anybody spot the problem yet? The reasoning (as far as I can reconstruct it, anyway), goes: "In order for the dtor to destruct the remaining tasks, we just have to decrement the refcount on the container object; since there should only be 1 reference to it, this will cause it to dip to 0, and then the container's dtor will take care of cleaning up all the other tasks. But in order for the task, when it finishes, to trigger the dtor of its associated sub-object, the refcount of the sub-object must be 1, otherwise the dtor won't trigger and we'll get stuck. So either the container's reference to the sub-object shouldn't be counted, or the task's reference to the sub-object shouldn't be counted. ..." And it just goes downhill from there. So much for refcounting solving memory-management woes. I'm becoming more and more convinced that most coders have no idea how to write manual memory management code properly. Or ref-counted code, for that matter. For all the time and effort it took to implement a ref-counting system in *C*, no less, and the time and effort it took to fix all the bugs associated with it, now somebody conveniently goes and subverts the ref-counting system, and we wonder why the code isn't working? And this isn't even performance-critical code; it's *business logic*, for crying out loud. Sighh... When I code in D, I discover to my pleasant surprise how much extra time I have (and how much more spare mental capacity I have) now that I don't have to continuously think about memory management. Sure, some of the resulting code may not be squeezing every last drop of juice from my CPU, but 95% of the time, it doesn't even matter anyway, 'cos it's not even the performance bottleneck. One of the symptoms of C/C++ coders (myself included) is that we like to write code in a funny, cramped style that we've convinced ourselves is "optimal code". This includes insistence on micro-managing memory allocations. However, most of this is premature optimization, which can be readily proved by running a profiler on your program, upon which you discover that *none* of your meticulously-coded fine-tuned memory management code and carefully written (aka unreadable and unmaintable) loops is even anywhere *near* the real performance bottleneck, which turns out to be a call to printf() that you forgot to comment out. Or a strlen() whose necessity was forced upon you because C/C++ is still suffering from that age-old mistake of conflating arrays with pointers. (Honestly, the necessity of using strlen() in inconvenient places easily overshadows 99% of the meticulously-crafted optimizations you spent 40 hours to write.) The amount of headache (and time better spent thinking about more important things, like how to implement an O(n log n) algorithm in place of the current O(n^2) algorithm that will singlehandedly make *all* of your other premature optimizations moot) saved by having a GC is almost priceless. Unless you're writing an AAA 3D game engine. Which only 5% of us coders have the dubious pleasure of working on. :-P Hooray for GC's, I say. T -- Дерево держится корнями, а человек - друзьями.
Jan 08 2014
next sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 1/8/14 11:15 AM, H. S. Teoh wrote:
 On Wed, Jan 08, 2014 at 11:35:19AM +0000, Atila Neves wrote:
 http://atilanevesoncode.wordpress.com/2014/01/08/adding-java-and-c-to-the-mqtt-benchmarks-or-how-i-learned-to-stop-worrying-and-love-the-garbage-collector/
[snip] You may want to paste all that as a reddit comment. Andrei
Jan 08 2014
prev sibling next sibling parent reply Paulo Pinto <pjmlp progtools.org> writes:
Am 08.01.2014 20:15, schrieb H. S. Teoh:
 On Wed, Jan 08, 2014 at 11:35:19AM +0000, Atila Neves wrote:
 http://atilanevesoncode.wordpress.com/2014/01/08/adding-java-and-c-to-the-mqtt-benchmarks-or-how-i-learned-to-stop-worrying-and-love-the-garbage-collector/
[snip] Thanks very much for sharing your experience. As I shared a few times here, it was Oberon which opened my eyes to GC enabled systems programming languages, around 1996, maybe. After that I was curious to learn about the other descendants of Oberon and Modula-3. Sadly none of them got an uptake outside ETHZ While researching for my Oberon article, I have discovered the Cedar programming language, developed at Xerox PARC as part of their Mesa system. A strong typed systems programming language with GC, as well as manual memory management, modules and functional programming features done in 1981. My initial though was, how would today's systems look like if Xerox had better connections to the outside world instead of AT&T. -- Paulo
Jan 08 2014
parent reply Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:
On 08/01/14 21:22, Paulo Pinto wrote:
 As I shared a few times here, it was Oberon which opened my eyes
 to GC enabled systems programming languages, around 1996, maybe.
What was the GC design for Oberon, and how does that relate to what's in D (and what's in other GC'd languages)?
Jan 09 2014
parent "Paulo Pinto" <pjmlp progtools.org> writes:
On Thursday, 9 January 2014 at 23:02:57 UTC, Joseph Rushton 
Wakeling wrote:
 On 08/01/14 21:22, Paulo Pinto wrote:
 As I shared a few times here, it was Oberon which opened my 
 eyes
 to GC enabled systems programming languages, around 1996, 
 maybe.
What was the GC design for Oberon, and how does that relate to what's in D (and what's in other GC'd languages)?
The original Oberon was a simple mark and sweep collector. Initially implemented in Assembly. In later versions it was coded in Oberon itself. Original 1992/2005 edition http://www.inf.ethz.ch/personal/wirth/ProjectOberon1992.pdf 2013 edition with images of the workstations were Oberon ran http://www.inf.ethz.ch/personal/wirth/ProjectOberon/PO.System.pdf EthOS used a mark and sweep GC with support for weak pointers and finalization. Running when the system was idle or when not enough memory was available. http://research.microsoft.com/en-us/um/people/cszypers/books/insight-ethos.pdf Active Oberon implementation used a mark and sweep with finalization support. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.85.5753&rep=rep1&type=pdf Modula-3 used a compacting GC initially, with an optional background one. https://modula3.elegosoft.com/cm3/doc/help/bib.html http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.36.6890 Cedar used a concurrent reference-counting collector, coupled with a mark and sweep one for cycle removals, with finalization support http://www.textfiles.com/bitsavers/pdf/xerox/parc/techReports/CSL-84-7_On_Adding_Garbage_Collection_and_Runtime_Types_to_a_Strongly-Typed_Statically-Checked_Concurrent_Language.pdf The features are quite similar to D: - GC - Allocation of data structures statically in global memory and stack - Escape hatches to allocate memory manually when needed I cannot say if they also allow for interior pointers like D does. However the main point about Oberon and other languages wasn't only technical, but human. Funny enough that is also Andrew Koening's latest post http://www.drdobbs.com/cpp/social-processes-and-the-design-of-progr/240165221 The people designing such systems believed that it was possible to write from the ground up a workstation operating system in a GC enabled systems programming language, with minimal Assembly. They did succeed and built workstations that were usable for normal office work, which were then used at ETHZ, Xerox and Olivetti for some time. For games, some more effort would be required I do acknowledge that. However the world at large, ignored these efforts. As Andrew nicely puts on his article, many times the social barrier is higher than the technical one. For many developers hearing the GC word, safe coding, bounds checking is enough to make them run away as fast as they can. -- Paulo
Jan 10 2014
prev sibling next sibling parent reply Benjamin Thaut <code benjamin-thaut.de> writes:
Am 08.01.2014 20:15, schrieb H. S. Teoh:
 Manual memory management is a LOT of effort, and to be quite honest,
 unless you're writing an AAA 3D game engine, you don't *need* that last
 5% performance improvement that manual memory management *might* gives
 you. That is, if you get it right. Which most C/C++ coders don't.
The problem is, that with the current D-GC its not 5%. Its 300%. See: http://3d.benjamin-thaut.de/?p=20 And people who are currently using C++ use C++ for a reason. And usually this reason is performance. As long as D remains with its current GC people will refuse to switch, given the 300% speed impact. Additionaly programming with a GC often leads to a lot more allocations, and programmers beeing unaware of all those allocations and the possibility that those allocations slow down the program and might even trash the cache. Programmers who properly learned manual memory management are often more aware of whats happening in the background and how to optmize algorithms for memory usage, which can lead to astonishing performance improvements on modern hardware. Also a GC is for automatic memory management. But memory is just a resource. And there are a lot other resources then just memory. Having a GC does not free you from doing other manual memory management, which still can be annoying and can create the exact same issues as with everything implementes the IDisposeable interface doesn't really improve the situation. It would be a lot better if GCs would focus on automatic resource management in general, so the user is freed of all such tedious tasks, and not just a portion of it. Additionaly switching away from C++ is also not a option because of other reasons. For example cross plattform compatibility. I don't know any language other then C/C++ which would actually work on all plattforms we (my team at work) currently develop for. Not even D (mostly because of missing ports of druntime / phobos. Maybe even a missing hardware architecture.) But I fully agree, that if you do some non performance critical business logic or application logic its a lot more productive to use a garbage then D here, mostly because of better tooling and more mature libraries. Kind Regards Benjamin Thaut
Jan 08 2014
next sibling parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Wed, Jan 08, 2014 at 09:23:48PM +0100, Benjamin Thaut wrote:
 Am 08.01.2014 20:15, schrieb H. S. Teoh:
Manual memory management is a LOT of effort, and to be quite honest,
unless you're writing an AAA 3D game engine, you don't *need* that
last 5% performance improvement that manual memory management *might*
gives you. That is, if you get it right. Which most C/C++ coders
don't.
The problem is, that with the current D-GC its not 5%. Its 300%. See: http://3d.benjamin-thaut.de/?p=20
Well, your experience was based on writing a 3D game engine. :) I didn't claim that GCs are best for that scenario. How many of us write 3D game engines for a living?
 And people who are currently using C++ use C++ for a reason. And
 usually this reason is performance. As long as D remains with its
 current GC people will refuse to switch, given the 300% speed
 impact.
I think your view is skewed by your bad experience with doing 3D in D. I've ported (well, more like re-written) compute-intensive code from C/C++ to D before, and my experience has been that the D version is either on par, or performs even better. Definitely nowhere near the 300% slowdown you quote. (Not the mention the >50% reduction in development time compared with writing it in C/C++!) Like I said, if you're doing something that *needs* to squeeze every last bit of performance out of the machine, then the GC may not be for you. In fact, from what I hear, most people doing 3D engine work don't even *use* memory allocation in the core engine -- everything is preallocated so no allocation / free (not even malloc/free) is done at all. You never know if a particular system's malloc/free relies on linear free lists, which may cause O(n) worst-case performance -- something you definitely want to avoid if you have only 20ms to render the next frame. If so, then it's no wonder you see a 300% slowdown if you start using the GC inside of the 3D engine.
 Additionaly programming with a GC often leads to a lot more
 allocations, and programmers beeing unaware of all those allocations
 and the possibility that those allocations slow down the program and
 might even trash the cache. Programmers who properly learned manual
 memory management are often more aware of whats happening in the
 background and how to optmize algorithms for memory usage, which can
 lead to astonishing performance improvements on modern hardware.
But the same programmers who don't know how to allocate properly on a GC'd language will also write poorly-performing malloc/free code. Freeing the root of a large tree structure can potentially run with no fixed upper bound on time if the dtor recursively frees all child nodes, so it's not that much better than a GC collection cycle. People who know to avoid doing that will also know to write GC'd code in a way that doesn't cause bad GC performance.
 Also a GC is for automatic memory management. But memory is just a
 resource. And there are a lot other resources then just memory.
 Having a GC does not free you from doing other manual memory
 management, which still can be annoying and can create the exact

 codebase where almost everything implementes the IDisposeable
 interface doesn't really improve the situation. It would be a lot
 better if GCs would focus on automatic resource management in
 general, so the user is freed of all such tedious tasks, and not
 just a portion of it.
True, but having a GC for memory is still better than having nothing at all. Memory, after all, is the most commonly used resource, generically speaking.
 Additionaly switching away from C++ is also not a option because of
 other reasons. For example cross plattform compatibility. I don't
 know any language other then C/C++ which would actually work on all
 plattforms we (my team at work) currently develop for. Not even D
 (mostly because of missing ports of druntime / phobos. Maybe even a
 missing hardware architecture.)
That doesn't alleviate the painfulness of coding in C++.
 But I fully agree, that if you do some non performance critical
 business logic or application logic its a lot more productive to use a
 garbage collected language.
If you're doing performance-critical / realtime stuff, you probably want to be very careful about how you use malloc/free anyway, same goes for GC's.

 mostly because of better tooling and more mature libraries.
[...] I find the lack of strong metaprogramming capabilities in Java (never lots of duplicated code, or adding too many indirections that hurts performance. For compute-intensive code, too many indirections can mean the difference between something finishing in 2 days instead of 2 hours. T -- Computers are like a jungle: they have monitor lizards, rams, mice, c-moss, binary trees... and bugs.
Jan 08 2014
parent reply Benjamin Thaut <code benjamin-thaut.de> writes:
Am 08.01.2014 21:57, schrieb H. S. Teoh:
 On Wed, Jan 08, 2014 at 09:23:48PM +0100, Benjamin Thaut wrote:

 Well, your experience was based on writing a 3D game engine. :) I didn't
 claim that GCs are best for that scenario. How many of us write 3D game
 engines for a living?
No, this expierence is not only based of this. I observed multiple discussions on the newsgroup, where turning off the GC would speed up the program by factor 3. The most recent one was parsing a text file and filling a associative array with the contents of that text file, which is not really 3d programming. What I'm really trying to say is: I would be willing to use a GC in D to, but only if D actually has a state of the art GC and not some primitive old does work without language support GC.
 In fact, from what I hear, most people doing 3D engine work don't even
 *use* memory allocation in the core engine -- everything is preallocated
 so no allocation / free (not even malloc/free) is done at all. You never
 know if a particular system's malloc/free relies on linear free lists,
 which may cause O(n) worst-case performance -- something you definitely
 want to avoid if you have only 20ms to render the next frame. If so,
 then it's no wonder you see a 300% slowdown if you start using the GC
 inside of the 3D engine.
That is a common misconception you can read very often on the internet. That doesn't make it true however. I saw lots of game and engine code in my life already, and its far from preallocating everything. It is tried to keep allocations to a minimum, but they are not avoided at all costs. If its neccessary they are just done (for example when spawning a new object, like a particle effect). It is even common to use scripting languages like lua for some tasks in game development, and lua allocates quite a lot during execution.
 But the same programmers who don't know how to allocate properly on a
 GC'd language will also write poorly-performing malloc/free code.
 Freeing the root of a large tree structure can potentially run with no
 fixed upper bound on time if the dtor recursively frees all child nodes,
 so it's not that much better than a GC collection cycle. People who know
 to avoid doing that will also know to write GC'd code in a way that
 doesn't cause bad GC performance.
That is another common argument of pro GC people I have never seen in partice yet. Meaning, I never seen a case where freeing a tree of objects would cause a significant enough slowdown. I however saw lots of cases where a garbage collection caused a significant slowdown.
 True, but having a GC for memory is still better than having nothing at
 all. Memory, after all, is the most commonly used resource, generically
 speaking.
Still it only solves half the problem.
 Additionaly switching away from C++ is also not a option because of
 other reasons. For example cross plattform compatibility. I don't
 know any language other then C/C++ which would actually work on all
 plattforms we (my team at work) currently develop for. Not even D
 (mostly because of missing ports of druntime / phobos. Maybe even a
 missing hardware architecture.)
That doesn't alleviate the painfulness of coding in C++.
It was never intended to. I just wanted to make the point, that even if you want, you can't avoid C++.
 But I fully agree, that if you do some non performance critical
 business logic or application logic its a lot more productive to use a
 garbage collected language.
If you're doing performance-critical / realtime stuff, you probably want to be very careful about how you use malloc/free anyway, same goes for GC's.
This statement again has been posted hunderts of times in the GC vs manual memory management discussion. And again I never saw that the execution time of malloc or other self written allocators are a problem in partice. I did however see that the runtime of a GC allocation became a problem, to the point where it is avoided entierly. With realtime I didn't really mean that "hard" realtime requirements of embeded systems and alike more like "soft" realtime requirements where you want to avoid pause times as much as possible.
 I find the lack of strong metaprogramming capabilities in Java (never

 lots of duplicated code, or adding too many indirections that hurts
 performance.  For compute-intensive code, too many indirections can mean
 the difference between something finishing in 2 days instead of 2 hours.
I fully agree here. Still when choosing a programming language you also have to pick one that all programmers on the team can and want to use. I fear that the D metaprogramming capabilities will scare of quite a few programmers because it seems to complicated to them. (Its really the same with C++ metaprogramming. Its syntactically ugly and verbose, but gets the job done, and is not so complicated if you are familiar with the most important concepts).
Jan 08 2014
next sibling parent Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:
On 08/01/14 23:23, Benjamin Thaut wrote:
 No, this expierence is not only based of this. I observed multiple discussions
 on the newsgroup, where turning off the GC would speed up the program by factor
 3.
In my experience it seems to depend very much on the particular problem being solved and the circumstances in which memory is being allocated. Example: I have some code where, at least in the source, dynamic arrays are being created via "new" in a (fairly) inner loop, and this can be run repeatedly apparently without the GC being triggered -- in fact, my suspicion is that the allocated space is just being repeatedly re-used and overwritten, so there are no new allocs or frees. OTOH some other code I wrote recently had a situation where, as the data structure in question expanded, a new array was allocated, and an old one copied and then deallocated. This was fine up to a certain scale but above a certain size the GC would (often but not always) kick in, leading to a significant (but unpredictable) slowdown. My impression was that below a certain level the GC is happy to either over-allocate (leaving lots of space for expansion) and/or avoid freeing memory (because there's plenty of memory still free), which avoids all the slowdown of alloc/free until there's a significant need for it.
Jan 08 2014
prev sibling next sibling parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Wed, Jan 08, 2014 at 11:43:26PM +0100, Joseph Rushton Wakeling wrote:
 On 08/01/14 23:23, Benjamin Thaut wrote:
No, this expierence is not only based of this. I observed multiple
discussions on the newsgroup, where turning off the GC would speed up
the program by factor 3.
In my experience it seems to depend very much on the particular problem being solved and the circumstances in which memory is being allocated. Example: I have some code where, at least in the source, dynamic arrays are being created via "new" in a (fairly) inner loop, and this can be run repeatedly apparently without the GC being triggered -- in fact, my suspicion is that the allocated space is just being repeatedly re-used and overwritten, so there are no new allocs or frees. OTOH some other code I wrote recently had a situation where, as the data structure in question expanded, a new array was allocated, and an old one copied and then deallocated. This was fine up to a certain scale but above a certain size the GC would (often but not always) kick in, leading to a significant (but unpredictable) slowdown. My impression was that below a certain level the GC is happy to either over-allocate (leaving lots of space for expansion) and/or avoid freeing memory (because there's plenty of memory still free), which avoids all the slowdown of alloc/free until there's a significant need for it.
So this proves that the real situation with GC vs manual memory management isn't as simple as a binary "GC is better" or "GC is bad". It depends a lot on the exact use case. And now that you mention it, there does seem to be some kind of threshold where something happens (I wasn't sure what it was before, but now I'm thinking maybe it's a change in GC behaviour) where there's a sudden change in program performance, that I've observed recently in one of my programs. I might have a look into it sometime -- though I was planning to redo that part of the code anyway, so I may or may not find out the real reason behind this. T -- Государство делает вид, что платит нам зарплату, а мы делаем вид, что работаем.
Jan 08 2014
prev sibling next sibling parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Wed, Jan 08, 2014 at 11:23:50PM +0100, Benjamin Thaut wrote:
 Am 08.01.2014 21:57, schrieb H. S. Teoh:
[...]
I find the lack of strong metaprogramming capabilities in Java (never

lots of duplicated code, or adding too many indirections that hurts
performance.  For compute-intensive code, too many indirections can
mean the difference between something finishing in 2 days instead of
2 hours.
I fully agree here. Still when choosing a programming language you also have to pick one that all programmers on the team can and want to use. I fear that the D metaprogramming capabilities will scare of quite a few programmers because it seems to complicated to them. (Its really the same with C++ metaprogramming. Its syntactically ugly and verbose, but gets the job done, and is not so complicated if you are familiar with the most important concepts).
Coming from a C++ background, I have to say that C++ metaprogramming, while possible, is only so in the most painful possible ways. My impression is that C++ gave template metaprogramming a bad name, because much of the metaprogramming aspects of templates were only discovered after the fact, so the original design was never intended to be used in the way it's used nowadays. As a result, people associate the design flaws in C++ templates with template programming and metaprogramming in general, whereas such flaws aren't an inherent feature of metaprogramming itself. Unfortunately, this makes people go "ewww" when they hear about D's metaprogramming, whereas the real situation is that metaprogramming is actually a pleasant experience in D, and very powerful if you know how to take advantage of it. One thing I really liked about TDPL is that Andrei sneakily introduces metaprogramming as "compile-time parameters" early on, so that by the time you get to the actual chapter on templates, you've already been using them comfortably for a long time, and no longer have an irrational fear of them. T -- Without geometry, life would be pointless. -- VS
Jan 08 2014
prev sibling parent "Atila Neves" <atila.neves gmail.com> writes:
 No, this expierence is not only based of this. I observed 
 multiple discussions on the newsgroup, where turning off the GC 
 would speed up the program by factor 3. The most recent one was
The GC doesn't even show up in the profiler for this/my use case. The one optimisation I did to avoid allocations increased performance by all of 5%. It really depends on the use case, and I don't think assuming a factor of 3 is advisable.
 That is another common argument of pro GC people I have never 
 seen in partice yet. Meaning, I never seen a case where freeing 
 a tree of objects would cause a significant enough slowdown. I 
 however saw lots of cases where a garbage collection caused a 
 significant slowdown.
Well, if I wasn't aware of allocation I wouldn't have done the optimisation mentioned above, so it's a good point. As far as slowdown happening with manual memory management, in certain cases cleaning up reference counted smart pointers can cause as much of a slowdown as a GC kicking in. This isn't my opinion though, there are data to that effect. Again, it depends on the use case.
 Still it only solves half the problem.
Maybe in Java. In D at least we have struct destructors for other resources.
 It was never intended to. I just wanted to make the point, that 
 even if you want, you can't avoid C++.
A fair point. I think what we're saying is not that we won't ever write C++ again, but that we won't write it again if given the choice and if another language (not necessarily D) is also a good fit. I'd be surprised if I wasn't still writing / refactoring / debugging C++ code a few decades for now. I don't want to write C again ever, but I know I'll have to.
 I fully agree here. Still when choosing a programming language 
 you also have to pick one that all programmers on the team can 
 and want to use. I fear that the D metaprogramming capabilities 
 will scare of quite a few programmers because it seems to 
 complicated to them. (Its really the same with C++ 
 metaprogramming. Its syntactically ugly and verbose, but gets 
 the job done, and is not so complicated if you are familiar 
 with the most important concepts).
I disagree wholeheartedly. It's a _lot_ more complicated in C++. D can also do more than C++, with far saner syntax.
Jan 08 2014
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 1/8/2014 12:23 PM, Benjamin Thaut wrote:
 Additionaly programming with a GC often leads to a lot more allocations,
I believe that this is incorrect. Using GC leads to fewer allocations, because you do not have to make extra copies just so it's clear who owns the allocations. For example, if you've got an array of char* pointers, in D some can be GC allocated, some can be malloc'd, some can be slices, some can be pointers to string literals. In C/C++, the array has to decide on an ownership policy, and all elements must conform. This means extra copies.
Jan 08 2014
next sibling parent reply Manu <turkeyman gmail.com> writes:
On 9 January 2014 13:08, Walter Bright <newshound2 digitalmars.com> wrote:

 On 1/8/2014 12:23 PM, Benjamin Thaut wrote:

 Additionaly programming with a GC often leads to a lot more allocations,
I believe that this is incorrect. Using GC leads to fewer allocations, because you do not have to make extra copies just so it's clear who owns the allocations.
You're making a keen assumption here that C programmers use STL. And no sane programmer that I've ever worked with uses STL precisely for this reason :P Sadly, being conscious of eliminating unnecessary copies in C/C++ takes a lot of work (see: time and money), so there is definitely value in factoring that problem away, but the existing GC is broken. Until it doesn't leak, stop the world, and/or can run incrementally, it remains no good for realtime usage. There were 2 presentations on improved GC's last year, why do we still have the lamest GC imaginable? I'm still yet to hear any proposal on how this situation will ever significantly improve... *cough* ARC... For example, if you've got an array of char* pointers, in D some can be GC
 allocated, some can be malloc'd, some can be slices, some can be pointers
 to string literals. In C/C++, the array has to decide on an ownership
 policy, and all elements must conform.

 This means extra copies.
Jan 08 2014
next sibling parent "Paulo Pinto" <pjmlp progtools.org> writes:
On Thursday, 9 January 2014 at 06:11:58 UTC, Manu wrote:
 On 9 January 2014 13:08, Walter Bright 
 <newshound2 digitalmars.com> wrote:

 On 1/8/2014 12:23 PM, Benjamin Thaut wrote:

 Additionaly programming with a GC often leads to a lot more 
 allocations,
I believe that this is incorrect. Using GC leads to fewer allocations, because you do not have to make extra copies just so it's clear who owns the allocations.
You're making a keen assumption here that C programmers use STL. And no sane programmer that I've ever worked with uses STL precisely for this reason :P Sadly, being conscious of eliminating unnecessary copies in C/C++ takes a lot of work (see: time and money), so there is definitely value in factoring that problem away, but the existing GC is broken. Until it doesn't leak, stop the world, and/or can run incrementally, it remains no good for realtime usage. There were 2 presentations on improved GC's last year, why do we still have the lamest GC imaginable? I'm still yet to hear any proposal on how this situation will ever significantly improve... *cough* ARC...
For it to be done properly, RC needs to be compiler assisted, otherwise it is just too slow. -- Paulo
Jan 08 2014
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 1/8/2014 10:11 PM, Manu wrote:
 On 9 January 2014 13:08, Walter Bright <newshound2 digitalmars.com
 <mailto:newshound2 digitalmars.com>> wrote:

     On 1/8/2014 12:23 PM, Benjamin Thaut wrote:

         Additionaly programming with a GC often leads to a lot more
allocations,


     I believe that this is incorrect. Using GC leads to fewer allocations,
     because you do not have to make extra copies just so it's clear who owns
the
     allocations.


 You're making a keen assumption here that C programmers use STL.
My observation has nothing to do with the STL, nor does it have anything to do with how well the GC is implemented. Also, neither smart pointers nor ARC resolve the excessive copying problem as I described it. I've been coding in C for 15-20 years before the STL, and the problem of excessive copying is a significant source of slowdown for C code. Consider this C code: char* cat(char* s1, char* s2) { size_t len1 = s1 ? strlen(s1) : 0; size_t len2 = s2 ? strlen(s2) : 0; char* s = (char*)malloc(len1 + len2 + 1); assert(s); memcpy(s, s1, len1); memcpy(s + len1, s2, len2); s[len1 + len2] = 0; return s; } Now consider D code: string cat(string s1, string s2) { return s1 ~ s2; } I can call cat with: cat("hello", null); and it works without copying in D, it just returns s1. In C, I gotta copy, ALWAYS. (C's strings being 0 terminated also forces much extra copying, but that's another topic.) The point is, no matter how slow the GC is relative to malloc, not allocating is faster than allocating, and a GC can greatly reduce the amount of alloc/copy going on. The reason that Java does excessive amounts of allocation is because Java doesn't have value types, not because Java has a GC.
Jan 08 2014
next sibling parent "Paulo Pinto" <pjmlp progtools.org> writes:
On Thursday, 9 January 2014 at 07:07:29 UTC, Walter Bright wrote:
 On 1/8/2014 10:11 PM, Manu wrote:
 On 9 January 2014 13:08, Walter Bright 
 <newshound2 digitalmars.com
 <mailto:newshound2 digitalmars.com>> wrote:
The reason that Java does excessive amounts of allocation is because Java doesn't have value types, not because Java has a GC.
That might change if IBM's extensions ever land in Java. http://www.slideshare.net/rsciampacone/javaone-2013-introduction-to-packedobjects Video presentation available here, http://www.parleys.com/play/52504e5ee4b0a43ac121240b Walter is right regarding D. All other GC enabled systems programming languages do have value objects and don't require everything to be on heap. So the stress on the GC to clean memory is not as high as on Java and similar systems. -- Paulo
Jan 09 2014
prev sibling next sibling parent reply "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Thursday, 9 January 2014 at 07:07:29 UTC, Walter Bright wrote:
 and it works without copying in D, it just returns s1. In C, I 
 gotta copy, ALWAYS.
Only if you write libraries, in an application you can set your own policies (invariants).
 (C's strings being 0 terminated also forces much extra copying, 
 but that's another topic.)
Not if you have your own allocator and split chopped strings (you can just overwrite the boundary character).
 The point is, no matter how slow the GC is relative to malloc, 
 not allocating is faster than allocating, and a GC can greatly 
 reduce the amount of alloc/copy going on.
But since malloc/free is tedious c-programmers tend to avoid it by embedding objects in large structs and put a variable sized object at the end of it... Or have their own pool (possibly on the stack at the location where it should be released).
Jan 09 2014
next sibling parent reply "Paulo Pinto" <pjmlp progtools.org> writes:
On Thursday, 9 January 2014 at 08:40:30 UTC, Ola Fosheim Grøstad 
wrote:
 On Thursday, 9 January 2014 at 07:07:29 UTC, Walter Bright 
 wrote:
 and it works without copying in D, it just returns s1. In C, I 
 gotta copy, ALWAYS.
Only if you write libraries, in an application you can set your own policies (invariants).
 (C's strings being 0 terminated also forces much extra 
 copying, but that's another topic.)
Not if you have your own allocator and split chopped strings (you can just overwrite the boundary character).
 The point is, no matter how slow the GC is relative to malloc, 
 not allocating is faster than allocating, and a GC can greatly 
 reduce the amount of alloc/copy going on.
But since malloc/free is tedious c-programmers tend to avoid it by embedding objects in large structs and put a variable sized object at the end of it... Or have their own pool (possibly on the stack at the location where it should be released).
I have only seen those things work in small AAA class teams.
Jan 09 2014
parent reply "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Thursday, 9 January 2014 at 09:10:07 UTC, Paulo Pinto wrote:
 I have only seen those things work in small AAA class teams.
But you have probably seen c programs allocate a bunch of different small structs with a single malloc where it is known that they will be freed in the same location? A compiler needs whole program analysis to do the same. So yes, c programs will have fewer allocs if the programmer cared.
Jan 09 2014
next sibling parent reply "Paulo Pinto" <pjmlp progtools.org> writes:
On Thursday, 9 January 2014 at 09:38:31 UTC, Ola Fosheim Grøstad 
wrote:
 On Thursday, 9 January 2014 at 09:10:07 UTC, Paulo Pinto wrote:
 I have only seen those things work in small AAA class teams.
But you have probably seen c programs allocate a bunch of different small structs with a single malloc where it is known that they will be freed in the same location? A compiler needs whole program analysis to do the same. So yes, c programs will have fewer allocs if the programmer cared.
Yes, I did. Not much different than memory pools in Turbo Pascal and Objective-C for that matter. And even more strange things, where the whole memory gets allocated at start, then some "handles" are used with mysterious macros to convert back and forth to real pointers. I have also seen lots of other storage tricks that go easily out of control when the team either grows over a certain size, or management decides to outsource part of the development or lowering the expected skill set of new team members. Then you watch the older guys playing fire brigade to track down issues of release X.Y.Z at customer site, almost every week. -- Paulo
Jan 09 2014
parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Thu, Jan 09, 2014 at 09:49:15AM +0000, Paulo Pinto wrote:
 On Thursday, 9 January 2014 at 09:38:31 UTC, Ola Fosheim Grstad
 wrote:
On Thursday, 9 January 2014 at 09:10:07 UTC, Paulo Pinto wrote:
I have only seen those things work in small AAA class teams.
But you have probably seen c programs allocate a bunch of different small structs with a single malloc where it is known that they will be freed in the same location? A compiler needs whole program analysis to do the same. So yes, c programs will have fewer allocs if the programmer cared.
Yes, I did. Not much different than memory pools in Turbo Pascal and Objective-C for that matter. And even more strange things, where the whole memory gets allocated at start, then some "handles" are used with mysterious macros to convert back and forth to real pointers. I have also seen lots of other storage tricks that go easily out of control when the team either grows over a certain size, or management decides to outsource part of the development or lowering the expected skill set of new team members. Then you watch the older guys playing fire brigade to track down issues of release X.Y.Z at customer site, almost every week.
[...] Exactly!! All these tricks are "possible" in C, but that's what they essentially are: tricks, hacks around the language. You can only keep it up with a small, dedicated core team. As soon as the PTBs decide to hire new grads and move people around, you're screwed, 'cos the old guy who was in charge of the tricky macros is no longer on the team, and nobody else understands how the macros work, and the new guys are under pressure to show contribution, so they barge in making assumptions about how things work -- which usually means nave C semantics, lots of strcpy's, direct pointer arithmetic, I don't use these weird macros 'cos I don't understand what they do. Result: fire brigade. :-) This is why compiler-enforced type attributes ultimately trumps any kind of coding convention. It forces everyone to do the Right Thing. This is why strings (arrays) with built-in length is better, because it allows slicing without needing to decide whether you should copy or modify in-place (*someone* will inevitably get it wrong). C's superiority is keyed on the programmer being perfect -- the philosophy of the language is to trust the programmer, to believe that the programmer knows what he's doing. Theoretically speaking, this is a good thing, because the compiler won't stand in your way and annoy you when you're trying to do something clever. (This is also what made me like C in the first place -- I was 19 at the time, so it figures. :-P) Unfortunately, in practice, humans are fallible -- very much fallible and error-prone -- so this philosophy only leads to pain and more pain. With a single-person project you can still somewhat maintain some semblance of order. But when you have a team of 15+ programmers (at my job we have up to 50), then it's total chaos, and you start to code by paranoia, i.e,, assume everyone else will screw up and add every possible safeguard you can think of in your part of the code, so that when things go wrong it's not your fault. Which means every string modification requires copying, which means performance is out the window. It means adding layers of indirection to shield your code from the outside world. Which means even more pointers to work with, which in turn means you start getting into pointer management problems, and start needing reference counting (which, as I described in an earlier post, people *still* screw up). At some point, you start wishing C had a GC to clean up the mess. T -- Public parking: euphemism for paid parking. -- Flora
Jan 09 2014
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 1/9/2014 1:38 AM, "Ola Fosheim Grøstad" 
<ola.fosheim.grostad+dlang gmail.com>" wrote:
 On Thursday, 9 January 2014 at 09:10:07 UTC, Paulo Pinto wrote:
 I have only seen those things work in small AAA class teams.
But you have probably seen c programs allocate a bunch of different small structs with a single malloc where it is known that they will be freed in the same location? A compiler needs whole program analysis to do the same. So yes, c programs will have fewer allocs if the programmer cared.
A GC does not prevent such techniques.
Jan 09 2014
parent reply "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Thursday, 9 January 2014 at 09:55:42 UTC, Walter Bright wrote:
 A GC does not prevent such techniques.
No, but programmers gravitate towards less work... If alloc is transparent and free is hidden... You gain a lot from not being explicit, but you get more allocations overall.
Jan 09 2014
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 1/9/2014 3:40 AM, "Ola Fosheim Grøstad" 
<ola.fosheim.grostad+dlang gmail.com>" wrote:
 On Thursday, 9 January 2014 at 09:55:42 UTC, Walter Bright wrote:
 A GC does not prevent such techniques.
No, but programmers gravitate towards less work... If alloc is transparent and free is hidden... You gain a lot from not being explicit, but you get more allocations overall.
GC doesn't even make those techniques harder. I can't see any merit to the idea that GC makes for excessive allocation.
Jan 09 2014
parent "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Thursday, 9 January 2014 at 17:17:53 UTC, Walter Bright wrote:
 GC doesn't even make those techniques harder.

 I can't see any merit to the idea that GC makes for excessive 
 allocation.
People do what they are accustomed to and what is easy. Library writers are more likely to do allocation for you if they can forget about ownership. I am more likely to use several single object "new" calls in C++, and more likely to do a "shared malloc" in C. C++ support RAII, C doesn't. "shared malloc" is a cheap version of RAII.
Jan 09 2014
prev sibling next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 1/9/2014 12:40 AM, "Ola Fosheim Grøstad" 
<ola.fosheim.grostad+dlang gmail.com>" wrote:
 On Thursday, 9 January 2014 at 07:07:29 UTC, Walter Bright wrote:
 and it works without copying in D, it just returns s1. In C, I gotta copy,
 ALWAYS.
Only if you write libraries, in an application you can set your own policies (invariants).
Please explain how this can work passing both string literals and allocated strings to cat().
 (C's strings being 0 terminated also forces much extra copying, but that's
 another topic.)
Not if you have your own allocator and split chopped strings (you can just overwrite the boundary character).
How do you return a string that is the path part of a path/filename? (The terminating 0 is not a problem solved by creating your own allocator.)
Jan 09 2014
parent reply "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Thursday, 9 January 2014 at 09:58:24 UTC, Walter Bright wrote:
 Please explain how this can work passing both string literals 
 and allocated strings to cat().
By having your own string allocator that tests for membership when you free (if you allow free and foreign strings in your cat)?
 How do you return a string that is the path part of a 
 path/filename? (The terminating 0 is not a problem solved by 
 creating your own allocator.)
If you discard the original you split at '/'. If you use your own stringallocator you don't have to worry about free... You either let the garbage remain until the pool is released or have a separate allocation structure that allows internal splits (no private size info before first char).
Jan 09 2014
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 1/9/2014 2:46 AM, "Ola Fosheim Grøstad" 
<ola.fosheim.grostad+dlang gmail.com>" wrote:
 On Thursday, 9 January 2014 at 09:58:24 UTC, Walter Bright wrote:
 Please explain how this can work passing both string literals and allocated
 strings to cat().
By having your own string allocator that tests for membership when you free (if you allow free and foreign strings in your cat)?
How does that work when you pass it "hello"? allocated with malloc()? basically any data that has mixed ancestry? Note that your code doesn't always have control over this - you may have written a library intended to be used by others, or you may be calling a library written by others.
 How do you return a string that is the path part of a path/filename? (The
 terminating 0 is not a problem solved by creating your own allocator.)
If you discard the original you split at '/'.
That doesn't work if you pass a string literal, or if you are not the owner of the data.
 If you use your own
 stringallocator you don't have to worry about free... You either let the
garbage
 remain until the pool is released or have a separate allocation structure that
 allows internal splits (no private size info before first char).
That doesn't work if you're passing strings with mixed ancestry.
Jan 09 2014
parent reply "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Thursday, 9 January 2014 at 17:15:46 UTC, Walter Bright wrote:
 How does that work when you pass it "hello"? allocated with 
 malloc()? basically any data that has mixed ancestry?
Why would you do that? You would have to overload cat then.
 Note that your code doesn't always have control over this - you 
 may have written a library intended to be used by others, or 
 you may be calling a library written by others.
The typical C (and the old C++) way has been to roll your own to get what you want and only use very focused libraries (like zlib, fft etc), or only use one big framework that define all their own stuff in a efficient and uniform manner with their own systems (Qt etc). But it becomes tedious when using more than one framework.
 That doesn't work if you're passing strings with mixed ancestry.
Well, you have to decide if you want to roll your own, use a framework or use the old C way. The point is more: you can make your own and make it C-compatible, and reasonably efficient. Usually there are different representations that are more or less efficient or convenient based on what you want to do. Even for strings. For instance, you can have a high speed ascii MSB string representation that is 64 bit aligned and that sorts fine using 64 bit uint, and which is 0 terminated (padded to the 8 byte-aligned boundary).
Jan 09 2014
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 1/9/2014 10:18 AM, "Ola Fosheim Grøstad" 
<ola.fosheim.grostad+dlang gmail.com>" wrote:
 On Thursday, 9 January 2014 at 17:15:46 UTC, Walter Bright wrote:
 How does that work when you pass it "hello"? allocated with malloc()?
 basically any data that has mixed ancestry?
Why would you do that? You would have to overload cat then.
So you agree that it won't work. BTW, it happens all the time when dealing with strings. For example, dealing with filenames, file extensions, and paths. Components can come from the command line, string literals, malloc, slices, etc., all mixed up together. Overloading doesn't work because a string literal and a string allocated by something else have the same type.
 That doesn't work if you're passing strings with mixed ancestry.
Well, you have to decide if you want to roll your own, use a framework or use the old C way. The point is more: you can make your own and make it C-compatible, and reasonably efficient.
My point is you can't avoid making the extra copies without GC in any reasonable way.
Jan 09 2014
next sibling parent reply "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Thursday, 9 January 2014 at 18:34:58 UTC, Walter Bright wrote:
 On 1/9/2014 10:18 AM, "Ola Fosheim Grøstad"
 Why would you do that? You would have to overload cat then.
So you agree that it won't work.
It will work for string literals or for malloc'ed strings, but not for both using the same function unless you start to depend on the data sections used for literals (memory range testing). Which is a dirty tool-dependent hack.
 Overloading doesn't work because a string literal and a string 
 allocated by something else have the same type.
Not if you return your own type, but have the same structure? You return a struct, containing a variabled sized array of char, and overload on that? But I see your point regarding literal/malloc, const char* and char* is a shady area, you can basically get anything cast to const char*.
Jan 09 2014
parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Thu, Jan 09, 2014 at 07:08:42PM +0000, digitalmars-d-bounces puremagic.com
wrote:
 On Thursday, 9 January 2014 at 18:34:58 UTC, Walter Bright wrote:
On 1/9/2014 10:18 AM, "Ola Fosheim Grstad"
Why would you do that? You would have to overload cat then.
So you agree that it won't work.
It will work for string literals or for malloc'ed strings, but not for both using the same function unless you start to depend on the data sections used for literals (memory range testing). Which is a dirty tool-dependent hack.
Overloading doesn't work because a string literal and a string
allocated by something else have the same type.
Not if you return your own type, but have the same structure? You return a struct, containing a variabled sized array of char, and overload on that? But I see your point regarding literal/malloc, const char* and char* is a shady area, you can basically get anything cast to const char*.
And since it is C, people expect to pass char* and const char* around. So most likely what will happen is that if there's any way at all to get a char* or const char* out of your opaque struct, they will do it, and then pass it to strcat, strlen, and who knows what else. You can't really stop this except by convention, because the language doesn't enforce the encapsulation, and making it truly opaque (via void* with PIMPL) will require an extra layer of indirection and make it unusable with commonly-expected C APIs like printf. But we all know what happens with programming by convention when the team grows bigger -- old people who know the Right Way of doing things leave, and new people come in ignorant of how things are Supposed To Be, falling back to const char*, so the code quickly degenerates into a horrible mess of mixed conventions and memory leaks / pointer bugs everywhere. Then you start strdup'ing everything Just In Case. Which was Walter's original point. T -- By understanding a machine-oriented language, the programmer will tend to use a much more efficient method; it is much closer to reality. -- D. Knuth
Jan 09 2014
prev sibling parent reply Paulo Pinto <pjmlp progtools.org> writes:
Am 09.01.2014 19:34, schrieb Walter Bright:
 On 1/9/2014 10:18 AM, "Ola Fosheim Grøstad"
 <ola.fosheim.grostad+dlang gmail.com>" wrote:
 On Thursday, 9 January 2014 at 17:15:46 UTC, Walter Bright wrote:
 How does that work when you pass it "hello"? allocated with malloc()?
 basically any data that has mixed ancestry?
Why would you do that? You would have to overload cat then.
So you agree that it won't work. BTW, it happens all the time when dealing with strings. For example, dealing with filenames, file extensions, and paths. Components can come from the command line, string literals, malloc, slices, etc., all mixed up together. Overloading doesn't work because a string literal and a string allocated by something else have the same type.
 That doesn't work if you're passing strings with mixed ancestry.
Well, you have to decide if you want to roll your own, use a framework or use the old C way. The point is more: you can make your own and make it C-compatible, and reasonably efficient.
My point is you can't avoid making the extra copies without GC in any reasonable way.
Every time I see such discussions, it reminds me when I started coding in the mid-80s and the heresy of using languages like Pascal and C dialects for microcomputers, instead of coding everything in Assembly or Forth. :) -- Paulo
Jan 09 2014
next sibling parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Thu, Jan 09, 2014 at 08:16:12PM +0100, Paulo Pinto wrote:
 Am 09.01.2014 19:34, schrieb Walter Bright:
On 1/9/2014 10:18 AM, "Ola Fosheim Grstad"
<ola.fosheim.grostad+dlang gmail.com>" wrote:
On Thursday, 9 January 2014 at 17:15:46 UTC, Walter Bright wrote:
How does that work when you pass it "hello"? allocated with
malloc()?  basically any data that has mixed ancestry?
Why would you do that? You would have to overload cat then.
So you agree that it won't work. BTW, it happens all the time when dealing with strings. For example, dealing with filenames, file extensions, and paths. Components can come from the command line, string literals, malloc, slices, etc., all mixed up together. Overloading doesn't work because a string literal and a string allocated by something else have the same type.
That doesn't work if you're passing strings with mixed ancestry.
Well, you have to decide if you want to roll your own, use a framework or use the old C way. The point is more: you can make your own and make it C-compatible, and reasonably efficient.
My point is you can't avoid making the extra copies without GC in any reasonable way.
Every time I see such discussions, it reminds me when I started coding in the mid-80s and the heresy of using languages like Pascal and C dialects for microcomputers, instead of coding everything in Assembly or Forth. :)
[...] Ah, the good ole 80's. I remember I was strongly pro-assembly in those days. Back then compiler / interpreter technology was still rather young, and the little that I saw of it didn't leave a good impression, so I regarded all high-level languages with suspicion. :) Especially languages that sport "nice" string operators, since back then many language implementations had rather nave string implementations, which are really slow and inefficient. T -- Always remember that you are unique. Just like everybody else. -- despair.com
Jan 09 2014
prev sibling parent reply "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Thursday, 9 January 2014 at 19:16:10 UTC, Paulo Pinto wrote:
 Every time I see such discussions, it reminds me when I started 
 coding in the mid-80s and the heresy of using languages like 
 Pascal and C dialects for microcomputers, instead of coding 
 everything in Assembly or Forth
If you insist on bringing up heresy... Motorola 680xx is pretty nice compared to x86, although the AMD64bit mode is better than it was. 680xx feels almost like C, just better ;9, I think only MIPS is close in programmer friendlieness. Forth is nice too, very minimalistic and quite powerful for the simplistic implementation. I had a Forth64 module for my C64 to dabble with, a bit hard to create more than toy programs in Forth... Postscript is pretty close actually, and clean. But Forth is dense, so dense that you don't edit text files, you edit text screens... But don't diss assembly, try to get more than 8 sprites and move sprites into the screen border without assembly, can't be done! High level languages, my ass, BASIC can't do that! But hey, I am not arguing in favour of Forth and C (although I would argue in favour of 680xx and MIPS). I am arguing in favour of smart compilers that allow you to go low level at the top of the call stack where it matters (inner loops) without having to resort to a different language. D is close to that, so it is a promising candidate. And... I actually think D is too lax in some areas. I don't think you should be allowed to call C/C++ without nailing down the pre/postconditions, basically describing what happens in terms of optimization constraints. I also want the programmer to be able to assert facts that the compiler fail to prove so that it can be used for optimization. Basically the ability to guide the optimizer so you don't have to resort to low level coding. I also think giving access to malloc is a bad idea. :-P And well, I am not new to GC, I have actually used Simula quite a bit in classes/teaching newbies. Simula incidentally has exactly the same Garbage Collector that D has AFAIK. I remember we had a 1970s internal memo describing the garbage collector of Simula on the curriculum of the compiler course... So that is veeeery old news. Actually Simula kinda has the same kind of string type representation that D has too. And OO. And it has coroutines… While it doesn't have templates, it does actually have name parameters that has textual substitution semantics (in addition to ref and value). Now I also kinda like that it has ":-" for reference assignment and ":=" for value assignment, but I didn't like it back then. 45 years later D merge Simula semantics with C (and some more stuff). And that is an interesting thing, of course. But hey, no point in pretending that other people don't know what programming a GC high level language entails. If I want low latency, I go to C/C++ and hopefully D. If I want high level productivity I use whatever fits the bill… all GC languages. But I don't think D should be the first option in any non-speed area yet, so the GC is of limited use for now IMO. (In clusters you might want that though, speed+convenience but no need for low latency.) I think D could pick up more good stuff from Python, like the array closures that allows you to succinctly transform arrays. Makes large portions of Python's standard library pointless. What I really like about D is that the front end code appears to be quite readable. Take a look at clang and you will see the difference. So, I guess anyone with C++ knowledge has the opportunity to tune both syntax and semantics to their own liking and share it with others. That's pretty sweet (I'd like to try that one day).
Jan 09 2014
next sibling parent reply "deadalnix" <deadalnix gmail.com> writes:
On Thursday, 9 January 2014 at 22:02:48 UTC, Ola Fosheim Grøstad
wrote:
 What I really like about D is that the front end code appears 
 to be quite readable. Take a look at clang and you will see the 
 difference. So, I guess anyone with C++ knowledge has the 
 opportunity to tune both syntax and semantics to their own 
 liking and share it with others. That's pretty sweet (I'd like 
 to try that one day).
This definitively convinced me that you must be very high on drugs.
Jan 09 2014
parent "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Thursday, 9 January 2014 at 22:15:18 UTC, deadalnix wrote:
 This definitively convinced me that you must be very high on
 drugs.
Why is that? I have browsed the repositories and had no problems figuring out what was going on from what I read. I don't understand all the interdependencies of course, but making small changes should not be a big deal from what I've seen.
Jan 09 2014
prev sibling parent "Paulo Pinto" <pjmlp progtools.org> writes:
On Thursday, 9 January 2014 at 22:02:48 UTC, Ola Fosheim Grøstad 
wrote:
 On Thursday, 9 January 2014 at 19:16:10 UTC, Paulo Pinto wrote:
 Every time I see such discussions, it reminds me when I 
 started coding in the mid-80s and the heresy of using 
 languages like Pascal and C dialects for microcomputers, 
 instead of coding everything in Assembly or Forth
If you insist on bringing up heresy... Motorola 680xx is pretty nice compared to x86, although the AMD64bit mode is better than it was. 680xx feels almost like C, just better ;9, I think only MIPS is close in programmer friendlieness. Forth is nice too, very minimalistic and quite powerful for the simplistic implementation. I had a Forth64 module for my C64 to dabble with, a bit hard to create more than toy programs in Forth... Postscript is pretty close actually, and clean. But Forth is dense, so dense that you don't edit text files, you edit text screens... But don't diss assembly, try to get more than 8 sprites and move sprites into the screen border without assembly, can't be done! High level languages, my ass, BASIC can't do that! But hey, I am not arguing in favour of Forth and C (although I would argue in favour of 680xx and MIPS). I am arguing in favour of smart compilers that allow you to go low level at the top of the call stack where it matters (inner loops) without having to resort to a different language. D is close to that, so it is a promising candidate. And... I actually think D is too lax in some areas. I don't think you should be allowed to call C/C++ without nailing down the pre/postconditions, basically describing what happens in terms of optimization constraints. I also want the programmer to be able to assert facts that the compiler fail to prove so that it can be used for optimization. Basically the ability to guide the optimizer so you don't have to resort to low level coding. I also think giving access to malloc is a bad idea. :-P And well, I am not new to GC, I have actually used Simula quite a bit in classes/teaching newbies. Simula incidentally has exactly the same Garbage Collector that D has AFAIK. I remember we had a 1970s internal memo describing the garbage collector of Simula on the curriculum of the compiler course... So that is veeeery old news. Actually Simula kinda has the same kind of string type representation that D has too. And OO. And it has coroutines… While it doesn't have templates, it does actually have name parameters that has textual substitution semantics (in addition to ref and value). Now I also kinda like that it has ":-" for reference assignment and ":=" for value assignment, but I didn't like it back then. 45 years later D merge Simula semantics with C (and some more stuff). And that is an interesting thing, of course. But hey, no point in pretending that other people don't know what programming a GC high level language entails. If I want low latency, I go to C/C++ and hopefully D. If I want high level productivity I use whatever fits the bill… all GC languages. But I don't think D should be the first option in any non-speed area yet, so the GC is of limited use for now IMO. (In clusters you might want that though, speed+convenience but no need for low latency.) I think D could pick up more good stuff from Python, like the array closures that allows you to succinctly transform arrays. Makes large portions of Python's standard library pointless. What I really like about D is that the front end code appears to be quite readable. Take a look at clang and you will see the difference. So, I guess anyone with C++ knowledge has the opportunity to tune both syntax and semantics to their own liking and share it with others. That's pretty sweet (I'd like to try that one day).
Sorry if I hit any nerve, one never knows the experience of other people in the Internet. It is just that in the enterprise world I have been part of projects that ported C and C++ based servers to JVM/.NET ones, always with comparable performance. I do acknowledge that in game programming it might be different, however even AAA do play with GC systems nowadays, even if they have some issues to optimize their behavior. For example, The Witcher 2 for the XBox 360. http://www.makinggames.de/index.php/magazin/2155_porting_the_witcher_2_on_xbox_360 -- Paulo
Jan 10 2014
prev sibling parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Thu, Jan 09, 2014 at 08:40:29AM +0000, digitalmars-d-bounces puremagic.com
wrote:
 On Thursday, 9 January 2014 at 07:07:29 UTC, Walter Bright wrote:
and it works without copying in D, it just returns s1. In C, I
gotta copy, ALWAYS.
Only if you write libraries, in an application you can set your own policies (invariants).
Yes, programming by convention, which falls flat as soon as you have a large team on the project, and people don't know your conventions (you'll be surprised how many "seasoned" programmers will just walk all over your code writing what they're used to writing, with no thought to read the code first and figure out how their code might fit in with the rest). I see lots of this at my job, and it inevitably leads to problems, because in C, people just *expect* the usual copying conventions. Sure, if you're a one-man project, then you can remove some of this copying, but rest assured that in a team project things will go haywire, and inevitably you'll end up dictating that everyone must copy everything because that's the only way to guarantee module X, which is written by team B, doesn't do something screwy with our data.
(C's strings being 0 terminated also forces much extra copying,
but that's another topic.)
Not if you have your own allocator and split chopped strings (you can just overwrite the boundary character).
You can't do this if the caller still wishes to retain the original string.
The point is, no matter how slow the GC is relative to malloc, not
allocating is faster than allocating, and a GC can greatly reduce
the amount of alloc/copy going on.
But since malloc/free is tedious c-programmers tend to avoid it by embedding objects in large structs and put a variable sized object at the end of it... Or have their own pool (possibly on the stack at the location where it should be released).
[...] One thing I miss in D is a nice way to allocate structs with a variable-length "static" array at the end. GCC supports this, probably as an extension (I don't remember if the C standard specifies this). I know I can just manually allocate this via core.gc and casts, but a built-in solution would be really nice. T -- Sometimes the best solution to morale problems is just to fire all of the unhappy people. -- despair.com
Jan 09 2014
parent "bearophile" <bearophileHUGS lycos.com> writes:
H. S. Teoh:

 One thing I miss in D is a nice way to allocate structs with a
 variable-length "static" array at the end. GCC supports this, 
 probably as an extension (I don't remember if the C standard
 specifies this). I know I can just manually allocate this via
 core.gc and casts, but a built-in solution would be really nice.
Since dmd 2.065 D supports this very well (it was supported in past too, but a less well). See: http://rosettacode.org/wiki/Sokoban#Faster_Version Bye, bearophile
Jan 09 2014
prev sibling parent reply Benjamin Thaut <code benjamin-thaut.de> writes:
Am 09.01.2014 08:07, schrieb Walter Bright:
 The point is, no matter how slow the GC is relative to malloc, not
 allocating is faster than allocating, and a GC can greatly reduce the
 amount of alloc/copy going on.
The points should be, if D is going to stay with a GC, and if so, when we will actually get propper GC support so a state of the art GC can be implemented. Or if we are going to replace the GC with ARC. This is a really important topic which shouldn't wait until the language is 20 years old. I'm already using D since almost 3 years, and the more I learn about Garbage Collectors and about D, the more obvious becomes that D does not properly support garbage collection and it will require quite some effort and spec changes to do so. And in all the time I used D nothing changed about the garbage collector. The only thing that happend was the RtInfo template in object.d. But it still isn't used and only solves a small portion of the percise scanning problem. In my opinion D was designed with language features in mind that need a GC, but D was not designed to actually support a GC. And this needs to change. If requested I can make a list with all language features / decisions so far that prevent the implementation of a state of the art GC. -- Kind Regards Benjamin Thaut
Jan 09 2014
next sibling parent reply "Tobias Pankrath" <tobias pankrath.net> writes:
On Thursday, 9 January 2014 at 10:14:08 UTC, Benjamin Thaut wrote:
 If requested I can make a list with all language features / 
 decisions so far that prevent the implementation of a state of 
 the art GC.
At least I am interested in your observations.
Jan 09 2014
parent Benjamin Thaut <code benjamin-thaut.de> writes:
Am 09.01.2014 11:36, schrieb Tobias Pankrath:
 On Thursday, 9 January 2014 at 10:14:08 UTC, Benjamin Thaut wrote:
 If requested I can make a list with all language features / decisions
 so far that prevent the implementation of a state of the art GC.
At least I am interested in your observations.
Ok I will put together a list. But as I'm currently swamped with end of semester stuff, you shouldn't expect it within the next 3 weeks. I will post it on my blog (www.benjamin-thaut.de) and I will post it in the "D.annouce" newsgroup. Kind Regards Benjamin Thaut
Jan 09 2014
prev sibling parent reply "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Thursday, 9 January 2014 at 10:14:08 UTC, Benjamin Thaut wrote:
 If requested I can make a list with all language features / 
 decisions so far that prevent the implementation of a state of 
 the art GC.
I am also interested in this, so that I can avoid those constructs. I am in general in agreement with you. I think regular ownership combined with a segmented GC that only scan pointers to a signified GC type would not be such a big deal and could be a real bonus. With whole program analysis you could then reject a lot of the branches you otherwise have to follow and you would not have to stop threads that cannot touch those GC types. Of course, you would then avoid using generic pointers. So, you might not need an advanced GC, just partition the GC scan better. Scanning stacks could be really fast if you know the call order of stack frames (and you have that opportunity with whole program analysis): e.g.: top frame is a(), but only b() and c() can call a() and b() and c() have same stack frame size and cannot hold pointers to GC object => skip over a() and b/c() in one go. It doesn't matter much if the GC takes even 20% of your efficiency away, as long as it doesn't lock you down for more than 1-2 milliseconds: that's <4 million cycles for a single core. If you need 25 cycles per pointer you can scan <80.000 pointers per core. So if the search space can be partitioned in a way that makes that possible by not following all pointers, then the GC would be fine. 100.000 cache lines = 3.2MB which is not too horrible either. I'd rather have 1000% less efficiency in the GC by having frequent GC calls than 400% more latency less frequently.
Jan 09 2014
parent reply "Paulo Pinto" <pjmlp progtools.org> writes:
On Thursday, 9 January 2014 at 13:44:10 UTC, Ola Fosheim Grøstad 
wrote:
 On Thursday, 9 January 2014 at 10:14:08 UTC, Benjamin Thaut 
 wrote:
 If requested I can make a list with all language features / 
 decisions so far that prevent the implementation of a state of 
 the art GC.
I am also interested in this, so that I can avoid those constructs. I am in general in agreement with you. I think regular ownership combined with a segmented GC that only scan pointers to a signified GC type would not be such a big deal and could be a real bonus. With whole program analysis you could then reject a lot of the branches you otherwise have to follow and you would not have to stop threads that cannot touch those GC types. Of course, you would then avoid using generic pointers. So, you might not need an advanced GC, just partition the GC scan better. Scanning stacks could be really fast if you know the call order of stack frames (and you have that opportunity with whole program analysis): e.g.: top frame is a(), but only b() and c() can call a() and b() and c() have same stack frame size and cannot hold pointers to GC object => skip over a() and b/c() in one go. It doesn't matter much if the GC takes even 20% of your efficiency away, as long as it doesn't lock you down for more than 1-2 milliseconds: that's <4 million cycles for a single core. If you need 25 cycles per pointer you can scan <80.000 pointers per core. So if the search space can be partitioned in a way that makes that possible by not following all pointers, then the GC would be fine. 100.000 cache lines = 3.2MB which is not too horrible either. I'd rather have 1000% less efficiency in the GC by having frequent GC calls than 400% more latency less frequently.
That could possibly be achieved with a generational parallel GC. -- Paulo
Jan 09 2014
parent reply "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Thursday, 9 January 2014 at 13:51:09 UTC, Paulo Pinto wrote:
 That could possibly be achieved with a generational parallel GC.
Isn't the basic assumption in a generational GC that most free'd objects has a short life span and happened since the last collection? Was there some assumption about the majority of inter-object pointers being within the same generation, too? So that you partition the objects in "train carts" and only have few pointers going between carts? I haven't looked at the original paper in a long time... Anyway, if that is the assumption then it is generally not true for programs that are written for real time. Temporary objects are then allocated in pools or on the stack. Objects that are free'd tend to come from timers, events or because they have a lifespan (like enemies in a computer game). I also dislike the idea of the GC locking cores down when it doesn't have to, so I don't think parallel is particularly useful. It will just put more pressure on the memory bus. I think it is sufficient to have a simple GC that only scans disjoint subsets (for that kind of application), so yes partitioned by type, or better: by reachability, but not by generation. If the GC behaviour is predictable then the application can be designed to not trigger bad behaviour from the get go.
Jan 09 2014
next sibling parent reply "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
And, if it isn't in D already I would very much like to have a 
weak pointer type that will be set to null if the object is only 
pointed to by weak pointers.

It is a PITA to have objects die and get them out of a bunch of 
event-queues etc.
Jan 09 2014
parent Benjamin Thaut <code benjamin-thaut.de> writes:
Am 09.01.2014 15:28, schrieb "Ola Fosheim Grøstad" 
<ola.fosheim.grostad+dlang gmail.com>":
 And, if it isn't in D already I would very much like to have a weak
 pointer type that will be set to null if the object is only pointed to
 by weak pointers.

 It is a PITA to have objects die and get them out of a bunch of
 event-queues etc.
Didn't phobos get such a weak pointer type lately? I at least saw a implementation on the newsgroup very recently. It used core.memory.setAttr to store information in objects. Then you can overwrite the collectHandler in core.runtime to null the weak references up destruction. -- Kind Regards Benjamin Thaut
Jan 09 2014
prev sibling parent reply "Paulo Pinto" <pjmlp progtools.org> writes:
On Thursday, 9 January 2014 at 14:19:41 UTC, Ola Fosheim Grøstad 
wrote:
 On Thursday, 9 January 2014 at 13:51:09 UTC, Paulo Pinto wrote:
 That could possibly be achieved with a generational parallel 
 GC.
Isn't the basic assumption in a generational GC that most free'd objects has a short life span and happened since the last collection? Was there some assumption about the majority of inter-object pointers being within the same generation, too? So that you partition the objects in "train carts" and only have few pointers going between carts? I haven't looked at the original paper in a long time...
That was just a suggestion. There are plenty of incremental GC algorithms to choose from.
 Anyway, if that is the assumption then it is generally not true 
 for programs that are written for real time. Temporary objects 
 are then allocated in pools or on the stack. Objects that are 
 free'd tend to come from timers, events or because they have a 
 lifespan (like enemies in a computer game).
There are real time GCs controlling missile tracking systems. Personally I find them a bit more real time than computer games. On a game you might miss a few rendering frames, a GC induced delay on a missile tracking system might turn out a bit ugly.
 I also dislike the idea of the GC locking cores down when it 
 doesn't have to, so I don't think parallel is particularly 
 useful. It will just put more pressure on the memory bus. I 
 think it is sufficient to have a simple GC that only scans 
 disjoint subsets (for that kind of application), so yes 
 partitioned by type, or better: by reachability, but not by 
 generation.

 If the GC behaviour is predictable then the application can be 
 designed to not trigger bad behaviour from the get go.
Sure, the GC usage should not hinder the application's performance. However, unless you target systems without an OS, you'll have anyway the OS making whatever it wants with the existing cores. I never saw much control besides setting affinities. -- Paulo
Jan 09 2014
parent reply "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Thursday, 9 January 2014 at 14:40:16 UTC, Paulo Pinto wrote:
 On a game you might miss a few rendering frames, a GC induced
 delay on a missile tracking system might turn out a bit ugly.
You have GC in games, but you limit it to a small set of objects (<50000?) So you can have real time with GC with an upper-bound. Putting everything under GC is probably not a future proof concept, since memory capacity most likely will increase faster than CPU speed for technical reasons.
 However, unless you target systems without an OS, you'll have 
 anyway the OS making whatever it wants with the existing cores.
Yes, but you don't blame the application if the scheduler isn't real time friendly. Linux has been a been kind of bad, because distributions have been focused on servers. But you find real time friendly schedulers too.
Jan 09 2014
parent "Paulo Pinto" <pjmlp progtools.org> writes:
On Thursday, 9 January 2014 at 14:57:31 UTC, Ola Fosheim Grøstad 
wrote:
 On Thursday, 9 January 2014 at 14:40:16 UTC, Paulo Pinto wrote:
 On a game you might miss a few rendering frames, a GC induced
 delay on a missile tracking system might turn out a bit ugly.
You have GC in games, but you limit it to a small set of objects (<50000?) So you can have real time with GC with an upper-bound. Putting everything under GC is probably not a future proof concept, since memory capacity most likely will increase faster than CPU speed for technical reasons.
Sure. As I mentioned in another thread, the other GC enabled system programming languages I know, also allow for static, global and stack allocation. And you also have an escape hatch to do manual memory management if you really have to. Namely Oberon(-2), Component Pascal, Active Oberon, Modula-3, up. While those ended up never being adopted by the industry at large, we can draw lessons from the experience of their users. Positive features and related flaws. Currently I am digging up the Mesa/Cedar reports from Xerox PARC. I think D already has the necessary features, their performance just needs to be improved. -- Paulo
Jan 09 2014
prev sibling parent =?UTF-8?Q?Klaim_=2D_Jo=C3=ABl_Lamotte?= <mjklaim gmail.com> writes:
On Thu, Jan 9, 2014 at 7:11 AM, Manu <turkeyman gmail.com> wrote:

 You're making a keen assumption here that C programmers use STL. And no
 sane programmer that I've ever worked with uses STL precisely for this
 reason :P
I think this sentence is misleading. I've made high performance application with no copy with the STL. Your "sane programmers" are just people who don't want to learn it. Sane programemrs make sure they know the strengh and pitfalls of their tools. They don't avoid tools because they make incorrect assomptions, like you are doing here. Also, this have nothing to do with STL.
Jan 09 2014
prev sibling next sibling parent reply "NoUseForAName" <no spam.com> writes:
On Wednesday, 8 January 2014 at 19:17:08 UTC, H. S. Teoh wrote:
 On Wed, Jan 08, 2014 at 11:35:19AM +0000, Atila Neves wrote:
 http://atilanevesoncode.wordpress.com/2014/01/08/adding-java-and-c-to-the-mqtt-benchmarks-or-how-i-learned-to-stop-worrying-and-love-the-garbage-collector/
Manual memory management is a LOT of effort
Not in my experience. It only gets ugly if you attempt to write Ruby/Java in C/C++. In C/C++ you do not wildly create short-lived objects all over the place. In embedded C there is often no object allocation at all after initialization. I have written C and C++ code for 15 years and the only real issue was memory safety but you do not need a GC to solve that problem.
 unless you're writing an AAA 3D game engine, you don't *need* 
 that last
 5% performance improvement that manual memory management 
 *might* gives
 you.
The performance issues of GC are not measured in percentages but in pause times. Those become problematic when - for example - your software must achieve a frame rate of at least 60 frames per second - every second. In future this will get worse because it seems the trend goes towards 120 Hz screens which require a frame rate of at least 120 frames per second for the best experience. Try squeezing D's stop-the-world GC pause times in there. The D solution is to avoid the GC and fallback to C-style code. That is why Rust creates so much more excitement among C/C++ programmers. You get high-level code, memory safety AND no pause times.
Jan 08 2014
parent reply "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Wednesday, 8 January 2014 at 23:08:43 UTC, NoUseForAName wrote:
 That is why Rust creates so much more excitement among C/C++ 
 programmers. You get high-level code, memory safety AND no 
 pause times.
let mut x = 4. Whyyy would anyone want to create such a syntax? I really want to like Rust, but I... just...
Jan 08 2014
parent reply "NoUseForAName" <no spam.com> writes:
On Wednesday, 8 January 2014 at 23:27:39 UTC, Ola Fosheim Grøstad 
wrote:
 let mut x = 4.

 Whyyy would anyone want to create such a syntax? I really want 
 to like Rust, but I... just...
Looks pretty boring/conventional to me. If you know many programming languages you immediately recognize "let" as a common keyword for assignment. That keyword is older than me and I am old (by Silicon Valley standards). That leaves only the funny sounding "mut" as slightly unusual. It is the result of making immutable the default which I think is a good decision. It is horribly abbreviated but the vast majority of programmers who know what a cache miss is seem to prefer such abbreviations (I am not part of that majority, though). I mean C gave us classics like "atoi".. still reminds me of "ahoi" every time I read it. And I will never get over C++'s "cout" and "cin". See? Rust makes C/C++ damaged people feel right at home even there ;P
Jan 08 2014
parent reply "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Wednesday, 8 January 2014 at 23:43:43 UTC, NoUseForAName wrote:
 Looks pretty boring/conventional to me. If you know many 
 programming languages you immediately recognize "let" as a 
 common keyword for assignment.
Yes, but I cannot think of a single one of them that I would like to use! ;-)
 That leaves only the funny sounding "mut" as slightly unusual. 
 It is the result of making immutable the default which I think 
 is a good decision.
Agree on the last point, immutable should be the default. Altough I think they should have skipped both "let" and "mut" and used a different symbol for initial-assignment instead.
 (I am not part of that majority, though). I mean C gave us 
 classics like "atoi".. still reminds me of "ahoi" every time I 
 read it. And I will never get over C++'s "cout" and "cin". See?
I don't mind cout, I hardly use cin, I try to avoid cerr, and I've never used clog… I mind how you configure iostreams though. It looks worse than printf, not sure how they managed that.
 Rust makes C/C++ damaged people feel right at home even there ;P
Well, I associate "let" with the functional-toy-languages we created/used at the university in the 90s so I kind of have problem taking Rust seriously. And the name? RUST? Decaying metal. Why? It gives me the eerie feeling that the designers are either brilliant, mad or both, or that it is a practical joke. I'm sure the compiler randomly tells you Aprils Fools! Or something.
Jan 08 2014
next sibling parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Wed, Jan 08, 2014 at 11:59:58PM +0000, digitalmars-d-bounces puremagic.com
wrote:
 On Wednesday, 8 January 2014 at 23:43:43 UTC, NoUseForAName wrote:
[...]
(I am not part of that majority, though). I mean C gave us
classics like "atoi".. still reminds me of "ahoi" every time I
read it. And I will never get over C++'s "cout" and "cin". See?
The absolute worst offender from the C days was creat(). I mean, seriously?? I'm actually a fan of abbreviated names myself, but that one simply takes it to a whole 'nother level of wrong.
 I don't mind cout, I hardly use cin, I try to avoid cerr, and I've
 never used clog… I mind how you configure iostreams though. It looks
 worse than printf, not sure how they managed that.
[...] I hate iostream with a passion. The syntax is only the tip of the proverbial iceberg. Manipulators that change the global state of the output stream, pathologically verbose ways of controlling output format (cout << setprecision(5) << num; -- really?!) that *also* modifies global state, crazy choice of output operator with counter-intuitive operator precedence (cout << a&b doesn't do what you think it does), ... I have trouble finding what's there to like about iostream. Even when I was still writing C++ a few years ago, I avoided iostream like the plague. For all of its flaws, C's stdio is still far better than iostream in terms of everyday usability. At least for me. YMMV. T -- Marketing: the art of convincing people to pay for what they didn't need before which you can't deliver after.
Jan 08 2014
next sibling parent reply "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Thursday, 9 January 2014 at 00:52:04 UTC, H. S. Teoh wrote:
 The absolute worst offender from the C days was creat().
That's unfair, that's unix, not C! http://linux.die.net/man/3/explain_creat_or_die
Jan 08 2014
next sibling parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Thu, Jan 09, 2014 at 01:06:01AM +0000, digitalmars-d-bounces puremagic.com
wrote:
 On Thursday, 9 January 2014 at 00:52:04 UTC, H. S. Teoh wrote:
The absolute worst offender from the C days was creat().
That's unfair, that's unix, not C! http://linux.die.net/man/3/explain_creat_or_die
That's why I said "from the C days", not "in C". :) Remember that C was created... um, creat-ed... in order to write Unix. T -- Gone Chopin. Bach in a minuet.
Jan 08 2014
parent "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Thursday, 9 January 2014 at 01:26:27 UTC, H. S. Teoh wrote:
 That's why I said "from the C days", not "in C". :) Remember 
 that C was
 created... um, creat-ed... in order to write Unix.
Yes, but you have to take into consideration that there are over twice as many anagrams for "creat" than for "create", so "creat" is clearly more versatile. There are no anagrams for "unix".
Jan 08 2014
prev sibling parent "Brad Anderson" <eco gnuk.net> writes:
On Thursday, 9 January 2014 at 01:06:03 UTC, Ola Fosheim Grøstad
wrote:
 On Thursday, 9 January 2014 at 00:52:04 UTC, H. S. Teoh wrote:
 The absolute worst offender from the C days was creat().
That's unfair, that's unix, not C! http://linux.die.net/man/3/explain_creat_or_die
But that just means the same people are responsible.
Jan 08 2014
prev sibling parent "Paulo Pinto" <pjmlp progtools.org> writes:
On Thursday, 9 January 2014 at 00:52:04 UTC, H. S. Teoh wrote:
 On Wed, Jan 08, 2014 at 11:59:58PM +0000, 
 digitalmars-d-bounces puremagic.com wrote:
 On Wednesday, 8 January 2014 at 23:43:43 UTC, NoUseForAName 
 wrote:
[...]
(I am not part of that majority, though). I mean C gave us
classics like "atoi".. still reminds me of "ahoi" every time I
read it. And I will never get over C++'s "cout" and "cin". 
See?
The absolute worst offender from the C days was creat(). I mean, seriously?? I'm actually a fan of abbreviated names myself, but that one simply takes it to a whole 'nother level of wrong.
 I don't mind cout, I hardly use cin, I try to avoid cerr, and 
 I've
 never used clog… I mind how you configure iostreams though. It 
 looks
 worse than printf, not sure how they managed that.
[...] I hate iostream with a passion.
I am on the other side of the fence, enjoying iostream since 1994. :) -- Paulo
Jan 08 2014
prev sibling parent "Paulo Pinto" <pjmlp progtools.org> writes:
On Wednesday, 8 January 2014 at 23:59:59 UTC, Ola Fosheim Grøstad 
wrote:
 On Wednesday, 8 January 2014 at 23:43:43 UTC, NoUseForAName 
 wrote:
 Looks pretty boring/conventional to me. If you know many 
 programming languages you immediately recognize "let" as a 
 common keyword for assignment.
Yes, but I cannot think of a single one of them that I would like to use! ;-)
 That leaves only the funny sounding "mut" as slightly unusual. 
 It is the result of making immutable the default which I think 
 is a good decision.
Agree on the last point, immutable should be the default. Altough I think they should have skipped both "let" and "mut" and used a different symbol for initial-assignment instead.
 (I am not part of that majority, though). I mean C gave us 
 classics like "atoi".. still reminds me of "ahoi" every time I 
 read it. And I will never get over C++'s "cout" and "cin". See?
I don't mind cout, I hardly use cin, I try to avoid cerr, and I've never used clog… I mind how you configure iostreams though. It looks worse than printf, not sure how they managed that.
 Rust makes C/C++ damaged people feel right at home even there 
 ;P
Well, I associate "let" with the functional-toy-languages we created/used at the university in the 90s so I kind of have problem taking Rust seriously. And the name? RUST? Decaying metal. Why? It gives me the eerie feeling that the designers are either brilliant, mad or both, or that it is a practical joke. I'm sure the compiler randomly tells you Aprils Fools! Or something.
You mean the toy languages that are slowly replacing C++ in the finance industry?
Jan 08 2014
prev sibling parent reply "Sean Kelly" <sean invisibleduck.org> writes:
On Wednesday, 8 January 2014 at 19:17:08 UTC, H. S. Teoh wrote:
 On Wed, Jan 08, 2014 at 11:35:19AM +0000, Atila Neves wrote:
 http://atilanevesoncode.wordpress.com/2014/01/08/adding-java-and-c-to-the-mqtt-benchmarks-or-how-i-learned-to-stop-worrying-and-love-the-garbage-collector/
I have to say, this is also my experience with C++ after I learnt D. Writing C++ is just so painful, so time-consuming, and so not rewarding for the amount of effort you put into it, that I just can't bring myself to write C++ anymore when I have the choice. And manual memory management is a big part of that time sink. Which is why I believe that a lot of the GC-phobia among the C/C++ folk is misplaced. I can sympathise, though, because coming from a C/C++ background myself, I was highly skeptical of GC'd languages, and didn't find it to be a particularly appealing aspect of D when I first started learning it. But as I learned D, I eventually got used to having the GC around, and discovered that not only it reduced the number of memory bugs dramatically, it also increased my productivity dramatically: I never realized just how much time and effort it took to write code with manual memory management: you constantly have to think about how exactly you're going to be storing your objects, who it's going to get passed to, how to decide who's responsible for freeing it, what's the best strategy for deciding who allocates and who frees. These considerations permeate every aspect of your code, because you need to know whether to pass/return an object* to someone, and whether this pointer implies transfer of ownership or not, since that determines who's responsible to free it, etc.. Even with C++'s smart pointers, you still have to decide which one to use, and what pitfalls are associated with them (beware of cycles with refcounted pointers, passing auto_ptr to somebody might invalidate it after they return, etc.). It's like income tax: on just about every line of code you write, you have to pay the "memory management tax" of extra mental overhead and time spent fixing pointer bugs in order to not get the IRS (Invalid Reference Segfault :P) knocking on your shell prompt.
This is what initially drew me to D from C++. Having a GC is a huge productivity gain.
 Manual memory management is a LOT of effort, and to be quite 
 honest, unless you're writing an AAA 3D game engine, you don't 
 *need* that last 5% performance improvement that manual memory 
 management *might* gives you. That is, if you get it right. 
 Which most C/C++ coders don't.
The other common case is server apps, since unpredictable delays can be quite undesirable as well. Java seems to mostly get around this by having very mature and capable GCs despite having a standard library that wants you to churn through memory like pies at an eating contest. The best you can do with D so far is mostly to just not allocate whenever possible, by slicing strings and such, since scanning can still be costly. I think there's still some work to do here, despite loving the GC as a general feature.
Jan 09 2014
parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Thu, Jan 09, 2014 at 07:01:59PM +0000, Sean Kelly wrote:
 On Wednesday, 8 January 2014 at 19:17:08 UTC, H. S. Teoh wrote:
[...]
Manual memory management is a LOT of effort, and to be quite
honest, unless you're writing an AAA 3D game engine, you don't
*need* that last 5% performance improvement that manual memory
management *might* gives you. That is, if you get it right. Which
most C/C++ coders don't.
The other common case is server apps, since unpredictable delays can be quite undesirable as well. Java seems to mostly get around this by having very mature and capable GCs despite having a standard library that wants you to churn through memory like pies at an eating contest. The best you can do with D so far is mostly to just not allocate whenever possible, by slicing strings and such, since scanning can still be costly. I think there's still some work to do here, despite loving the GC as a general feature.
I think we all agree that D's GC in its current state needs a lot of improvement. While I have come to accept GCs as a good thing, that doesn't mean that D's current GC is *that* good. Yet. I wish I had the know-how (and the time!) to improve D's GC, because if D can get a GC that's on par with Java's, then D can totally beat Java flat, since the existence of value types greatly reduces the memory pressure on the GC, so the GC will have much less work to do compared to an equivalent Java program. OTOH, even with D's suboptimal GC, I'm already seeing great productivity gains at only a low cost, so that's a big thumbs up for GC's. And the nice thing about being able to call malloc from D (which you can't in Java) means you can still do manual memory management in critical code sections when you need to squeeze out some extra performance. T -- Turning your clock 15 minutes ahead won't cure lateness---you're just making time go faster!
Jan 09 2014
next sibling parent Paulo Pinto <pjmlp progtools.org> writes:
Am 09.01.2014 20:40, schrieb H. S. Teoh:
 On Thu, Jan 09, 2014 at 07:01:59PM +0000, Sean Kelly wrote:
 On Wednesday, 8 January 2014 at 19:17:08 UTC, H. S. Teoh wrote:
[...]
 Manual memory management is a LOT of effort, and to be quite
 honest, unless you're writing an AAA 3D game engine, you don't
 *need* that last 5% performance improvement that manual memory
 management *might* gives you. That is, if you get it right. Which
 most C/C++ coders don't.
The other common case is server apps, since unpredictable delays can be quite undesirable as well. Java seems to mostly get around this by having very mature and capable GCs despite having a standard library that wants you to churn through memory like pies at an eating contest. The best you can do with D so far is mostly to just not allocate whenever possible, by slicing strings and such, since scanning can still be costly. I think there's still some work to do here, despite loving the GC as a general feature.
I think we all agree that D's GC in its current state needs a lot of improvement. While I have come to accept GCs as a good thing, that doesn't mean that D's current GC is *that* good. Yet. I wish I had the know-how (and the time!) to improve D's GC, because if D can get a GC that's on par with Java's, then D can totally beat Java flat, since the existence of value types greatly reduces the memory pressure on the GC, so the GC will have much less work to do compared to an equivalent Java program. OTOH, even with D's suboptimal GC, I'm already seeing great productivity gains at only a low cost, so that's a big thumbs up for GC's. And the nice thing about being able to call malloc from D (which you can't in Java) means you can still do manual memory management in critical code sections when you need to squeeze out some extra performance. T
Well, there are a few options to call malloc from Java: - Do you own JNI wrapper - Use Java Native Access - Use Java Native Runtime - Use NIO Buffers - Use sun.misc.Unsafe.allocateMemory (sun.misc.Unsafe is planned to become a public API) -- Paulo
Jan 09 2014
prev sibling parent reply "qznc" <qznc web.de> writes:
On Thursday, 9 January 2014 at 19:41:43 UTC, H. S. Teoh wrote:
 because if D can get a GC
 that's on par with Java's, then D can totally beat Java flat, 
 since the
 existence of value types greatly reduces the memory pressure on 
 the GC,
 so the GC will have much less work to do compared to an 
 equivalent Java
 program.
Java will probably gain (something like) value types at some point. Google for "packed objects", it provides similar gains as value types. Hopefully, D gets a better GC first.
Jan 09 2014
parent reply "Brian Rogoff" <brogoff gmail.com> writes:
On Thursday, 9 January 2014 at 21:35:45 UTC, qznc wrote:
 On Thursday, 9 January 2014 at 19:41:43 UTC, H. S. Teoh wrote:
 because if D can get a GC
 that's on par with Java's, then D can totally beat Java flat, 
 since the
 existence of value types greatly reduces the memory pressure 
 on the GC,
 so the GC will have much less work to do compared to an 
 equivalent Java
 program.
Java will probably gain (something like) value types at some point. Google for "packed objects", it provides similar gains as value types. Hopefully, D gets a better GC first.
What's the status of all that? There were interesting talks at DConf 2013 about precise and concurrent GCs, and it seemed that work was going on to fold all that into the compilers, and that Walter/Andrei were ready to make changes to the spec and runtime if needed to support precise GC. All very encouraging. Will DMD have a precise GC by the next DConf? -- Brian
Jan 09 2014
parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Thu, Jan 09, 2014 at 10:51:22PM +0000, Brian Rogoff wrote:
 On Thursday, 9 January 2014 at 21:35:45 UTC, qznc wrote:
On Thursday, 9 January 2014 at 19:41:43 UTC, H. S. Teoh wrote:
because if D can get a GC that's on par with Java's, then D can
totally beat Java flat, since the existence of value types greatly
reduces the memory pressure on the GC, so the GC will have much less
work to do compared to an equivalent Java program.
Java will probably gain (something like) value types at some point. Google for "packed objects", it provides similar gains as value types. Hopefully, D gets a better GC first.
What's the status of all that? There were interesting talks at DConf 2013 about precise and concurrent GCs, and it seemed that work was going on to fold all that into the compilers, and that Walter/Andrei were ready to make changes to the spec and runtime if needed to support precise GC. All very encouraging. Will DMD have a precise GC by the next DConf?
[...] Has *anything* been done on the GC at all since the previous DConf? Not trying to be provocative, just genuinely curious if anything has been happening on that front, since I don't remember seeing any commits in that area all year. T -- "I'm running Windows '98." "Yes." "My computer isn't working now." "Yes, you already said that." -- User-Friendly
Jan 09 2014
parent Walter Bright <newshound2 digitalmars.com> writes:
On 1/9/2014 3:29 PM, H. S. Teoh wrote:
 Has *anything* been done on the GC at all since the previous DConf? Not
 trying to be provocative, just genuinely curious if anything has been
 happening on that front, since I don't remember seeing any commits in
 that area all year.
Not much.
Jan 09 2014
prev sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 1/8/14 3:35 AM, Atila Neves wrote:
 http://atilanevesoncode.wordpress.com/2014/01/08/adding-java-and-c-to-the-mqtt-benchmarks-or-how-i-learned-to-stop-worrying-and-love-the-garbage-collector/
http://www.reddit.com/r/programming/comments/1uqabe/adding_java_and_c_to_the_mqtt_benchmarks_or_how_i/?already_submitted=true Andrei
Jan 08 2014