www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - One area where D has the edge

reply "Laeeth Isharc" <Laeeth.nospam nospam-laeeth.com> writes:
The author talks about C++ performance but D can match it whilst 
bringing scripting language style programmer productivity, and 
arguably higher quality code (because you can understand the code 
base as a coherent whole).   Integration with C++ libraries is 
really the last missing piece, and it seems clear enough that is 
on the way to being solved at which point one of the principal 
relative advantages of Java (and cython / numpy - although latter 
less applicable to map reduce) goes away.

I heard a talk from the chap behind commoncrawl (previously at 
Napster and LinkedIn and seemed a hacker by temperament).  He 
observed that vast data sets put massive strain on code and would 
tend to find obscure bugs and strangeness sooner or later.  He 
talked about the non determinacy (for practical purposes) of JIT 
creating strange bugs that were very difficult to reproduce since 
the compiler would generate different machine code on each run.

And beyond slower execution speed  of Java, the memory bloat 
makes a big difference given how cloud pricing works (its peanuts 
to get a machine with a gig of ram, but 64 gig is not so cheap, 
and quickly gets very expensive - and one may need hundreds of 
machines).


http://www.trendcaller.com/2009/05/hadoop-should-target-cllvm-not-java.html

Over the years, there have been many contentious arguments about 
the performance of C++ versus Java. Oddly, every one I found 
addressed only one kind of performance (work/time). I can't find 
any benchmarking of something at least as important in today's 
massive-scale-computing environments, work/watt. A dirty little 
secret about JIT technologies like Java, is that they throw a lot 
more CPU resources at the problem, trying to get up to par with 
native C++ code. JITs use more memory, and periodically run 
background optimizer tasks. These overheads are somewhat offset 
in work/time performance, by extra optimizations which can be 
performed with more dynamic information. But it results in a 
hungrier appetite for watts. Another dirty little secret about 
Java vs C++ benchmarks is that they compare single-workloads. Try 
running 100 VMs, each with a Java and C++ benchmark in it and 
Java's hungrier appetite for resources (MHz, cache, RAM) will 
show. But of course, Java folks don't mention that.

But let's say for the sake of (non-)argument, that Java can 
achieve a 1:1 work/time performance relative to C++, for a single 
program. If Java consumes 15% more power doing it, does it matter 
on a PC? Most people don't dare. Does it matter for small-scale 
server environments? Maybe not. Does it matter when you deploy 
Hadoop on a 10,000 node cluster, and the holistic inefficiency 
(multiple things running concurrently) goes to 30%? Ask the 
people who sign the checks for the power bill. Unfortunately, 
inefficiency scales really well.
Jan 25 2015
next sibling parent reply "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Sunday, 25 January 2015 at 21:50:53 UTC, Laeeth Isharc wrote:
 And beyond slower execution speed  of Java, the memory bloat 
 makes a big difference given how cloud pricing works (its 
 peanuts to get a machine with a gig of ram, but 64 gig is not 
 so cheap, and quickly gets very expensive - and one may need 
 hundreds of machines).
Yes, but memory bloat is not D's strength either until manual memory management is addressed in a satisfactory manner. Rust is way ahead.
Jan 26 2015
parent reply "Laeeth Isharc" <Laeeth.nospam nospam-laeeth.com> writes:
On Monday, 26 January 2015 at 18:53:45 UTC, Ola Fosheim Grøstad 
wrote:
 On Sunday, 25 January 2015 at 21:50:53 UTC, Laeeth Isharc wrote:
 And beyond slower execution speed  of Java, the memory bloat 
 makes a big difference given how cloud pricing works (its 
 peanuts to get a machine with a gig of ram, but 64 gig is not 
 so cheap, and quickly gets very expensive - and one may need 
 hundreds of machines).
Yes, but memory bloat is not D's strength either until manual memory management is addressed in a satisfactory manner. Rust is way ahead.
It seems to me (as a newcomer) that often with D it is the gap between what the language wants to be, and the present reality that upsets people, whereas pragmatically it remains much better than the alternatives even if you have to do a bit of extra work to allocate manually. It is like seeing a beautiful woman marred by a flaw that you just can't seem to ignore until you get to know her as a person. (No apologies for sexism here). The problems of garbage collection in D seem different to those of GC (and memory bloat) in Java whereas people hear GC and not quite perfect and they instantly slot it into their mental slot of GC collected languages, which means Java. Does Rust have the productivity of D? And it doesn't have the maturity, as I understand it.
Jan 26 2015
parent reply "Wyatt" <wyatt.epp gmail.com> writes:
On Monday, 26 January 2015 at 20:19:09 UTC, Laeeth Isharc wrote:
 Does Rust have the productivity of D?  And it doesn't have the 
 maturity, as I understand it.
This brings up something that's been bugging me. D has a pitch for users of a lot of crappy languages, but what do we say when the competition isn't a total slouch? This exchange (names changed) is what started this train of thought: <chum> though i don't understand what the point of D is either because once you've already accepted a gc there are better languages you could use <chum> and if you refuse to accept one, then, well, you either have c++11 or you wait for rust to be usable <otherguy> chum: what is better than D once youre willing to have managed mem? <chum> it's functional, but the complaint all the gamedev folks have about fp langs is that their implementations are usually garbage collected and they can't accept gc pauses interesting language, even if it's kind of ugly to look at and CIL-ly). Thoughts? -Wyatt
Jan 26 2015
next sibling parent reply "Laeeth Isharc" <laeethnospam nospamlaeeth.com> writes:
On Monday, 26 January 2015 at 20:55:14 UTC, Wyatt wrote:
 On Monday, 26 January 2015 at 20:19:09 UTC, Laeeth Isharc wrote:
 Does Rust have the productivity of D?  And it doesn't have the 
 maturity, as I understand it.
This brings up something that's been bugging me. D has a pitch for users of a lot of crappy languages, but what do we say when the competition isn't a total slouch? This exchange (names changed) is what started this train of thought: <chum> though i don't understand what the point of D is either because once you've already accepted a gc there are better languages you could use <chum> and if you refuse to accept one, then, well, you either have c++11 or you wait for rust to be usable <otherguy> chum: what is better than D once youre willing to have managed mem? <chum> it's functional, but the complaint all the gamedev folks have about fp langs is that their implementations are usually garbage collected and they can't accept gc pauses pretty interesting language, even if it's kind of ugly to look at and CIL-ly). Thoughts? -Wyatt
competition to D consists of crappy languages - there are some very smart and creative people with large resources working on them (putting aside the question of the tone one should adopt in public towards peers). This kind of categorical thinking is a mistake. I am not certain, but it strikes me that outside of realtime, the kind of problems one may have with the GC (or with avoiding the use of the gC) in D are really quite different to, say, Java, whereas people lump everything together. I have no experience with realtime applications, so can't comment there. It's not for me to say, but D isn't a product like toothpaste where you are trying to elbow aside the competition, but one where it needs to be the best 'D' it can be, and communicate that well to people and make it easy for them to take advantage of what it has to offer.
Jan 26 2015
parent reply "Wyatt" <wyatt.epp gmail.com> writes:
On Monday, 26 January 2015 at 22:05:55 UTC, Laeeth Isharc wrote:

 competition to D consists of crappy languages - there are some 
 very smart and creative people with large resources working on 
 them (putting aside the question of the tone one should adopt 
 in public towards peers).
That's exactly what I'm saying. Against C or C++, D looks fantastic. But those aren't great languages. But what's the argument for D beyond that? How can people using non-awful languages be persuaded to even have interest?
 It's not for me to say, but D isn't a product like toothpaste 
 where you are trying to elbow aside the competition, but one 
 where it needs to be the best 'D' it can be, and communicate 
 that well to people and make it easy for them to take advantage 
 of what it has to offer.
And that's what bugs me; that even if D is good and has a lot to offer, the pitch doesn't communicate it well. The important part of that exchange that I hoped people would fixate on was this: "I don't understand what the point of D is either because once you've already accepted a GC there are better languages you could use." This indicates to me that there's a problem of messaging. On Tuesday, 27 January 2015 at 02:39:03 UTC, bachmeier wrote:
 Which language today does something that's not done by any 
 other language?
INTERCAL has politeness. But what are you actually trying to say with this statement? -Wyatt
Jan 27 2015
next sibling parent "Laeeth Isharc" <Laeeth.nospam nospam-laeeth.com> writes:
On Tuesday, 27 January 2015 at 13:02:06 UTC, Wyatt wrote:
 On Monday, 26 January 2015 at 22:05:55 UTC, Laeeth Isharc wrote:

 competition to D consists of crappy languages - there are some 
 very smart and creative people with large resources working on 
 them (putting aside the question of the tone one should adopt 
 in public towards peers).
That's exactly what I'm saying. Against C or C++, D looks fantastic. But those aren't great languages. But what's the argument for D beyond that? How can people using non-awful languages be persuaded to even have interest?
You will rarely hear me use the argument from popularity, but to recognise the triteness of saying they must be doing something right is not to say that people switching from C family cannot provide a nice source of fuel for D given the size of their language base, and the size of ours. A language doesn't need to be all things to all people, just to be the best version of what it is meant to be, and to communicate that.
 And that's what bugs me; that even if D is good and has a lot 
 to offer, the pitch doesn't communicate it well.  The important 
 part of that exchange that I hoped people would fixate on was 
 this:
I fully agree. But a teenager is not as poised as the young woman she grows into. And it may be more important at one stage in life to focus on homework than popularity. The more complex a creature is, the longer it may take to reach full maturity. It would have been bad for D to be 50x as popular at this stage because that would have hurt its quality. Too much noise, politics, and lowest common denominator stuff comes with being prematurely popular. Also often self satisfaction and complacency. You can't say that there has not been a frenzy of emphasis on presentation and cleanup in the past months. A long way to go to be sure, but the journey has started. One sinks a lot of work into something before seeing results - make a note in your diary for 2018 and look back and tell me the situation is not radically better...
 "I don't understand what the point of D is either because once 
 you've already accepted a GC there are better languages you 
 could use."

 This indicates to me that there's a problem of messaging.
Especially wrt GC, and memory management where it is a highly technical topic that depends on the use case, and where it is hard to know the situation for you before having experience of the language. So FUD is very effective, and people love to explain why the kid on the fringe deserves his status because it makes them feel better about themselves. And there is this tribal sense of GC as a Shibboleth - I am a native programmer, and I don't use GC. (As Walter said, he used to think GC was for loser programmers who couldn't manage memory like a man - my paraphrase). If one reads the threads on GC (and I have been doing so the past days), one hears mostly from gaming programmers. That's an important market deserving of respect, and handy as a stress case for the language. But its a small part of total potential use case domain, and I would love to hear more accounts of how people are managing fine with the GC as it is. There were some insightful posts by Adam Ruppe, and we should pull them out into a GC wiki FAQ. Laeeth.
Jan 27 2015
prev sibling parent "bachmeier" <no spam.com> writes:
On Tuesday, 27 January 2015 at 13:02:06 UTC, Wyatt wrote:

 On Tuesday, 27 January 2015 at 02:39:03 UTC, bachmeier wrote:
 Which language today does something that's not done by any 
 other language?
INTERCAL has politeness. But what are you actually trying to say with this statement? -Wyatt
Looks like I didn't post my full message. The point I intended to make is that it's hard to make a pitch like that. It used to be that for Lisp or some other language you could sell it by talking about features that aren't easily found in other languages. Today, every feature is implemented by numerous other languages, so you really have to try a language in order to know how it compares. In short, you can't convince someone that's discussing languages at that level.
Jan 27 2015
prev sibling next sibling parent "weaselcat" <weaselcat gmail.com> writes:
On Monday, 26 January 2015 at 20:55:14 UTC, Wyatt wrote:
 On Monday, 26 January 2015 at 20:19:09 UTC, Laeeth Isharc wrote:
 Does Rust have the productivity of D?  And it doesn't have the 
 maturity, as I understand it.
This brings up something that's been bugging me. D has a pitch for users of a lot of crappy languages, but what do we say when the competition isn't a total slouch? This exchange (names changed) is what started this train of thought: <chum> though i don't understand what the point of D is either because once you've already accepted a gc there are better languages you could use <chum> and if you refuse to accept one, then, well, you either have c++11 or you wait for rust to be usable <otherguy> chum: what is better than D once youre willing to have managed mem? <chum> it's functional, but the complaint all the gamedev folks have about fp langs is that their implementations are usually garbage collected and they can't accept gc pauses pretty interesting language, even if it's kind of ugly to look at and CIL-ly). Thoughts? -Wyatt
D GC collects far too often for games, I believe that the GC is more configurable in 2.067 in this regard. I currently have to disable the GC and run it manually. Single larger(ish) pause >>> many, many small pauses.
Jan 26 2015
prev sibling parent "bachmeier" <no spam.net> writes:
On Monday, 26 January 2015 at 20:55:14 UTC, Wyatt wrote:
 On Monday, 26 January 2015 at 20:19:09 UTC, Laeeth Isharc wrote:
 Does Rust have the productivity of D?  And it doesn't have the 
 maturity, as I understand it.
This brings up something that's been bugging me. D has a pitch for users of a lot of crappy languages, but what do we say when the competition isn't a total slouch? This exchange (names changed) is what started this train of thought: <chum> though i don't understand what the point of D is either because once you've already accepted a gc there are better languages you could use <chum> and if you refuse to accept one, then, well, you either have c++11 or you wait for rust to be usable <otherguy> chum: what is better than D once youre willing to have managed mem? <chum> it's functional, but the complaint all the gamedev folks have about fp langs is that their implementations are usually garbage collected and they can't accept gc pauses pretty interesting language, even if it's kind of ugly to look at and CIL-ly). Thoughts? -Wyatt
Which language today does something that's not done by any other language?
Jan 26 2015
prev sibling parent reply "Paulo Pinto" <pjmlp progtools.org> writes:
On Sunday, 25 January 2015 at 21:50:53 UTC, Laeeth Isharc wrote:
 The author talks about C++ performance but D can match it 
 whilst bringing scripting language style programmer 
 productivity, and arguably higher quality code (because you can 
 understand the code base as a coherent whole).   Integration 
 with C++ libraries is really the last missing piece, and it 
 seems clear enough that is on the way to being solved at which 
 point one of the principal relative advantages of Java (and 
 cython / numpy - although latter less applicable to map reduce) 
 goes away.

 I heard a talk from the chap behind commoncrawl (previously at 
 Napster and LinkedIn and seemed a hacker by temperament).  He 
 observed that vast data sets put massive strain on code and 
 would tend to find obscure bugs and strangeness sooner or 
 later.  He talked about the non determinacy (for practical 
 purposes) of JIT creating strange bugs that were very difficult 
 to reproduce since the compiler would generate different 
 machine code on each run.

 And beyond slower execution speed  of Java, the memory bloat 
 makes a big difference given how cloud pricing works (its 
 peanuts to get a machine with a gig of ram, but 64 gig is not 
 so cheap, and quickly gets very expensive - and one may need 
 hundreds of machines).


 http://www.trendcaller.com/2009/05/hadoop-should-target-cllvm-not-java.html

 Over the years, there have been many contentious arguments 
 about the performance of C++ versus Java. Oddly, every one I 
 found addressed only one kind of performance (work/time). I 
 can't find any benchmarking of something at least as important 
 in today's massive-scale-computing environments, work/watt. A 
 dirty little secret about JIT technologies like Java, is that 
 they throw a lot more CPU resources at the problem, trying to 
 get up to par with native C++ code. JITs use more memory, and 
 periodically run background optimizer tasks. These overheads 
 are somewhat offset in work/time performance, by extra 
 optimizations which can be performed with more dynamic 
 information. But it results in a hungrier appetite for watts. 
 Another dirty little secret about Java vs C++ benchmarks is 
 that they compare single-workloads. Try running 100 VMs, each 
 with a Java and C++ benchmark in it and Java's hungrier 
 appetite for resources (MHz, cache, RAM) will show. But of 
 course, Java folks don't mention that.

 But let's say for the sake of (non-)argument, that Java can 
 achieve a 1:1 work/time performance relative to C++, for a 
 single program. If Java consumes 15% more power doing it, does 
 it matter on a PC? Most people don't dare. Does it matter for 
 small-scale server environments? Maybe not. Does it matter when 
 you deploy Hadoop on a 10,000 node cluster, and the holistic 
 inefficiency (multiple things running concurrently) goes to 
 30%? Ask the people who sign the checks for the power bill. 
 Unfortunately, inefficiency scales really well.
No, Java does not consume 15% doing it, because there isn't just one implementation of Java compilers. Most comercial JVMs do offer the capability of ahead of time native code compilation or JIT caches. So when those 15% really matter, enterprises do shell out the money for such JVMs. Oracle commercial JVM and the OpenJDK are just the reference implementation. -- Paulo
Jan 26 2015
parent reply "Laeeth Isharc" <laeethnospam nospamlaeeth.com> writes:
" If Java consumes 15% more power doing it, does
 it matter on a PC? Most people don't dare. Does it matter for 
 small-scale server environments? Maybe not. Does it matter 
 when you deploy Hadoop on a 10,000 node cluster, and the 
 holistic inefficiency (multiple things running concurrently) 
 goes to 30%? Ask the people who sign the checks for the power 
 bill. Unfortunately, inefficiency scales really well.
No, Java does not consume 15% doing it, because there isn't just one implementation of Java compilers. Most comercial JVMs do offer the capability of ahead of time native code compilation or JIT caches. So when those 15% really matter, enterprises do shell out the money for such JVMs. Oracle commercial JVM and the OpenJDK are just the reference implementation.
Thanks for the colour. (For clarity, the content from the link wasn't by me, and I meant the general gist rather than the details). How do commercial JVMs rate in terms of memory usage against thoughtful native (D) code implementations? Is the basic point mistaken?
Jan 26 2015
parent reply "Paulo Pinto" <pjmlp progtools.org> writes:
On Monday, 26 January 2015 at 22:12:24 UTC, Laeeth Isharc wrote:
 " If Java consumes 15% more power doing it, does
 it matter on a PC? Most people don't dare. Does it matter for 
 small-scale server environments? Maybe not. Does it matter 
 when you deploy Hadoop on a 10,000 node cluster, and the 
 holistic inefficiency (multiple things running concurrently) 
 goes to 30%? Ask the people who sign the checks for the power 
 bill. Unfortunately, inefficiency scales really well.
No, Java does not consume 15% doing it, because there isn't just one implementation of Java compilers. Most comercial JVMs do offer the capability of ahead of time native code compilation or JIT caches. So when those 15% really matter, enterprises do shell out the money for such JVMs. Oracle commercial JVM and the OpenJDK are just the reference implementation.
Thanks for the colour. (For clarity, the content from the link wasn't by me, and I meant the general gist rather than the details). How do commercial JVMs rate in terms of memory usage against thoughtful native (D) code implementations? Is the basic point mistaken?
So far I just dabbled in D, because our customers choose the platforms, not we. However, these are the kind of tools you get to analyse performance in commercial JVMs, http://www.oracle.com/technetwork/java/javaseproducts/mission-control/java-mission-control-1998576.html http://www.oracle.com/technetwork/server-storage/solarisstudio/features/performance-analyzer-2292312.html Just providing the examples from Oracle, other vendors have similar tools. With them, you can drill down the whole JVM and interactions at the OS level and find performance bottlecks all the way down to generated Assembly code. As for memory usage, Atego JVMs run in quite memory constrained devices. Here is the tiniest of them, http://www.atego.com/products/atego-perc-ultra/ -- Paulo
Jan 26 2015
next sibling parent "deadalnix" <deadalnix gmail.com> writes:
Yup, most people like to shit on Java, but quite frankly, the 
ecosystem is way ahead of what exists on most platform.

It is even fairly common to get a Java program up and running + 
tweaking of the JVM is less time and with better performance than 
what you would have in C++.

Obviously, given enough time and resource, you can still optimize 
the C++ version to death to the point it beat java, but, unless 
you are willing to spend an order of magnitude more for 
performances, java is a better choice than C++.
Jan 26 2015
prev sibling parent reply "uri" <uri.grill gmail.com> writes:
On Monday, 26 January 2015 at 22:53:15 UTC, Paulo Pinto wrote:
 On Monday, 26 January 2015 at 22:12:24 UTC, Laeeth Isharc wrote:
 " If Java consumes 15% more power doing it, does
 it matter on a PC? Most people don't dare. Does it matter 
 for small-scale server environments? Maybe not. Does it 
 matter when you deploy Hadoop on a 10,000 node cluster, and 
 the holistic inefficiency (multiple things running 
 concurrently) goes to 30%? Ask the people who sign the 
 checks for the power bill. Unfortunately, inefficiency 
 scales really well.
No, Java does not consume 15% doing it, because there isn't just one implementation of Java compilers. Most comercial JVMs do offer the capability of ahead of time native code compilation or JIT caches. So when those 15% really matter, enterprises do shell out the money for such JVMs. Oracle commercial JVM and the OpenJDK are just the reference implementation.
Thanks for the colour. (For clarity, the content from the link wasn't by me, and I meant the general gist rather than the details). How do commercial JVMs rate in terms of memory usage against thoughtful native (D) code implementations? Is the basic point mistaken?
So far I just dabbled in D, because our customers choose the platforms, not we. However, these are the kind of tools you get to analyse performance in commercial JVMs, http://www.oracle.com/technetwork/java/javaseproducts/mission-control/java-mission-control-1998576.html http://www.oracle.com/technetwork/server-storage/solarisstudio/features/performance-analyzer-2292312.html Just providing the examples from Oracle, other vendors have similar tools. With them, you can drill down the whole JVM and interactions at the OS level and find performance bottlecks all the way down to generated Assembly code. As for memory usage, Atego JVMs run in quite memory constrained devices. Here is the tiniest of them, http://www.atego.com/products/atego-perc-ultra/ -- Paulo
There was also this one from 1998 that was very small http://www.javaworld.com/article/2076641/learn-java/an-introduction-to-the-java-ring.html Java has some history running on small devices. Cheers, uri
Jan 26 2015
next sibling parent Russel Winder via Digitalmars-d <digitalmars-d puremagic.com> writes:
On Tue, 2015-01-27 at 04:50 +0000, uri via Digitalmars-d wrote:

=20
 Java has some history running on small devices.
And, after all, Java (n=C3=A9e Oak) was invented for programming white good= s operating systems. Also set top boxes. The first tablet, Star7, appeared long before iPad. FTR JavaCard has almost, but not quite, nothing to do with Java. A bit like JavaScript. --=20 Russel. =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D Dr Russel Winder t: +44 20 7585 2200 voip: sip:russel.winder ekiga.n= et 41 Buckmaster Road m: +44 7770 465 077 xmpp: russel winder.org.uk London SW11 1EN, UK w: www.russel.org.uk skype: russel_winder
Jan 26 2015
prev sibling parent reply "Laeeth Isharc" <Laeeth.nospam nospam-laeeth.com> writes:
 There was also this one from 1998 that was very small

 http://www.javaworld.com/article/2076641/learn-java/an-introduction-to-the-java-ring.html

 Java has some history running on small devices.

 Cheers,
 uri
Indeed, and I remember that well. However I was less interested in embedded devices and what java could do under conditions of small memory, and more interested in its memory efficiency on servers in managing much larger data sets. Since it seems to me we are still early in the unfolding of current trends, and what is true today mostly for google and Facebook may be more widely true for others tomorrow. I do appreciate that java is comparable in execution speed to native code in many cases. Is its memory footprint really comparable? And if you have a small team, and not much time, how does that change things - D vs Java? I don't think for D GC matters so much as not real time and you can easily preallocate buffers. But don't let me stop you talking about small devices.
Jan 26 2015
parent reply "Paulo Pinto" <pjmlp progtools.org> writes:
On Tuesday, 27 January 2015 at 06:08:34 UTC, Laeeth Isharc wrote:
 There was also this one from 1998 that was very small

 http://www.javaworld.com/article/2076641/learn-java/an-introduction-to-the-java-ring.html

 Java has some history running on small devices.

 Cheers,
 uri
Indeed, and I remember that well. However I was less interested in embedded devices and what java could do under conditions of small memory, and more interested in its memory efficiency on servers in managing much larger data sets. Since it seems to me we are still early in the unfolding of current trends, and what is true today mostly for google and Facebook may be more widely true for others tomorrow. I do appreciate that java is comparable in execution speed to native code in many cases. Is its memory footprint really comparable? And if you have a small team, and not much time, how does that change things - D vs Java? I don't think for D GC matters so much as not real time and you can easily preallocate buffers. But don't let me stop you talking about small devices.
I cannot speak about small team experiences. Our projects usually take around 30+ developers. In terms of server applications, yes when the applications are deployed usually the memory usage might not be optimal. However that is why profiling and language knowledge is required. Fine tuning a Java application is no different than other compiled languages, it just requires to know which knobs to turn. For example, foreach allocates and a simple for does not, so choose wisely how to iterate. If scratch arrays are required multiple times, just allocate it once for the complete lifetime of the class. Always use StringBuilder for string manipulations. And many other tricks. The thing that Java still looses in memory management, even in commercial JVMs, is lack of value types since escape analysis algorithms are not very aggressive, but support is being designed and will land partially in Java 9 and Java 10 time frame. So by Java 10 according to the planned features, there will be value types and even the open source JVM will have some form of JIT cache. Including the large amount of available libraries. As for D, it surely has its place and I am looking forward to language adoption. -- Paulo
Jan 27 2015
parent reply "Laeeth Isharc" <Laeeth.nospam nospam-laeeth.com> writes:
  I cannot speak about small team experiences. Our projects 
 usually take around 30+ developers.
That it is a decent sized team to have to coordinate and it puts emphasis on very different questions. The context I am thinking of is much leaner - more like special forces than the regular army (I mean in terms of flexibility and need to respond quickly to changing circumstances) - although the sums at stake are likely comparable to larger teams (the area is hedge fund portfolio management).
 In terms of server applications, yes when the applications are 
 deployed usually the memory usage might not be optimal.
For you that is less important, and I suppose that comes from the intrinsic nature of the situation. You have beefy machines serving many users, I suppose? I am thinking of a purpose where there are only a handful of users, but the data sets may be larger than we are used to working with, requiring more work than just plain map reduce, and where rapid iteration and prototyping is important. Also to minimize cognitive overload and complexity. A friend has written an article on big data in finance for Alpha magazine, and I will post a link here once it has been published. One problem in economics is you have to forecast the present, because the numbers are published with a lag and themselves reflect decisions taken months previously. But markets are forward looking and discount a future that may be imagined but cannot be understood based only on hard facts. So we need all the help we can get, particularly during an epoch where things change all the time, (eurchf fx rate moved forty percent in a day...). Bridgewater have taken the work of Hal Varian and applied it to use media and web analytics to get a good live cut of economic activity and inflation. Although it is not a tough problem theoretically, people don't actually do as much as they could yet - I think finance is behind tech companies, but they are catching up. Another fund that my friend writes about uses employee sentiment to pick stocks to be long and short of - they manage a few billion and have done quite well.
 However that is why profiling and language knowledge is 
 required.
Yes, I can imagine, and it sounds like not just that Java is the best option for you, but perhaps the only viable one. I am curious though - what do you think the memory footprint is as a ratio to C++ before and after fine tuning? And what proportion of development time does this take?
 Fine tuning a Java application is no different than other 
 compiled languages, it just requires to know which knobs to 
 turn.
I liked a quote by a certain C++ guru talking about the experience of a Facebook, to the effect that a sensible first draft written in C++ would perform decently, whereas this was not true always of other languages. Now their trade off between programmer productivity and execution efficiency is extreme, but is this merely an edge case of little relevance for the rest of us, or is it a Gibsonian case of the future being already here, and just not evenly distributed? I am no expert, but I wonder if the latter may be more true than generally appreciated.
 For example, foreach allocates and a simple for does not, so 
 choose wisely how to iterate.
Would love to hear any other insights you have on this topic. There ought to be a FAQ on getting along with the GC for fun and profit.
 The thing that Java still looses in memory management, even in 
 commercial JVMs, is lack of value types since escape analysis 
 algorithms are not very aggressive, but support is being 
 designed and will land partially in Java 9 and Java 10 time 
 frame.
That was my real point - and that it does really matter in some areas, and that these are growing very quickly. (I appreciate that someone reading my post quickly would see the 15% power thing and focus on that, which was not so much my point - although I am still suspicious of the idea that Java will always keep up with native code without doing a lot of extra work). People on the Slashdot thread were saying what is the point of D. But the way I saw it, where is the real competition for my use case? I can't be extravagant with memory, but I still need rapid development and productivity. We all have a tendency to think that what we know from experience and reading is the full picture, but the world is a big place, and something needs to appeal to someone to grow, not necessarily to oneself personally The C++ integration is the remaining piece. Otherwise it is like the old Soviet Union - this is the factory that makes the steel that builds the factory that makes the steel that... so that Vladimir may have a new car. Ie one spends too much time in roundabout investment before one actually reaps the benefit of higher productivity.
 So by Java 10 according to the planned features, there will be 
 value types and even the open source JVM will have some form of 
 JIT cache. Including the large amount of available libraries.

 As for D, it surely has its place and I am looking forward to 
 language adoption.
Out of interest, what do you see as characterising this place (abstracting away from things that are not perfect now, but will probably be fixed in time)? And in an enterprise environment, what would you use D for today? Laeeth.
Jan 27 2015
next sibling parent reply "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Tuesday, 27 January 2015 at 15:09:36 UTC, Laeeth Isharc wrote:
 I cannot speak about small team experiences. Our projects 
 usually take around 30+ developers.
That it is a decent sized team to have to coordinate and it puts emphasis on very different questions. The context I am thinking of is much leaner - more like special forces than the regular army (I mean in terms of flexibility and need to respond quickly to changing circumstances) - although the sums at stake are likely comparable to larger teams (the area is hedge fund portfolio management).
Out of curiosity, what is lacking in the current commercial offerings for hedge fund management? Why not use an existing engine? Also, why D? Why not use a language or platform designed for scalability and distributed computing like http://chapel.cray.com/ ?
Jan 27 2015
parent reply "Laeeth Isharc" <laeethnospam nospamlaeeth.com> writes:
 Out of curiosity, what is lacking in the current commercial 
 offerings for hedge fund management? Why not use an existing 
 engine?
In the general sense, lots is lacking across the board. I started a macro fund in 2012 with a former colleague from Citadel in partnership with another company, with the idea that they would provide infrastructure as they had experience in this domain. I should not say more, but let's say that I was not so happy with my choice of corporate partner. This experience made me think more carefully about the extent to which one needs to understand and control technology in my business. One of the things that was striking was the very limited set of choices available for a portfolio management system. Macro involves trading potentially any liquid product in any developed (and sometimes less developed) market, so it doesn't fit well with product offerings that have a product silo mentality. One uses a portfolio management system very intensively, so user interface matters. But very few of the offerings available seemed even to be passable. We ended up going with these guys who have a decent system because it was spun out of a hedge fund but if you asked me about passable alternatives, I do not know if there are any. http://www.tfgsystems.com/ There are of course specific challenges for macro and for startup funds that may not be generally true of the domain - it is a big area and what people need may be different. Larger funds use a combination of third party technologies and their own bits, but I am not sure that everyone is perfectly happy with what they have. I formerly jointly ran fixed income in London for Citadel, a big US fund, so have some background in the area. Things changed a lot since then, and I certainly wouldn't want to speak about Citadel. It's a funny domain, because the numbers are more like a large business, but there are not all that many people involved. People on the investment side don't necessarily have a technology background, or have the time and attention to spare to hone their specification of exactly how they want things to work. So one can have a strange experience of on paper being in a situation where one ought to have one's pick of systems, but in practice feeling starved of resources and control. This is one of the reasons I decided to spend time refreshing my technology skills, even though by conventional wisdom the basic tenets of opportunity cost and division of labour would suggest there is no point. Things have changed a lot in the past twenty years, and the only way to keep up is to get one's hands dirty now and then. Again on the resources front - given what happened in 2008, there has been an understandable focus on reporting, compliance, and the like. It's a surprisingly brittle business because your costs are fixed, whereas revenues depend on performance and assets and investment strategies tend to intrinsically experience an ebb and flow whilst it is human nature to extrapolate performance and investors, being human, tend to chase returns. So it's not today necessarily the fashion to have a large group of people to develop ideas and tools that might pay off, but where it is hard to demonstrate that they will beforehand. There has been a cultural change in the industry accompany its institutionalisation, so it's today much more 'corporate' in mindset than it once was, and this shift has not only positive aspects. In many cases, you can kind of do what you want in theory using Bloomberg. The problem is that it is closed, and with a restrictive API, so if you want to refine your analysis, that becomes limiting. But because you can do a lot that way (and it is presented very attractively) it's not so easy to justify rebuilding some functionality from scratch in order to have control. To take am almost trivial example, Bloomberg offers the ability to receive an alert by email when market hit various price conditions (or certain very basic technical analysis indicators are triggered). That's valuable, but not enough for various reasons: one needs to maintain the alerts by hand (last I checked); I don't trust email for delivery of something important; and I want to be able to consider more complex conditions. One could do this in a spreadsheet, but that's not in my opinion the way to run a business. Python is fine for this kind of thing, but I would rather engineer the whole thing in a better way, since the analytics are shared between functions. Or to take another example, charting and data management platforms for institutional traders remain unsatisfactory. It's not easy to pull data in to Bloomberg, and to do so in an automated way where your data series are organized. One wants to have all the data in one place and be able to run analyses on them, and I am not aware of a satisfactory platform available for this. Quite honestly, the retail solutions are much more impressive - it's just that they don't cover what one needs as a professional. By building it oneself, one has control and can work towards excellence. The combination of incremental improvements, small in themselves, is underestimated in our world today as a contribution to success.
 Also, why D? Why not use a language or platform designed for 
 scalability and distributed computing like 
 http://chapel.cray.com/ ?
Pragmatically, I am an old C programmer, and there is a limit to how much I can learn in the time available. It seems to me I can do everything I need in D in a way that is scalable for practical purposes. Some of what I want to do is totally straightforward scripting, and some is more ambitious. It is nice to be able to use a single language, if it's the right tool for the job (and if not, then interoperability matters). If sociomantic (and that advertising company linked to in the blog post from a while back about using D for big data) can do what they do, I can't imagine it will be limiting for me for a while. I will check it out, but there is a beauty to starting with the smallest useful version, and knowing that you can scale if you need to. I recognize this reply is meandering a bit - since the major topic is use of D for big data in finance, whereas I am touching on a whole host of applications where I see it being rather useful. Laeeth.
Jan 27 2015
parent "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Tuesday, 27 January 2015 at 19:27:43 UTC, Laeeth Isharc wrote:
 One of the things that was striking was the very limited set of 
 choices available for a portfolio management system.  Macro 
 involves trading potentially any liquid product in any 
 developed (and sometimes less developed) market, so it doesn't 
 fit well with product offerings that have a product silo 
 mentality.  One uses a portfolio management system very 
 intensively, so user interface matters.
I have to admit that I know very little about hedge funds, so this is all quite new and intriguing for me (and therefore pique my interest! ;^). I am using Google Cloud for creating App Engine web apps, but I have wanted to experiment with Google's cloud computing offerings for a while. Do you think that Compute Engine and Big Query would be suitable for your needs? Or is it required that you have all your data on site locally? Google has pretty good stability (SLA), but I guess they were very slow for a few hours during the olympics or so a couple of years ago (some load balancing mechanism that went bananas).
 There are of course specific challenges for macro and for 
 startup funds that may not be generally true of the domain - it 
 is a big area and what people need may be different.  Larger 
 funds use a combination of third party technologies and their 
 own bits, but I am not sure that everyone is perfectly happy 
 with what they have.
So, basically there might be a market for tailoring solutions so that client can gain strategic benefits?
 that they will beforehand.  There has been a cultural change in 
 the industry accompany its institutionalisation, so it's today 
 much more 'corporate' in mindset than it once was, and this 
 shift has not only positive aspects.
Ah, I sense you are going against the stream by getting your hands dirty in a DIY way. Good! :)
 becomes limiting.  But because you can do a lot that way (and 
 it is presented very attractively) it's not so easy to justify 
 rebuilding some functionality from scratch in order to have 
 control.
So Bloomberg have basically commoditized the existing practice, in a way, thus reinforcing a particular mindset of how things ought to be done, perhaps? And maybe you see some opportunities in doing things differently? :-)
 are triggered).  That's valuable, but not enough for various 
 reasons: one needs to maintain the alerts by hand (last I 
 checked);
For legal reasons?
 in my opinion the way to run a business.  Python is fine for 
 this kind of thing, but I would rather engineer the whole thing 
 in a better way, since the analytics are shared between 
 functions.
Not sure what you mean by the "functions", do you mean technical computations or people (like different functional roles)?
 automated way where your data series are organized.  One wants 
 to have all the data in one place and be able to run analyses 
 on them, and I am not aware of a satisfactory platform 
 available for this.
How large are the datasets?
 needs as a professional.  By building it oneself, one has 
 control and can work towards excellence.  The combination of 
 incremental improvements, small in themselves, is 
 underestimated in our world today as a contribution to success.
Yes, and you can also tailor the interface to the user, so professionals can eventually get more done or be less frustrated by getting rid of the clutter. Or in some cases where I try to make the interface so simple that no learning (and therefore confusion) is necessary, which is kind of important for functions that are used seldom. But it sounds like you are creating tools for yourself, so that might not apply in your case?
 Pragmatically, I am an old C programmer, and there is a limit 
 to how much I can learn in the time available.  It seems to me
Sound like D might be a good starting point for you, an incremental upgrade from C.
 I can do everything I need in D in a way that is scalable for 
 practical purposes.  Some of what I want to do is totally 
 straightforward scripting, and some is more ambitious.  It is 
 nice to be able to use a single language, if it's the right 
 tool for the job (and if not, then interoperability matters).  
 If sociomantic (and that advertising company linked to in the 
 blog post from a while back about using D for big data) can do 
 what they do, I can't imagine it will be limiting for me for a 
 while.  I will check it out, but there is a beauty to starting 
 with the smallest useful version, and knowing that you can 
 scale if you need to.
If you need very high performance on a single CPU then you probably need a compiler that will generate good SIMD code for you, but I suppose you could try out a tool like Intel's experimental vectorizing compiler https://ispc.github.io/ or something else that can vectorize and link it in if D is too slow for you.
 I recognize this reply is meandering a bit - since the major 
 topic is use of D for big data in finance, whereas I am 
 touching on a whole host of applications where I see it being 
 rather useful.
You might find D a bit lacking on the SIMD side, with AVX you can basically boost performance with a factor of 5-6x compared to non-vectorized code, but maybe D will benefit from the auto vectorizing support that is being added to LLVM for Clang. How do you plan to do the user interface? HTML5?
Jan 27 2015
prev sibling parent "Paulo Pinto" <pjmlp progtools.org> writes:
On Tuesday, 27 January 2015 at 15:09:36 UTC, Laeeth Isharc wrote:
 I cannot speak about small team experiences. Our projects 
 usually take around 30+ developers.
That it is a decent sized team to have to coordinate and it puts emphasis on very different questions. The context I am thinking of is much leaner - more like special forces than the regular army (I mean in terms of flexibility and need to respond quickly to changing circumstances) - although the sums at stake are likely comparable to larger teams (the area is hedge fund portfolio management).
 In terms of server applications, yes when the applications are 
 deployed usually the memory usage might not be optimal.
For you that is less important, and I suppose that comes from the intrinsic nature of the situation. You have beefy machines serving many users, I suppose? I am thinking of a purpose where there are only a handful of users, but the data sets may be larger than we are used to working with, requiring more work than just plain map reduce, and where rapid iteration and prototyping is important. Also to minimize cognitive overload and complexity. A friend has written an article on big data in finance for Alpha magazine, and I will post a link here once it has been published. One problem in economics is you have to forecast the present, because the numbers are published with a lag and themselves reflect decisions taken months previously. But markets are forward looking and discount a future that may be imagined but cannot be understood based only on hard facts. So we need all the help we can get, particularly during an epoch where things change all the time, (eurchf fx rate moved forty percent in a day...). Bridgewater have taken the work of Hal Varian and applied it to use media and web analytics to get a good live cut of economic activity and inflation. Although it is not a tough problem theoretically, people don't actually do as much as they could yet - I think finance is behind tech companies, but they are catching up. Another fund that my friend writes about uses employee sentiment to pick stocks to be long and short of - they manage a few billion and have done quite well.
 However that is why profiling and language knowledge is 
 required.
Yes, I can imagine, and it sounds like not just that Java is the best option for you, but perhaps the only viable one. I am curious though - what do you think the memory footprint is as a ratio to C++ before and after fine tuning? And what proportion of development time does this take?
Actually we use more than just Java. JVM languages, .NET languages, C++ (only on the realm of JNI/PInvoke/COM), JavaScript Memory optimizations only take place if really requested by the customer, from their acceptance tests, which is seldom the case. Usually one to two sprints might be spent.
 Fine tuning a Java application is no different than other 
 compiled languages, it just requires to know which knobs to 
 turn.
I liked a quote by a certain C++ guru talking about the experience of a Facebook, to the effect that a sensible first draft written in C++ would perform decently, whereas this was not true always of other languages. Now their trade off between programmer productivity and execution efficiency is extreme, but is this merely an edge case of little relevance for the rest of us, or is it a Gibsonian case of the future being already here, and just not evenly distributed? I am no expert, but I wonder if the latter may be more true than generally appreciated.
 For example, foreach allocates and a simple for does not, so 
 choose wisely how to iterate.
Would love to hear any other insights you have on this topic. There ought to be a FAQ on getting along with the GC for fun and profit.
Java ONE, Skills Matter and Microsoft BUILD have performance talks every now and then. Then there is the mechanical sympathy blog and mailing list. http://mechanical-sympathy.blogspot.de/
 The thing that Java still looses in memory management, even in 
 commercial JVMs, is lack of value types since escape analysis 
 algorithms are not very aggressive, but support is being 
 designed and will land partially in Java 9 and Java 10 time 
 frame.
That was my real point - and that it does really matter in some areas, and that these are growing very quickly. (I appreciate that someone reading my post quickly would see the 15% power thing and focus on that, which was not so much my point - although I am still suspicious of the idea that Java will always keep up with native code without doing a lot of extra work). People on the Slashdot thread were saying what is the point of D. But the way I saw it, where is the real competition for my use case? I can't be extravagant with memory, but I still need rapid development and productivity. We all have a tendency to think that what we know from experience and reading is the full picture, but the world is a big place, and something needs to appeal to someone to grow, not necessarily to oneself personally The C++ integration is the remaining piece. Otherwise it is like the old Soviet Union - this is the factory that makes the steel that builds the factory that makes the steel that... so that Vladimir may have a new car. Ie one spends too much time in roundabout investment before one actually reaps the benefit of higher productivity.
 So by Java 10 according to the planned features, there will be 
 value types and even the open source JVM will have some form 
 of JIT cache. Including the large amount of available 
 libraries.

 As for D, it surely has its place and I am looking forward to 
 language adoption.
Out of interest, what do you see as characterising this place (abstracting away from things that are not perfect now, but will probably be fixed in time)? And in an enterprise environment, what would you use D for today? Laeeth.
To be honest I don't see a place on the enterprise for my type of work. As a language geek, I hang around in multiple language forums and I like D, because I got to appreciate systems programming with memory safe programming languages back in the 90's. Our projects are usually based on distributed computing using JVM/.NET stacks, using mainly Oracle and MS SQL Server, web services, HADOOP, Akka(.NET). Devops like to be able to use the respective management consoles on the servers across the network. Desktop applications tend to be built on top of Eclipse RCP/Netbeans, WPF or plain Web. The mobile space is covered with Web applications, Cordova or Xamarin. This is my little world, but I imagine D being usable in startups or companies not constrained by other language tech stacks. -- Paulo
Jan 27 2015