digitalmars.D - dmd-x64
- alkor (1/1) Dec 21 2009 anybody see the 64-bit version of dmd compiler?
- bearophile (4/5) Dec 21 2009 I can't see it. It must be absent.
- alkor (4/4) Dec 22 2009 it's bad
- Travis Boucher (3/9) Dec 22 2009 Look up gdc and ldc, both can target x86_64. gdc tends to be lagging
- Matt (6/17) Dec 22 2009 GDC is being maintained again. See
- Travis Boucher (18/37) Dec 22 2009 gdc is still lagging quite a bit, I've been following the goshawk
- bearophile (4/7) Dec 22 2009 Can you explain better what do you mean?
- Travis Boucher (16/25) Dec 22 2009 llvm has been designed for use for code analyzers, compiler development,...
- bearophile (9/13) Dec 22 2009 I have already done hundred of tests and benchmarks with LDC and llvm-gc...
- dsimcha (8/11) Dec 22 2009 situations it's about as good or even better (I can show a large amount ...
- bearophile (5/7) Dec 23 2009 Mostly C++/Fortran.
- Travis Boucher (10/29) Dec 22 2009 I am not trying to get into the benchmark game, for every example of gcc...
- Leandro Lucarella (19/25) Dec 23 2009 I don't know if that are accurate numbers, but 5-25% looks like a *lot* ...
- bearophile (10/29) Dec 23 2009 Vectorization can improve 2X or 3X+ the performance of certain code (typ...
- Leandro Lucarella (17/45) Dec 23 2009 Well, you are talking about a single user, but for servers, if you have ...
- retard (9/33) Dec 23 2009 Aren't able to appreciate? Where are those numbers pulled from?
- =?UTF-8?B?UGVsbGUgTcOlbnNzb24=?= (6/39) Dec 23 2009 I think you miss the point, he said vectorization was a big deal. The
- bearophile (4/5) Dec 23 2009 You are right. It's not easy to give average numbers for any kind of C o...
- Walter Bright (3/9) Dec 23 2009 Small benchmarks tend to have a high 'beta', or variance from the norm.
- retard (8/19) Dec 23 2009 It's difficult to measure performance improvements overall in
- Walter Bright (4/11) Dec 23 2009 I find that benchmarks are useful in figuring out new ways to optimize
- alkor (7/19) Dec 24 2009 oh ... i stirred up a holy war, sorry
- Walter Bright (2/6) Dec 24 2009 D already has TLS. What exactly do you need?
- alkor (6/7) Dec 24 2009 hmm ... i don't think so.
- =?UTF-8?B?UGVsbGUgTcOlbnNzb24=?= (4/11) Dec 24 2009 int i;
- Denis Koroskin (6/13) Dec 24 2009 "Shared data" is something which is *shared* between threads. That's exa...
- alkor (25/38) Dec 23 2009 i've tested g++, gdc & dmd on an ordinary task - processing compressed d...
- Travis Boucher (3/15) Dec 23 2009 If you can't get gdc to generate optimized code, then you are using it
- alkor (17/17) Dec 23 2009 maybe, i do something wrong, but for example:
- =?ISO-8859-1?Q?=22J=E9r=F4me_M=2E_Berger=22?= (11/29) Dec 23 2009 Because the dmd-built executable is stripped. Try to add "-s" to=20
- alkor (53/87) Dec 23 2009 oh no - both files aren't stripped
- Travis Boucher (4/9) Dec 23 2009 Add -frelease to gdc (if you want a fair comparison), and look at the
- alkor (2/13) Dec 23 2009
- Travis Boucher (3/5) Dec 23 2009 Thats because -frelease removes certain array bounds checking code,
alkor:anybody see the 64-bit version of dmd compiler?I can't see it. It must be absent. Bye, bearophile
Dec 21 2009
it's bad d's good enough to make real projects, but complier MUST supports linux x64 as a target platform believe, it's time to make 64-bit code generation is it possible to take back-end (i.e. code generation) from gcc or it's too complicated?
Dec 22 2009
alkor wrote:it's bad d's good enough to make real projects, but complier MUST supports linux x64 as a target platform believe, it's time to make 64-bit code generation is it possible to take back-end (i.e. code generation) from gcc or it's too complicated?Look up gdc and ldc, both can target x86_64. gdc tends to be lagging behind (ALOT) in the dmd front end, ldc not as much.
Dec 22 2009
On 12/22/09 2:34 AM, Travis Boucher wrote:alkor wrote:GDC is being maintained again. See http://bitbucket.org/goshawk/gdc/wiki/Home They are up to DMD 1.043 and there has been significant activity recently. It could take a while for them to get fully caught up, but they are making good progress.it's bad d's good enough to make real projects, but complier MUST supports linux x64 as a target platform believe, it's time to make 64-bit code generation is it possible to take back-end (i.e. code generation) from gcc or it's too complicated?Look up gdc and ldc, both can target x86_64. gdc tends to be lagging behind (ALOT) in the dmd front end, ldc not as much.
Dec 22 2009
Matt wrote:On 12/22/09 2:34 AM, Travis Boucher wrote:gdc is still lagging quite a bit, I've been following the goshawk branch. The problem here is he has to deal with both the major DMD changes (in 2 different D versions) and the big changes in GCC, so maintaining gdc itself would be an annoying process since there isn't a bit of support on either end of the bridge. (DM does what best for DM, gcc won't accept a language like D (even though it has more similarities to C/C++ then java/fortran/ada does). ldc on the other hand has a great structure which promotes using it as a backend for a different front end, however it doesn't (yet) generic code nearly as good as gcc. dmd's focus seems to be more about a reference compiler then a flexible compile that generates great code. Personally, I still use an old ass gdc based on GCC 4.1.3, DMD1.020 because it happens to be the one that best supports my platform (FreeBSD/amd64). The only real issues I run into is a few issues with CTFE and dsss/rebuild's handling of a few compiler errors (eg. writefln("..."; results in rebuild exploding.)alkor wrote:GDC is being maintained again. See http://bitbucket.org/goshawk/gdc/wiki/Home They are up to DMD 1.043 and there has been significant activity recently. It could take a while for them to get fully caught up, but they are making good progress.it's bad d's good enough to make real projects, but complier MUST supports linux x64 as a target platform believe, it's time to make 64-bit code generation is it possible to take back-end (i.e. code generation) from gcc or it's too complicated?Look up gdc and ldc, both can target x86_64. gdc tends to be lagging behind (ALOT) in the dmd front end, ldc not as much.
Dec 22 2009
Travis Boucher:ldc on the other hand has a great structure which promotes using it as a backend for a different front end, however it doesn't (yet) generic code nearly as good as gcc.Can you explain better what do you mean? Bye, bearophile
Dec 22 2009
bearophile wrote:Travis Boucher:llvm has been designed for use for code analyzers, compiler development, IDEs, etc. The APIs are well documented and well thought out, as it its IR (which is an assembler-like language itself). It is easy to use small parts of llvm due to its modular structure. Although it's design promotes all sorta of optimization techniques, its still pretty young (compared to gcc) and just doesn't have all of the optimization stuff gcc has. gcc has evolved over a long time, and contains alot of legacy cruft. It's IR changes on a (somewhat) regular basis, and its internals are a big hairy intertwined mess. Trying to learn one small part of how GCC works often involves learning how alot of other unrelated things work. However, since it is so mature, many different optimization techniques have been developed, and continue to be developed as underlying hardware changes. It also supports generating code for a huge number of targets. When I say 'ldc' above, I really mean 'llvm' in general.ldc on the other hand has a great structure which promotes using it as a backend for a different front end, however it doesn't (yet) generic code nearly as good as gcc.Can you explain better what do you mean? Bye, bearophile
Dec 22 2009
Travis Boucher:Although it's design promotes all sorta of optimization techniques, its still pretty young (compared to gcc) and just doesn't have all of the optimization stuff gcc has.I have already done hundred of tests and benchmarks with LDC and llvm-gcc, and I'm starting to understand its optimizations. I am mostly ignorant of LLVM still, but I'm giving a bit of help tuning it, this improvement was motivated by me: http://blog.llvm.org/2009/12/advanced-topics-in-redundant-load.html Compared to GCC LLVM lacks vectorization (this can be important for certain heavy numerical computing code), profile-guided optimization (this is usually less important, it's uncommon that it gives more than 5-25% performance improvement), but it has a link-time optimizations that gcc lacks (about as important as profile-guided optimization or a little more). LLVM produces bad X86 floating point code still, but its int/FP SSE code is about as good as GCC one or better (but it's not vectorized, so far). GCC is older and it knows few extra small/tiny optimization tricks, but in most situations they don't create a large difference in performance, they are often quite specific. So overall LLVM may sometime produce a little slower code, but in many situations it's about as good or even better (I can show a large amount of cases where LLVM is better). So the asm quality difference is smaller than you seem to imply. If the size of such performance differences are important for you, then you may want to use the Intel compiler instead of GCC, because it's sometimes better than GCC. Bye, bearophile
Dec 22 2009
== Quote from bearophile (bearophileHUGS lycos.com)'s articleSo overall LLVM may sometime produce a little slower code, but in manysituations it's about as good or even better (I can show a large amount of cases where LLVM is better). So the asm quality difference is smaller than you seem to imply. If the size of such performance differences are important for you, then you may want to use the Intel compiler instead of GCC, because it's sometimes better than GCC.Bye, bearophileDoes Intel even make compilers for any language outside the horribly crufty legacy language category (C, C++, Fortran)?
Dec 22 2009
dsimcha:Does Intel even make compilers for any language outside the horribly crufty legacy language category (C, C++, Fortran)?Mostly C++/Fortran. The problem is, probably those crufty legacy languages aren't going away in the next 20 years :-) Bye, bearophile
Dec 23 2009
bearophile wrote:Travis Boucher:I am not trying to get into the benchmark game, for every example of gcc generating better code then llvm, there could be an example of llvm generating better code then gcc. What I was trying to state is the overall differences between the two: - ldc supports newer versions of the dmd front end then gcc. - gdc tend to generate better code then ldc (in many cases) - gdc supports more targets (the code generator, not the runtime) I personally use an old-ass gdc because it works for what I need. I'd like to switch to ldc, but there is limited support for my target platform.Although it's design promotes all sorta of optimization techniques, its still pretty young (compared to gcc) and just doesn't have all of the optimization stuff gcc has.I have already done hundred of tests and benchmarks with LDC and llvm-gcc, and I'm starting to understand its optimizations. I am mostly ignorant of LLVM still, but I'm giving a bit of help tuning it, this improvement was motivated by me: http://blog.llvm.org/2009/12/advanced-topics-in-redundant-load.html Compared to GCC LLVM lacks vectorization (this can be important for certain heavy numerical computing code), profile-guided optimization (this is usually less important, it's uncommon that it gives more than 5-25% performance improvement), but it has a link-time optimizations that gcc lacks (about as important as profile-guided optimization or a little more). LLVM produces bad X86 floating point code still, but its int/FP SSE code is about as good as GCC one or better (but it's not vectorized, so far). GCC is older and it knows few extra small/tiny optimization tricks, but in most situations they don't create a large difference in performance, they are often quite specific. So overall LLVM may sometime produce a little slower code, but in many situations it's about as good or even better (I can show a large amount of cases where LLVM is better). So the asm quality difference is smaller than you seem to imply. If the size of such performance differences are important for you, then you may want to use the Intel compiler instead of GCC, because it's sometimes better than GCC. Bye, bearophile
Dec 22 2009
bearophile, el 23 de diciembre a las 00:13 me escribiste:Compared to GCC LLVM lacks vectorization (this can be important for certain heavy numerical computing code), profile-guided optimization (this is usually less important, it's uncommon that it gives more than 5-25% performance improvement)I don't know if that are accurate numbers, but 5-25% looks like a *lot* to me.but it has a link-time optimizations that gcc lacks (about as important as profile-guided optimization or a little more).And GCC have LTO too, see: http://gcc.gnu.org/wiki/LinkTimeOptimization I'm not arguing that GCC is way better than LLVM, just wanted to add some lacking information to this thread. I really think they are very close, sometimes one is better, sometimes the other is better), but LLVM is very young compared to GCC so it's very promising that they are so close to GCC in so little time (and using less memory and CPU time). -- Leandro Lucarella (AKA luca) http://llucax.com.ar/ ---------------------------------------------------------------------- GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145 104C 949E BFB6 5F5A 8D05) ---------------------------------------------------------------------- If you want to be alone, just be alone If you want to watch the sea, just watch the sea But do it now, timing is the answer, do it now Timing is the answer to success
Dec 23 2009
Leandro Lucarella:bearophile, el 23 de diciembre a las 00:13 me escribiste:Vectorization can improve 2X or 3X+ the performance of certain code (typical example: matrix multiplication done right). Performance differences start to matter in practice when they are 2X or more. In most situations users aren't able to appreciate a 20% performance improvement of an application. (But small improvements are important for the compiler devs because they are cumulative, so many small improvements may eventually lead some a significant difference). Regarding the accuracy of those numbers, it's not easy to tell how much accurate they are, because they are quite sensitive to the details of the code.Compared to GCC LLVM lacks vectorization (this can be important for certain heavy numerical computing code), profile-guided optimization (this is usually less important, it's uncommon that it gives more than 5-25% performance improvement)I don't know if that are accurate numbers, but 5-25% looks like a *lot* to me.Oh, nice, I have not tried this yet. Is this going in Gcc 4.5? LTO of LLVM is pretty good, I don't know if GCC implements it equally well (I fear that the answer is negative.but it has a link-time optimizations that gcc lacks (about as important as profile-guided optimization or a little more).And GCC have LTO too, see: http://gcc.gnu.org/wiki/LinkTimeOptimizationI'm not arguing that GCC is way better than LLVM, just wanted to add some lacking information to this thread.Thank you.I really think they are very close, sometimes one is better, sometimes the other is better), but LLVM is very young compared to GCC so it's very promising that they are so close to GCC in so little time (and using less memory and CPU time).LLVM devs are also very nice people, they help me when I have a problem, and they even implement large changes I ask them, often in a short enough time. Helping them is fun. This means that probably the compiler will keep improving for some more time, because in open source projects the quality of the community is important. Bye, bearophile
Dec 23 2009
bearophile, el 23 de diciembre a las 12:02 me escribiste:Leandro Lucarella:Well, you are talking about a single user, but for servers, if you have to provide a minimum quality of service, a 20% difference means you can serve 20% more people, for example (not that people would have to wait 0.2 secs more, because that is not an option).bearophile, el 23 de diciembre a las 00:13 me escribiste:Vectorization can improve 2X or 3X+ the performance of certain code (typical example: matrix multiplication done right). Performance differences start to matter in practice when they are 2X or more. In most situations users aren't able to appreciate a 20% performance improvement of an application. (But small improvements are important for the compiler devs because they are cumulative, so many small improvements may eventually lead some a significant difference).Compared to GCC LLVM lacks vectorization (this can be important for certain heavy numerical computing code), profile-guided optimization (this is usually less important, it's uncommon that it gives more than 5-25% performance improvement)I don't know if that are accurate numbers, but 5-25% looks like a *lot* to me.I think it will be in GCC 4.5 but I don't know the details. -- Leandro Lucarella (AKA luca) http://llucax.com.ar/ ---------------------------------------------------------------------- GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145 104C 949E BFB6 5F5A 8D05) ---------------------------------------------------------------------- Se va a licitar un sistema de vuelos espaciales mendiante el cual, desde una plataforma que quizás se instale en la provincia de Córdoba. Esas naves espaciales va a salir de la atmósfera, va a remontar la estratósfera y desde ahà elegir el lugar donde quieran ir de tal forma que en una hora y media podamos desde Argentina estar en Japón, en Corea o en cualquier parte. -- Carlos Saúl Menem (sic)Oh, nice, I have not tried this yet. Is this going in Gcc 4.5? LTO of LLVM is pretty good, I don't know if GCC implements it equally well (I fear that the answer is negative.but it has a link-time optimizations that gcc lacks (about as important as profile-guided optimization or a little more).And GCC have LTO too, see: http://gcc.gnu.org/wiki/LinkTimeOptimization
Dec 23 2009
Wed, 23 Dec 2009 12:02:53 -0500, bearophile wrote:Leandro Lucarella:Aren't able to appreciate? Where are those numbers pulled from? Autovectorization mostly deals with expression optimizations in loops. You can easily calculate how much faster some code runs when it uses e.g. SSE2 instructions instead of plain old x86 instructions.bearophile, el 23 de diciembre a las 00:13 me escribiste:Vectorization can improve 2X or 3X+ the performance of certain code (typical example: matrix multiplication done right). Performance differences start to matter in practice when they are 2X or more. In most situations users aren't able to appreciate a 20% performance improvement of an application. (But small improvements are important for the compiler devs because they are cumulative, so many small improvements may eventually lead some a significant difference).Compared to GCC LLVM lacks vectorization (this can be important for certain heavy numerical computing code), profile-guided optimization (this is usually less important, it's uncommon that it gives more than 5-25% performance improvement)I don't know if that are accurate numbers, but 5-25% looks like a *lot* to me.LLVM devs are also very nice people, they help me when I have a problem, and they even implement large changes I ask them, often in a short enough time. Helping them is fun. This means that probably the compiler will keep improving for some more time, because in open source projects the quality of the community is important.And GCC devs aren't nice people? They won't help you if you have a problem? Helping them isn't fun? GCC won't keep improving because it's open source? You make no sense. How much do the LLVM devs pay you for advertising them?
Dec 23 2009
On 12/23/2009 10:40 PM, retard wrote:Wed, 23 Dec 2009 12:02:53 -0500, bearophile wrote:I think you miss the point, he said vectorization was a big deal. The numbers on profile guided optimization seem a bit odd though.Leandro Lucarella:Aren't able to appreciate? Where are those numbers pulled from? Autovectorization mostly deals with expression optimizations in loops. You can easily calculate how much faster some code runs when it uses e.g. SSE2 instructions instead of plain old x86 instructions.bearophile, el 23 de diciembre a las 00:13 me escribiste:Vectorization can improve 2X or 3X+ the performance of certain code (typical example: matrix multiplication done right). Performance differences start to matter in practice when they are 2X or more. In most situations users aren't able to appreciate a 20% performance improvement of an application. (But small improvements are important for the compiler devs because they are cumulative, so many small improvements may eventually lead some a significant difference).Compared to GCC LLVM lacks vectorization (this can be important for certain heavy numerical computing code), profile-guided optimization (this is usually less important, it's uncommon that it gives more than 5-25% performance improvement)I don't know if that are accurate numbers, but 5-25% looks like a *lot* to me.LLVM is way younger than GCC. In my experiments, I get mostly better performance out of clang than out of gcc. Working with LLVM seems like more fun to me.LLVM devs are also very nice people, they help me when I have a problem, and they even implement large changes I ask them, often in a short enough time. Helping them is fun. This means that probably the compiler will keep improving for some more time, because in open source projects the quality of the community is important.And GCC devs aren't nice people? They won't help you if you have a problem? Helping them isn't fun? GCC won't keep improving because it's open source? You make no sense. How much do the LLVM devs pay you for advertising them?
Dec 23 2009
Pelle MÃ¥nsson:The numbers on profile guided optimization seem a bit odd though.<You are right. It's not easy to give average numbers for any kind of C or C++ software. In benchmark-like code I've seen up to 20-25% improvements, but I assume that in much larger programs the situation is different. Probably if you try to compute a true average, the average percentage of improvement is lower, like 5% or less. It's a feature useful for hot spots of the code. Bye, bearophile
Dec 23 2009
bearophile wrote:You are right. It's not easy to give average numbers for any kind of C or C++ software. In benchmark-like code I've seen up to 20-25% improvements, but I assume that in much larger programs the situation is different. Probably if you try to compute a true average, the average percentage of improvement is lower, like 5% or less. It's a feature useful for hot spots of the code.Small benchmarks tend to have a high 'beta', or variance from the norm. The results in actual applications tend to be much closer together.
Dec 23 2009
Wed, 23 Dec 2009 17:04:49 -0800, Walter Bright wrote:bearophile wrote:It's difficult to measure performance improvements overall in applications like image manipulation software or sound wave editors. E.g. if a complex effect processing takes now 2 seconds instead of 4 hours, but all GUI event processing is 100% slower, during the workday the application might only work 10% faster overall. The user spends much more time in the interactive part of the code. From what I've read, bearophile mostly only uses synthetic tests.You are right. It's not easy to give average numbers for any kind of C or C++ software. In benchmark-like code I've seen up to 20-25% improvements, but I assume that in much larger programs the situation is different. Probably if you try to compute a true average, the average percentage of improvement is lower, like 5% or less. It's a feature useful for hot spots of the code.Small benchmarks tend to have a high 'beta', or variance from the norm. The results in actual applications tend to be much closer together.
Dec 23 2009
retard wrote:It's difficult to measure performance improvements overall in applications like image manipulation software or sound wave editors. E.g. if a complex effect processing takes now 2 seconds instead of 4 hours, but all GUI event processing is 100% slower, during the workday the application might only work 10% faster overall. The user spends much more time in the interactive part of the code. From what I've read, bearophile mostly only uses synthetic tests.I find that benchmarks are useful in figuring out new ways to optimize code, but not very useful in predicting the performance of a compiler on any of my apps.
Dec 23 2009
oh ... i stirred up a holy war, sorry each lang has weak & strength features e.g. need hight performance? - use asm and pay by development time but i'm looking for a new lang generation (not c++ - it's too "dirty") w real objects & templates and powerful multi-threading features e.g. thread-local storage (TLS) and some concurrency features from c++0x so, Walter, is it possible to expand a set of D's multi-threading features? Walter Bright Wrote:retard wrote:It's difficult to measure performance improvements overall in applications like image manipulation software or sound wave editors. E.g. if a complex effect processing takes now 2 seconds instead of 4 hours, but all GUI event processing is 100% slower, during the workday the application might only work 10% faster overall. The user spends much more time in the interactive part of the code. From what I've read, bearophile mostly only uses synthetic tests.I find that benchmarks are useful in figuring out new ways to optimize code, but not very useful in predicting the performance of a compiler on any of my apps.
Dec 24 2009
alkor wrote:but i'm looking for a new lang generation (not c++ - it's too "dirty") w real objects & templates and powerful multi-threading features e.g. thread-local storage (TLS) and some concurrency features from c++0x so, Walter, is it possible to expand a set of D's multi-threading features?D already has TLS. What exactly do you need?
Dec 24 2009
D already has TLS. What exactly do you need?hmm ... i don't think so. i've worked out the following info: http://www.digitalmars.com/d/2.0/cpp0x.html#local-classes http://www.digitalmars.com/d/2.0/migrate-to-shared.html but "shared data" are not TLS or i misunderstand something whether you could give a TLS example?
Dec 24 2009
On 12/24/2009 11:44 AM, alkor wrote:int i; void main() { } compile with -vtls. :)D already has TLS. What exactly do you need?hmm ... i don't think so. i've worked out the following info: http://www.digitalmars.com/d/2.0/cpp0x.html#local-classes http://www.digitalmars.com/d/2.0/migrate-to-shared.html but "shared data" are not TLS or i misunderstand something whether you could give a TLS example?
Dec 24 2009
On Thu, 24 Dec 2009 13:44:41 +0300, alkor <alkor au.ru> wrote:"Shared data" is something which is *shared* between threads. That's exact opposite of TLS (thread-*local* storage). In D2, everything is thread-local by default: int foo; // thread-local shared int bar; // shared among threadsD already has TLS. What exactly do you need?hmm ... i don't think so. i've worked out the following info: http://www.digitalmars.com/d/2.0/cpp0x.html#local-classes http://www.digitalmars.com/d/2.0/migrate-to-shared.html but "shared data" are not TLS or i misunderstand something whether you could give a TLS example?
Dec 24 2009
i've tested g++, gdc & dmd on an ordinary task - processing compressed data w using zlib all compilers're made from sources, target - gentoo x32 i686 c++ & d codes are simplest & alike but, dmd makes faster result then g++ and gdc loses g++ 'cause gdc'es not have any optimization options gdc makes slower code then dmd and does'nt support d 2.0, so it's useless so ... i'm waiting for dmd x64 == Repost the article of Travis Boucher (boucher.travis gmail.com) == Posted at 2009/12/23 01:51 to digitalmars.D bearophile wrote:Travis Boucher:alkor wrote:i've tested g++, gdc & dmd on an ordinary task - processing compressed data w using zlib all compilers're made from sources, target - gentoo x32 i686 c++ & d codes are simplest & alike but, dmd makes faster result then g++ and gdc loses g++ 'cause gdc'es not have any optimization options gdc makes slower code then dmd and does'nt support d 2.0, so it's useless so ... i'm waiting for dmd x64If you can't get gdc to generate optimized code, then you are using it wrong.Dec 23 2009maybe, i do something wrong, but for example: $ cat main.d int main () { return 0; } $dmd -O -release -ofmain-dmd main.d $gdc -O3 main.d -o main-gdc $ ls -l main-dmd main-gdc -rwxr-xr-x 1 alkor alkor 123439 Dec 23 14:06 main-dmd -rwxr-xr-x 1 alkor alkor 609363 Dec 23 14:06 main-gdc why the main-gdc in 5 time more then the main-dmd? any test shows dmd superiorities over gdc (and gcc) dmd rules :) == Repost the article of Travis Boucher (boucher.travis gmail.com) == Posted at 2009/12/23 04:48 to digitalmars.D If you can't get gdc to generate optimized code, then you are using it wrong.Dec 23 2009alkor wrote:maybe, i do something wrong, but for example: =20 $ cat main.d int main () { return 0; } =20 $dmd -O -release -ofmain-dmd main.d $gdc -O3 main.d -o main-gdc $ ls -l main-dmd main-gdc -rwxr-xr-x 1 alkor alkor 123439 Dec 23 14:06 main-dmd -rwxr-xr-x 1 alkor alkor 609363 Dec 23 14:06 main-gdc =20 why the main-gdc in 5 time more then the main-dmd? =20Because the dmd-built executable is stripped. Try to add "-s" to=20 the gdc command line or use gdmd with the same options as dmd. Moreover, since you are trying to optimize for space rather than=20 performance, you should use -Os (or at least -O2) rather than -O3.any test shows dmd superiorities over gdc (and gcc) dmd rules :) =20dmd doesn't even work on my computer. End of story :) Jerome --=20 mailto:jeberger free.fr http://jeberger.free.fr Jabber: jeberger jabber.frDec 23 2009oh no - both files aren't stripped after strip a difference is 2,3 times $ strip main-gdc main-dmd $ ls -l main-dmd main-gdc -rwxr-xr-x 1 alkor alkor 65088 Dec 23 16:44 main-dmd -rwxr-xr-x 1 alkor alkor 155784 Dec 23 16:44 main-gdc and main-gdc required libgcc_s.so.1 $ ldd main-gdc linux-gate.so.1 => (0xffffe000) libm.so.6 => /lib/libm.so.6 (0xb7ee7000) libgcc_s.so.1 => /usr/local/lib/libgcc_s.so.1 (0xb7edc000) libpthread.so.0 => /lib/libpthread.so.0 (0xb7ec4000) libc.so.6 => /lib/libc.so.6 (0xb7d89000) /lib/ld-linux.so.2 (0xb7f2d000) --- the next test - math performance --- module test.performance; import std.stdio, std.random; const int MAX = 10000000; int main () { int[] a = new int[MAX]; int[] b = new int[MAX]; double[] c = new double[MAX]; for (auto i=0; i< MAX; i++) { a[i] = i | 0xa1c0; b[i] = i | 0xbadbad; } for (auto i=0; i< MAX; i++) { c[i] = (a[i] & 0x10) ? cast(double)a[i] * b[i] * b[i] : cast(double)a[i] * a[i] * b[i]; } writefln("init a[9555000] 0x%08X", a[9555000]); writefln("init b[9555000] 0x%08X", b[9555000]); writefln("init b[9555000] %f", c[9555000]); return 0; } $ dmd -O -release -oftest-dmd test-performance.d && strip test-dmd $ time ./test-dmd init a[9555000] 0x0091EDF8 init b[9555000] 0x00BBDFBD init b[9555000] 1449827528761239666688.000000 real 0m0.722s user 0m0.552s sys 0m0.168s $ gdc -O3 test-performance.d -o test-gdc && strip test-gdc $ time ./test-gdc init a[9555000] 0x0091EDF8 init b[9555000] 0x00BBDFBD init b[9555000] 1449827528761239666688.000000 real 0m0.786s user 0m0.628s sys 0m0.152s so, dmd's code optimization rules Walter made nice lang & good compiler - it's true Jérôme M. Berger Wrote:alkor wrote:maybe, i do something wrong, but for example: $ cat main.d int main () { return 0; } $dmd -O -release -ofmain-dmd main.d $gdc -O3 main.d -o main-gdc $ ls -l main-dmd main-gdc -rwxr-xr-x 1 alkor alkor 123439 Dec 23 14:06 main-dmd -rwxr-xr-x 1 alkor alkor 609363 Dec 23 14:06 main-gdc why the main-gdc in 5 time more then the main-dmd?Because the dmd-built executable is stripped. Try to add "-s" to the gdc command line or use gdmd with the same options as dmd. Moreover, since you are trying to optimize for space rather than performance, you should use -Os (or at least -O2) rather than -O3.any test shows dmd superiorities over gdc (and gcc) dmd rules :)dmd doesn't even work on my computer. End of story :) Jerome -- mailto:jeberger free.fr http://jeberger.free.fr Jabber: jeberger jabber.frDec 23 2009alkor wrote:$ dmd -O -release -oftest-dmd test-performance.d && strip test-dmd $ gdc -O3 test-performance.d -o test-gdc && strip test-gdc so, dmd's code optimization rules Walter made nice lang & good compiler - it's trueAdd -frelease to gdc (if you want a fair comparison), and look at the code generated rather then running a micro benchmark on something that takes a fraction of a second to run.Dec 23 2009thanks, w -frelease gdc makes a good result - faster then dmd's one & normal size Travis Boucher Wrote:alkor wrote:$ dmd -O -release -oftest-dmd test-performance.d && strip test-dmd $ gdc -O3 test-performance.d -o test-gdc && strip test-gdc so, dmd's code optimization rules Walter made nice lang & good compiler - it's trueAdd -frelease to gdc (if you want a fair comparison), and look at the code generated rather then running a micro benchmark on something that takes a fraction of a second to run.Dec 23 2009alkor wrote:thanks, w -frelease gdc makes a good result - faster then dmd's one & normal sizeThats because -frelease removes certain array bounds checking code, assertion testing and I think a few other things.Dec 23 2009