digitalmars.D.ldc - prinft performance problem
- bearophile (74/74) Mar 17 2014 I have reduced another performance problem, Windows 32 bit.
- David Nadlinger (11/13) Mar 18 2014 LDC/Win32 uses the MinGW output/formatting functions, as e.g. the
- bearophile (41/50) Mar 18 2014 I am compiling the C code with the same gcc that ldc2 is using on
- David Nadlinger (25/33) Mar 19 2014 Doing this would be easy, in fact that's how it was before we started
- bearophile (17/32) Mar 19 2014 I sometimes use printf when I have to print lot of data because
I have reduced another performance problem, Windows 32 bit. -------------------------- A test C program: #include <stdio.h> int main() { for (double i = 0; i < 200000; i++) printf("%f\n", i); return 0; } -------------------------- A similar D program: import core.stdc.stdio: printf; int main() { for (double i = 0; i < 200000; i++) printf("%f\n", i); return 0; } -------------------------- I compile with: gcc -std=gnu99 -Ofast -flto -s test1.c -o test1 gcc 4.8.0 ldmd2 -wi -O -release -inline -noboundscheck test2.d ldc2 0.13.0-alpha1 If I redirect the output to file, the run-times for me are about 0.30 seconds for the C version, and about 1.12 seconds for the D version. -------------------------- GCC asm: _main: pushl %ebp movl %esp, %ebp andl $-16, %esp subl $32, %esp call ___main fldz .p2align 4,,7 L4: fstl 4(%esp) movl $LC1, (%esp) fstpl 24(%esp) call _printf fldl 24(%esp) fadds LC2 flds LC3 fcomip %st(1), %st ja L4 fstp %st(0) xorl %eax, %eax leave ret -------------------------- LDC2 asm: __Dmain: pushl %esi subl $24, %esp xorps %xmm0, %xmm0 movl $200000, %esi .align 16, 0x90 LBB0_1: movsd %xmm0, 16(%esp) movsd %xmm0, 4(%esp) movl $_.str, (%esp) calll ___mingw_printf movsd 16(%esp), %xmm0 addsd LCPI0_0, %xmm0 decl %esi jne LBB0_1 xorl %eax, %eax addl $24, %esp popl %esi ret -------------------------- Bye, bearophile
Mar 17 2014
On Tue 18 Mar 2014 04:31:30 AM CET, bearophile wrote:I have reduced another performance problem, Windows 32 bit. A test C program: […]LDC/Win32 uses the MinGW output/formatting functions, as e.g. the printf() from the MSCRT can't handle reals. I don't really see a reason why the LDC-generated code should be that much slower otherwise (you can verify this by just calling __mingw_printf directly or I think also by passing the -posix/-ansi flags to GCC). As to why the MinGW printf() is actually slower, no idea. Probably just a question of an optimized-to-hell version against a simple hack to make the C99 format specifiers work. David
Mar 18 2014
David Nadlinger:LDC/Win32 uses the MinGW output/formatting functions, as e.g. the printf() from the MSCRT can't handle reals.I am compiling the C code with the same gcc that ldc2 is using on default on Windows, as explained in the ldc2 installation procedure.I don't really see a reason why the LDC-generated code should be that much slower otherwiseNor I.(you can verify this by just calling __mingw_printf directly or I think also by passing the -posix/-ansi flags to GCC).OK. If I compile this C code: #include <stdio.h> int main() { double i; for (i = 0; i < 200000; i++) printf("%f\n", i); return 0; } With (no optimizations): gcc -ansi test2.c -o test2 The run-time is about the same (about 0.31 seconds). But if I compile this code: #include <stdio.h> int main() { double i; for (i = 0; i < 200000; i++) __mingw_printf("%f\n", i); return 0; } The run-time is about 1.13 seconds, that is the same as the D version. If I compile this version that uses __mingw_printf with: gcc -Ofast -flto -s -ansi test2.c -o test2 The run-time is still about 1.13 seconds. So the experiment you have suggested has given an interesting answer :-)As to why the MinGW printf() is actually slower, no idea. Probably just a question of an optimized-to-hell version against a simple hack to make the C99 format specifiers work.I don't fully understand this part of your answer. And I don't understand how to fix the D code to make it about four times faster. Can you fix ldc2 to use the same printing function as used on default by C code compiled by GCC? When I have to write a lot of floating point values this could be a significant difference in run-time. Bye, bearophile
Mar 18 2014
On 19 Mar 2014, at 1:30, bearophile wrote:David Nadlinger:Doing this would be easy, in fact that's how it was before we started serious work on LDC/MinGW. However, as mentioned in my last message, GCC by default uses the printf() function from the Microsoft C runtime, which can't handle reals (i.e. C long doubles) and some of the C99 format specifiers. Thus, using the MSCRT functions will cause serious problem for many D programs (and the DMD/Phobos test suites), as all the floating point formatting and printing functions in Phobos depend on the C runtime functions like snprintf(). long doubles and C99 format strings are not as widespread in C code, thus you explicitly have to define __MINGW_USE_ANSI_STDIO to get printf() and friends mapped to the MinGW functions in C code (I would have thought that some command line switches also affect that, but apparently I misremembered). If you are actually using C printf() directly in your program (and not the Phobos formatting functions) and the Microsoft runtime covers the format specifiers you need, then you can just manually write the function declarations in question ("extern(C) int printf(const char*, …)"). core.stdc.stdio merely contains an alias from __mingw_printf() to printf(). If the Phobos string formatting performance is not good enough for you, then the best thing to do would be to write a D floating point formatting implementation and finally ditch the C formatting functions. DavidLDC/Win32 uses the MinGW output/formatting functions, as e.g. the printf() from the MSCRT can't handle reals.I am compiling the C code with the same gcc that ldc2 is using on default on Windows, as explained in the ldc2 installation procedure. […] Can you fix ldc2 to use the same printing function as used on default by C code compiled by GCC?
Mar 19 2014
David Nadlinger:However, as mentioned in my last message, GCC by default uses the printf() function from the Microsoft C runtime, which can't handle reals (i.e. C long doubles) and some of the C99 format specifiers.I missed that part of your compressed answer :-)If you are actually using C printf() directly in your program (and not the Phobos formatting functions)I sometimes use printf when I have to print lot of data because writeln is usually quite slower.and the Microsoft runtime covers the format specifiers you need, then you can just manually write the function declarations in question ("extern(C) int printf(const char*, …)"). core.stdc.stdio merely contains an alias from __mingw_printf() to printf().Good, this D code runs in 0.31, about as the C version: extern(C) nothrow int printf(const char*, ...); int main() { for (double i = 0; i < 200000; i++) printf("%f\n", i); return 0; } This is enough for my purposes, thank you (I don't need to print large amounts of reals).If the Phobos string formatting performance is not good enough for you, then the best thing to do would be to write a D floating point formatting implementation and finally ditch the C formatting functions.Printing floating point values correctly and quickly is a very complex project :-) Bye, bearophile
Mar 19 2014