c++.windows.32-bits - console apps slowdown
- Laurentiu Pancescu (17/17) May 26 2002 Hello!
- Walter (8/25) May 26 2002 It could be a stack alignment issue. Check that the stack is aligned to ...
- Laurentiu Pancescu (57/61) May 26 2002 Thanks, Walter! The difference seems unrelated to stack alignment, but ...
- Walter (7/34) May 26 2002 Ok, I'll check it out. -Walter
- Laurentiu Pancescu (16/63) May 27 2002 Hello Walter!
- Laurentiu Pancescu (7/8) Jun 03 2002 Gee... I hope you didn't get angry with me about this! At least, does i...
- Walter (6/13) Jun 03 2002 noticed
-
Laurentiu Pancescu
(16/30)
Jun 03 2002
You got me scared for a moment!
. I didn't create those small demos ... - Walter (7/16) Jun 03 2002 for
Hello! I've written a floating point test, as a console application. I do not display anything during calculations, only the result and elapsed time, after it completes. The big surprise is that if I run it under WinME console window, it's 50% slower than if I run same EXE under "rxvt for Win32" from MSYS package (www.mingw.org). And this happens with any console mode Win32 application, not only the ones that are compiled with DMC. Now the weird part: DOS extended applications run with the same speed under both shells, there's no additional slow-down in the Win32 console. I've also tried with Cygwin bash, and the slowdown is the same, therefore I assume it's related to the console emulation, and not to the fact that COMMAND.COM is a 16-bit DOS application. Switching to full-screen doesn't change anything, either. Did anyone else notice this problem? Does this also happen on WinNT family (Win2000/XP)? Regards, Laurentiu
May 26 2002
It could be a stack alignment issue. Check that the stack is aligned to 16 bytes on both. -Walter "Laurentiu Pancescu" <user domain.invalid> wrote in message news:acqu7p$k8g$1 digitaldaemon.com...Hello! I've written a floating point test, as a console application. I do not display anything during calculations, only the result and elapsed time, after it completes. The big surprise is that if I run it under WinME console window, it's 50% slower than if I run same EXE under "rxvt for Win32" from MSYS package (www.mingw.org). And this happens with anyconsolemode Win32 application, not only the ones that are compiled with DMC. Now the weird part: DOS extended applications run with the same speed underbothshells, there's no additional slow-down in the Win32 console. I've also tried with Cygwin bash, and the slowdown is the same, therefore I assume it's related to the console emulation, and not to the fact thatCOMMAND.COMis a 16-bit DOS application. Switching to full-screen doesn't change anything, either. Did anyone else notice this problem? Does this also happen on WinNTfamily(Win2000/XP)? Regards, Laurentiu
May 26 2002
Thanks, Walter! The difference seems unrelated to stack alignment, but I noticed that the DMC stack alignment problem isn't completely solved (DMC tries to keep 8-byte alignment, AFAIK). Please try the enclosed demo: "sc -o -6 -ff int1 main" (or int2, or int3, it doesn't matter - all of them are affected) Here's the output of int1.exe: Stack of main() 0x64fde8 Stack of integrate: 0x64fda4 --> misalignment generated by main() Result is 6.93147e+06 Elapsed time: 1.719 s If I compile for X32 (sc -o -6 -ff -mx int1 main x32.lib), I get: Stack of main() 0xfff98fdc --> note misalignment, maybe we should ask Doug for para alignment? Stack of integrate: 0xfff98f90 --> next misalignment makes things right :) Result is 6.93147e+006 Elapsed time: 1.32 s It seems optimizing main() does something bad - everything is fine if you compile main.cpp with no optimizations. Best regards, Laurentiu "Walter" <walter digitalmars.com> schrieb im Newsbeitrag news:acr3tk$ouk$1 digitaldaemon.com...It could be a stack alignment issue. Check that the stack is aligned to 16 bytes on both. -Walter "Laurentiu Pancescu" <user domain.invalid> wrote in message news:acqu7p$k8g$1 digitaldaemon.com...begin 666 laur_align.zip MHB6]FD6*N;X^Y]E>GH6YAE*>K488ZC%HVTT2I,PKB[&H4K6K8-5*%R4'O&G' M67\ 5HP\"CC/L\\\0X)6.^.,IFY.KK ;4$A[;0W:%H<7(_H;].A;R\LBC'S" MP2/L7=. M\['Y]+-4Q!.NH%C+SU"80V3N3BLN0>1^A<GK?9!$H0"K .#Q < MQ\CGP6N1QBXJ"(Z1XH&OOY<95;K+1]PSB1_KQHO]4SHPR3?VOM9_SOK;>?L- M<'"-44%N S 0O"/QAU$J53A0DEP)Z;V'GOJ RH!)K((=&3M*5?7OM5U3EQZJ MG%COSLX,LW=<M(/I&&HN)ZT8'<O38YK$]DCUR;?2I!WH-.%):'945$N5)A]I MAWFE%X&&X("M17_NG8TPC3ZJZF9)9SDZG<SHF$O'C8CE_ME*HU'76+UHVKY! M]N!!D558N4EVD;S#FMQ;'O=FHAO\:B\5,NY- Z,.XK;.<^+F\,KYP?TA10Z. M-;(&#Z M=N76.7*?'%=/$UEB4K<>)3(O'&2D]*>EA=T+\*#U!5!+`P04````" `!H+HL M+#W^R/D```#G MZ $J`R:Q"G8$=A4IZMWK,49LN_*;F??QS(LVW>A[!3%)=ZMOESS;6]HN;E9R MBNT\<VJZC](IT5O?C JL&#A MS[ 1%C_AC*9N3M3LK'<0`H=/)[MOV(%LU74.Z1\XT(3]6-VCX*]!2+4R_4C2 M/4;'<K SF";K$S1$" ^X+#G-$9/+\[J-1 F-`JS%&R3'<?ML])J5\[.) F.B MPS 4W"/E'TY!0C$*$-8V[5)U*A,P(E6N_5JL.G85VRR(?R>.706$Y>&]\]V] M\[M11N "9VRS _$^X>/=5G,L%<]35!92!L.FJ",I]/ /=49X UR=6 0C%,G M0Q(7.Q(=6T;E6*'GRM2L++[*`E?^0"YHOXR(T%:<]QY'-3C?P)&P1J87&SRZ M#M6KY^(,>\Q6J"):?UHE<<=NDU>$R$ =E<D+J^1=LPG+M-6O;[0-GL;;II-H M:?ZL_9OD)9DHES+\'SU3MYI?'$G$/2YRY+PX=IURGY(R/&+SO-N_;7:3%.[= M5#FT#X-!.S;?<9\_4$L!`A0`% ```` `])ZZ+%,8#FKQ````Y $``` ````` M% ```` `2Z"Z+$UL5JCQ````M $``` ```````````` ````AP,``&UA:6XN 98W!P4$L%! `````$``0`V ```)X$```````` ` end
May 26 2002
Ok, I'll check it out. -Walter "Laurentiu Pancescu" <user domain.invalid> wrote in message news:acr9e6$t9k$1 digitaldaemon.com...Thanks, Walter! The difference seems unrelated to stack alignment, but I noticed that the DMC stack alignment problem isn't completely solved (DMC tries to keep 8-byte alignment, AFAIK). Please try the enclosed demo: "sc -o -6 -ff int1 main" (or int2, or int3, it doesn't matter - all ofthemare affected) Here's the output of int1.exe: Stack of main() 0x64fde8 Stack of integrate: 0x64fda4 --> misalignment generated by main() Result is 6.93147e+06 Elapsed time: 1.719 s If I compile for X32 (sc -o -6 -ff -mx int1 main x32.lib), I get: Stack of main() 0xfff98fdc --> note misalignment, maybe we should askDougfor para alignment? Stack of integrate: 0xfff98f90 --> next misalignment makes things right:)Result is 6.93147e+006 Elapsed time: 1.32 s It seems optimizing main() does something bad - everything is fine if you compile main.cpp with no optimizations. Best regards, Laurentiu "Walter" <walter digitalmars.com> schrieb im Newsbeitrag news:acr3tk$ouk$1 digitaldaemon.com...16It could be a stack alignment issue. Check that the stack is aligned tobytes on both. -Walter "Laurentiu Pancescu" <user domain.invalid> wrote in message news:acqu7p$k8g$1 digitaldaemon.com...
May 26 2002
Hello Walter! I forgot to tell you my DMC version: 8.28, I downloaded it yesterday (CD update). It seems stack alignment works fine if I put everything in a single CPP file. In this example, the performance improvement with proper alignment isn't that big, as you can see from X32 program output; probably because everything fits in the microprocessor's cache, and misaligned memory accesses are not performed inside the calcation loop. Just a guess... :) Regards, Laurentiu "Walter" <walter digitalmars.com> wrote in message news:acsikl$27pe$1 digitaldaemon.com...Ok, I'll check it out. -Walter "Laurentiu Pancescu" <user domain.invalid> wrote in message news:acr9e6$t9k$1 digitaldaemon.com...IThanks, Walter! The difference seems unrelated to stack alignment, but(DMCnoticed that the DMC stack alignment problem isn't completely solvedyoutries to keep 8-byte alignment, AFAIK). Please try the enclosed demo: "sc -o -6 -ff int1 main" (or int2, or int3, it doesn't matter - all ofthemare affected) Here's the output of int1.exe: Stack of main() 0x64fde8 Stack of integrate: 0x64fda4 --> misalignment generated by main() Result is 6.93147e+06 Elapsed time: 1.719 s If I compile for X32 (sc -o -6 -ff -mx int1 main x32.lib), I get: Stack of main() 0xfff98fdc --> note misalignment, maybe we should askDougfor para alignment? Stack of integrate: 0xfff98f90 --> next misalignment makes things right:)Result is 6.93147e+006 Elapsed time: 1.32 s It seems optimizing main() does something bad - everything is fine iftocompile main.cpp with no optimizations. Best regards, Laurentiu "Walter" <walter digitalmars.com> schrieb im Newsbeitrag news:acr3tk$ouk$1 digitaldaemon.com...It could be a stack alignment issue. Check that the stack is aligned16bytes on both. -Walter "Laurentiu Pancescu" <user domain.invalid> wrote in message news:acqu7p$k8g$1 digitaldaemon.com...
May 27 2002
"Walter" <walter digitalmars.com> wrote in message news:acsikl$27pe$1 digitaldaemon.com...Ok, I'll check it out. -WalterGee... I hope you didn't get angry with me about this! At least, does it also happen on your machine, when compiling my test programs? I've noticed that stack is correctly aligned in many more programs (Win32 only, X32 it something different), so I'm not sure this wasn't just a special case. Laurentiu
Jun 03 2002
"Laurentiu Pancescu" <user nowhere.near> wrote in message news:adf86b$p60$1 digitaldaemon.com..."Walter" <walter digitalmars.com> wrote in message news:acsikl$27pe$1 digitaldaemon.com...noticedOk, I'll check it out. -WalterGee... I hope you didn't get angry with me about this! At least, does it also happen on your machine, when compiling my test programs? I'vethat stack is correctly aligned in many more programs (Win32 only, X32 it something different), so I'm not sure this wasn't just a special case.I'm annoyed with myself, not you, for there being a bug in the alignment process. I'm just glad you took the time to point it out and prepare a test case for me.
Jun 03 2002
You got me scared for a moment! <g>. I didn't create those small demos for testing alignment; I just read a nice article at www.oonumerics.org, about different strategies to achieve in C++ similar performance with that of FORTRAN. And I wrote those 3 small programs, and compared the results (also between different compilers). I hope you'll find that bug... Oh, related to X32: do you have some docs related to what's required from an X32 drop-in replacement? It seems the extender is no longer maintained, and maybe we could come up with an open-source, state-of-the-art one. Regards, Laurentiu "Walter" <walter digitalmars.com> wrote in message news:adg5dm$1ot1$2 digitaldaemon.com..."Laurentiu Pancescu" <user nowhere.near> wrote in message news:adf86b$p60$1 digitaldaemon.com...it"Walter" <walter digitalmars.com> wrote in message news:acsikl$27pe$1 digitaldaemon.com...Ok, I'll check it out. -WalterGee... I hope you didn't get angry with me about this! At least, doesitalso happen on your machine, when compiling my test programs? I'venoticedthat stack is correctly aligned in many more programs (Win32 only, X32testsomething different), so I'm not sure this wasn't just a special case.I'm annoyed with myself, not you, for there being a bug in the alignment process. I'm just glad you took the time to point it out and prepare acase for me.
Jun 03 2002
"Laurentiu Pancescu" <user domain.invalid> wrote in message news:adge5s$226v$2 digitaldaemon.com...You got me scared for a moment! <g>. I didn't create those small demosfortesting alignment; I just read a nice article at www.oonumerics.org, about different strategies to achieve in C++ similar performance with that of FORTRAN. And I wrote those 3 small programs, and compared the results(alsobetween different compilers). I hope you'll find that bug... Oh, related to X32: do you have some docs related to what's required fromanX32 drop-in replacement? It seems the extender is no longer maintained,andmaybe we could come up with an open-source, state-of-the-art one.There aren't any docs other than the library source code.
Jun 03 2002