digitalmars.D - Stack Alignment for Numerics
- dsimcha (8/8) Mar 03 2010 Does the freezing of the D2 language spec and the publication of TDPL pr...
- Don (19/28) Mar 03 2010 I agree. See bugzilla 2278. Something that's changed since this issue
- dsimcha (4/32) Mar 03 2010 Possibly stupid question: Would aligning each stack frame on 8-byte bou...
- bearophile (6/9) Mar 03 2010 Isn't the default alignment on osx 16 bytes?
Does the freezing of the D2 language spec and the publication of TDPL preclude fixing low level ABI issues like stack alignment? I have some numerics code that is taking a massive performance hit because the stack keeps ending up aligned such that none of my doubles are aligned on 8-byte boundaries, resulting in something like a 2x performance hit. If not, this is a pretty serious performance problem. Is there a "standard" solution to the stack alignment problem that will allow consistently good performance on numerics code that uses double-precision floats?
Mar 03 2010
dsimcha wrote:Does the freezing of the D2 language spec and the publication of TDPL preclude fixing low level ABI issues like stack alignment? I have some numerics code that is taking a massive performance hit because the stack keeps ending up aligned such that none of my doubles are aligned on 8-byte boundaries, resulting in something like a 2x performance hit. If not, this is a pretty serious performance problem. Is there a "standard" solution to the stack alignment problem that will allow consistently good performance on numerics code that uses double-precision floats?I agree. See bugzilla 2278. Something that's changed since this issue was last raised, is that the DMD backend now has 8-byte stack alignment for the Mac compiler. So the hard work has already been done. All that would be required to support it on Windows and Linux as well, is to enable it, and to align the stack to 8 bytes around every extern(C) call. As a workaround, I've been doing things like: // Align the stack to a multiple of 64 bytes void main() { asm { naked; mov EBP, ESP; and ESP, 0xFFFF_FFC0; call alignedmain; mov ESP, EBP; ret; } }
Mar 03 2010
== Quote from Don (nospam nospam.com)'s articledsimcha wrote:Possibly stupid question: Would aligning each stack frame on 8-byte boundaries be enough to ensure that each individual stack-allocated double is aligned on 8-byte boundaries?Does the freezing of the D2 language spec and the publication of TDPL preclude fixing low level ABI issues like stack alignment? I have some numerics code that is taking a massive performance hit because the stack keeps ending up aligned such that none of my doubles are aligned on 8-byte boundaries, resulting in something like a 2x performance hit. If not, this is a pretty serious performance problem. Is there a "standard" solution to the stack alignment problem that will allow consistently good performance on numerics code that uses double-precision floats?I agree. See bugzilla 2278. Something that's changed since this issue was last raised, is that the DMD backend now has 8-byte stack alignment for the Mac compiler. So the hard work has already been done. All that would be required to support it on Windows and Linux as well, is to enable it, and to align the stack to 8 bytes around every extern(C) call. As a workaround, I've been doing things like: // Align the stack to a multiple of 64 bytes void main() { asm { naked; mov EBP, ESP; and ESP, 0xFFFF_FFC0; call alignedmain; mov ESP, EBP; ret; } }
Mar 03 2010
Don:Something that's changed since this issue was last raised, is that the DMD backend now has 8-byte stack alignment for the Mac compiler.Isn't the default alignment on osx 16 bytes? http://blogs.embarcadero.com/eboling/2009/05/20/5607 I think it can be good to do some experiments and benchmarks (on Linux or Windows) to compare few alternative implementation ideas, for example using LLVM. Bye, bearophile
Mar 03 2010