D - align doesn't work
- Sean L. Palmer (19/19) Apr 07 2003 Locals should be able to be aligned to the specified requirements. This...
- Helmut Leitner (26/58) Apr 07 2003 I'll add some weird facts to the topic alignment.
- Sean L. Palmer (15/40) Apr 07 2003 On x86 architecture, branch targets do considerably better when aligned ...
Locals should be able to be aligned to the specified requirements. This is vital once we start dealing with types that have hard alignment requirements (such as structs that contain 128-bit xmmwords that must be 16-byte aligned so that inline asm that references them won't get alignment faults). That cent/ucent type would sure be handy too. ;) Sean align(16) struct foo { uint x,y; }; void main () { foo f; uint x; foo f2; printf("foo.y.offset = %d, foo.size = %d\n", foo.y.offset, foo.size); // this is good printf("f = %p, f2 = %p\n", &f, &f2); // these should both be aligned to 16 bytes // align(16) foo f3; // syntax error, I don't understand the reasoning why. }
Apr 07 2003
"Sean L. Palmer" wrote:Locals should be able to be aligned to the specified requirements. This is vital once we start dealing with types that have hard alignment requirements (such as structs that contain 128-bit xmmwords that must be 16-byte aligned so that inline asm that references them won't get alignment faults). That cent/ucent type would sure be handy too. ;) Sean align(16) struct foo { uint x,y; }; void main () { foo f; uint x; foo f2; printf("foo.y.offset = %d, foo.size = %d\n", foo.y.offset, foo.size); // this is good printf("f = %p, f2 = %p\n", &f, &f2); // these should both be aligned to 16 bytes // align(16) foo f3; // syntax error, I don't understand the reasoning why. }I'll add some weird facts to the topic alignment. While doing precision benchmarks I found, that delegates and functions are extremely senible to alignment. The same functions void TestLoop1000A () { for(int i=0; i<1000; i++) { // empty } } void TestLoop1000B () { for(int i=0; i<1000; i++) { // empty } } will perform quite differently (about 20% up) depending on their starting offset within a 16-Byte frame (at least that is what the benchmrks seem to proof). The measurement error is below 1% (reproducibility). I don't understand it. I'm not a hardware man. It may be CPU-dependent (I used an Athlon 750 for this). Exactly the same effect can be seen when benchmarking the same code by using closures. -- Helmut Leitner leitner hls.via.at Graz, Austria www.hls-software.com
Apr 07 2003
On x86 architecture, branch targets do considerably better when aligned to at least 4 byte alignment (8 is better for modern CPU's I think) It's even better to have your entire inner loop fit into as few cache lines as possible. This is something the compiler should deal with internally when you specify -O; the programmer should not have to concern themselves with such petty implementation details. It's part of the standard size vs. speed tradeoff. Or were you driving at the need for some directive to control code alignment manually? Sean "Helmut Leitner" <leitner hls.via.at> wrote in message news:3E913C6A.CBF417D1 hls.via.at...I'll add some weird facts to the topic alignment. While doing precision benchmarks I found, that delegates and functions are extremely senible to alignment. The same functions void TestLoop1000A () { for(int i=0; i<1000; i++) { // empty } } void TestLoop1000B () { for(int i=0; i<1000; i++) { // empty } } will perform quite differently (about 20% up) depending on their starting offset within a 16-Byte frame (at least that is what the benchmrks seem toproof).The measurement error is below 1% (reproducibility). I don't understand it. I'm not a hardware man. It may be CPU-dependent (I used an Athlon 750 for this). Exactly the same effect can be seen when benchmarking the same code byusingclosures. -- Helmut Leitner leitner hls.via.at Graz, Austria www.hls-software.com
Apr 07 2003