digitalmars.D - Aligning data in memory
- Peter Alexander (17/17) Sep 17 2011 I posted this is d.learn, and also on stackoverflow.com with no
- Adam D. Ruppe (9/9) Sep 17 2011 Perhaps:
- Peter Alexander (3/12) Sep 17 2011 If I am correct, that only aligns it within the struct, it doesn't align...
- Rory McGuire (4/13) Sep 19 2011 surely you would have to use
- Peter Alexander (3/17) Sep 20 2011 v has offset 0 in the struct, so &v.v == &v, which is all the inline asm...
- Rory McGuire (4/32) Sep 21 2011 Would that even be true in the case where you specify a alignment ( keep...
- Peter Alexander (5/29) Sep 21 2011 I could be wrong, but I think so.
- Trass3r (6/15) Sep 19 2011 That align directive is fucked up anyways.
- Robert Jacques (4/21) Sep 17 2011 It depends. OS X requires 16-byte alignment, which DMD complies with. So...
I posted this is d.learn, and also on stackoverflow.com with no satisfactory answer. Can anyone help me with this? http://stackoverflow.com/questions/7375165/aligning-stack-variables-in-d --- Is there a way to align data on the stack? In particular, I want to create an 16-byte aligned array of floats to load into XMM registers using movaps, which is significantly faster than movups. e.g. void foo() { float[4] v = [1.0f, 2.0f, 3.0f, 4.0f]; asm { movaps XMM0, v; // v must be 16-byte aligned for this to work. ... } }
Sep 17 2011
Perhaps: void foo() { struct V { align(16) float[4] v = [1.0f, 2.0f, 3.0f, 4.0f]; } V v; asm { movaps XMM0, v; } } It compiles, but I'm not sure if it's actually correct.
Sep 17 2011
On 17/09/11 7:11 PM, Adam D. Ruppe wrote:Perhaps: void foo() { struct V { align(16) float[4] v = [1.0f, 2.0f, 3.0f, 4.0f]; } V v; asm { movaps XMM0, v; } } It compiles, but I'm not sure if it's actually correct.If I am correct, that only aligns it within the struct, it doesn't align the struct itself.
Sep 17 2011
surely you would have to use movaps XMM0, v.v; because the alignment would only happen inside the struct? On Sat, Sep 17, 2011 at 8:11 PM, Adam D. Ruppe <destructionator gmail.com>wrote:Perhaps: void foo() { struct V { align(16) float[4] v = [1.0f, 2.0f, 3.0f, 4.0f]; } V v; asm { movaps XMM0, v; } } It compiles, but I'm not sure if it's actually correct.
Sep 19 2011
On 19/09/11 9:17 AM, Rory McGuire wrote:surely you would have to use movaps XMM0, v.v; because the alignment would only happen inside the struct? On Sat, Sep 17, 2011 at 8:11 PM, Adam D. Ruppe <destructionator gmail.com <mailto:destructionator gmail.com>> wrote: Perhaps: void foo() { struct V { align(16) float[4] v = [1.0f, 2.0f, 3.0f, 4.0f]; } V v; asm { movaps XMM0, v; } } It compiles, but I'm not sure if it's actually correct.v has offset 0 in the struct, so &v.v == &v, which is all the inline asm cares about.
Sep 20 2011
Would that even be true in the case where you specify a alignment ( keeping in mind that the alignment is for that specific variable)? On Tue, Sep 20, 2011 at 7:25 PM, Peter Alexander < peter.alexander.au gmail.com> wrote:On 19/09/11 9:17 AM, Rory McGuire wrote:surely you would have to use movaps XMM0, v.v; because the alignment would only happen inside the struct? On Sat, Sep 17, 2011 at 8:11 PM, Adam D. Ruppe <destructionator gmail.com <mailto:destructionator gmail.**com<destructionator gmail.com>>> wrote: Perhaps: void foo() { struct V { align(16) float[4] v = [1.0f, 2.0f, 3.0f, 4.0f]; } V v; asm { movaps XMM0, v; } } It compiles, but I'm not sure if it's actually correct.v has offset 0 in the struct, so &v.v == &v, which is all the inline asm cares about.
Sep 21 2011
I could be wrong, but I think so. As I understand, align(N) only aligns it *within the structure*. If you are at 0 offset, you are aligned on all N already, so I don't see why it would add padding before the first member of a struct. On 21/09/11 11:22 AM, Rory McGuire wrote:Would that even be true in the case where you specify a alignment ( keeping in mind that the alignment is for that specific variable)? On Tue, Sep 20, 2011 at 7:25 PM, Peter Alexander <peter.alexander.au gmail.com <mailto:peter.alexander.au gmail.com>> wrote: On 19/09/11 9:17 AM, Rory McGuire wrote: surely you would have to use movaps XMM0, v.v; because the alignment would only happen inside the struct? On Sat, Sep 17, 2011 at 8:11 PM, Adam D. Ruppe <destructionator gmail.com <mailto:destructionator gmail.com> <mailto:destructionator gmail.__com <mailto:destructionator gmail.com>>> wrote: Perhaps: void foo() { struct V { align(16) float[4] v = [1.0f, 2.0f, 3.0f, 4.0f]; } V v; asm { movaps XMM0, v; } } It compiles, but I'm not sure if it's actually correct. v has offset 0 in the struct, so &v.v == &v, which is all the inline asm cares about.
Sep 21 2011
Am 17.09.2011, 20:11 Uhr, schrieb Adam D. Ruppe <destructionator gmail.com>:Perhaps: void foo() { struct V { align(16) float[4] v = [1.0f, 2.0f, 3.0f, 4.0f]; } V v; asm { movaps XMM0, v; } } It compiles, but I'm not sure if it's actually correct.That align directive is fucked up anyways. Why does it even exist if the value you specify doesn't change anything? I can't make sense out of the description: http://www.d-programming-language.org/attribute.html#align
Sep 19 2011
On Sat, 17 Sep 2011 14:01:19 -0400, Peter Alexander <peter.alexander.au gmail.com> wrote:I posted this is d.learn, and also on stackoverflow.com with no satisfactory answer. Can anyone help me with this? http://stackoverflow.com/questions/7375165/aligning-stack-variables-in-d --- Is there a way to align data on the stack? In particular, I want to create an 16-byte aligned array of floats to load into XMM registers using movaps, which is significantly faster than movups. e.g. void foo() { float[4] v = [1.0f, 2.0f, 3.0f, 4.0f]; asm { movaps XMM0, v; // v must be 16-byte aligned for this to work. ... } }It depends. OS X requires 16-byte alignment, which DMD complies with. So on Mac the above code is okay. However, on PC, the only way to get aligned memory is to a) use the heap or b) request extra stack space and align it yourself. (i.e. declare a float[7] and then slice it appropriately) The other option is to just use movups. movups on aligned data had (IIRC) the same speed on aligned data as movaps did on my CPU (Core 2) and I'd really be surprised if on any modern architecture this wasn't true. (That said, movups does slow down on unaligned memory) Also, you could use alloca or region allocator to get aligned memory.
Sep 17 2011