www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Aligning data in memory

reply Peter Alexander <peter.alexander.au gmail.com> writes:
I posted this is d.learn, and also on stackoverflow.com with no 
satisfactory answer. Can anyone help me with this?

http://stackoverflow.com/questions/7375165/aligning-stack-variables-in-d

---

Is there a way to align data on the stack? In particular, I want to 
create an 16-byte aligned array of floats to load into XMM registers 
using movaps, which is significantly faster than movups.

e.g.

void foo()
{
     float[4] v = [1.0f, 2.0f, 3.0f, 4.0f];
     asm
     {
         movaps XMM0, v; // v must be 16-byte aligned for this to work.
         ...
     }
}
Sep 17 2011
next sibling parent reply Adam D. Ruppe <destructionator gmail.com> writes:
Perhaps:

void foo() {
        struct V { align(16) float[4] v = [1.0f, 2.0f, 3.0f, 4.0f]; }
	V v;
	asm {
		movaps XMM0, v;
	}
}


It compiles, but I'm not sure if it's actually correct.
Sep 17 2011
next sibling parent Peter Alexander <peter.alexander.au gmail.com> writes:
On 17/09/11 7:11 PM, Adam D. Ruppe wrote:
 Perhaps:

 void foo() {
          struct V { align(16) float[4] v = [1.0f, 2.0f, 3.0f, 4.0f]; }
 	V v;
 	asm {
 		movaps XMM0, v;
 	}
 }


 It compiles, but I'm not sure if it's actually correct.
If I am correct, that only aligns it within the struct, it doesn't align the struct itself.
Sep 17 2011
prev sibling next sibling parent reply Rory McGuire <rjmcguire gmail.com> writes:
surely you would have to use
 movaps XMM0, v.v;

 because the alignment would only happen inside the struct?


On Sat, Sep 17, 2011 at 8:11 PM, Adam D. Ruppe <destructionator gmail.com>wrote:

 Perhaps:

 void foo() {
        struct V { align(16) float[4] v = [1.0f, 2.0f, 3.0f, 4.0f]; }
        V v;
        asm {
                movaps XMM0, v;
        }
 }


 It compiles, but I'm not sure if it's actually correct.
Sep 19 2011
parent reply Peter Alexander <peter.alexander.au gmail.com> writes:
On 19/09/11 9:17 AM, Rory McGuire wrote:
 surely you would have to use
   movaps XMM0, v.v;

   because the alignment would only happen inside the struct?


 On Sat, Sep 17, 2011 at 8:11 PM, Adam D. Ruppe
 <destructionator gmail.com <mailto:destructionator gmail.com>> wrote:

     Perhaps:

     void foo() {
             struct V { align(16) float[4] v = [1.0f, 2.0f, 3.0f, 4.0f]; }
             V v;
             asm {
                     movaps XMM0, v;
             }
     }


     It compiles, but I'm not sure if it's actually correct.
v has offset 0 in the struct, so &v.v == &v, which is all the inline asm cares about.
Sep 20 2011
parent reply Rory McGuire <rjmcguire gmail.com> writes:
Would that even be true in the case where you specify a alignment ( keeping
in mind that the alignment is for that specific variable)?



On Tue, Sep 20, 2011 at 7:25 PM, Peter Alexander <
peter.alexander.au gmail.com> wrote:

 On 19/09/11 9:17 AM, Rory McGuire wrote:

 surely you would have to use
  movaps XMM0, v.v;

  because the alignment would only happen inside the struct?


 On Sat, Sep 17, 2011 at 8:11 PM, Adam D. Ruppe
 <destructionator gmail.com
<mailto:destructionator gmail.**com<destructionator gmail.com>>>
 wrote:

    Perhaps:

    void foo() {
            struct V { align(16) float[4] v = [1.0f, 2.0f, 3.0f, 4.0f]; }
            V v;
            asm {
                    movaps XMM0, v;
            }
    }


    It compiles, but I'm not sure if it's actually correct.
v has offset 0 in the struct, so &v.v == &v, which is all the inline asm cares about.
Sep 21 2011
parent Peter Alexander <peter.alexander.au gmail.com> writes:
I could be wrong, but I think so.

As I understand, align(N) only aligns it *within the structure*.

If you are at 0 offset, you are aligned on all N already, so I don't see 
why it would add padding before the first member of a struct.


On 21/09/11 11:22 AM, Rory McGuire wrote:
 Would that even be true in the case where you specify a alignment (
 keeping in mind that the alignment is for that specific variable)?



 On Tue, Sep 20, 2011 at 7:25 PM, Peter Alexander
 <peter.alexander.au gmail.com <mailto:peter.alexander.au gmail.com>> wrote:

     On 19/09/11 9:17 AM, Rory McGuire wrote:

         surely you would have to use
           movaps XMM0, v.v;

           because the alignment would only happen inside the struct?


         On Sat, Sep 17, 2011 at 8:11 PM, Adam D. Ruppe
         <destructionator gmail.com <mailto:destructionator gmail.com>
         <mailto:destructionator gmail.__com
         <mailto:destructionator gmail.com>>> wrote:

             Perhaps:

             void foo() {
                     struct V { align(16) float[4] v = [1.0f, 2.0f, 3.0f,
         4.0f]; }
                     V v;
                     asm {
                             movaps XMM0, v;
                     }
             }


             It compiles, but I'm not sure if it's actually correct.



     v has offset 0 in the struct, so &v.v == &v, which is all the inline
     asm cares about.
Sep 21 2011
prev sibling parent Trass3r <un known.com> writes:
Am 17.09.2011, 20:11 Uhr, schrieb Adam D. Ruppe  
<destructionator gmail.com>:

 Perhaps:

 void foo() {
         struct V { align(16) float[4] v = [1.0f, 2.0f, 3.0f, 4.0f]; }
 	V v;
 	asm {
 		movaps XMM0, v;
 	}
 }

 It compiles, but I'm not sure if it's actually correct.
That align directive is fucked up anyways. Why does it even exist if the value you specify doesn't change anything? I can't make sense out of the description: http://www.d-programming-language.org/attribute.html#align
Sep 19 2011
prev sibling parent "Robert Jacques" <sandford jhu.edu> writes:
On Sat, 17 Sep 2011 14:01:19 -0400, Peter Alexander
<peter.alexander.au gmail.com> wrote:
 I posted this is d.learn, and also on stackoverflow.com with no
 satisfactory answer. Can anyone help me with this?

 http://stackoverflow.com/questions/7375165/aligning-stack-variables-in-d

 ---

 Is there a way to align data on the stack? In particular, I want to
 create an 16-byte aligned array of floats to load into XMM registers
 using movaps, which is significantly faster than movups.

 e.g.

 void foo()
 {
      float[4] v = [1.0f, 2.0f, 3.0f, 4.0f];
      asm
      {
          movaps XMM0, v; // v must be 16-byte aligned for this to work.
          ...
      }
 }
It depends. OS X requires 16-byte alignment, which DMD complies with. So on Mac the above code is okay. However, on PC, the only way to get aligned memory is to a) use the heap or b) request extra stack space and align it yourself. (i.e. declare a float[7] and then slice it appropriately) The other option is to just use movups. movups on aligned data had (IIRC) the same speed on aligned data as movaps did on my CPU (Core 2) and I'd really be surprised if on any modern architecture this wasn't true. (That said, movups does slow down on unaligned memory) Also, you could use alloca or region allocator to get aligned memory.
Sep 17 2011