digitalmars.D.learn - SSE, Inline assembler, Structs, ...
- Audun Wilhelmsen (23/23) Apr 01 2008 I want to use SSE to create a fast vector/matrix library (if anyone has ...
- Sascha Katzner (6/14) Apr 01 2008 It seems that this is an data alignment problem, IIRC the "a" in movAps
- Audun Wilhelmsen (2/18) Apr 03 2008 Well I've tried align, align(4) and align(16) in front of struct Vec4.. ...
- Jarrett Billingsley (5/25) Apr 03 2008 I think align(n) only works on data alignment within the struct, and not...
- Sascha Katzner (6/10) Apr 04 2008 So, you have to put your structs in the static data segment, structs on
I want to use SSE to create a fast vector/matrix library (if anyone has done this already I'd like to know). It seems that there's quite a bit of overhead with operator overloading, so I'd probably want to write some of the algorithms in my final app in assembly, but I'd still like to have optimized operators too. But I'm having some problems. I can't get this to work for instance: align struct Vec 4 { float x,y,z,w; .... Vec4 opAdd(Vec4 v) { Vec4 res; asm { movaps XMM0, [this]; addps XMM0, v[EBP]; movaps res[EBP], XMM0; } return res; } } if i add Vec4 *me = this and replace this with me it compiles, but it crashes. Also, this confuses me: Vec4 v1 = Vec4(1,2,3,4); // Vec4* p = &v1; asm { movaps XMM1, v1[EBP]; } if I remove the comment, the program crashes.
Apr 01 2008
Audun Wilhelmsen wrote:Also, this confuses me: Vec4 v1 = Vec4(1,2,3,4); // Vec4* p = &v1; asm { movaps XMM1, v1[EBP]; } if I remove the comment, the program crashes.It seems that this is an data alignment problem, IIRC the "a" in movAps stands for "align" that means the command expects aligned data. You can use movups (unaligned), but that is not as fast. Or you can align your data. LLAP, Sascha
Apr 01 2008
Sascha Katzner Wrote:Audun Wilhelmsen wrote:Well I've tried align, align(4) and align(16) in front of struct Vec4.. Isn't that supposed to align the data?Also, this confuses me: Vec4 v1 = Vec4(1,2,3,4); // Vec4* p = &v1; asm { movaps XMM1, v1[EBP]; } if I remove the comment, the program crashes.It seems that this is an data alignment problem, IIRC the "a" in movAps stands for "align" that means the command expects aligned data. You can use movups (unaligned), but that is not as fast. Or you can align your data. LLAP, Sascha
Apr 03 2008
"Audun Wilhelmsen" <seronor gmail.com> wrote in message news:ft3bdl$1enj$1 digitalmars.com...Sascha Katzner Wrote:I think align(n) only works on data alignment within the struct, and not the alignment of the struct itself in memory. I _think_.Audun Wilhelmsen wrote:Well I've tried align, align(4) and align(16) in front of struct Vec4.. Isn't that supposed to align the data?Also, this confuses me: Vec4 v1 = Vec4(1,2,3,4); // Vec4* p = &v1; asm { movaps XMM1, v1[EBP]; } if I remove the comment, the program crashes.It seems that this is an data alignment problem, IIRC the "a" in movAps stands for "align" that means the command expects aligned data. You can use movups (unaligned), but that is not as fast. Or you can align your data. LLAP, Sascha
Apr 03 2008
Audun Wilhelmsen wrote:Well I've tried align, align(4) and align(16) in front of struct Vec4.. Isn't that supposed to align the data?Since 1.023...Data items in static data segment >= 16 bytes in size are now paragraph aligned.So, you have to put your structs in the static data segment, structs on the stack are not properly aligned as far as I know. LLAP, Sascha
Apr 04 2008