digitalmars.D - Built-in vector types
- Simon Hobbs (24/24) May 15 2004 If you want D to gain a great advantage (at least for games) over modern...
- Billy Zelsnack (8/43) May 15 2004 +1
- Walter (7/10) May 15 2004 The language already supports vector operations on arrays of floats,
- Ben Hinkle (5/16) May 15 2004 Walter, do you know much about Cg? I was just poking around the nvidia s...
- Walter (3/8) May 23 2004 I looked at it briefly a couple years back, I think.
- =?iso-8859-1?q?Knud_S=F8rensen?= (37/40) May 16 2004 As fare as I can see from http://developer.nvidia.com/attach/6043
- J Anderson (15/26) May 16 2004 I would argue that having matrix multiplication and such will bloat the
- Andy Friesen (5/12) May 16 2004 If Phobos included some types and functions for these sorts of
- J Anderson (4/17) May 16 2004 Exactly!
- Ben Hinkle (7/40) May 15 2004 Does float4 have value or reference semantics?
- Billy Zelsnack (11/17) May 15 2004 I would like to not care when passing it around and trust the compiler
- =?iso-8859-1?q?Knud_S=F8rensen?= (12/12) May 15 2004 Hi
- Ben Hinkle (17/38) May 15 2004 Reference or value semantics makes a difference with code like
- Simon Hobbs (15/21) May 16 2004 Sorry, I'm not explaining myself properly.
- hellcatv hotmail.com (13/36) May 16 2004 ideally having a struct with vector-like math ops should vectorize
- Simon Hobbs (5/13) May 16 2004 Well, in the future when X86 and PowerPC 'go away' it would still be tri...
- Ben Hinkle (22/50) May 16 2004 OK. My first thought was to use the inline assembler but now that you sa...
- hellcatv hotmail.com (17/67) May 16 2004 Actually My research project involves doing things like BLAS on the GPU
- hellcatv hotmail.com (10/34) May 15 2004 I have implemented Cg's float2 float3 and float4 classes in D
- hellcatv hotmail.com (37/83) May 15 2004 First of all I'd like to say that if walter wishes to integrate my vec.d...
- Ben Hinkle (48/55) May 15 2004 very nifty. I feel your pain implementing all those swizzle operators. I...
- J Anderson (6/43) May 15 2004 Nice. Parhaps you should check out / take some ideas from burtons
If you want D to gain a great advantage (at least for games) over modern c++ compilers I believe it would be a smart move to add a built in type for float4 (and I guess double2 for completeness.) CPU support for these types is getting to be pretty ubiquitous and making them built-in has all the advantages of c++ intrinsics and far more besides: 1. consistency across implementations 2. native support for constants 3. debug and release code won't have the huge (order of magnitude) speed disparity that they do in the c++ method. In c++ it is really a pre-requisite to make a class wrapper for the intrinsic functions because they are entirely un-usable in their native form. This works fine but a vector add, for example, calls a 12 instruction function in a debug build and in a release build is a single vector instruction. Trying to debug a 60fps game at 10fps is a royal pain in the arse. 4. the possibility of adding built-in support for vector swizzling/write masking/element access (a la Cg/HLSL) although I guess this is rather contencious I would be inclined make dot3, dot4, cross, etc. into intrinsic functions rather than trying to invent dodgy operators for them. Another issue that arises is the ability to keep temporary single scalar results in a vector register so that they don't keep being transferred to and from FPU registers. Would the optimizer be able to factor this problem away, or would an explicit float1 (or whatever) type be better? Si
May 15 2004
+1 D already has very clean access to OpenGL and other C libraries. Having a vector type along with some good standard vector/matrix libraries will draw a lot of game developers. Game developers are an interesting bunch. We typically have terrible time constraints to construct bleeding edge technology that is supposed to run real-time on a wide performance range of computers. Simon Hobbs wrote:If you want D to gain a great advantage (at least for games) over modern c++ compilers I believe it would be a smart move to add a built in type for float4 (and I guess double2 for completeness.) CPU support for these types is getting to be pretty ubiquitous and making them built-in has all the advantages of c++ intrinsics and far more besides: 1. consistency across implementations 2. native support for constants 3. debug and release code won't have the huge (order of magnitude) speed disparity that they do in the c++ method. In c++ it is really a pre-requisite to make a class wrapper for the intrinsic functions because they are entirely un-usable in their native form. This works fine but a vector add, for example, calls a 12 instruction function in a debug build and in a release build is a single vector instruction. Trying to debug a 60fps game at 10fps is a royal pain in the arse. 4. the possibility of adding built-in support for vector swizzling/write masking/element access (a la Cg/HLSL) although I guess this is rather contencious I would be inclined make dot3, dot4, cross, etc. into intrinsic functions rather than trying to invent dodgy operators for them. Another issue that arises is the ability to keep temporary single scalar results in a vector register so that they don't keep being transferred to and from FPU registers. Would the optimizer be able to factor this problem away, or would an explicit float1 (or whatever) type be better? Si
May 15 2004
The language already supports vector operations on arrays of floats, doubles, or anything else. Currently, however, it is not implemented in the compiler. "Simon Hobbs" <Simon_member pathlink.com> wrote in message news:c851kq$v01$1 digitaldaemon.com...If you want D to gain a great advantage (at least for games) over modernc++compilers I believe it would be a smart move to add a built in type forfloat4(and I guess double2 for completeness.)
May 15 2004
Walter wrote:The language already supports vector operations on arrays of floats, doubles, or anything else. Currently, however, it is not implemented in the compiler. "Simon Hobbs" <Simon_member pathlink.com> wrote in message news:c851kq$v01$1 digitaldaemon.com...Walter, do you know much about Cg? I was just poking around the nvidia site reading the spec and it looks like they have some interesting ideas to avoid aliasing (inout is copy-in-copy-out, no pointers, etc) that could be nifty to pull into D.If you want D to gain a great advantage (at least for games) over modernc++compilers I believe it would be a smart move to add a built in type forfloat4(and I guess double2 for completeness.)
May 15 2004
"Ben Hinkle" <bhinkle4 juno.com> wrote in message news:c868ng$2muf$1 digitaldaemon.com...Walter, do you know much about Cg?I looked at it briefly a couple years back, I think.I was just poking around the nvidia site reading the spec and it looks like they have some interesting ideas to avoid aliasing (inout is copy-in-copy-out, no pointers, etc) that could be nifty to pull into D.
May 23 2004
On Sat, 15 May 2004 11:36:40 -0700, Walter wrote:The language already supports vector operations on arrays of floats, doubles, or anything else. Currently, however, it is not implemented in the compiler.As fare as I can see from http://developer.nvidia.com/attach/6043 what Cg have and D is missing. Is matrix multiplication, vector swizzling and write masking I suggest the following syntax for D using the example from the Cg link. float[4] vec1={4.0,-2.0,5.0,3.0}; float[2] vec2 =vec1[1,0]; // vec2 ={-2.0,4.0} float scalar =vec1[3]; // scaler = 3.0 float[3] vec3=scalar; // vec3 = {3.0,3.0,3.0} write masking vec1[0,3]=vec3; // vec1 = {3.0,-2.0,5.0,3.0} Something that have been bordering me about D is that a slice 0..4 means 0,1,2,3 and not 0,1,2,3,4 if you chose to use the comma notation for masking I think that it would be better to have 0..4 as a short for 0,1,2,3,4 instead of 0,1,2,3. With matrix multiplication i think that is better to use the more general Einstein summation. Which would allow a very short notation for vector calculations. in this notation an affine (a*v+b) transformation on a vector would be written like this double[4] vec1, vec2, b; double[4][4] a; vec2[i=0..3]=a[i][j=0..3]*vec1[j] + b[i]; but on an array with 100 vectors you could transform with double[4][100] arr1,arr2; arr2[i=0..3][k=0..99]=a[i][j=0..3]*arr1[j][k] + b[i]; the advantage of having this implemented in the core language is to exploit the processors vector unit (MMX) ,a graphic processors (GPU) or a math unit like http://www.clearspeed.com/ without rewriting the program in assembler for the specific hardware. Maybe it would be a good idea to have compiler modules for different types of hardware. Knud
May 16 2004
Knud Sørensen wrote:the advantage of having this implemented in the core language is to exploit the processors vector unit (MMX) ,a graphic processors (GPU) or a math unit like http://www.clearspeed.com/ without rewriting the program in assembler for the specific hardware.I would argue that having matrix multiplication and such will bloat the language. It should be a library feature. I see no problem with writing it in assembler (as long as I don't have to write it <g>). It would be better to have this as part of the standard library. That language won't be able to provide much additional speed by hard-wiring things like MMX into the language. Remember MMX and the like are designed to work well as language extensions in the first place. Why not include, in the language, every useful hardware data-structure under-the-sun? Data structures should only be put into the language when they make sense and can be done much cleaner then with libraries.Maybe it would be a good idea to have compiler modules for different types of hardware.The language shouldn't be tied to the hardware. Its the job of library vendors to make porting hell not the language.Knud-- -Anderson: http://badmama.com.au/~anderson/
May 16 2004
J Anderson wrote:I would argue that having matrix multiplication and such will bloat the language. It should be a library feature. I see no problem with writing it in assembler (as long as I don't have to write it <g>). It would be better to have this as part of the standard library. That language won't be able to provide much additional speed by hard-wiring things like MMX into the language. Remember MMX and the like are designed to work well as language extensions in the first place.If Phobos included some types and functions for these sorts of operations, compiler vendors would hypothetically be able to implement those operations as intrinsics. -- andy
May 16 2004
Andy Friesen wrote:J Anderson wrote:Exactly! -- -Anderson: http://badmama.com.au/~anderson/I would argue that having matrix multiplication and such will bloat the language. It should be a library feature. I see no problem with writing it in assembler (as long as I don't have to write it <g>). It would be better to have this as part of the standard library. That language won't be able to provide much additional speed by hard-wiring things like MMX into the language. Remember MMX and the like are designed to work well as language extensions in the first place.If Phobos included some types and functions for these sorts of operations, compiler vendors would hypothetically be able to implement those operations as intrinsics. -- andy
May 16 2004
Simon Hobbs wrote:If you want D to gain a great advantage (at least for games) over modern c++ compilers I believe it would be a smart move to add a built in type for float4 (and I guess double2 for completeness.)Does float4 have value or reference semantics? I don't think I'd use a float4 myself since I'm not a game programmer but it has repeatedly come up about using (shortish) arrays with value semantics. Some generic "static array with value-semantics" would be cool. Right now to get something like it I'm using a struct with the type and length as template parameters. It works fine but is verbose.CPU support for these types is getting to be pretty ubiquitous and making them built-in has all the advantages of c++ intrinsics and far more besides: 1. consistency across implementations 2. native support for constants 3. debug and release code won't have the huge (order of magnitude) speed disparity that they do in the c++ method. In c++ it is really a pre-requisite to make a class wrapper for the intrinsic functions because they are entirely un-usable in their native form. This works fine but a vector add, for example, calls a 12 instruction function in a debug build and in a release build is a single vector instruction. Trying to debug a 60fps game at 10fps is a royal pain in the arse. 4. the possibility of adding built-in support for vector swizzling/write masking/element access (a la Cg/HLSL) although I guess this is rather contencious I would be inclined make dot3, dot4, cross, etc. into intrinsic functions rather than trying to invent dodgy operators for them. Another issue that arises is the ability to keep temporary single scalar results in a vector register so that they don't keep being transferred to and from FPU registers. Would the optimizer be able to factor this problem away, or would an explicit float1 (or whatever) type be better? Si
May 15 2004
Does float4 have value or reference semantics? I don't think I'd use a float4 myself since I'm not a game programmer but it has repeatedly come up about using (shortish) arrays with value semantics. Some generic "static array with value-semantics" would be cool. Right now to get something like it I'm using a struct with the type and length as template parameters. It works fine but is verbose.I would like to not care when passing it around and trust the compiler to make the fastest decision for me. I have tons of c++ code that looks something like this: void doSomething(const float3& vecA) I pass by const reference because I am assuming it will be faster, but in some cases it just might not be. Who knows and I don't really care how it is passed as long as it is the fastest way possible. As for (shortish), I regulary use float2,float3,float4, and float16. float16 is for a 4x4 matrix, but that could just be 4 float4 and be just as efficient. So I think 2,3,4 lengths would give you 99% of the value for vector types (as far as game development is concerned).
May 15 2004
Hi Did you read my post on Einstein notation ?? http://www.digitalmars.com/drn-bin/wwwnews?digitalmars.D/288 Would that be useful for game developers ?? I have been thinking that you could drop the index[] notation in my first post and just define the slice at first contact. Like v2[i=0..3]=m[i][k=0..5]*v1[k]; for 4x6 matrix multiplication or det=M[0][i=0..3]*M[1]*[j=0..3]*M[2][k=0..3]*M[3][l=0..3]*P(i,j,k,l); to compute the determinant for 4x4 matrix.
May 15 2004
Billy Zelsnack wrote:Reference or value semantics makes a difference with code like float4 x,y; ... x[0] = 2.0; // or whatever the syntax is for an element of x y = x; y[0] = 1.0; What is x[0]? reference semantics says 1.0, value semantics says 2.0. Similarly with reference semantics you need to be careful with float4 doSomething() { float4 res; ... return res; } unless the memory for "res" is either passed in as an input or allocated from the heap or something like that. -BenDoes float4 have value or reference semantics? I don't think I'd use a float4 myself since I'm not a game programmer but it has repeatedly come up about using (shortish) arrays with value semantics. Some generic "static array with value-semantics" would be cool. Right now to get something like it I'm using a struct with the type and length as template parameters. It works fine but is verbose.I would like to not care when passing it around and trust the compiler to make the fastest decision for me. I have tons of c++ code that looks something like this: void doSomething(const float3& vecA) I pass by const reference because I am assuming it will be faster, but in some cases it just might not be. Who knows and I don't really care how it is passed as long as it is the fastest way possible. As for (shortish), I regulary use float2,float3,float4, and float16. float16 is for a 4x4 matrix, but that could just be 4 float4 and be just as efficient. So I think 2,3,4 lengths would give you 99% of the value for vector types (as far as game development is concerned).
May 15 2004
In article <c85srd$268c$1 digitaldaemon.com>, Ben Hinkle says...Does float4 have value or reference semantics? I don't think I'd use a float4 myself since I'm not a game programmer but it has repeatedly come up about using (shortish) arrays with value semantics. Some generic "static array with value-semantics" would be cool. Right now to get something like it I'm using a struct with the type and length as template parameters. It works fine but is verbose.Sorry, I'm not explaining myself properly. float4 would have value semantics and would represent a vector hardware register (e.g. an SSE register in X86.) and use SIMD instructions to perform add/sub/mul/div/etc... It would be subject to all of the optimizations that the compiler can currently do on floats and ints. Modern C++ compilers support the use of these registers/instructions through intrinsics (or Dylan Cuthbert's extensions to GCC on PS2) and unless D can at least match them, then as a professional games programmer I'll never be able to justify the use of D - even in spite of all the other great language features :( But I'd actually like to see D go one step further by supporting these types in the language and leapfrogging C++ in an important way in the process. I'm only interested in extreme performance, so making a struct that looks like a Cg type is pointless, although interesting :) Si
May 16 2004
ideally having a struct with vector-like math ops should vectorize but I don't think any compiler right now we have matches that definition of ideal in some ways it would be best to concentrate on figuring out how to enable fast optimizations on structures that happen to do vector ops rather than programming specific types for architectures that have SIMD instructions that may go away in the very next gen of hardware (what if they have scalar units instead next time around) I would be curious what the hit would be of using my Cg struct as opposed to doing the raw math on 3 local vars...I suspect it's a lot... in C++ it certainly is with gcc...visual studio makes it about a 50% speed hit, but gcc it's more like 75% speed hit In article <c87bp5$186f$1 digitaldaemon.com>, Simon Hobbs says...In article <c85srd$268c$1 digitaldaemon.com>, Ben Hinkle says...Does float4 have value or reference semantics? I don't think I'd use a float4 myself since I'm not a game programmer but it has repeatedly come up about using (shortish) arrays with value semantics. Some generic "static array with value-semantics" would be cool. Right now to get something like it I'm using a struct with the type and length as template parameters. It works fine but is verbose.Sorry, I'm not explaining myself properly. float4 would have value semantics and would represent a vector hardware register (e.g. an SSE register in X86.) and use SIMD instructions to perform add/sub/mul/div/etc... It would be subject to all of the optimizations that the compiler can currently do on floats and ints. Modern C++ compilers support the use of these registers/instructions through intrinsics (or Dylan Cuthbert's extensions to GCC on PS2) and unless D can at least match them, then as a professional games programmer I'll never be able to justify the use of D - even in spite of all the other great language features :( But I'd actually like to see D go one step further by supporting these types in the language and leapfrogging C++ in an important way in the process. I'm only interested in extreme performance, so making a struct that looks like a Cg type is pointless, although interesting :) Si
May 16 2004
In article <c87g2f$1ea5$1 digitaldaemon.com>, hellcatv hotmail.com says...ideally having a struct with vector-like math ops should vectorize but I don't think any compiler right now we have matches that definition of ideal in some ways it would be best to concentrate on figuring out how to enable fast optimizations on structures that happen to do vector ops rather than programming specific types for architectures that have SIMD instructions that may go away in the very next gen of hardware (what if they have scalar units instead next time around)Well, in the future when X86 and PowerPC 'go away' it would still be trivial for the compiler to implement a vector add using scalar units. As you point out, it is doing the opposite that proves problematic. Si
May 16 2004
Simon Hobbs wrote:In article <c85srd$268c$1 digitaldaemon.com>, Ben Hinkle says...OK. My first thought was to use the inline assembler but now that you say it uses SSE registers I guess even with asm blocks you'd have to make sure the right registers are filled when you call add/sub/etc. That would mess up the data-flow optimizations. Still it is an option. It is kindof like bringing back the "register" storage attribute from C (shudder).Does float4 have value or reference semantics? I don't think I'd use a float4 myself since I'm not a game programmer but it has repeatedly come up about using (shortish) arrays with value semantics. Some generic "static array with value-semantics" would be cool. Right now to get something like it I'm using a struct with the type and length as template parameters. It works fine but is verbose.Sorry, I'm not explaining myself properly. float4 would have value semantics and would represent a vector hardware register (e.g. an SSE register in X86.) and use SIMD instructions to perform add/sub/mul/div/etc... It would be subject to all of the optimizations that the compiler can currently do on floats and ints.Modern C++ compilers support the use of these registers/instructions through intrinsics (or Dylan Cuthbert's extensions to GCC on PS2) and unless D can at least match them, then as a professional games programmer I'll never be able to justify the use of D - even in spite of all the other great language features :(If these GCC extensions work on the x86 then gdc could pick them up. DMD would take longer though.But I'd actually like to see D go one step further by supporting these types in the language and leapfrogging C++ in an important way in the process. I'm only interested in extreme performance, so making a struct that looks like a Cg type is pointless, although interesting :)D is young so the performance will certainly improve somewhat, maybe not to the extreme you are looking for. When I read about D it struck me as a slightly lower level version of Java/Csharp. I never expected it to have extreme performance of something like Fortran or Cg or even all the customizability of C++. So for me that's all bonus :-) Still some vectorized support could benefit both the game and scientific computing worlds. I was just googling to see if anyone has tried putting BLAS on the GPU and sure enough people are looking into it. See for example http://wwwcg.in.tum.de/Research/data/Publications/sig03.pdf That would mean numerical algorithms run on the GPU instead of the CPU for the vectorized ops. There are probably tons of problems with getting the data there and back but it's a neat possibility. Who says a graphics card is just for graphics? ;-)
May 16 2004
Actually My research project involves doing things like BLAS on the GPU http://graphics.stanford.edu/projects/brookgpu/ in fact we benchmarked a lot of the blas stuff... (the code for the benchmarks is available on the above website and in CVS) on the matrix-vector operations the performance was quite impressive (SAXPY and Dot) however matrix-matrix multiply sucks on the GPU... we get the full bandwidth out of the cache, but--full bandwidth out of the cache is half or a quarter the full bandwidth out of the CPU cache... so there's no chance you win on matrix-matrix. anyhow feel free to download our brook platform and try writing some GPU programs yourself... (I recommend getting the CVS version right now--the released version is falling behind in features) and feel free to chat with me about what kinds of apps will work well on the GPU... the answer is apps that reuse their data a finite number of times... things that get huge cache performance on the CPU are not likely candidates. --Daniel In article <c87rg0$1ugl$1 digitaldaemon.com>, Ben Hinkle says...Simon Hobbs wrote:In article <c85srd$268c$1 digitaldaemon.com>, Ben Hinkle says...OK. My first thought was to use the inline assembler but now that you say it uses SSE registers I guess even with asm blocks you'd have to make sure the right registers are filled when you call add/sub/etc. That would mess up the data-flow optimizations. Still it is an option. It is kindof like bringing back the "register" storage attribute from C (shudder).Does float4 have value or reference semantics? I don't think I'd use a float4 myself since I'm not a game programmer but it has repeatedly come up about using (shortish) arrays with value semantics. Some generic "static array with value-semantics" would be cool. Right now to get something like it I'm using a struct with the type and length as template parameters. It works fine but is verbose.Sorry, I'm not explaining myself properly. float4 would have value semantics and would represent a vector hardware register (e.g. an SSE register in X86.) and use SIMD instructions to perform add/sub/mul/div/etc... It would be subject to all of the optimizations that the compiler can currently do on floats and ints.Modern C++ compilers support the use of these registers/instructions through intrinsics (or Dylan Cuthbert's extensions to GCC on PS2) and unless D can at least match them, then as a professional games programmer I'll never be able to justify the use of D - even in spite of all the other great language features :(If these GCC extensions work on the x86 then gdc could pick them up. DMD would take longer though.But I'd actually like to see D go one step further by supporting these types in the language and leapfrogging C++ in an important way in the process. I'm only interested in extreme performance, so making a struct that looks like a Cg type is pointless, although interesting :)D is young so the performance will certainly improve somewhat, maybe not to the extreme you are looking for. When I read about D it struck me as a slightly lower level version of Java/Csharp. I never expected it to have extreme performance of something like Fortran or Cg or even all the customizability of C++. So for me that's all bonus :-) Still some vectorized support could benefit both the game and scientific computing worlds. I was just googling to see if anyone has tried putting BLAS on the GPU and sure enough people are looking into it. See for example http://wwwcg.in.tum.de/Research/data/Publications/sig03.pdf That would mean numerical algorithms run on the GPU instead of the CPU for the vectorized ops. There are probably tons of problems with getting the data there and back but it's a neat possibility. Who says a graphics card is just for graphics? ;-)
May 16 2004
I have implemented Cg's float2 float3 and float4 classes in D they're exactly like Cg I suggest people just use the standard pioneered by Microsoft and Nvidia for the vector format :-) if we can build it into the compiler, great download it here (it's GPL right now, but as the author I'm willing to relicense it at your request) http://cvs.sourceforge.net/viewcvs.py/deliria/deliria/vec.d http://cvs.sourceforge.net/viewcvs.py/deliria/deliria/matrix.d In article <c851kq$v01$1 digitaldaemon.com>, Simon Hobbs says...If you want D to gain a great advantage (at least for games) over modern c++ compilers I believe it would be a smart move to add a built in type for float4 (and I guess double2 for completeness.) CPU support for these types is getting to be pretty ubiquitous and making them built-in has all the advantages of c++ intrinsics and far more besides: 1. consistency across implementations 2. native support for constants 3. debug and release code won't have the huge (order of magnitude) speed disparity that they do in the c++ method. In c++ it is really a pre-requisite to make a class wrapper for the intrinsic functions because they are entirely un-usable in their native form. This works fine but a vector add, for example, calls a 12 instruction function in a debug build and in a release build is a single vector instruction. Trying to debug a 60fps game at 10fps is a royal pain in the arse. 4. the possibility of adding built-in support for vector swizzling/write masking/element access (a la Cg/HLSL) although I guess this is rather contencious I would be inclined make dot3, dot4, cross, etc. into intrinsic functions rather than trying to invent dodgy operators for them. Another issue that arises is the ability to keep temporary single scalar results in a vector register so that they don't keep being transferred to and from FPU registers. Would the optimizer be able to factor this problem away, or would an explicit float1 (or whatever) type be better? Si
May 15 2004
First of all I'd like to say that if walter wishes to integrate my vec.d into his language he can have it under the BSD license or another license if he wants to talk to me about it. Other users must talk to me about changing the license but I'm quite flexible. I was a bit brief about how my float2 float3 and float4 work in http://cvs.sourceforge.net/viewcvs.py/deliria/deliria/vec.d but it's almost exactly like the Cg spec except for a few caveats a) as you can see in my posts before I was complaining that the opCmp operator must only return an int hence I can't do the 4-way < and > and == comparisons...so I use the dot product to get a partial ordering b) you can still do component-wise compares using the opLess and opGreater and opLEqual and so forth. c) you can assign using the swizzle operators float4 myvar=float4(1,2,3,4); float3 mytmp.yzx = myvar.zyx; .. d) you cannot repeat letters in the swizzle operators unless you are on gdc because digital mars' link.exe has a bug that crashes if too many functions are defined in a single file float3 mytmp.xyz = myvar.zyz; <-- only works if you define Swizzle as a compiler flag otherwise the alternative is float3 mytmp.xyz = myvar.swizzle(2,1,2); it's just as powerful a syntax and only necessary if you repeat components. e) I have provided real2 real3 and real4 and double2 double3 and double4 vectors. f) I welcome contributions to the lib... and especially benchmarking of it. g) I have provided all the intrinsic functions within Cg (cos, lerp, etc)--they also work on the intrinsic float,double and real types. h) this lib really pushes the digital mars compiler to its limit--adding one or two functions causes the linker to crash under windows. of course gdc is golden i) The lib is created using a single template class and a *lot* of instantiations of that class (so the user mustn't type vec!(real,4) of course that syntax works as well) --Daniel In article <c86boa$2r88$1 digitaldaemon.com>, hellcatv hotmail.com says...I have implemented Cg's float2 float3 and float4 classes in D they're exactly like Cg I suggest people just use the standard pioneered by Microsoft and Nvidia for the vector format :-) if we can build it into the compiler, great download it here (it's GPL right now, but as the author I'm willing to relicense it at your request) http://cvs.sourceforge.net/viewcvs.py/deliria/deliria/vec.d http://cvs.sourceforge.net/viewcvs.py/deliria/deliria/matrix.d In article <c851kq$v01$1 digitaldaemon.com>, Simon Hobbs says...If you want D to gain a great advantage (at least for games) over modern c++ compilers I believe it would be a smart move to add a built in type for float4 (and I guess double2 for completeness.) CPU support for these types is getting to be pretty ubiquitous and making them built-in has all the advantages of c++ intrinsics and far more besides: 1. consistency across implementations 2. native support for constants 3. debug and release code won't have the huge (order of magnitude) speed disparity that they do in the c++ method. In c++ it is really a pre-requisite to make a class wrapper for the intrinsic functions because they are entirely un-usable in their native form. This works fine but a vector add, for example, calls a 12 instruction function in a debug build and in a release build is a single vector instruction. Trying to debug a 60fps game at 10fps is a royal pain in the arse. 4. the possibility of adding built-in support for vector swizzling/write masking/element access (a la Cg/HLSL) although I guess this is rather contencious I would be inclined make dot3, dot4, cross, etc. into intrinsic functions rather than trying to invent dodgy operators for them. Another issue that arises is the ability to keep temporary single scalar results in a vector register so that they don't keep being transferred to and from FPU registers. Would the optimizer be able to factor this problem away, or would an explicit float1 (or whatever) type be better? Si
May 15 2004
hellcatv hotmail.com wrote:First of all I'd like to say that if walter wishes to integrate my vec.d into his language he can have it under the BSD license or another license if he wants to talk to me about it. Other users must talk to me about changing the license but I'm quite flexible. I was a bit brief about how my float2 float3 and float4 work in http://cvs.sourceforge.net/viewcvs.py/deliria/deliria/vec.dvery nifty. I feel your pain implementing all those swizzle operators. It feels like lisp again with cdr, cadr, cadadr etc etc :-) How many are there - it looks like >50. yikes. But what's with the pretty wacky idea that for nsize < 3 the z() property should return x()? Does Cg really do that? That seems pretty pervasive that asking for higher dimension information just picks some valid dimension and uses that. Seems random to me. But then again maybe there's a reason. Otherwise it is very cool to have vectorized math operations and such. I definetely wouldn't mind seeing a simplified version of vec getting included in phobos somewhere. All those swizzles make my head ... well... spin. What I've been using is just: // helper to make an array "literal" with value semantics // Example: // uintn!(3)(100,200,300) struct InlineArray(T,int N) { T[N] array; static .InlineArray!(T,N) opCall(T x0,...) { .InlineArray!(T,N) res; res.array[] = (&x0)[0..N][]; return res; } static .InlineArray!(T,N) opCall(T[N] x) { .InlineArray!(T,N) res; res.array[] = x[]; return res; } T opIndex(int i) { return array[i]; } void opIndex(int i, T val) { array[i] = val; } // todo: arithmetic, cmp, etc } template uintn(int N) { alias InlineArray!(uint,N) uintn; } template intn(int N) { alias InlineArray!(int,N) intn; } template floatn(int N) { alias InlineArray!(float,N) floatn; } template doublen(int N) { alias InlineArray!(double,N) doublen; }
May 15 2004
hellcatv hotmail.com wrote:First of all I'd like to say that if walter wishes to integrate my vec.d into his language he can have it under the BSD license or another license if he wants to talk to me about it. Other users must talk to me about changing the license but I'm quite flexible. I was a bit brief about how my float2 float3 and float4 work in http://cvs.sourceforge.net/viewcvs.py/deliria/deliria/vec.d but it's almost exactly like the Cg spec except for a few caveats a) as you can see in my posts before I was complaining that the opCmp operator must only return an int hence I can't do the 4-way < and > and == comparisons...so I use the dot product to get a partial ordering b) you can still do component-wise compares using the opLess and opGreater and opLEqual and so forth. c) you can assign using the swizzle operators float4 myvar=float4(1,2,3,4); float3 mytmp.yzx = myvar.zyx; .. d) you cannot repeat letters in the swizzle operators unless you are on gdc because digital mars' link.exe has a bug that crashes if too many functions are defined in a single file float3 mytmp.xyz = myvar.zyz; <-- only works if you define Swizzle as a compiler flag otherwise the alternative is float3 mytmp.xyz = myvar.swizzle(2,1,2); it's just as powerful a syntax and only necessary if you repeat components. e) I have provided real2 real3 and real4 and double2 double3 and double4 vectors. f) I welcome contributions to the lib... and especially benchmarking of it. g) I have provided all the intrinsic functions within Cg (cos, lerp, etc)--they also work on the intrinsic float,double and real types. h) this lib really pushes the digital mars compiler to its limit--adding one or two functions causes the linker to crash under windows. of course gdc is golden i) The lib is created using a single template class and a *lot* of instantiations of that class (so the user mustn't type vec!(real,4) of course that syntax works as well) --DanielNice. Parhaps you should check out / take some ideas from burtons math.d class (in undig). It seemed pretty complete and had some niffty ideas. -- -Anderson: http://badmama.com.au/~anderson/
May 15 2004