digitalmars.D - BLADE 0.2Alpha: Vector operations with mixins, expression templates,
- Don Clugston (48/48) Apr 03 2007 I have been trying to come up with a convincing use case for the new
- Walter Bright (3/7) Apr 03 2007 This is pretty incredible.
- Don Clugston (17/25) Apr 04 2007 I thought you'd like it. The sheer power of D these days...
- Georg Wrede (8/9) Apr 04 2007 I've some experience proofreading and commenting similar articles. Only
- Walter Bright (4/22) Apr 04 2007 Good question. I've already fixed #1028. But even if it looks stone age
- David B. Held (9/18) Apr 05 2007 Well, hopefully, if Walter breaks your code again, it will be to make it...
- Walter Bright (5/23) Apr 05 2007 I'm pretty sure DDJ would publish it, and it would reach a much wider
- Dan (5/5) Apr 05 2007 Current compilers tend to dump IR's to a buffer and then adjust them acc...
- janderson (15/23) Apr 07 2007 That would be nice and would help however I think it'll take more then
- Don Clugston (12/38) Apr 05 2007 Ooooooh....
- Georg Wrede (6/18) Apr 05 2007 Gees, Don,
- Georg Wrede (29/51) Apr 05 2007 Later, when you start writing the article itself:
- Walter Bright (7/19) Apr 05 2007 Just wait 'till you get AST macros!
- Dan (3/11) Apr 05 2007 Wow. That's the most excited I've seen Walter. : )
- Georg Wrede (2/15) Apr 05 2007 LOL, have to admit I agree!!!
- Bill Baxter (5/21) Apr 05 2007 It really is very cool and all, but did I just miss the benchmarks? I
- Don Clugston (11/34) Apr 06 2007 You're right, but it wasn't posted as a finished product!
- Walter Bright (4/7) Apr 03 2007 The hard part is doing register allocation. Generating code for the
- Pragma (10/66) Apr 03 2007 This is a work of art Don - it's practically a compiler extension. Nice...
- Bruno Medeiros (6/78) Apr 07 2007 Hum, a minor question, is a string representation of the expressions
- Pragma (11/90) Apr 09 2007 I know Don wrote this lib, but I think I can answer this.
- Dan (4/12) Apr 09 2007 Well said. Once the AST is reflected at compile time, I'm sure the code...
- Dave (6/21) Apr 09 2007 Hmmm - I wonder if a lot of what Don's been doing is serving as a "proof...
- Pragma (9/23) Apr 09 2007 Thanks. But evaluating the AST will only take things so far. There is ...
- Paul Findlay (3/8) Apr 09 2007 And its dawning on me that some text processing can take advantage of do...
- Dan (3/10) Apr 10 2007 Yeah, pretty much any large (>1kb) buffer copy algorithm runs fastest th...
- Don Clugston (16/113) Apr 09 2007 Actually I don't know how to make a tree nicely with tuples.
- Pragma (80/197) Apr 09 2007 Here's an example (for grins):
- Don Clugston (11/185) Apr 10 2007 The problem I was referring to, is: how to store both values, and
- Walter Bright (2/5) Apr 10 2007 You guys gotta publish!
- Don Clugston (6/12) Apr 10 2007 I've begun a draft. A historical question for you --
- Walter Bright (8/14) Apr 10 2007 Depends on how you look at it. g++ was first 'released' in December,
- Pragma (4/10) Apr 10 2007 Baby steps Walter, baby steps... ;)
- Pragma (21/46) Apr 10 2007 Oh, my bad. Yea, it would probably be overkill for BLADE. :)
- KlausO (15/238) Apr 10 2007 Hey pragma,
- Pragma (11/54) Apr 10 2007 Wow, that's one heck of an endorsement. Thanks, but don't throw anythin...
- KlausO (17/81) Apr 10 2007 Another issue I ran into is circular template dependencies. You could
- Pragma (6/66) Apr 10 2007 Yep, that one bit me too; just use the CTFE trick I mentioned. Wrapping...
- BCS (5/21) Apr 10 2007 The way my dparse sidesteps this (both the id length limit and forward
- Walter Bright (2/8) Apr 10 2007 One of the motivations for CTFE was to address that exact problem.
- Witold Baryluk (9/9) Apr 03 2007 Hello,
- Don Clugston (21/28) Apr 04 2007 Do you intend to implement BLAS2 and BLAS3-like functionality? I feel
- Davidl (16/64) Apr 03 2007 i think compilers have feeled a great challenge from compile-time back-e...
- Jascha Wetzel (20/76) Apr 05 2007 This is really neat.
- Mikola Lysenko (16/16) Apr 05 2007 Excellent work! I have to say that I've had my misgivings about mixin
- torhu (3/11) Apr 05 2007 Oh my GAD? Get it? ;)
- Bill Baxter (18/37) Apr 05 2007 Wow, that looks really sweet. Last I heard on GA was that it was good
- Mikola Lysenko (21/62) Apr 05 2007 As I understand it, GAIGEN uses an external program to generate the
- Dan (3/11) Apr 06 2007 In essence, it's a *demonstration* that with all of D's features combine...
- Don Clugston (14/26) Apr 07 2007 It might be because no-one has really tried. Agner told me that he has
- Dan (5/21) Apr 10 2007 Oh! DotNet? Another runtime hosted microsoft environment? Brilliant. ...
I have been trying to come up with a convincing use case for the new mixins (and for metaprogramming in general). My best effort to date can be found at: http://www.dsource.org/projects/mathextra/browser/trunk/mathextra/Blade.d It generates near-optimal x87 asm code for BLAS1-style basic vector operations. 32, 64 and 80 bit vectors are all supported. Compile with -version=BladeDebug to see the asm code which is generated. Typical usage: void main() { auto p = Vec([1.0L, 2, 18]); // a vector of 80-bit reals. auto q = Vec([3.5L, 1.1, 3.8]); // ditto auto r = Vec([17.0f, 28.25, 1]); // a vector of 32-bit floats auto z = Vec([17.0i, 28.1i, 1i]); // a vector of 64-bit idoubles real d = dot(r, p+r+r); ireal e = dot(r, z); q -= ((r+p)*18.0L*314.1L - (p-r))* 35; d = dot(r, p+r+r); } Notice that mixed-length operations (real[] + float[] - double[]) are supported. Like the C++ Blitz++ library, expression templates are used to convert vector expressions into efficient element-wise operations. Unlike that library, however, there is no reliance on the compiler's optimiser. Instead, the expression template is manipulated as text, converted into postfix, and then passed to a simple CTFE compile-time assembler, which creates highly efficient asm code which is used as a mixin. To understand the later parts of the code, you need some knowledge of x87 assembler. In fact, you probably need to have read Agner Fog's superb Pentium optimisation manual (www.agner.org). Some observations: * I was amazed at how simple the expression template code is (it is somewhat cluttered by the code to check for real/imaginary type mismatch errors). * I've often read that the x87 floating-point stack is notoriously difficult for compilers to write code for, but it works quite well in this case. * The major workarounds are: - inability to define operators for built-in arrays (hence the use of 'Vec' wrappers). - inability to index through a tuple in a CTFE function (solved by converting types into a string). * There have been mutterings about how unhygenic/dangerous the new mixins are. In this case, the mixin forms the _entire_ body of the function. This is an interesting situation which I think a language purist will find more palatable. Enjoy.
Apr 03 2007
Don Clugston wrote:I have been trying to come up with a convincing use case for the new mixins (and for metaprogramming in general). My best effort to date can be found at: http://www.dsource.org/projects/mathextra/browser/trunk/mathextra/Blade.dThis is pretty incredible. Can I beg you to write an article about it?
Apr 03 2007
Walter Bright wrote:Don Clugston wrote:I thought you'd like it. The sheer power of D these days... "Flies like a rocket, steers like a bicycle."I have been trying to come up with a convincing use case for the new mixins (and for metaprogramming in general). My best effort to date can be found at: http://www.dsource.org/projects/mathextra/browser/trunk/mathextra/Blade.dThis is pretty incredible.Can I beg you to write an article about it?OK. Are you thinking somewhere like DDJ? It's nicely focused (central idea being something like "D's metaprogramming is so powerful, that compiler back-end optimisation can easily be performed in compile-time libraries") -- but it would require loads of explanation of D's coolest features. Could be a challenge to make it easily readable. But with the proposed changes to 'const', macros, and previous discussion about whether tuples should be automatically flattened, I have doubts over how stable that part of the language is. I wouldn't want to write something that didn't compile any more, or looked hopelessly obsolete by the time it was published. I've been bitten before -- CTFE blasted almost all of my previous metaprogramming code into the stone age <g>. What's the shelf life of this stuff?.
Apr 04 2007
Don Clugston wrote:Could be a challenge to make it easily readable.I've some experience proofreading and commenting similar articles. Only too happy to oblige! Also, DDJ readers may not be the CUJ crowd, but still, they can handle advanced stuff. Besides, some similar magazines help you by rewriting the article anyway, because then it has the same overall "feel" that their other articles. Articles in SciAm for example feel the same, no matter who or from what continent is presented as the author.
Apr 04 2007
Don Clugston wrote:Walter Bright wrote:Anywhere!Can I beg you to write an article about it?OK. Are you thinking somewhere like DDJ?It's nicely focused (central idea being something like "D's metaprogramming is so powerful, that compiler back-end optimisation can easily be performed in compile-time libraries") -- but it would require loads of explanation of D's coolest features. Could be a challenge to make it easily readable. But with the proposed changes to 'const', macros, and previous discussion about whether tuples should be automatically flattened, I have doubts over how stable that part of the language is. I wouldn't want to write something that didn't compile any more, or looked hopelessly obsolete by the time it was published. I've been bitten before -- CTFE blasted almost all of my previous metaprogramming code into the stone age <g>. What's the shelf life of this stuff?.6 months from now, it's still a decade ahead of any other language.
Apr 04 2007
Don Clugston wrote:[...] But with the proposed changes to 'const', macros, and previous discussion about whether tuples should be automatically flattened, I have doubts over how stable that part of the language is. I wouldn't want to write something that didn't compile any more, or looked hopelessly obsolete by the time it was published. I've been bitten before -- CTFE blasted almost all of my previous metaprogramming code into the stone age <g>. What's the shelf life of this stuff?.Well, hopefully, if Walter breaks your code again, it will be to make it even more elegant (although, that will be a bit of a trick, I admit). Eventually, the language will be so powerful that you can't make your code any more compact, and then you won't have to worry about it getting broken. To be honest, I'd like to see an article about this in CUJ. I don't know if they would publish it or not, but if they did, I guarantee it would make more than a few people take a hard look at D. Dave
Apr 05 2007
David B. Held wrote:Don Clugston wrote:I'm pretty sure DDJ would publish it, and it would reach a much wider audience than CUJ, which has disappeared <g>. Anyhow, Don, this kind of stuff needs to reach a much wider audience than a random posting in this n.g. would get.[...] But with the proposed changes to 'const', macros, and previous discussion about whether tuples should be automatically flattened, I have doubts over how stable that part of the language is. I wouldn't want to write something that didn't compile any more, or looked hopelessly obsolete by the time it was published. I've been bitten before -- CTFE blasted almost all of my previous metaprogramming code into the stone age <g>. What's the shelf life of this stuff?.Well, hopefully, if Walter breaks your code again, it will be to make it even more elegant (although, that will be a bit of a trick, I admit). Eventually, the language will be so powerful that you can't make your code any more compact, and then you won't have to worry about it getting broken. To be honest, I'd like to see an article about this in CUJ. I don't know if they would publish it or not, but if they did, I guarantee it would make more than a few people take a hard look at D.
Apr 05 2007
Current compilers tend to dump IR's to a buffer and then adjust them according to parser rules. The wrong approach. I totally want to see this kind of programming get implemented into the compiler itself. I know it's too much to ask, but if DMD could generate optimal floating point math and cache pipelines, it would rend the gaming industry from C++'s hands REAL quick like. A C++ programmer can say: "oh yeah, it's a minor upgrade made for my convenience, but I'm already conveniently comfortable with C++"... Until a decision maker sees that D performs 20% better, dramatically reduces debugging time, cuts source code volume, and has better version control.
Apr 05 2007
Dan wrote:Current compilers tend to dump IR's to a buffer and then adjust them according to parser rules. The wrong approach. I totally want to see this kind of programming get implemented into the compiler itself. I know it's too much to ask, but if DMD could generate optimal floating point math and cache pipelines, it would rend the gaming industry from C++'s hands REAL quick like. A C++ programmer can say: "oh yeah, it's a minor upgrade made for my convenience, but I'm already conveniently comfortable with C++"... Until a decision maker sees that D performs 20% better, dramatically reduces debugging time, cuts source code volume, and has better version control.That would be nice and would help however I think it'll take more then that to convince a Game company to switch to D, particularly ones using XNA. It would have to be a startup company that is not working 360, perhaps a Nintendo, sony, Linux or perhaps windows. I say perhaps, because if a company is going to switch to anything its probably XNA. Now if we had D for Net, maybe the story would be slightly different. I do like the idea of using D for games, I just see it as being a hard sell. Scripting and tools are probably the main gateway for D into the game programming world. Once a few game companies start using it in their main project and can demonstrate productivity/speed improvements then the wall may indeed start to crumble. -Joel
Apr 07 2007
Walter Bright wrote:David B. Held wrote:Ooooooh.... I just updated the code, adding a code generator for D code, for when x87 is unavailable. I could not believe how easy it was. It was about 30 lines of trivial code. Yet directly generating a 'for' loop is still better than what Blitz++ does. I reckon it'd take less than a week to implement everything those libraries do. The language synergy of tuples, mixins, CTFE, builtin strings, the ~ operator, constant folding, array slicing, template string parameters, type deduction, defined inline asm, static if, is() expressions... is astonishing. *All* of them are required in order for this to work.Don Clugston wrote:I'm pretty sure DDJ would publish it, and it would reach a much wider audience than CUJ, which has disappeared <g>. Anyhow, Don, this kind of stuff needs to reach a much wider audience than a random posting in this n.g. would get.[...] But with the proposed changes to 'const', macros, and previous discussion about whether tuples should be automatically flattened, I have doubts over how stable that part of the language is. I wouldn't want to write something that didn't compile any more, or looked hopelessly obsolete by the time it was published. I've been bitten before -- CTFE blasted almost all of my previous metaprogramming code into the stone age <g>. What's the shelf life of this stuff?.Well, hopefully, if Walter breaks your code again, it will be to make it even more elegant (although, that will be a bit of a trick, I admit). Eventually, the language will be so powerful that you can't make your code any more compact, and then you won't have to worry about it getting broken. To be honest, I'd like to see an article about this in CUJ. I don't know if they would publish it or not, but if they did, I guarantee it would make more than a few people take a hard look at D.
Apr 05 2007
Don Clugston wrote:Ooooooh.... I just updated the code, adding a code generator for D code, for when x87 is unavailable. I could not believe how easy it was. It was about 30 lines of trivial code. Yet directly generating a 'for' loop is still better than what Blitz++ does. I reckon it'd take less than a week to implement everything those libraries do. The language synergy of tuples, mixins, CTFE, builtin strings, the ~ operator, constant folding, array slicing, template string parameters, type deduction, defined inline asm, static if, is() expressions... is astonishing. *All* of them are required in order for this to work.Gees, Don, the above quote would be the perfect ingress (or at least a "quote box") in a DDJ article!!!! I'm clenching the chair so I wouldn't buy the tickets to come to your place and write it together with you, right now!
Apr 05 2007
Quoting from the page:If you have an article idea, you should first send us an article proposal. This is a short statement (about one page) that says what your article will be about. You can mail it to our <a href="address.htm">office</a> or e-mail it to <a href="mailto:editors ddj.com">editors ddj.com</a>. Be sure to give us several ways to contact you (such as phone number, e-mail address, fax number, and mailing address). We'll look at your idea and get back to you, usually within a month.</p>...<p>Although many people simply write their entire article and send it to us, there are several *advantages* to sending us a proposal. We can look at your idea and possibly suggest a particular slant you hadn't considered. We can also help you to organize the article.Later, when you start writing the article itself:Write as if you are giving a brief, informal talk to a group of coworkers. You want to be concise and well organized because you don't want to waste their time, but you don't want to be too stuffy or formal because you know them.Ah, and to give you more of the "feel" of being an article author, here's another quote from the same page:Illustrations and screen captures offer an opportunity to lure anyone looking through an issue into reading your article. They can also provide succinct summaries of complex ideas. Since we redraw most figures anyway, to conform to our size and style requirements, you should not worry about making your images very high quality. Many authors prefer to sketch them by hand and fax or mail them.Heh, especially the last sentence. The same goes for the text itself. The point being, from the magazine's point of view, most stuff sent to them is by teenagers, idiots, mentals, over-the-hill old-timers -- and only a handful is stuff they'd be interested in. Now, for those few, they'd kill, lick your boots, or even sharpen your pencil for you. As to whether this would interest them, all you have to send them is the above mentioned Ingress. Let them take over from that. Simply naming Walter as a reference should get you a definite YES/NO from them. And once you get the YES, they'll hold your hand through and through. --- For you personally, having your article in DDJ is of course "nice". But the real benefit is that then you've done it, been there. So, by the time D advances yet another bit, and your "template jujitsu" turns into "template nuke" (as perceived from the audience -- you'll never notice the change yourself!), it's gonna be a lot easier to jot up something for them again. Or for the next magazine. (Ahem, I'm ashamed to say, but at least as I experienced, my first article was a bit like "doing it for the first time with a woman". Once you get over it, the next one is a piece of cake.) Thusly, both now and in the future, you're in a position to do D a /lot/ of good -- with effort that's actually less than some of your better D forages to date. PS, if they get back to you within /less/ than 4 weeks, they're drooling. Unless they've read this post, of course. ;-)
Apr 05 2007
Don Clugston wrote:Ooooooh.... I just updated the code, adding a code generator for D code, for when x87 is unavailable. I could not believe how easy it was. It was about 30 lines of trivial code. Yet directly generating a 'for' loop is still better than what Blitz++ does. I reckon it'd take less than a week to implement everything those libraries do. The language synergy of tuples, mixins, CTFE, builtin strings, the ~ operator, constant folding, array slicing, template string parameters, type deduction, defined inline asm, static if, is() expressions... is astonishing. *All* of them are required in order for this to work.Just wait 'till you get AST macros! But this stuff is *awesome*. Scratch DDJ, I don't want to wait 6 months. This needs to get out now, - I suggest www.artima.com, followed up with dzone, slashdot, and digg. It would also be a real treat to have you present this at Amazon's D conference in August.
Apr 05 2007
Walter Bright Wrote:Just wait 'till you get AST macros! But this stuff is *awesome*. Scratch DDJ, I don't want to wait 6 months. This needs to get out now, - I suggest www.artima.com, followed up with dzone, slashdot, and digg. It would also be a real treat to have you present this at Amazon's D conference in August.Wow. That's the most excited I've seen Walter. : ) I'm going to have to read and understand your code. I can tell.
Apr 05 2007
Dan wrote:Walter Bright Wrote:LOL, have to admit I agree!!!Just wait 'till you get AST macros! But this stuff is *awesome*. Scratch DDJ, I don't want to wait 6 months. This needs to get out now, - I suggest www.artima.com, followed up with dzone, slashdot, and digg. It would also be a real treat to have you present this at Amazon's D conference in August.Wow. That's the most excited I've seen Walter. : )
Apr 05 2007
Georg Wrede wrote:Dan wrote:It really is very cool and all, but did I just miss the benchmarks? I need numbers, man! You can generate all the assembly you want, but at the end of the day its the numbers that people care about. --bbWalter Bright Wrote:LOL, have to admit I agree!!!Just wait 'till you get AST macros! But this stuff is *awesome*. Scratch DDJ, I don't want to wait 6 months. This needs to get out now, - I suggest www.artima.com, followed up with dzone, slashdot, and digg. It would also be a real treat to have you present this at Amazon's D conference in August.Wow. That's the most excited I've seen Walter. : )
Apr 05 2007
Bill Baxter wrote:Georg Wrede wrote:You're right, but it wasn't posted as a finished product! I post this sort of thing every now and again, largely to give Walter some feedback as to what the language is doing at the bleeding edge. In this case, I thought it would be relevant to AST macro development. Also I posted this to get an idea if there was much interest in this sort of thing -- I don't have a great deal of free time, so I want to make good use of it. I only got the code working a few days ago. The speed of the code will likely be affected by how well DMD optimises the tuple copying. I even haven't looked at what code it's actually generating.Dan wrote:It really is very cool and all, but did I just miss the benchmarks? I need numbers, man! You can generate all the assembly you want, but at the end of the day its the numbers that people care about. --bbWalter Bright Wrote:LOL, have to admit I agree!!!Just wait 'till you get AST macros! But this stuff is *awesome*. Scratch DDJ, I don't want to wait 6 months. This needs to get out now, - I suggest www.artima.com, followed up with dzone, slashdot, and digg. It would also be a real treat to have you present this at Amazon's D conference in August.Wow. That's the most excited I've seen Walter. : )
Apr 06 2007
Don Clugston wrote:* I've often read that the x87 floating-point stack is notoriously difficult for compilers to write code for, but it works quite well in this case.The hard part is doing register allocation. Generating code for the stack machine is trivial. There are some complications for dealing with 87 stack overflows.
Apr 03 2007
Don Clugston wrote:I have been trying to come up with a convincing use case for the new mixins (and for metaprogramming in general). My best effort to date can be found at: http://www.dsource.org/projects/mathextra/browser/trunk/mathextra/Blade.d It generates near-optimal x87 asm code for BLAS1-style basic vector operations. 32, 64 and 80 bit vectors are all supported. Compile with -version=BladeDebug to see the asm code which is generated. Typical usage: void main() { auto p = Vec([1.0L, 2, 18]); // a vector of 80-bit reals. auto q = Vec([3.5L, 1.1, 3.8]); // ditto auto r = Vec([17.0f, 28.25, 1]); // a vector of 32-bit floats auto z = Vec([17.0i, 28.1i, 1i]); // a vector of 64-bit idoubles real d = dot(r, p+r+r); ireal e = dot(r, z); q -= ((r+p)*18.0L*314.1L - (p-r))* 35; d = dot(r, p+r+r); } Notice that mixed-length operations (real[] + float[] - double[]) are supported. Like the C++ Blitz++ library, expression templates are used to convert vector expressions into efficient element-wise operations. Unlike that library, however, there is no reliance on the compiler's optimiser. Instead, the expression template is manipulated as text, converted into postfix, and then passed to a simple CTFE compile-time assembler, which creates highly efficient asm code which is used as a mixin. To understand the later parts of the code, you need some knowledge of x87 assembler. In fact, you probably need to have read Agner Fog's superb Pentium optimisation manual (www.agner.org). Some observations: * I was amazed at how simple the expression template code is (it is somewhat cluttered by the code to check for real/imaginary type mismatch errors). * I've often read that the x87 floating-point stack is notoriously difficult for compilers to write code for, but it works quite well in this case. * The major workarounds are: - inability to define operators for built-in arrays (hence the use of 'Vec' wrappers). - inability to index through a tuple in a CTFE function (solved by converting types into a string). * There have been mutterings about how unhygenic/dangerous the new mixins are. In this case, the mixin forms the _entire_ body of the function. This is an interesting situation which I think a language purist will find more palatable. Enjoy.This is a work of art Don - it's practically a compiler extension. Nice job. :) For others that are interested in how this actually gets the job done, I found this in your documentation: * THEORY: * Expression templates are used to create an expression string of the form "(a+b*c)+d" * and a tuple, the entries of which correspond to a, b, c, d, ... * This string is converted to postfix. The postfix string is converted to * a string containing x87 asm, which is then mixed into a function which accepts the tuple. -- - EricAnderton at yahoo
Apr 03 2007
Pragma wrote:Don Clugston wrote:Hum, a minor question, is a string representation of the expressions better (easier to use, manipulate, etc.) than a tree representation? -- Bruno Medeiros - MSc in CS/E student http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#DI have been trying to come up with a convincing use case for the new mixins (and for metaprogramming in general). My best effort to date can be found at: http://www.dsource.org/projects/mathextra/browser/trunk/mathextra/Blade.d It generates near-optimal x87 asm code for BLAS1-style basic vector operations. 32, 64 and 80 bit vectors are all supported. Compile with -version=BladeDebug to see the asm code which is generated. Typical usage: void main() { auto p = Vec([1.0L, 2, 18]); // a vector of 80-bit reals. auto q = Vec([3.5L, 1.1, 3.8]); // ditto auto r = Vec([17.0f, 28.25, 1]); // a vector of 32-bit floats auto z = Vec([17.0i, 28.1i, 1i]); // a vector of 64-bit idoubles real d = dot(r, p+r+r); ireal e = dot(r, z); q -= ((r+p)*18.0L*314.1L - (p-r))* 35; d = dot(r, p+r+r); } Notice that mixed-length operations (real[] + float[] - double[]) are supported. Like the C++ Blitz++ library, expression templates are used to convert vector expressions into efficient element-wise operations. Unlike that library, however, there is no reliance on the compiler's optimiser. Instead, the expression template is manipulated as text, converted into postfix, and then passed to a simple CTFE compile-time assembler, which creates highly efficient asm code which is used as a mixin. To understand the later parts of the code, you need some knowledge of x87 assembler. In fact, you probably need to have read Agner Fog's superb Pentium optimisation manual (www.agner.org). Some observations: * I was amazed at how simple the expression template code is (it is somewhat cluttered by the code to check for real/imaginary type mismatch errors). * I've often read that the x87 floating-point stack is notoriously difficult for compilers to write code for, but it works quite well in this case. * The major workarounds are: - inability to define operators for built-in arrays (hence the use of 'Vec' wrappers). - inability to index through a tuple in a CTFE function (solved by converting types into a string). * There have been mutterings about how unhygenic/dangerous the new mixins are. In this case, the mixin forms the _entire_ body of the function. This is an interesting situation which I think a language purist will find more palatable. Enjoy.This is a work of art Don - it's practically a compiler extension. Nice job. :) For others that are interested in how this actually gets the job done, I found this in your documentation: * THEORY: * Expression templates are used to create an expression string of the form "(a+b*c)+d" * and a tuple, the entries of which correspond to a, b, c, d, ... * This string is converted to postfix. The postfix string is converted to * a string containing x87 asm, which is then mixed into a function which accepts the tuple.
Apr 07 2007
Bruno Medeiros wrote:Pragma wrote:I know Don wrote this lib, but I think I can answer this. In short: no. But it's really an unintentionally loaded question: there are some rather severe limitations as to what kind of data structures you can create at compile time. You're basically limited to tuples and strings, each of which have drawbacks of their own. You can create a tree using tuples, but then you run into problems with over-running the limitations on identifier length built into the compiler. Strings are no where near as flexible out of the box, but pose no storage limitations, which gives a slight edge to string-based representations of data. With some clever coding, strings can be made to store just about any structure imaginable (even if it does make for some ugly code). -- - EricAnderton at yahooDon Clugston wrote:Hum, a minor question, is a string representation of the expressions better (easier to use, manipulate, etc.) than a tree representation?I have been trying to come up with a convincing use case for the new mixins (and for metaprogramming in general). My best effort to date can be found at: http://www.dsource.org/projects/mathextra/browser/trunk/mathextra/Blade.d It generates near-optimal x87 asm code for BLAS1-style basic vector operations. 32, 64 and 80 bit vectors are all supported. Compile with -version=BladeDebug to see the asm code which is generated. Typical usage: void main() { auto p = Vec([1.0L, 2, 18]); // a vector of 80-bit reals. auto q = Vec([3.5L, 1.1, 3.8]); // ditto auto r = Vec([17.0f, 28.25, 1]); // a vector of 32-bit floats auto z = Vec([17.0i, 28.1i, 1i]); // a vector of 64-bit idoubles real d = dot(r, p+r+r); ireal e = dot(r, z); q -= ((r+p)*18.0L*314.1L - (p-r))* 35; d = dot(r, p+r+r); } Notice that mixed-length operations (real[] + float[] - double[]) are supported. Like the C++ Blitz++ library, expression templates are used to convert vector expressions into efficient element-wise operations. Unlike that library, however, there is no reliance on the compiler's optimiser. Instead, the expression template is manipulated as text, converted into postfix, and then passed to a simple CTFE compile-time assembler, which creates highly efficient asm code which is used as a mixin. To understand the later parts of the code, you need some knowledge of x87 assembler. In fact, you probably need to have read Agner Fog's superb Pentium optimisation manual (www.agner.org). Some observations: * I was amazed at how simple the expression template code is (it is somewhat cluttered by the code to check for real/imaginary type mismatch errors). * I've often read that the x87 floating-point stack is notoriously difficult for compilers to write code for, but it works quite well in this case. * The major workarounds are: - inability to define operators for built-in arrays (hence the use of 'Vec' wrappers). - inability to index through a tuple in a CTFE function (solved by converting types into a string). * There have been mutterings about how unhygenic/dangerous the new mixins are. In this case, the mixin forms the _entire_ body of the function. This is an interesting situation which I think a language purist will find more palatable. Enjoy.This is a work of art Don - it's practically a compiler extension. Nice job. :) For others that are interested in how this actually gets the job done, I found this in your documentation: * THEORY: * Expression templates are used to create an expression string of the form "(a+b*c)+d" * and a tuple, the entries of which correspond to a, b, c, d, ... * This string is converted to postfix. The postfix string is converted to * a string containing x87 asm, which is then mixed into a function which accepts the tuple.
Apr 09 2007
Pragma Wrote:In short: no. But it's really an unintentionally loaded question: there are some rather severe limitations as to what kind of data structures you can create at compile time. You're basically limited to tuples and strings, each of which have drawbacks of their own. You can create a tree using tuples, but then you run into problems with over-running the limitations on identifier length built into the compiler. Strings are no where near as flexible out of the box, but pose no storage limitations, which gives a slight edge to string-based representations of data. With some clever coding, strings can be made to store just about any structure imaginable (even if it does make for some ugly code).Well said. Once the AST is reflected at compile time, I'm sure the code can be made vastly more refined; and perhaps additional code generators can be developed and *hopefully* integrated into D. Particular generators that spark my interest tend to have to do with x87, SSE extensions, and x86-64. Most compilers to date don't properly use these functionalities. Having them integrated into D Agner Fog-optimally would hugely replace alot of C++ gaming, video, rendering and graphics engine code. *bigsmile
Apr 09 2007
Dan wrote:Pragma Wrote:Hmmm - I wonder if a lot of what Don's been doing is serving as a "proof of concept" for the D 2.0 compiler, to be written in D of course <g> (It just seems like a lot of the new D features fit so well with compiler development. Especially for things like "proving constness", etc.) - DaveIn short: no. But it's really an unintentionally loaded question: there are some rather severe limitations as to what kind of data structures you can create at compile time. You're basically limited to tuples and strings, each of which have drawbacks of their own. You can create a tree using tuples, but then you run into problems with over-running the limitations on identifier length built into the compiler. Strings are no where near as flexible out of the box, but pose no storage limitations, which gives a slight edge to string-based representations of data. With some clever coding, strings can be made to store just about any structure imaginable (even if it does make for some ugly code).Well said. Once the AST is reflected at compile time, I'm sure the code can be made vastly more refined; and perhaps additional code generators can be developed and *hopefully* integrated into D. Particular generators that spark my interest tend to have to do with x87, SSE extensions, and x86-64. Most compilers to date don't properly use these functionalities. Having them integrated into D Agner Fog-optimally would hugely replace alot of C++ gaming, video, rendering and graphics engine code. *bigsmile
Apr 09 2007
Dan wrote:Pragma Wrote:Thanks. But evaluating the AST will only take things so far. There is a huge gulf of functionality that stands between where we are now (at compile time) and what D would be like as a full-on scripting language (not that I'm advocating that it be pushed that far). Much of what sits in the middle (namely structs, multi-dimensional arrays and maps) can go a long way towards *analyzing* that tree for code generation and more.In short: no. But it's really an unintentionally loaded question: there are some rather severe limitations as to what kind of data structures you can create at compile time. You're basically limited to tuples and strings, each of which have drawbacks of their own. You can create a tree using tuples, but then you run into problems with over-running the limitations on identifier length built into the compiler. Strings are no where near as flexible out of the box, but pose no storage limitations, which gives a slight edge to string-based representations of data. With some clever coding, strings can be made to store just about any structure imaginable (even if it does make for some ugly code).Well said. Once the AST is reflected at compile time, I'm sure the code can be made vastly more refined; and perhaps additional code generators can be developed and *hopefully* integrated into D.Particular generators that spark my interest tend to have to do with x87, SSE extensions, and x86-64. Most compilers to date don't properly use these functionalities. Having them integrated into D Agner Fog-optimally would hugely replace alot of C++ gaming, video, rendering and graphics engine code.Anyway, I'm grinning too. I can't wait to see what comes up next. So far, converting Enki to a compile-time generator is proving to be a huge challenge, but workable. -- - EricAnderton at yahoo
Apr 09 2007
Particular generators that spark my interest tend to have to do with x87, SSE extensions, and x86-64. Most compilers to date don't properly use these functionalities. Having them integrated into D Agner Fog-optimally would hugely replace alot of C++ gaming, video, rendering and graphics engine code.And its dawning on me that some text processing can take advantage of doing 64-bit/128-bit chunks at a time - Paul
Apr 09 2007
Paul Findlay Wrote:Yeah, pretty much any large (>1kb) buffer copy algorithm runs fastest through SSE2 ever since 128-bit came out. The raw power overcomes the need for tweaking the loop instead of just using rep movsd. I think the asm guys have the best code we could hope for already written up on a few pages. Agner Fog, Paul Hsieh et al. If we could leech their strategy (if meaning legal) then I can see a good use for such a thing. I guess we just ought to wait for AST reflection.Particular generators that spark my interest tend to have to do with x87, SSE extensions, and x86-64. Most compilers to date don't properly use these functionalities. Having them integrated into D Agner Fog-optimally would hugely replace alot of C++ gaming, video, rendering and graphics engine code.And its dawning on me that some text processing can take advantage of doing 64-bit/128-bit chunks at a time
Apr 10 2007
Pragma wrote:Bruno Medeiros wrote:Actually I don't know how to make a tree nicely with tuples. Strings are no where near as flexible out of the box, butPragma wrote:I know Don wrote this lib, but I think I can answer this. In short: no. But it's really an unintentionally loaded question: there are some rather severe limitations as to what kind of data structures you can create at compile time. You're basically limited to tuples and strings, each of which have drawbacks of their own. You can create a tree using tuples, but then you run into problems with over-running the limitations on identifier length built into the compiler.Don Clugston wrote:Hum, a minor question, is a string representation of the expressions better (easier to use, manipulate, etc.) than a tree representation?I have been trying to come up with a convincing use case for the new mixins (and for metaprogramming in general). My best effort to date can be found at: http://www.dsource.org/projects/mathextra/browser/trunk/mathextra/Blade.d It generates near-optimal x87 asm code for BLAS1-style basic vector operations. 32, 64 and 80 bit vectors are all supported. Compile with -version=BladeDebug to see the asm code which is generated. Typical usage: void main() { auto p = Vec([1.0L, 2, 18]); // a vector of 80-bit reals. auto q = Vec([3.5L, 1.1, 3.8]); // ditto auto r = Vec([17.0f, 28.25, 1]); // a vector of 32-bit floats auto z = Vec([17.0i, 28.1i, 1i]); // a vector of 64-bit idoubles real d = dot(r, p+r+r); ireal e = dot(r, z); q -= ((r+p)*18.0L*314.1L - (p-r))* 35; d = dot(r, p+r+r); } Notice that mixed-length operations (real[] + float[] - double[]) are supported. Like the C++ Blitz++ library, expression templates are used to convert vector expressions into efficient element-wise operations. Unlike that library, however, there is no reliance on the compiler's optimiser. Instead, the expression template is manipulated as text, converted into postfix, and then passed to a simple CTFE compile-time assembler, which creates highly efficient asm code which is used as a mixin. To understand the later parts of the code, you need some knowledge of x87 assembler. In fact, you probably need to have read Agner Fog's superb Pentium optimisation manual (www.agner.org). Some observations: * I was amazed at how simple the expression template code is (it is somewhat cluttered by the code to check for real/imaginary type mismatch errors). * I've often read that the x87 floating-point stack is notoriously difficult for compilers to write code for, but it works quite well in this case. * The major workarounds are: - inability to define operators for built-in arrays (hence the use of 'Vec' wrappers). - inability to index through a tuple in a CTFE function (solved by converting types into a string). * There have been mutterings about how unhygenic/dangerous the new mixins are. In this case, the mixin forms the _entire_ body of the function. This is an interesting situation which I think a language purist will find more palatable. Enjoy.This is a work of art Don - it's practically a compiler extension. Nice job. :) For others that are interested in how this actually gets the job done, I found this in your documentation: * THEORY: * Expression templates are used to create an expression string of the form "(a+b*c)+d" * and a tuple, the entries of which correspond to a, b, c, d, ... * This string is converted to postfix. The postfix string is converted to * a string containing x87 asm, which is then mixed into a function which accepts the tuple.pose no storage limitations, which gives a slight edge to string-based representations of data. With some clever coding, strings can be made to store just about any structure imaginable (even if it does make for some ugly code).There's a couple of minor things, though: (1) in the case of generating D code to mixin, no tree is required (the string just gets mixed back in again, so it might as well stay as a string). (2) The "separate string and tuple" structure means the expression can be optimised without needing to move any of the values around; this creates less garbage for the compiler to optimise away. (3) Some patterns are probably easier to find with a string, than a tree. (4) It's more natural for D coders to think in terms of arrays, than lists. (and there is much more language support). (5) A string representation can easily express things like c=a+3*(a+b); where 'a' appears twice. (At present, there's no way to generate it, but theoretically the compiler could provide it many cases). So -- if trees were available, I'm not sure if I'd use them, or not.
Apr 09 2007
Don Clugston wrote:Pragma wrote:Here's an example (for grins): import std.stdio; template Node(char[] Data,Nodes...){ alias Data data; alias Nodes nodes; } template PrintData(char[] parent){ const char[] PrintData = ""; } template PrintData(char[] parent,alias Node){ static if(parent == ""){ const char[] PrintData = Node.data ~ "\n" ~ PrintData!(Node.data,Node.nodes); } else{ const char[] PrintData = parent ~ "." ~ Node.data ~ "\n" ~ PrintData!(parent ~ "." ~ Node.data,Node.nodes); } } template PrintData(char[] parent,alias Node,V...){ const char[] PrintData = PrintData!(parent,Node) ~ PrintData!(parent,V); } // example "tree" structure alias Node!("1", Node!("1"), Node!("2", Node!("1"), Node!("2") ) ) dataset; void main(){ writefln("%s",PrintData!("",dataset)); }Bruno Medeiros wrote:Actually I don't know how to make a tree nicely with tuples.Pragma wrote:I know Don wrote this lib, but I think I can answer this. In short: no. But it's really an unintentionally loaded question: there are some rather severe limitations as to what kind of data structures you can create at compile time. You're basically limited to tuples and strings, each of which have drawbacks of their own. You can create a tree using tuples, but then you run into problems with over-running the limitations on identifier length built into the compiler.Don Clugston wrote:Hum, a minor question, is a string representation of the expressions better (easier to use, manipulate, etc.) than a tree representation?I have been trying to come up with a convincing use case for the new mixins (and for metaprogramming in general). My best effort to date can be found at: http://www.dsource.org/projects/mathextra/browser/trunk/mathextra/Blade.d It generates near-optimal x87 asm code for BLAS1-style basic vector operations. 32, 64 and 80 bit vectors are all supported. Compile with -version=BladeDebug to see the asm code which is generated. Typical usage: void main() { auto p = Vec([1.0L, 2, 18]); // a vector of 80-bit reals. auto q = Vec([3.5L, 1.1, 3.8]); // ditto auto r = Vec([17.0f, 28.25, 1]); // a vector of 32-bit floats auto z = Vec([17.0i, 28.1i, 1i]); // a vector of 64-bit idoubles real d = dot(r, p+r+r); ireal e = dot(r, z); q -= ((r+p)*18.0L*314.1L - (p-r))* 35; d = dot(r, p+r+r); } Notice that mixed-length operations (real[] + float[] - double[]) are supported. Like the C++ Blitz++ library, expression templates are used to convert vector expressions into efficient element-wise operations. Unlike that library, however, there is no reliance on the compiler's optimiser. Instead, the expression template is manipulated as text, converted into postfix, and then passed to a simple CTFE compile-time assembler, which creates highly efficient asm code which is used as a mixin. To understand the later parts of the code, you need some knowledge of x87 assembler. In fact, you probably need to have read Agner Fog's superb Pentium optimisation manual (www.agner.org). Some observations: * I was amazed at how simple the expression template code is (it is somewhat cluttered by the code to check for real/imaginary type mismatch errors). * I've often read that the x87 floating-point stack is notoriously difficult for compilers to write code for, but it works quite well in this case. * The major workarounds are: - inability to define operators for built-in arrays (hence the use of 'Vec' wrappers). - inability to index through a tuple in a CTFE function (solved by converting types into a string). * There have been mutterings about how unhygenic/dangerous the new mixins are. In this case, the mixin forms the _entire_ body of the function. This is an interesting situation which I think a language purist will find more palatable. Enjoy.This is a work of art Don - it's practically a compiler extension. Nice job. :) For others that are interested in how this actually gets the job done, I found this in your documentation: * THEORY: * Expression templates are used to create an expression string of the form "(a+b*c)+d" * and a tuple, the entries of which correspond to a, b, c, d, ... * This string is converted to postfix. The postfix string is converted to * a string containing x87 asm, which is then mixed into a function which accepts the tuple.Strings are no where near as flexible out of the box, butThey're really handy. I've found some uses for them already: /* ZeroOrMoreExpr = generate.ZeroOrMoreExpr() ::= "[" "{" Expression:expr "}" "]" [RangeSetExpr:ranges ] [Binding:binding ] ["*" SubExpressionTerm:delim ] [SubExpressionTerm:term]; */ bool ZeroOrMoreExpr(inout char[] value,inout char[] bindings,inout char[] tokens,inout char[] err){ return Rule!( generate.ZeroOrMoreExpr, And!( Literal!("["), Literal!("{"), Bind!(Expression,"expr"), Literal!("}"), Literal!("]"), Optional!( Bind!(RangeSetExpr,"ranges") ), Optional!( Bind!(Binding,"binding") ), Optional!( And!( Literal!("*"), Bind!(SubExpressionTerm,"delim") ) ), Bind!(SubExpressionTerm,"term") ) )(value,bindings,tokens,err); } There's a *lot* going on in this sample. Basically what you see here is a chunk of the as-of-yet-experimental compile-time Enki parser. This piece parses the Zero-or-more expression part of the EBNF variant that Enki supports. The templates evaluate to CTFE's that in turn make up the parser when it's executed. So there's several layers of compile-time evaluation going on here. Also, what's not obvious is that "char[] tokens" is actually an *array of strings* that is stored in a single char[] array; each string's length data is encoded as a size_t mapped onto the appropriate number of chars. The Bind!() expressions also map parsed out data to a key/value set which is stored in a similar fashion (char[] bindings). The goals here were to side-step the limitations of D's identifier name-length while providing a user-readable toolkit for parser composition; the only limit now is placed on the length of any given rule. I haven't found a way to unify the compile-time and run-time parser generation yet (one backend with two targets), but I anticipate using something very similar to the above by providing a CT and RT version of the template library. At a minimum I plan on having this consume a .bnf file at compile-time, in order to create a run-time parser via mixin. -- - EricAnderton at yahoopose no storage limitations, which gives a slight edge to string-based representations of data. With some clever coding, strings can be made to store just about any structure imaginable (even if it does make for some ugly code).There's a couple of minor things, though: (1) in the case of generating D code to mixin, no tree is required (the string just gets mixed back in again, so it might as well stay as a string). (2) The "separate string and tuple" structure means the expression can be optimised without needing to move any of the values around; this creates less garbage for the compiler to optimise away. (3) Some patterns are probably easier to find with a string, than a tree. (4) It's more natural for D coders to think in terms of arrays, than lists. (and there is much more language support). (5) A string representation can easily express things like c=a+3*(a+b); where 'a' appears twice. (At present, there's no way to generate it, but theoretically the compiler could provide it many cases). So -- if trees were available, I'm not sure if I'd use them, or not.
Apr 09 2007
Pragma wrote:Don Clugston wrote:The problem I was referring to, is: how to store both values, and functions/operators, inside the tree. It seems to get messy very quickly.Pragma wrote:Here's an example (for grins): import std.stdio; template Node(char[] Data,Nodes...){ alias Data data; alias Nodes nodes; } // example "tree" structure alias Node!("1", Node!("1"), Node!("2", Node!("1"), Node!("2") ) ) dataset;Bruno Medeiros wrote:Actually I don't know how to make a tree nicely with tuples.Pragma wrote:I know Don wrote this lib, but I think I can answer this. In short: no. But it's really an unintentionally loaded question: there are some rather severe limitations as to what kind of data structures you can create at compile time. You're basically limited to tuples and strings, each of which have drawbacks of their own. You can create a tree using tuples, but then you run into problems with over-running the limitations on identifier length built into the compiler.Don Clugston wrote:Hum, a minor question, is a string representation of the expressions better (easier to use, manipulate, etc.) than a tree representation?I have been trying to come up with a convincing use case for the new mixins (and for metaprogramming in general). My best effort to date can be found at: http://www.dsource.org/projects/mathextra/browser/trunk/mathextra/Blade.d It generates near-optimal x87 asm code for BLAS1-style basic vector operations. 32, 64 and 80 bit vectors are all supported. Compile with -version=BladeDebug to see the asm code which is generated. Typical usage: void main() { auto p = Vec([1.0L, 2, 18]); // a vector of 80-bit reals. auto q = Vec([3.5L, 1.1, 3.8]); // ditto auto r = Vec([17.0f, 28.25, 1]); // a vector of 32-bit floats auto z = Vec([17.0i, 28.1i, 1i]); // a vector of 64-bit idoubles real d = dot(r, p+r+r); ireal e = dot(r, z); q -= ((r+p)*18.0L*314.1L - (p-r))* 35; d = dot(r, p+r+r); } Notice that mixed-length operations (real[] + float[] - double[]) are supported. Like the C++ Blitz++ library, expression templates are used to convert vector expressions into efficient element-wise operations. Unlike that library, however, there is no reliance on the compiler's optimiser. Instead, the expression template is manipulated as text, converted into postfix, and then passed to a simple CTFE compile-time assembler, which creates highly efficient asm code which is used as a mixin. To understand the later parts of the code, you need some knowledge of x87 assembler. In fact, you probably need to have read Agner Fog's superb Pentium optimisation manual (www.agner.org). Some observations: * I was amazed at how simple the expression template code is (it is somewhat cluttered by the code to check for real/imaginary type mismatch errors). * I've often read that the x87 floating-point stack is notoriously difficult for compilers to write code for, but it works quite well in this case. * The major workarounds are: - inability to use a tuple element directly from asm code (bug - inability to define operators for built-in arrays (hence the use of 'Vec' wrappers). - inability to index through a tuple in a CTFE function (solved by converting types into a string). * There have been mutterings about how unhygenic/dangerous the new mixins are. In this case, the mixin forms the _entire_ body of the function. This is an interesting situation which I think a language purist will find more palatable. Enjoy.This is a work of art Don - it's practically a compiler extension. Nice job. :) For others that are interested in how this actually gets the job done, I found this in your documentation: * THEORY: * Expression templates are used to create an expression string of the form "(a+b*c)+d" * and a tuple, the entries of which correspond to a, b, c, d, ... * This string is converted to postfix. The postfix string is converted to * a string containing x87 asm, which is then mixed into a function which accepts the tuple.I meant for this application. There's no doubt they're indispensable in other contexts.So -- if trees were available, I'm not sure if I'd use them, or not.They're really handy. I've found some uses for them already:/* ZeroOrMoreExpr = generate.ZeroOrMoreExpr() ::= "[" "{" Expression:expr "}" "]" [RangeSetExpr:ranges ] [Binding:binding ] ["*" SubExpressionTerm:delim ] [SubExpressionTerm:term]; */ bool ZeroOrMoreExpr(inout char[] value,inout char[] bindings,inout char[] tokens,inout char[] err){ return Rule!( generate.ZeroOrMoreExpr, And!( Literal!("["), Literal!("{"), Bind!(Expression,"expr"), Literal!("}"), Literal!("]"), Optional!( Bind!(RangeSetExpr,"ranges") ), Optional!( Bind!(Binding,"binding") ), Optional!( And!( Literal!("*"), Bind!(SubExpressionTerm,"delim") ) ), Bind!(SubExpressionTerm,"term") ) )(value,bindings,tokens,err); } There's a *lot* going on in this sample. Basically what you see here is a chunk of the as-of-yet-experimental compile-time Enki parser. This piece parses the Zero-or-more expression part of the EBNF variant that Enki supports. The templates evaluate to CTFE's that in turn make up the parser when it's executed. So there's several layers of compile-time evaluation going on here. Also, what's not obvious is that "char[] tokens" is actually an *array of strings* that is stored in a single char[] array; each string's length data is encoded as a size_t mapped onto the appropriate number of chars. The Bind!() expressions also map parsed out data to a key/value set which is stored in a similar fashion (char[] bindings).Seriously cool! Seems like you're generating a tree of nested mixins? Anyway, I suspect this will really benefit from any compiler improvements, eg CTFE support for AAs or nested functions. And obviously AST macros.The goals here were to side-step the limitations of D's identifier name-length while providing a user-readable toolkit for parser composition; the only limit now is placed on the length of any given rule. I haven't found a way to unify the compile-time and run-time parser generation yet (one backend with two targets), but I anticipate using something very similar to the above by providing a CT and RT version of the template library. At a minimum I plan on having this consume a .bnf file at compile-time, in order to create a run-time parser via mixin.I really think the new mixins + CTFE was a breakthrough. Publish some of this stuff, and I think we'll see an exodus of many Boost developers into D.
Apr 10 2007
Don Clugston wrote:I really think the new mixins + CTFE was a breakthrough. Publish some of this stuff, and I think we'll see an exodus of many Boost developers into D.You guys gotta publish!
Apr 10 2007
Walter Bright wrote:Don Clugston wrote:I've begun a draft. A historical question for you -- On this page, http://www.artima.com/cppsource/top_cpp_software.html Scott Meyers says that g++ was the first compiler to generate native code. But I thought Zortech was older than g++. Is that correct?I really think the new mixins + CTFE was a breakthrough. Publish some of this stuff, and I think we'll see an exodus of many Boost developers into D.You guys gotta publish!
Apr 10 2007
Don Clugston wrote:I've begun a draft. A historical question for you -- On this page, http://www.artima.com/cppsource/top_cpp_software.html Scott Meyers says that g++ was the first compiler to generate native code. But I thought Zortech was older than g++. Is that correct?Depends on how you look at it. g++ was first 'released' in December, 1987. But the release notes for it say: "The GNU C++ Compiler is still in test release, and is NOT ready for everyday use" so I don't consider that a real release. Oregon Software released their native C++ compiler (not for the PC) in Jan or Feb 1988, nobody seems to recall exactly. Zortech's first release was in Jun 1988. Michael Tiemann, author of g++, calls the Sep 1988 release the "first really stable version."
Apr 10 2007
Walter Bright wrote:Don Clugston wrote:Baby steps Walter, baby steps... ;) -- - EricAnderton at yahooI really think the new mixins + CTFE was a breakthrough. Publish some of this stuff, and I think we'll see an exodus of many Boost developers into D.You guys gotta publish!
Apr 10 2007
Don Clugston wrote:The problem I was referring to, is: how to store both values, and functions/operators, inside the tree. It seems to get messy very quickly.It can, especially with the additional code you need to treat all those templates as specialized data types.I meant for this application. There's no doubt they're indispensable in other contexts.Oh, my bad. Yea, it would probably be overkill for BLADE. :)Almost. The generate.ZeroOrMore returns an arbitrary string, that may be another "map" or a chunk of runtime code; the value is placed in the "value" string that's passed in. The *rootmost* generator is what compiles all this stuff into an actual chunk of mixin-able code. It's analogous to how the current rendition of Enki works. It seems overkill, but it's needed so I can do some basic semantic analysis, and do things like declare binding vars. I'm still looking for ways to simplify the process.Basically what you see here is a chunk of the as-of-yet-experimental compile-time Enki parser. This piece parses the Zero-or-more expression part of the EBNF variant that Enki supports. The templates evaluate to CTFE's that in turn make up the parser when it's executed. So there's several layers of compile-time evaluation going on here. Also, what's not obvious is that "char[] tokens" is actually an *array of strings* that is stored in a single char[] array; each string's length data is encoded as a size_t mapped onto the appropriate number of chars. The Bind!() expressions also map parsed out data to a key/value set which is stored in a similar fashion (char[] bindings).Seriously cool! Seems like you're generating a tree of nested mixins?Anyway, I suspect this will really benefit from any compiler improvements, eg CTFE support for AAs or nested functions. And obviously AST macros.Definitely for AA's but not so much for AST macros - I get the impression that AST manipulation will only be useful for D code, and not for completely arbitrary grammars like EBNF. Getting compile-time AA support would cut Enki's code size down by almost a third: const char[] hackedArray = "\x00\x00\x00\x05hello\x00\x00\x00\x05cruel\x00\x00\x00\x05world"; Under the hood, this is what most of my data looks like. Dropping the support routines for manipulating such structures would help make things a lot less cumbersome. :(I really think the new mixins + CTFE was a breakthrough. Publish some of this stuff, and I think we'll see an exodus of many Boost developers into D.Agreed. CTFE frees us from many limitations imposed by templates, and the added flexibility of mixin() pretty much gives us the same strength of javascript's eval() statement at compile-time. It's a huge deal. It's possible that we might find a handful of interested devs in their camp, but I think we're still back to the same old problem as with the rest of C++: momentum. I sincerely doubt we'll see a mass exodus from one camp to this one regardless of how good a job is done here. Of course, I'll be happy to be wrong about that. :) -- - EricAnderton at yahoo
Apr 10 2007
Pragma schrieb:Don Clugston wrote:Hey pragma, really cool, I've come up with a similar structure while experimenting with a PEG parser in D (see attachment) after I've read this article series on Codeproject: http://www.codeproject.com/cpp/crafting_interpreter_p1.asp http://www.codeproject.com/cpp/crafting_interpreter_p2.asp http://www.codeproject.com/cpp/crafting_interpreter_p3.asp The template system of D does an awesome job in keeping templated PEG grammars readable. BTW: If you turn Enki into a PEG style parser I definitely throw my attempts into the dustbin :-) Greets KlausOPragma wrote:Here's an example (for grins): import std.stdio; template Node(char[] Data,Nodes...){ alias Data data; alias Nodes nodes; } template PrintData(char[] parent){ const char[] PrintData = ""; } template PrintData(char[] parent,alias Node){ static if(parent == ""){ const char[] PrintData = Node.data ~ "\n" ~ PrintData!(Node.data,Node.nodes); } else{ const char[] PrintData = parent ~ "." ~ Node.data ~ "\n" ~ PrintData!(parent ~ "." ~ Node.data,Node.nodes); } } template PrintData(char[] parent,alias Node,V...){ const char[] PrintData = PrintData!(parent,Node) ~ PrintData!(parent,V); } // example "tree" structure alias Node!("1", Node!("1"), Node!("2", Node!("1"), Node!("2") ) ) dataset; void main(){ writefln("%s",PrintData!("",dataset)); }Bruno Medeiros wrote:Actually I don't know how to make a tree nicely with tuples.Pragma wrote:I know Don wrote this lib, but I think I can answer this. In short: no. But it's really an unintentionally loaded question: there are some rather severe limitations as to what kind of data structures you can create at compile time. You're basically limited to tuples and strings, each of which have drawbacks of their own. You can create a tree using tuples, but then you run into problems with over-running the limitations on identifier length built into the compiler.Don Clugston wrote:Hum, a minor question, is a string representation of the expressions better (easier to use, manipulate, etc.) than a tree representation?I have been trying to come up with a convincing use case for the new mixins (and for metaprogramming in general). My best effort to date can be found at: http://www.dsource.org/projects/mathextra/browser/trunk/mathextra/Blade.d It generates near-optimal x87 asm code for BLAS1-style basic vector operations. 32, 64 and 80 bit vectors are all supported. Compile with -version=BladeDebug to see the asm code which is generated. Typical usage: void main() { auto p = Vec([1.0L, 2, 18]); // a vector of 80-bit reals. auto q = Vec([3.5L, 1.1, 3.8]); // ditto auto r = Vec([17.0f, 28.25, 1]); // a vector of 32-bit floats auto z = Vec([17.0i, 28.1i, 1i]); // a vector of 64-bit idoubles real d = dot(r, p+r+r); ireal e = dot(r, z); q -= ((r+p)*18.0L*314.1L - (p-r))* 35; d = dot(r, p+r+r); } Notice that mixed-length operations (real[] + float[] - double[]) are supported. Like the C++ Blitz++ library, expression templates are used to convert vector expressions into efficient element-wise operations. Unlike that library, however, there is no reliance on the compiler's optimiser. Instead, the expression template is manipulated as text, converted into postfix, and then passed to a simple CTFE compile-time assembler, which creates highly efficient asm code which is used as a mixin. To understand the later parts of the code, you need some knowledge of x87 assembler. In fact, you probably need to have read Agner Fog's superb Pentium optimisation manual (www.agner.org). Some observations: * I was amazed at how simple the expression template code is (it is somewhat cluttered by the code to check for real/imaginary type mismatch errors). * I've often read that the x87 floating-point stack is notoriously difficult for compilers to write code for, but it works quite well in this case. * The major workarounds are: - inability to use a tuple element directly from asm code (bug - inability to define operators for built-in arrays (hence the use of 'Vec' wrappers). - inability to index through a tuple in a CTFE function (solved by converting types into a string). * There have been mutterings about how unhygenic/dangerous the new mixins are. In this case, the mixin forms the _entire_ body of the function. This is an interesting situation which I think a language purist will find more palatable. Enjoy.This is a work of art Don - it's practically a compiler extension. Nice job. :) For others that are interested in how this actually gets the job done, I found this in your documentation: * THEORY: * Expression templates are used to create an expression string of the form "(a+b*c)+d" * and a tuple, the entries of which correspond to a, b, c, d, ... * This string is converted to postfix. The postfix string is converted to * a string containing x87 asm, which is then mixed into a function which accepts the tuple.Strings are no where near as flexible out of the box, butThey're really handy. I've found some uses for them already: /* ZeroOrMoreExpr = generate.ZeroOrMoreExpr() ::= "[" "{" Expression:expr "}" "]" [RangeSetExpr:ranges ] [Binding:binding ] ["*" SubExpressionTerm:delim ] [SubExpressionTerm:term]; */ bool ZeroOrMoreExpr(inout char[] value,inout char[] bindings,inout char[] tokens,inout char[] err){ return Rule!( generate.ZeroOrMoreExpr, And!( Literal!("["), Literal!("{"), Bind!(Expression,"expr"), Literal!("}"), Literal!("]"), Optional!( Bind!(RangeSetExpr,"ranges") ), Optional!( Bind!(Binding,"binding") ), Optional!( And!( Literal!("*"), Bind!(SubExpressionTerm,"delim") ) ), Bind!(SubExpressionTerm,"term") ) )(value,bindings,tokens,err); } There's a *lot* going on in this sample. Basically what you see here is a chunk of the as-of-yet-experimental compile-time Enki parser. This piece parses the Zero-or-more expression part of the EBNF variant that Enki supports. The templates evaluate to CTFE's that in turn make up the parser when it's executed. So there's several layers of compile-time evaluation going on here. Also, what's not obvious is that "char[] tokens" is actually an *array of strings* that is stored in a single char[] array; each string's length data is encoded as a size_t mapped onto the appropriate number of chars. The Bind!() expressions also map parsed out data to a key/value set which is stored in a similar fashion (char[] bindings). The goals here were to side-step the limitations of D's identifier name-length while providing a user-readable toolkit for parser composition; the only limit now is placed on the length of any given rule. I haven't found a way to unify the compile-time and run-time parser generation yet (one backend with two targets), but I anticipate using something very similar to the above by providing a CT and RT version of the template library. At a minimum I plan on having this consume a .bnf file at compile-time, in order to create a run-time parser via mixin.pose no storage limitations, which gives a slight edge to string-based representations of data. With some clever coding, strings can be made to store just about any structure imaginable (even if it does make for some ugly code).There's a couple of minor things, though: (1) in the case of generating D code to mixin, no tree is required (the string just gets mixed back in again, so it might as well stay as a string). (2) The "separate string and tuple" structure means the expression can be optimised without needing to move any of the values around; this creates less garbage for the compiler to optimise away. (3) Some patterns are probably easier to find with a string, than a tree. (4) It's more natural for D coders to think in terms of arrays, than lists. (and there is much more language support). (5) A string representation can easily express things like c=a+3*(a+b); where 'a' appears twice. (At present, there's no way to generate it, but theoretically the compiler could provide it many cases). So -- if trees were available, I'm not sure if I'd use them, or not.
Apr 10 2007
KlausO wrote:Hey pragma, really cool, I've come up with a similar structure while experimenting with a PEG parser in D (see attachment) after I've read this article series on Codeproject: http://www.codeproject.com/cpp/crafting_interpreter_p1.asp http://www.codeproject.com/cpp/crafting_interpreter_p2.asp http://www.codeproject.com/cpp/crafting_interpreter_p3.asp The template system of D does an awesome job in keeping templated PEG grammars readable.Yes it does! Thanks for posting this - I didn't even know that article was there.BTW: If you turn Enki into a PEG style parser I definitely throw my attempts into the dustbin :-) GreetsWow, that's one heck of an endorsement. Thanks, but don't throw anything out yet. This rendition of Enki is still a ways off though. FYI I plan on keeping Enki's internals as human-readable as possible by keeping it self-hosting. So there'll be two ways to utilize the toolkit: EBNF coding and "by hand".alias Action!( And!( PlusRepeat!(EmailChar), Char!(' '), PlusRepeat!( And!( Or!( In!( Range!('a', 'z'), Range!('A', 'Z'), Range!('0', '9'), Char!('_'), Char!('%'), Char!('-') ), Char!('.') ), Not!(EmailSuffix) ) ), Char!('.'), EmailSuffix ), delegate void(char[] email) { writefln("<email:", email, ">"); } ) Email;I had an earlier cut that looked a lot like this. :) But there's a very subtle problem lurking in there. By making your entire grammar one monster-sized template instance, you'll run into DMD's identifier-length limit *fast*. As a result it failed when I tried to transcribe Enki's ENBF definition. That's why I wrap each rule as a CTFE-style function as it side-steps that issue rather nicely, without generating too many warts. -- - EricAnderton at yahoo
Apr 10 2007
Pragma schrieb:KlausO wrote:Nice to hear that I hit your taste :-)Hey pragma, really cool, I've come up with a similar structure while experimenting with a PEG parser in D (see attachment) after I've read this article series on Codeproject: http://www.codeproject.com/cpp/crafting_interpreter_p1.asp http://www.codeproject.com/cpp/crafting_interpreter_p2.asp http://www.codeproject.com/cpp/crafting_interpreter_p3.asp The template system of D does an awesome job in keeping templated PEG grammars readable.Yes it does! Thanks for posting this - I didn't even know that article was there.BTW: If you turn Enki into a PEG style parser I definitely throw my attempts into the dustbin :-) GreetsWow, that's one heck of an endorsement. Thanks, but don't throw anything out yet. This rendition of Enki is still a ways off though. FYI I plan on keeping Enki's internals as human-readable as possible by keeping it self-hosting. So there'll be two ways to utilize the toolkit: EBNF coding and "by hand".Another issue I ran into is circular template dependencies. You could get nasty error messages like dpeg.d(282): Error: forward reference to 'And!(Alternative,OptRepeat!(And!(WS,Char!('/'),WS,Alternative)),WS)' Any idea how they could be resolved ? FYI: Other good PEG references: http://pdos.csail.mit.edu/~baford/packrat/ http://www.codeproject.com/csharp/cat.asp very interesting had been http://code.google.com/p/cat-language/wiki/HowTheInterpreterWorks C++ templated parsers http://www.codeproject.com/cpp/yard-xml-parser.asp http://www.codeproject.com/cpp/biscuit.aspalias Action!( And!( PlusRepeat!(EmailChar), Char!(' '), PlusRepeat!( And!( Or!( In!( Range!('a', 'z'), Range!('A', 'Z'), Range!('0', '9'), Char!('_'), Char!('%'), Char!('-') ), Char!('.') ), Not!(EmailSuffix) ) ), Char!('.'), EmailSuffix ), delegate void(char[] email) { writefln("<email:", email, ">"); } ) Email;I had an earlier cut that looked a lot like this. :) But there's a very subtle problem lurking in there. By making your entire grammar one monster-sized template instance, you'll run into DMD's identifier-length limit *fast*. As a result it failed when I tried to transcribe Enki's ENBF definition. That's why I wrap each rule as a CTFE-style function as it side-steps that issue rather nicely, without generating too many warts.
Apr 10 2007
KlausO wrote:Pragma schrieb:Yep, that one bit me too; just use the CTFE trick I mentioned. Wrapping each rule fixes this since DMD can resolve forward references to functions, but not with template instances.KlausO wrote:Another issue I ran into is circular template dependencies. You could get nasty error messages like dpeg.d(282): Error: forward reference to 'And!(Alternative,OptRepeat!(And!(WS,Char!('/'),WS,Alternative)),WS)' Any idea how they could be resolved ?alias Action!( And!( PlusRepeat!(EmailChar), Char!(' '), PlusRepeat!( And!( Or!( In!( Range!('a', 'z'), Range!('A', 'Z'), Range!('0', '9'), Char!('_'), Char!('%'), Char!('-') ), Char!('.') ), Not!(EmailSuffix) ) ), Char!('.'), EmailSuffix ), delegate void(char[] email) { writefln("<email:", email, ">"); } ) Email;I had an earlier cut that looked a lot like this. :) But there's a very subtle problem lurking in there. By making your entire grammar one monster-sized template instance, you'll run into DMD's identifier-length limit *fast*. As a result it failed when I tried to transcribe Enki's ENBF definition. That's why I wrap each rule as a CTFE-style function as it side-steps that issue rather nicely, without generating too many warts.FYI: Other good PEG references: http://pdos.csail.mit.edu/~baford/packrat/ http://www.codeproject.com/csharp/cat.asp very interesting had been http://code.google.com/p/cat-language/wiki/HowTheInterpreterWorks C++ templated parsers http://www.codeproject.com/cpp/yard-xml-parser.asp http://www.codeproject.com/cpp/biscuit.aspThanks. I can always use a little more research to lean on. -- - EricAnderton at yahoo
Apr 10 2007
KlausO wrote:The way my dparse sidesteps this (both the id length limit and forward references) is to have the parser functions specialized only on the name of the reduction, the grammar is carried in an outer scope (the mixed in template that is never used as an identifier). This also allows the inner template to specialize on anything that is defined.Pragma schrieb:I had an earlier cut that looked a lot like this. :) But there's a very subtle problem lurking in there. By making your entire grammar one monster-sized template instance, you'll run into DMD's identifier-length limit *fast*. As a result it failed when I tried to transcribe Enki's ENBF definition. That's why I wrap each rule as a CTFE-style function as it side-steps that issue rather nicely, without generating too many warts.Another issue I ran into is circular template dependencies. You could get nasty error messages like
Apr 10 2007
Pragma wrote:By making your entire grammar one monster-sized template instance, you'll run into DMD's identifier-length limit *fast*. As a result it failed when I tried to transcribe Enki's ENBF definition. That's why I wrap each rule as a CTFE-style function as it side-steps that issue rather nicely, without generating too many warts.One of the motivations for CTFE was to address that exact problem.
Apr 10 2007
Hello, Very good work. I'm impressed. Actually i was developing very similar program, but yours is practicly all I was needing. I'm developing linear algebra package in D, and now I'm optimising it for vector machines, so your BLAS1-like pacakge will be helpful. :) Thx. -- Witold Baryluk
Apr 03 2007
Witold Baryluk wrote:Hello, Very good work. I'm impressed. Actually i was developing very similar program, but yours is practicly all I was needing. I'm developing linear algebra package in D, and now I'm optimising it for vector machines, so your BLAS1-like pacakge will be helpful. :)Do you intend to implement BLAS2 and BLAS3-like functionality? I feel that I don't know enough about cache-efficient matrix blocking techniques to be confident in presenting code. I don't know how sophisticated the techniques are in libraries like ATLAS, but the ones used in Blitz++ should be easy to compete with. Blade is still mostly proof-of-concept rather than being industrial-strength -- there are so many possibilities for improvement, it's not well tested, and I haven't even put any effort into making the code look nice. But as Davidl said, it shows that hard-core back-end optimisation can now be done in a library at compile time. The code is quite modularised, so it would be straightforward to add a check for 'is it SSE-able' (ie, are all the vectors floats) and 'is it SSE2-able' (are all the vectors doubles), and if so, send them off into specialised assemblers, otherwise send them to the x87 one. The postfix-ing step will involve searching for *+ combinations where fma can be used. The ability to know the vector length in many cases can be a huge advantage for SSE code, where loop unrolling by a factor of 2 or 4 is always necessary. I haven't got that far yet. I'm pretty sure that this type of technique can blow almost everything else out of the water. <g>
Apr 04 2007
i think compilers have feeled a great challenge from compile-time back-e= nd = like powerI have been trying to come up with a convincing use case for the new =mixins (and for metaprogramming in general). My best effort to date ca=n =be found at: http://www.dsource.org/projects/mathextra/browser/trunk/mathextra/Blad=e.dIt generates near-optimal x87 asm code for BLAS1-style basic vector =operations. 32, 64 and 80 bit vectors are all supported. Compile with -version=3DBladeDebug to see the asm code which is genera=ted.Typical usage: void main() { auto p =3D Vec([1.0L, 2, 18]); // a vector of 80-bit reals. auto q =3D Vec([3.5L, 1.1, 3.8]); // ditto auto r =3D Vec([17.0f, 28.25, 1]); // a vector of 32-bit floats auto z =3D Vec([17.0i, 28.1i, 1i]); // a vector of 64-bit idouble=sreal d =3D dot(r, p+r+r); ireal e =3D dot(r, z); q -=3D ((r+p)*18.0L*314.1L - (p-r))* 35; d =3D dot(r, p+r+r); } Notice that mixed-length operations (real[] + float[] - double[]) are ==supported. Like the C++ Blitz++ library, expression templates are used to convert==vector expressions into efficient element-wise operations. Unlike that==library, however, there is no reliance on the compiler's optimiser. =Instead, the expression template is manipulated as text, converted int=o =postfix, and then passed to a simple CTFE compile-time assembler, whic=h =creates highly efficient asm code which is used as a mixin. To understand the later parts of the code, you need some knowledge of ==x87 assembler. In fact, you probably need to have read Agner Fog's =superb Pentium optimisation manual (www.agner.org). Some observations: * I was amazed at how simple the expression template code is (it is =somewhat cluttered by the code to check for real/imaginary type mismat=ch =errors). * I've often read that the x87 floating-point stack is notoriously =difficult for compilers to write code for, but it works quite well in ==this case. * The major workarounds are:- inability to define operators for built-in arrays (hence the use of =='Vec' wrappers). - inability to index through a tuple in a CTFE function (solved by =converting types into a string). * There have been mutterings about how unhygenic/dangerous the new =mixins are. In this case, the mixin forms the _entire_ body of the =function. This is an interesting situation which I think a language =purist will find more palatable. Enjoy.
Apr 03 2007
This is really neat. I like that this basically is a compiler extension and it's still a) well readable b) produces simple error messages during compilation (if any) c) introduces non-artificial syntax for the task c) will actually depend more on the versatility of operator overloading than the CTFE stuff. a) and b) are so much better than what c++ template magic gave us (as can be observed in boost). Now i think there's only one requirement missing to make it a perfect compiler extension and that is to produce debuggable code. Actually in this case, as the extension is supposed to generate mostly single line expression statements, it might not be that important. But if we consider multi-line DSL statements, a way to mark generated code as belonging to one of the DSL source lines would be great. I think, a way for a CTF to recieve the file+line where it was called from would do the trick. In BLADE that information could be added to the expression string, passed through to the postfix notation and finally be inserted into the ASM code with #line directives. Don Clugston wrote:I have been trying to come up with a convincing use case for the new mixins (and for metaprogramming in general). My best effort to date can be found at: http://www.dsource.org/projects/mathextra/browser/trunk/mathextra/Blade.d It generates near-optimal x87 asm code for BLAS1-style basic vector operations. 32, 64 and 80 bit vectors are all supported. Compile with -version=BladeDebug to see the asm code which is generated. Typical usage: void main() { auto p = Vec([1.0L, 2, 18]); // a vector of 80-bit reals. auto q = Vec([3.5L, 1.1, 3.8]); // ditto auto r = Vec([17.0f, 28.25, 1]); // a vector of 32-bit floats auto z = Vec([17.0i, 28.1i, 1i]); // a vector of 64-bit idoubles real d = dot(r, p+r+r); ireal e = dot(r, z); q -= ((r+p)*18.0L*314.1L - (p-r))* 35; d = dot(r, p+r+r); } Notice that mixed-length operations (real[] + float[] - double[]) are supported. Like the C++ Blitz++ library, expression templates are used to convert vector expressions into efficient element-wise operations. Unlike that library, however, there is no reliance on the compiler's optimiser. Instead, the expression template is manipulated as text, converted into postfix, and then passed to a simple CTFE compile-time assembler, which creates highly efficient asm code which is used as a mixin. To understand the later parts of the code, you need some knowledge of x87 assembler. In fact, you probably need to have read Agner Fog's superb Pentium optimisation manual (www.agner.org). Some observations: * I was amazed at how simple the expression template code is (it is somewhat cluttered by the code to check for real/imaginary type mismatch errors). * I've often read that the x87 floating-point stack is notoriously difficult for compilers to write code for, but it works quite well in this case. * The major workarounds are: - inability to define operators for built-in arrays (hence the use of 'Vec' wrappers). - inability to index through a tuple in a CTFE function (solved by converting types into a string). * There have been mutterings about how unhygenic/dangerous the new mixins are. In this case, the mixin forms the _entire_ body of the function. This is an interesting situation which I think a language purist will find more palatable. Enjoy.
Apr 05 2007
Excellent work! I have to say that I've had my misgivings about mixin statements, but I'm glad to see that BLADE and Don's hard work have proven me wrong. Compile time vector expressions are desperately needed, and this library goes a long way to sorting things out. As far as meta-type tricks go, this is by far the coolest I've ever seen in any programming language. One very interesting application I would like to see with this technique would be a compile time geometric algebra library. If no one writes one by this summer, I will probably do it myself. (Actually, when I heard the name BLADE, my heart skipped a beat thinking it was a GA library, but alas it is not...) Maybe it is my unrestrained exuberance, I would guess that a strong D implementation should easily crush the leading C++ GA lib, GAIGEN in both performance and ease of use. For reference, here is the GAIGEN homepage: http://www.science.uva.nl/ga/gaigen/content_gaigen.html -Mikola Lysenko
Apr 05 2007
Mikola Lysenko wrote: <snip>One very interesting application I would like to see with this technique would be a compile time geometric algebra library. If no one writes one by this summer, I will probably do it myself. (Actually, when I heard the name BLADE, my heart skipped a beat thinking it was a GA library, but alas it is not...) Maybe it is my unrestrained exuberance, I would guess that a strong D implementation should easily crush the leading C++ GA lib, GAIGEN in both performance and ease of use.Oh my GAD? Get it? ;)
Apr 05 2007
Mikola Lysenko wrote:Excellent work! I have to say that I've had my misgivings about mixin statements, but I'm glad to see that BLADE and Don's hard work have proven me wrong. Compile time vector expressions are desperately needed, and this library goes a long way to sorting things out. As far as meta-type tricks go, this is by far the coolest I've ever seen in any programming language. One very interesting application I would like to see with this technique would be a compile time geometric algebra library. If no one writes one by this summer, I will probably do it myself. (Actually, when I heard the name BLADE, my heart skipped a beat thinking it was a GA library, but alas it is not...) Maybe it is my unrestrained exuberance, I would guess that a strong D implementation should easily crush the leading C++ GA lib, GAIGEN in both performance and ease of use. For reference, here is the GAIGEN homepage: http://www.science.uva.nl/ga/gaigen/content_gaigen.html -Mikola LysenkoWow, that looks really sweet. Last I heard on GA was that it was good for working out the math, but too slow for real use. Now it's fast too apparently! Now I've really got to learn GA. And a D compile-time fu version would be neat. One question: from the GAIGEN page it sounds almost like GAIGEN generates off-line a set of primitive operations that are efficient for the chosen specific GA domain (3D euclidean or whatever). If so, then that sounds pretty good to me. What's wrong with using an off-line approach? Are there some expressions things that are just too difficult to optimize ahead of time without seeing the whole expression as it will actually used? Or is it more like GAIGEN creates the primitives operations for a particular GA like dot product, cross product, and matrix multiplies? If so then it still maybe leaves a lot of optimization opportunities on the floor. Like evaluating A*B*C*vec as ((A*B)*C)*vec vs. (A*(B*(C*vec))) can make a big difference if ABC are big matrices and vec is a small vector. --bb
Apr 05 2007
Bill Baxter wrote:Mikola Lysenko wrote:As I understand it, GAIGEN uses an external program to generate the implementation, and the implementation uses template expressions to actually perform the optimizations. The process seems a bit contrived to me, but I'm sure they have their reasons. Also if you look at the performance statistics on that site, you will see that the geometric algebra implementation is actually SLOWER than the standard linear algebra (but not by much.) The main advantage of GA when programming is that the code is shorter and simpler. Recently, GA has been getting some more widespread acceptance and commercial success. Take Geomerics for example; they're a small company that specializes in graphics and physics software based on GA inspired algorithms. Here is there website: http://www.geomerics.com They've got a very impressive realtime radiosity engine that seems to have gotten some press in the last few weeks. There's also some interesting stuff in the technology section on their website. Writing a GA library for D is definitely on my todo list (right after I get done with this semester of classes.) I will probably work on it this summer unless someone beats me to it. -MikExcellent work! I have to say that I've had my misgivings about mixin statements, but I'm glad to see that BLADE and Don's hard work have proven me wrong. Compile time vector expressions are desperately needed, and this library goes a long way to sorting things out. As far as meta-type tricks go, this is by far the coolest I've ever seen in any programming language. One very interesting application I would like to see with this technique would be a compile time geometric algebra library. If no one writes one by this summer, I will probably do it myself. (Actually, when I heard the name BLADE, my heart skipped a beat thinking it was a GA library, but alas it is not...) Maybe it is my unrestrained exuberance, I would guess that a strong D implementation should easily crush the leading C++ GA lib, GAIGEN in both performance and ease of use. For reference, here is the GAIGEN homepage: http://www.science.uva.nl/ga/gaigen/content_gaigen.html -Mikola LysenkoWow, that looks really sweet. Last I heard on GA was that it was good for working out the math, but too slow for real use. Now it's fast too apparently! Now I've really got to learn GA. And a D compile-time fu version would be neat. One question: from the GAIGEN page it sounds almost like GAIGEN generates off-line a set of primitive operations that are efficient for the chosen specific GA domain (3D euclidean or whatever). If so, then that sounds pretty good to me. What's wrong with using an off-line approach? Are there some expressions things that are just too difficult to optimize ahead of time without seeing the whole expression as it will actually used? Or is it more like GAIGEN creates the primitives operations for a particular GA like dot product, cross product, and matrix multiplies? If so then it still maybe leaves a lot of optimization opportunities on the floor. Like evaluating A*B*C*vec as ((A*B)*C)*vec vs. (A*(B*(C*vec))) can make a big difference if ABC are big matrices and vec is a small vector. --bb
Apr 05 2007
Don Clugston Wrote:this case, I thought it would be relevant to AST macro development. Also I posted this to get an idea if there was much interest in this sort of thing -- I don't have a great deal of free time, so I want to make good use of it. I only got the code working a few days ago. The speed of the code will likely be affected by how well DMD optimises the tuple copying. I even haven't looked at what code it's actually generating.In essence, it's a *demonstration* that with all of D's features combined, one can write a fwicked-optimized compiler at compile time in D. He's also working towards it being Agner Fog's ideal of optimal; and is taking the right steps to do it. Agner Fog is quite respected for his work in the x86 family of assembly; and previously I haven't seen any program that can *generate* code sufficiently well to approach his optimal assembly source.
Apr 06 2007
Dan wrote:Don Clugston Wrote:Exactly.this case, I thought it would be relevant to AST macro development. Also I posted this to get an idea if there was much interest in this sort of thing -- I don't have a great deal of free time, so I want to make good use of it. I only got the code working a few days ago. The speed of the code will likely be affected by how well DMD optimises the tuple copying. I even haven't looked at what code it's actually generating.In essence, it's a *demonstration* that with all of D's features combined, one can write a fwicked-optimized compiler at compile time in D.He's also working towards it being Agner Fog's ideal of optimal; and is taking the right steps to do it. Agner Fog is quite respected for his work in the x86 family of assembly; and previously I haven't seen any program that can *generate* code sufficiently well to approach his optimal assembly source.It might be because no-one has really tried. Agner told me that he has *never* been emailed by a compiler writer, even though he's tried to get their attention; I find that astonishing. Someone recently posted a link to an article by Michael Abrash about his software DirectX 3D renderer. There are some similarities to this: he's developed a basic, efficient code structure, and his renderer takes little fragments of code and inserts them into it. Working out what the code structure should be (which is what compilers generally try to do) is a fantastically difficult problem, but splicing fragments together is pretty easy. OTOH it must be pretty hard to identify patterns like this after the initial code generation step.
Apr 07 2007
janderson Wrote:That would be nice and would help however I think it'll take more then that to convince a Game company to switch to D, particularly ones using XNA.Umm... A runtime hosted microsoft product? For anything high performance, that's suicide! No wonder the games the last couple years have come out bloated to 8Gb.It would have to be a startup company that is not working 360, perhaps a Nintendo, sony, Linux or perhaps windows. I say perhaps, because if a company is going to switch to anything its probably XNA. Now if we had D for Net, maybe the story would be slightly different.Oh! DotNet? Another runtime hosted microsoft environment? Brilliant. I would hope you've suggested this is so *only* because programmers would want to target the XBox360. Anything running on Linux would *hugely* suffer performance running on DotNet compared to D; as would anything on the x86-64 platform - the performance hit would still be considerable on Windows, but nonetheless tolerable on modern computers.I do like the idea of using D for games, I just see it as being a hard sell. Scripting and tools are probably the main gateway for D into the game programming world.Well, with Walnut 2.x en route and Blade slowly evolving... I'd say that within the next year we ought to see a stupidly easy-to-edit scripting engine along with incredibly fast vector and floating point math. If someone eventually provides runtime AST reflection, we may soon thereafter see SSE2 and proveable security follow shortly thereafter.Once a few game companies start using it in their main project and can demonstrate productivity/speed improvements then the wall may indeed start to crumble. -JoelThat's probably a fair assessment.
Apr 10 2007