digitalmars.D - Tokenizing D at compile time?
- dsimcha (16/16) Aug 25 2011 I'm working on a parallel array ops implementation for
- Timon Gehr (9/25) Aug 25 2011 That is not real tokenization, can you go with
- Rainer Schuetze (32/48) Aug 26 2011 The lexer used by Visual D is also CTFE capable:
I'm working on a parallel array ops implementation for std.parallel_algorithm. (For the latest work in progress see https://github.com/dsimcha/parallel_algorithm/blob/master/parallel_algorithm.d ). To make it (somewhat) pretty, I need to be able to tokenize a single statement worth of D source code at compile time. Right now, the syntax requires manual tokenization: mixin(parallelArrayOp( "lhs[]", "=", "op1[]", "*", "op2[]", "/", "op3[]" )); where lhs, op1, op2, op3 are arrays. I'd like it to be something like: mixin(parallelArrayOp( "lhs[] = op1[] * op2[] / op3[]" )); Does anyone have/is there any easy way to write a compile-time D tokenizer?
Aug 25 2011
On 08/26/2011 03:08 AM, dsimcha wrote:I'm working on a parallel array ops implementation for std.parallel_algorithm. (For the latest work in progress see https://github.com/dsimcha/parallel_algorithm/blob/master/parallel_algorithm.d ). To make it (somewhat) pretty, I need to be able to tokenize a single statement worth of D source code at compile time. Right now, the syntax requires manual tokenization: mixin(parallelArrayOp( "lhs[]", "=", "op1[]", "*", "op2[]", "/", "op3[]" ));That is not real tokenization, can you go with "lhs","[","]",","=",... ?where lhs, op1, op2, op3 are arrays. I'd like it to be something like: mixin(parallelArrayOp( "lhs[] = op1[] * op2[] / op3[]" )); Does anyone have/is there any easy way to write a compile-time D tokenizer?I have written an almost complete tokenizer in D. Making it compile-time should be rather trivial, I will give it a try. (it will also convert embedded numerals to the correct type etc.) If you don't need a complete tokenizer: What are the tokenizer features you need?
Aug 25 2011
On 26.08.2011 03:08, dsimcha wrote:I'm working on a parallel array ops implementation for std.parallel_algorithm. (For the latest work in progress see https://github.com/dsimcha/parallel_algorithm/blob/master/parallel_algorithm.d ). To make it (somewhat) pretty, I need to be able to tokenize a single statement worth of D source code at compile time. Right now, the syntax requires manual tokenization: mixin(parallelArrayOp( "lhs[]", "=", "op1[]", "*", "op2[]", "/", "op3[]" )); where lhs, op1, op2, op3 are arrays. I'd like it to be something like: mixin(parallelArrayOp( "lhs[] = op1[] * op2[] / op3[]" )); Does anyone have/is there any easy way to write a compile-time D tokenizer?The lexer used by Visual D is also CTFE capable: http://www.dsource.org/projects/visuald/browser/trunk/vdc/lexer.d As Timon pointed out, it will separate into D tokens, not the more combined elements in your array. Here's my small CTFE test: /////////////////////////////////////////////////////////////////////// int[] ctfeLexer(string s) { Lexer lex; int state; uint pos; int[] ids; while(pos < s.length) { uint prevpos = pos; int id; int type = lex.scan(state, s, pos, id); assert(prevpos < pos); if(!Lexer.isCommentOrSpace(type, s[prevpos .. pos])) ids ~= id; } return ids; } unittest { static assert(ctfeLexer(q{int /* comment to skip */ a;}) == [ TOK_int, TOK_Identifier, TOK_semicolon ]); } If you want the tokens as strings rather than just the token ID, you can collect "s[prevpos .. pos]" instead of "id" into an array.
Aug 26 2011
== Quote from Rainer Schuetze (r.sagitario gmx.de)'s articleThe lexer used by Visual D is also CTFE capable: http://www.dsource.org/projects/visuald/browser/trunk/vdc/lexer.d As Timon pointed out, it will separate into D tokens, not the more combined elements in your array. Here's my small CTFE test:Thanks, but I've come to the conclusion that this lexer is way too big a dependency for something as small as parallel array ops, unless it were to be integrated into Phobos by itself. I'll just stick with the ugly syntax. Unfortunately, according to my benchmarks array ops may be so memory bandwidth-bound that parallelization doesn't yield very good speedups anyhow.
Aug 26 2011
dsimcha wrote:== Quote from Rainer Schuetze (r.sagitario gmx.de)'s articleTotally. Anything below BLAS3 is memory-limited, not CPU limited. Even then, cache prefetching has as big an impact as number of processors.The lexer used by Visual D is also CTFE capable: http://www.dsource.org/projects/visuald/browser/trunk/vdc/lexer.d As Timon pointed out, it will separate into D tokens, not the more combined elements in your array. Here's my small CTFE test:Thanks, but I've come to the conclusion that this lexer is way too big a dependency for something as small as parallel array ops, unless it were to be integrated into Phobos by itself. I'll just stick with the ugly syntax. Unfortunately, according to my benchmarks array ops may be so memory bandwidth-bound that parallelization doesn't yield very good speedups anyhow.
Aug 27 2011
On 8/27/2011 5:37 AM, Don wrote:dsimcha wrote:I think the "memory bandwidth-bound" statement actually applies to a lot of what I tried to do in std.parallel_algorithm. Much of it shows far-below-linear speedups, but it can't be explained by communication overhead because the speedup relative to the serial algorithm doesn't improve when I make the problem and work unit sizes huge.== Quote from Rainer Schuetze (r.sagitario gmx.de)'s articleTotally. Anything below BLAS3 is memory-limited, not CPU limited. Even then, cache prefetching has as big an impact as number of processors.The lexer used by Visual D is also CTFE capable: http://www.dsource.org/projects/visuald/browser/trunk/vdc/lexer.d As Timon pointed out, it will separate into D tokens, not the more combined elements in your array. Here's my small CTFE test:Thanks, but I've come to the conclusion that this lexer is way too big a dependency for something as small as parallel array ops, unless it were to be integrated into Phobos by itself. I'll just stick with the ugly syntax. Unfortunately, according to my benchmarks array ops may be so memory bandwidth-bound that parallelization doesn't yield very good speedups anyhow.
Aug 27 2011