digitalmars.D - PROPOSAL: opSeq()
- Russell Lewis (65/65) Apr 07 2008 PROPOSAL: A way to handle sequences of expressions which otherwise would...
- Bill Baxter (21/27) Apr 07 2008 Are you familiar with the "trailing delegates" proposal?
- downs (11/19) Apr 07 2008 FWIW and just FYI, the least closing brackets can be done with my_for(i=...
- Russell Lewis (70/88) Apr 08 2008 Yes, I am familiar with the concept. My proposal is a generalization of...
- Bill Baxter (14/123) Apr 08 2008 Ok. Good examples. Here's another that I suppose would be possible:
- Russell Lewis (18/20) Apr 09 2008 That is something that I have worried about, as well, and I haven't done...
- Frits van Bommel (21/44) Apr 09 2008 Well, one ambiguity is stuff like: "foo 1 -2". Is this foo.opSeq(1 - 2)
- Russell Lewis (2/24) Apr 09 2008 Good points. I'll ponder 'em.
PROPOSAL: A way to handle sequences of expressions which otherwise would have been syntax errors EXAMPLE CODE: my_for(i=0, i<10, i++) { <code> } PARSER DETAILS: Add a grammar rule that works as follows: expression: expression expression+ (I'm not sure exactly where in the associativity hierarchy it should go. Maybe assign expression?) GRAMMAR DETAILS: Any time that we parse the above rule, the left-hand expression must be a "sequence handler." A sequence handler is either a delegate, or a struct which implements the function "opSeq()". The number and type of arguments of the handler determine how many, and what type, of expressions can follow the handler. The return value from the handler can be void, or a value. If the handler has fewer arguments than we have expressions in the sequence, then the return value from the first handler may be a second handler, and thus we can chain handlers. If the types of the expressions don't match, or the sequence of expressions has too few elements, then we have a syntax error. Handlers are always right-associative. This means that if we have a series of expressions: handler1 expressionA handler2 expressionB then this first becomes: handler1 expressionA handler2(expressionB) Then, if handler1 has 2 arguments, it becomes handler1(expressionA, handler2(expressionB)) However, if handler1 has only 1 argument, then it must return a handler for the second expression: handler1(expressionA)(handler2(expressionB)) IMPLEMENTATION EXAMPLE: // my_for(). // // Note that the "lazy void" overload of opSeq handles single-line // bodies with no {} while the "void delegate()" overload handles // bodies with {}. MyFor my_for(lazy void init, lazy bool test, lazy void inc) { MyFor ret; ret.init = init; ret.test = test; ret.inc = inc; return ret; } struct MyFor { void delegate() init; bool delegate() test; void delegate() inc; void opSeq(lazy void body) { opSeq({ body() }); } void opSeq(void delegate() body) { init(); while(test()) { body(); inc; } } }
Apr 07 2008
Russell Lewis wrote:PROPOSAL: A way to handle sequences of expressions which otherwise would have been syntax errors EXAMPLE CODE: my_for(i=0, i<10, i++) { <code> }Are you familiar with the "trailing delegates" proposal? Basically the idea there is that any {<code>} block following a function call would be treated as an extra argument to the function. So if you write the function: void my_for(lazy void init, lazy bool test, lazy void inc, void delegate()) { ... } then your EXAMPLE_CODE above would call that function. Your proposal would have one benefit over that in that you could have "my_for" a varargs function if you wanted to. Though, the trailing delegates idea could probably be fixed to handle that too. Like by making the trailing delegate the first argument instead of the last (kinda like what opIndexAssign does). Overall I think trailing delegates sounds like a simpler, more elegant approach. Can you point out any other benefits of your proposal that trailing delegate args would not have? I believe Walter's response previously has been that we should just get used to looking at things like: my_for(i=0,i<10,i++,{<code>}); instead of adding complications to the grammar to support such things. --bb
Apr 07 2008
Bill Baxter wrote:I believe Walter's response previously has been that we should just get used to looking at things like: my_for(i=0,i<10,i++,{<code>}); instead of adding complications to the grammar to support such things. --bbFWIW and just FYI, the least closing brackets can be done with my_for(i=0, i<10, i++) = {<code>}; using an overloaded opAssign. To make it flexible, template opAssign and make it lazy to allow chaining; i.e. my_for(...) = your_for(...) = {<code>}; For example, I use this in dglut: const string LazyCall=" static if (is(T==void)) t(); else static if (is(T==void delegate())) t()(); else static assert(false, T.stringof); "; Of course, I'd still rather have trailing DGs or full infix support. ^^ --downs
Apr 07 2008
Bill Baxter wrote:Russell Lewis wrote:Yes, I am familiar with the concept. My proposal is a generalization of that which is able to handle any type of expression, and also to handle multiple expressions. OPEN QUESTION: What happens if an opSeq-type struct is *not* followed by anything? Do we need syntax to indicate whether that is legal or not? You asked how opSeq is better than trailing delegates, so here are some more examples of things that opSeq can do: 1) Bare statements. Take a look at my implementation of the MyFor struct from the original post. One of the overloads of opSeq takes "lazy void block", which means that this syntax is also legal: my_for(i=0, i<10, i++) a = a+1; 2) Suffixes. People have suggested that the expression 3 + 2i be something that can be implemented entirely as a library. If i was a variable and we supported "opSeqRev", then it would be easy! 3) Multiple arguments. Trailing delegates can't implement complex syntaxes, such as do...while. opSeq can. At the bottom of this post, I'll post code that will handle all of the following: MyWhile(a != b) <bare statement>; MyWhile(a != b) { <block>} MyDo <bare statement> MyWhile(a != b); MyDo { <block> } MyWhile(a != b); 4) Generalized syntax. The examples above indicate to me that a lot of D's syntax could be implemented in a library using opSeq. Would that allow many of D's constructs to be first class entities? Might that allow us to implement more functional-language type features? Here's the example code I promised: BEGIN CODE struct While { bool delegate() cond; void opSeq(lazy void bareStatement) { opSeq({ bareStatement(); }); } void opSeq(void delegate() block) { if(cond()) { BEGIN_LOOP: // so I don't have to use D's while! block(); if(cond()) goto BEGIN_LOOP; } } } While MyWhile(lazy bool cond) { While ret; ret.cond = cond; return ret; } struct Do { void opSeq(lazy void bareStatement, While the_while) { opSeq({ bareStatement(); }, the_while); } void opSeq(void delegate() block, While the_while) { block(); the_while block; } } // this isn't a function, it's a variable. that's because // the use of MyDo doesn't use parens. Do MyDo; END CODEPROPOSAL: A way to handle sequences of expressions which otherwise would have been syntax errors EXAMPLE CODE: my_for(i=0, i<10, i++) { <code> }Are you familiar with the "trailing delegates" proposal? Basically the idea there is that any {<code>} block following a function call would be treated as an extra argument to the function. So if you write the function: void my_for(lazy void init, lazy bool test, lazy void inc, void delegate()) { ... } then your EXAMPLE_CODE above would call that function.
Apr 08 2008
Russell Lewis wrote:Bill Baxter wrote:Ok. Good examples. Here's another that I suppose would be possible: 5) Cast-like syntaxes. For instance the to! template in Phobos 2.x and Tango acts like a cast more or less, but you have to parenthesize the argument. Currently: int x = 5; string y = to!(string)(x); // ok! string z = to!(string) x; // error! But with your opSeq, I think the latter could be made legal, too. IIUC. I mention this because I keep forgetting to put those parenthesis around to!'s argument because it just feels so darn much like a cast. It's an interesting idea. Are you sure it doesn't kill the-ease-of-parsing requirement for the grammar? --bbRussell Lewis wrote:Yes, I am familiar with the concept. My proposal is a generalization of that which is able to handle any type of expression, and also to handle multiple expressions. OPEN QUESTION: What happens if an opSeq-type struct is *not* followed by anything? Do we need syntax to indicate whether that is legal or not? You asked how opSeq is better than trailing delegates, so here are some more examples of things that opSeq can do: 1) Bare statements. Take a look at my implementation of the MyFor struct from the original post. One of the overloads of opSeq takes "lazy void block", which means that this syntax is also legal: my_for(i=0, i<10, i++) a = a+1; 2) Suffixes. People have suggested that the expression 3 + 2i be something that can be implemented entirely as a library. If i was a variable and we supported "opSeqRev", then it would be easy! 3) Multiple arguments. Trailing delegates can't implement complex syntaxes, such as do...while. opSeq can. At the bottom of this post, I'll post code that will handle all of the following: MyWhile(a != b) <bare statement>; MyWhile(a != b) { <block>} MyDo <bare statement> MyWhile(a != b); MyDo { <block> } MyWhile(a != b); 4) Generalized syntax. The examples above indicate to me that a lot of D's syntax could be implemented in a library using opSeq. Would that allow many of D's constructs to be first class entities? Might that allow us to implement more functional-language type features? Here's the example code I promised: BEGIN CODE struct While { bool delegate() cond; void opSeq(lazy void bareStatement) { opSeq({ bareStatement(); }); } void opSeq(void delegate() block) { if(cond()) { BEGIN_LOOP: // so I don't have to use D's while! block(); if(cond()) goto BEGIN_LOOP; } } } While MyWhile(lazy bool cond) { While ret; ret.cond = cond; return ret; } struct Do { void opSeq(lazy void bareStatement, While the_while) { opSeq({ bareStatement(); }, the_while); } void opSeq(void delegate() block, While the_while) { block(); the_while block; } } // this isn't a function, it's a variable. that's because // the use of MyDo doesn't use parens. Do MyDo; END CODEPROPOSAL: A way to handle sequences of expressions which otherwise would have been syntax errors EXAMPLE CODE: my_for(i=0, i<10, i++) { <code> }Are you familiar with the "trailing delegates" proposal? Basically the idea there is that any {<code>} block following a function call would be treated as an extra argument to the function. So if you write the function: void my_for(lazy void init, lazy bool test, lazy void inc, void delegate()) { ... } then your EXAMPLE_CODE above would call that function.
Apr 08 2008
Bill Baxter wrote:It's an interesting idea. Are you sure it doesn't kill the-ease-of-parsing requirement for the grammar?That is something that I have worried about, as well, and I haven't done a rock-solid analysis of it. However, my hand-waving argument is that we parse the code without any knowledge of the types (we don't know which are opSeq handlers and which are not). If our parsing shows us that we have a sequence of expressions without any sort of operator between them, then we interpret that using the opSeq parse rule: expression: expression expression ... Then, in semantic analysis, we would decide whether that syntax is valid or not. Since opSeq is right-associative, we start at the far-right of any chain of expressions, and see if the next-to-last expression is an opSeq handler; if so, it must take 1 argument, and the type must match the rightmost expression. If not, then we work left, and so on. Mechanically, I think I can argue that this doesn't make the parser any more complex. What I don't know for sure, yet, is whether it introduces ambiguities into the grammar. Those often require a tool to find. :( Russ
Apr 09 2008
Russell Lewis wrote:Bill Baxter wrote:Well, one ambiguity is stuff like: "foo 1 -2". Is this foo.opSeq(1 - 2) (i.e. foo.opSeq(-1)) or foo.opSeq(1, -2)? Ditto for '~' (concatenation versus bitwise negation), '&' (bitwise-and versus address-of), '!' (template instantiation versus logical negation), '.' ("member of" versus "look up in the global scope"), '+' (addition versus numeric identity function), '*' (multiplication versus dereferencing). If you only allow _unexpected_ expressions, as you suggest, that would mean always choosing the first alternative above. That would mean you'd have to disambiguate the unary versions of those operators by placing them in parentheses: "foo 1 (-2)" instead of the initial example. But that leaves another ambiguity: what about "foo x (-2)"? That would translate to foo.opSeq(x(-2)). I don't think this one can be resolved, even placing parentheses around x doesn't work. For example, if x is a delegate, the expression would mean the same thing with or without parentheses around it, so there would be no way to call Foo.opSeq(void delegate(int), int) except explicitly. Besides, if you're going to place parentheses around all the operands you might as well overload opCall and be done with it, without any syntax extensions or added ambiguity at all.It's an interesting idea. Are you sure it doesn't kill the-ease-of-parsing requirement for the grammar?That is something that I have worried about, as well, and I haven't done a rock-solid analysis of it. However, my hand-waving argument is that we parse the code without any knowledge of the types (we don't know which are opSeq handlers and which are not). If our parsing shows us that we have a sequence of expressions without any sort of operator between them, then we interpret that using the opSeq parse rule: expression: expression expression ... Then, in semantic analysis, we would decide whether that syntax is valid or not. Since opSeq is right-associative, we start at the far-right of any chain of expressions, and see if the next-to-last expression is an opSeq handler; if so, it must take 1 argument, and the type must match the rightmost expression. If not, then we work left, and so on. Mechanically, I think I can argue that this doesn't make the parser any more complex. What I don't know for sure, yet, is whether it introduces ambiguities into the grammar. Those often require a tool to find. :(
Apr 09 2008
Frits van Bommel wrote:Well, one ambiguity is stuff like: "foo 1 -2". Is this foo.opSeq(1 - 2) (i.e. foo.opSeq(-1)) or foo.opSeq(1, -2)? Ditto for '~' (concatenation versus bitwise negation), '&' (bitwise-and versus address-of), '!' (template instantiation versus logical negation), '.' ("member of" versus "look up in the global scope"), '+' (addition versus numeric identity function), '*' (multiplication versus dereferencing). If you only allow _unexpected_ expressions, as you suggest, that would mean always choosing the first alternative above. That would mean you'd have to disambiguate the unary versions of those operators by placing them in parentheses: "foo 1 (-2)" instead of the initial example. But that leaves another ambiguity: what about "foo x (-2)"? That would translate to foo.opSeq(x(-2)). I don't think this one can be resolved, even placing parentheses around x doesn't work. For example, if x is a delegate, the expression would mean the same thing with or without parentheses around it, so there would be no way to call Foo.opSeq(void delegate(int), int) except explicitly. Besides, if you're going to place parentheses around all the operands you might as well overload opCall and be done with it, without any syntax extensions or added ambiguity at all.Good points. I'll ponder 'em.
Apr 09 2008