D - Ideas for language
- Marko Tintor marko pkj.co.yu (49/49) Mar 01 2003 1) shorter relational expressions
- Ilya Minkov (43/92) Mar 01 2003 Hm. Requieres some thought on whether that's implementable.
- Antti Sykari (101/152) Mar 01 2003 Many of the suggestions below have been given earlier.
- Achillefs Margaritis (15/170) Mar 02 2003 ADA has "smart unions" named "discrimininants". In the declaration of th...
- Ilya Minkov (75/192) Mar 02 2003 And it has been implemented in Python because of "being pretty obvious"....
- Daniel Yokomiso (62/254) Mar 03 2003 Hi,
- Ilya Minkov (18/33) Mar 03 2003 It's not all that clear to me why. You have to scan every function, so
- Bill Cox (38/45) Mar 03 2003 I can help you there. Breaking loops in a directed graph requires only
- Daniel Yokomiso (12/45) Mar 03 2003 As I said "if the compiler CAN'T annotate the object code with purity
- Mike Wynn (3/9) Mar 03 2003 look at LUA (www.lua.org) the only thing that is require is a change in ...
1) shorter relational expressions it is easyer to write a <= b < c == d instead of a <= b && b < c && c == d + b and c are evaluated only once 2) multiple assignment a,b,c = A,B,C; is executed like this: A,B,C is evaluated from left to right then values are assigned to a,b,c from left to right (a=A, b=B, c=C) 3) ^ power, =>, <= and <=> logical operators 8^4 ... 8*8*8*8 a => b ... !a || b a <= b ... b => a a <=> b ... (a => b) && (a <= b) 4) optimization idea if function F has no side effect and its arguments are constants it can be computed at compile time 5) better swich old swich: switch(exp0) { case constexpr1: command1 case constexpr2: command2 case constexpr3: command3 default: command4 } is expanded: if(exp0 == constexpr1) goto label1; if(exp0 == constexpr2) goto label2; if(exp0 == constexpr3) goto label3; goto label4; label1: command1; label2: command2; label3: command3; label4: command4; better switch: switch(exp0) case(exp1) command1 case(exp2) command2 case(exp3) command3 else command4 is expanded: if(exp0 == exp1) command1 else if(exp0 == exp2) command2 else if(exp0 == exp3) command3 else command4
Mar 01 2003
Welcome stranger! Marko wrote:1) shorter relational expressions it is easyer to write a <= b < c == d instead of a <= b && b < c && c == d + b and c are evaluated only onceHm. Requieres some thought on whether that's implementable. RealtionalExp -> [Exp RelOp]* Exp. Hm.2) multiple assignment a,b,c = A,B,C; is executed like this: A,B,C is evaluated from left to right then values are assigned to a,b,c from left to right (a=A, b=B, c=C)We've gone throught such things a couple of times. The question is: can you (or anyone) find examples which would make this feature useful.3) ^ power, =>, <= and <=> logical operators 8^4 ... 8*8*8*8 a => b ... !a || b a <= b ... b => a a <=> b ... (a => b) && (a <= b)Yes. However ^ is taken by XOR. How about ** ? I also wanted that division int/int=float, and to have a separate integer division operator. This would decimate the number of stupid numeric bugs, and would not yuild to many new bugs because a compiler would warn about a type mismatch.4) optimization idea if function F has no side effect and its arguments are constants it can be computed at compile timeMany, many people have had this good idea. :) This would also give a number of other interesting things. Example: regexp could be saved in a program in a compiled form, instead of translating them at run-time. Though it doesn't buy much with regexp, it would with larger interpreted sub-languages. I guess there's some problem checking purity of functions though. It would mean a need for recursive analysis of the (almost) whole program. All functions would qualify, which only access constant globals if any, and only functions also qualifed if any. Can a recursive set of functions be qualified?5) better swich old swich: switch(exp0) { case constexpr1: command1 case constexpr2: command2 case constexpr3: command3 default: command4 } is expanded: if(exp0 == constexpr1) goto label1; if(exp0 == constexpr2) goto label2; if(exp0 == constexpr3) goto label3; goto label4; label1: command1; label2: command2; label3: command3; label4: command4; better switch: switch(exp0) case(exp1) command1 case(exp2) command2 case(exp3) command3 else command4 is expanded: if(exp0 == exp1) command1 else if(exp0 == exp2) command2 else if(exp0 == exp3) command3 else command4One thing: a switch is NEVER CONVERTED TO IF's!!! It is a mean to create a jump table: "take an input value, make some math on it which yuilds an index, make a table lookup, jump to the adress noted in the table". That's why switch is so darned efficient - only a few CPU cycles!!! And you see why it has such a semantics in C. What you mean here, is that a "break" is implicit. Walter doesn't want it. He's not exactly a young person, and i guess he suspects if he does it, he'd be having bugs because it doesn't work the "good old C way". And many other programmers as well. He'll rather give you a better compiler which will tell you whenever you're missing a "break". I guess that'd also be his argument to division. A compiler cannot warn you with division though. There's another problem to it: how do you check for ranges in "switch"? Now you can write "case a: case b: case c" else you would be thinking out a new syntax on it. Which could be like array slicing, or inclusive... Which would anyway cause programmers making huge slices and bloating a jump table, which is not a good idea at all. One thing i find very important to implement is a "smart union", which knows is current state. It would fix the unsafety of the normal union and make serialisation possible. -i.
Mar 01 2003
Many of the suggestions below have been given earlier. However, perhaps it's good to repeat why some of them are worth considering. Ilya Minkov <midiclub 8ung.at> writes:Marko wrote:First of all, I think that here is a useful feature. At least for the relational (<, <=, >=, >) expression part. "a < b < c" is semantically obvious to everyone. It's compact and more readable than "a < b && b < c", and causes less bugs: people do write things like "1 < a < 5" in accident, and get bitten. Most importantly, it's comfortable to use. The only downside IMO is that there are some subtle issues to decide and hence the feature requires some time to design and implement. Such as: - Is the operation short-circuited? (Adding another short-circuited "operator" in the language -- but why not?) - Are the operations evaluated only once? (Logically, yes, since they are written in the source code only once) - How does this interact with operator overloading? (My guess would be "translate to a<b && b<c && ... first, then do overloading) - Apply that to == and != (and === and !==), too? How about "a < b == x < y"? (I wouldn't) Someone might also say that all extra features are bad, because they require time and effort to learn and teach and write about. For example, the old complaint that there are 4 ways to increment a variable in C. But this applies only to features which aren't obviously easy to use, and this one is. Also I could envision it to be close-to obvious to implement.1) shorter relational expressions it is easyer to write a <= b < c == d instead of a <= b && b < c && c == d + b and c are evaluated only onceHm. Requieres some thought on whether that's implementable. RealtionalExp -> [Exp RelOp]* Exp.Most probably everything is implementable :) Maybe not in a straightforward manner (like direct translation from a<b<c to a<b&&b<c -- that would cause the duplication of the expressions, which is not probably desired), but implementable anyway. Probably a change in the back-end/intermediate representation of the compiler to take into account sequential CmpExps and do things like "eval the first and second; compare; if false, quit; eval the third; compare with the second; etc."This has also subtle issues. Joe R. Newbie will try this (or as swap is likely to be implemented as a library routine, something similar): void swap(inout int a, inout int b) { a, b = b, a; } which translates to: a = b; b = a; and leads to problems. Should this be the default behavior or something which first assigns to temporary values? Other things to consider: - should functions return multiple values - if f returns two values, what does "a, b = f()" mean? - if f returns two values, waht does "a = f()" mean? - if f returns one value, what does "a, b = f(), y()" mean? - do these features fit into the grammar seamlessly?2) multiple assignment a,b,c = A,B,C; is executed like this: A,B,C is evaluated from left to right then values are assigned to a,b,c from left to right (a=A, b=B, c=C)We've gone throught such things a couple of times. The question is: can you (or anyone) find examples which would make this feature useful.With the presence of "out" parameters, I'm not sure that assigning multiple values is that useful. But hey, at least it looks cool :-)If I were implementing a language from scratch, I'd probably use ^ as an exponentiation operator. ^ doesn't particularly look like "xor" if you haven't done much bit-level C programming. (Nor does | look like "or" but that's another issue altogether...) ** clashes with multiplication combined with pointer dereference. <= also clashes with the "less-or-equal" operator. "<-", "->", and "<->" could be used if logical operators were required. (At least if -> pointer syntax were to be demolished) But still, there's the issue of: a<-b; // a <- b or a < -b ?3) ^ power, =>, <= and <=> logical operators 8^4 ... 8*8*8*8 a => b ... !a || b a <= b ... b => a a <=> b ... (a => b) && (a <= b)Yes. However ^ is taken by XOR. How about ** ?Having pure compile-time functions would be very neat, but effectively it would require a D compiler to be a D interpreter at the same time. Dunno about the purity checking. I suppose you will very soon find out the purity of a function when the compiler starts interpreting a function and eventually tries to read a global variable, format your hard disk, or send a network packet, or something else "impure" ;) An alternative would be to require some different kind of syntax for pure functions, effectively making them to be of different type of functions. And all functions that are pure should be declared pure, or we have yet again the annoying problem I mentioned in the other post: we really know we have pure functions, but we can't use them because they are not declared pure. [switch]4) optimization idea if function F has no side effect and its arguments are constants it can be computed at compile timeMany, many people have had this good idea. :) This would also give a number of other interesting things. Example: regexp could be saved in a program in a compiled form, instead of translating them at run-time. Though it doesn't buy much with regexp, it would with larger interpreted sub-languages. I guess there's some problem checking purity of functions though. It would mean a need for recursive analysis of the (almost) whole program. All functions would qualify, which only access constant globals if any, and only functions also qualifed if any. Can a recursive set of functions be qualified?One thing: a switch is NEVER CONVERTED TO IF's!!! It is a mean to create a jump table: "take an input value, make some math on it which yuilds an index, make a table lookup, jump to the adress noted in the table". That's why switch is so darned efficient - only a few CPU cycles!!! And you see why it has such a semantics in C.It just *might* be converted to if statements, though, and you'll never know unless you disassemble the object code ;)I guess that'd also be his argument to division. A compiler cannot warn you with division though.Sometimes it sounds like a nice idea to make all arithmetic operators like "div: Int x Int -> Int" for each integral type Int. Only it isn't so. Case 1: current machine architectures have operations like "mul: Int32 x Int32 -> Int64" and "div: Int64 x Int32 -> Int32 x Int32" (both dividend and remainder computed); Case 2: integer division actually produces a rational number, and probably the technically most feasible solution would be using floats. But I guess that some C compatibility is in order sometimes. At least it might make the language more comfortable to some people.There's another problem to it: how do you check for ranges in "switch"? Now you can write "case a: case b: case c" else you would be thinking out a new syntax on it. Which could be like array slicing, or inclusive... Which would anyway cause programmers making huge slices and bloating a jump table, which is not a good idea at all.Sounds like a good place for syntax "case 1..5:"One thing i find very important to implement is a "smart union", which knows is current state. It would fix the unsafety of the normal union and make serialisation possible.In C++, this can be achieved with metaprogramming ( http://www.crystalclearsoftware.com/cgi-bin/boost_wiki/wiki.pl?Variant) although it isn't very simple. An ideal language would provide the right abstractions for implementing things like a smart union, and then include it in the standard library. (I can dream, can't I...) D gets better and better with time (by which I mean delegates and everything)... but there still are couple of feature requests that keep coming up. Perhaps some kind of wiki, bugzilla or similar would be suitable for tracking feature requests and storing comments, possibly even voting for them. -Antti
Mar 01 2003
ADA has "smart unions" named "discrimininants". In the declaration of the object the programmer can declare a record that is parameterised according to passed value. For example: record variant(value:integer) is case value when 0 => param1:string; when 1 => param1:integer; end case; end record; stringvar: variant(0); intvar: variant(1); "Antti Sykari" <jsykari gamma.hut.fi> wrote in message news:87el5q8srb.fsf hoastest1-8c.hoasnet.inet.fi...Many of the suggestions below have been given earlier. However, perhaps it's good to repeat why some of them are worth considering. Ilya Minkov <midiclub 8ung.at> writes:Marko wrote:First of all, I think that here is a useful feature. At least for the relational (<, <=, >=, >) expression part. "a < b < c" is semantically obvious to everyone. It's compact and more readable than "a < b && b < c", and causes less bugs: people do write things like "1 < a < 5" in accident, and get bitten. Most importantly, it's comfortable to use. The only downside IMO is that there are some subtle issues to decide and hence the feature requires some time to design and implement. Such as: - Is the operation short-circuited? (Adding another short-circuited "operator" in the language -- but why not?) - Are the operations evaluated only once? (Logically, yes, since they are written in the source code only once) - How does this interact with operator overloading? (My guess would be "translate to a<b && b<c && ... first, then do overloading) - Apply that to == and != (and === and !==), too? How about "a < b == x < y"? (I wouldn't) Someone might also say that all extra features are bad, because they require time and effort to learn and teach and write about. For example, the old complaint that there are 4 ways to increment a variable in C. But this applies only to features which aren't obviously easy to use, and this one is. Also I could envision it to be close-to obvious to implement.1) shorter relational expressions it is easyer to write a <= b < c == d instead of a <= b && b < c && c == d + b and c are evaluated only onceHm. Requieres some thought on whether that's implementable. RealtionalExp -> [Exp RelOp]* Exp.Most probably everything is implementable :) Maybe not in a straightforward manner (like direct translation from a<b<c to a<b&&b<c -- that would cause the duplication of the expressions, which is not probably desired), but implementable anyway. Probably a change in the back-end/intermediate representation of the compiler to take into account sequential CmpExps and do things like "eval the first and second; compare; if false, quit; eval the third; compare with the second; etc."This has also subtle issues. Joe R. Newbie will try this (or as swap is likely to be implemented as a library routine, something similar): void swap(inout int a, inout int b) { a, b = b, a; } which translates to: a = b; b = a; and leads to problems. Should this be the default behavior or something which first assigns to temporary values? Other things to consider: - should functions return multiple values - if f returns two values, what does "a, b = f()" mean? - if f returns two values, waht does "a = f()" mean? - if f returns one value, what does "a, b = f(), y()" mean? - do these features fit into the grammar seamlessly?2) multiple assignment a,b,c = A,B,C; is executed like this: A,B,C is evaluated from left to right then values are assigned to a,b,c from left to right (a=A, b=B, c=C)We've gone throught such things a couple of times. The question is: can you (or anyone) find examples which would make this feature useful.With the presence of "out" parameters, I'm not sure that assigning multiple values is that useful. But hey, at least it looks cool :-)If I were implementing a language from scratch, I'd probably use ^ as an exponentiation operator. ^ doesn't particularly look like "xor" if you haven't done much bit-level C programming. (Nor does | look like "or" but that's another issue altogether...) ** clashes with multiplication combined with pointer dereference. <= also clashes with the "less-or-equal" operator. "<-", "->", and "<->" could be used if logical operators were required. (At least if -> pointer syntax were to be demolished) But still, there's the issue of: a<-b; // a <- b or a < -b ?3) ^ power, =>, <= and <=> logical operators 8^4 ... 8*8*8*8 a => b ... !a || b a <= b ... b => a a <=> b ... (a => b) && (a <= b)Yes. However ^ is taken by XOR. How about ** ?Having pure compile-time functions would be very neat, but effectively it would require a D compiler to be a D interpreter at the same time. Dunno about the purity checking. I suppose you will very soon find out the purity of a function when the compiler starts interpreting a function and eventually tries to read a global variable, format your hard disk, or send a network packet, or something else "impure" ;) An alternative would be to require some different kind of syntax for pure functions, effectively making them to be of different type of functions. And all functions that are pure should be declared pure, or we have yet again the annoying problem I mentioned in the other post: we really know we have pure functions, but we can't use them because they are not declared pure. [switch]4) optimization idea if function F has no side effect and its arguments are constants it can be computed at compile timeMany, many people have had this good idea. :) This would also give a number of other interesting things. Example: regexp could be saved in a program in a compiled form, instead of translating them at run-time. Though it doesn't buy much with regexp, it would with larger interpreted sub-languages. I guess there's some problem checking purity of functions though. It would mean a need for recursive analysis of the (almost) whole program. All functions would qualify, which only access constant globals if any, and only functions also qualifed if any. Can a recursive set of functions be qualified?One thing: a switch is NEVER CONVERTED TO IF's!!! It is a mean to create a jump table: "take an input value, make some math on it which yuilds an index, make a table lookup, jump to the adress noted in the table". That's why switch is so darned efficient - only a few CPU cycles!!! And you see why it has such a semantics in C.It just *might* be converted to if statements, though, and you'll never know unless you disassemble the object code ;)I guess that'd also be his argument to division. A compiler cannot warn you with division though.Sometimes it sounds like a nice idea to make all arithmetic operators like "div: Int x Int -> Int" for each integral type Int. Only it isn't so. Case 1: current machine architectures have operations like "mul: Int32 x Int32 -> Int64" and "div: Int64 x Int32 -> Int32 x Int32" (both dividend and remainder computed); Case 2: integer division actually produces a rational number, and probably the technically most feasible solution would be using floats. But I guess that some C compatibility is in order sometimes. At least it might make the language more comfortable to some people.There's another problem to it: how do you check for ranges in "switch"? Now you can write "case a: case b: case c" else you would be thinking out a new syntax on it. Which could be like array slicing, or inclusive... Which would anyway cause programmers making huge slices and bloating a jump table, which is not a good idea at all.Sounds like a good place for syntax "case 1..5:"One thing i find very important to implement is a "smart union", which knows is current state. It would fix the unsafety of the normal union and make serialisation possible.In C++, this can be achieved with metaprogramming ( http://www.crystalclearsoftware.com/cgi-bin/boost_wiki/wiki.pl?Variant) although it isn't very simple. An ideal language would provide the right abstractions for implementing things like a smart union, and then include it in the standard library. (I can dream, can't I...) D gets better and better with time (by which I mean delegates and everything)... but there still are couple of feature requests that keep coming up. Perhaps some kind of wiki, bugzilla or similar would be suitable for tracking feature requests and storing comments, possibly even voting for them. -Antti
Mar 02 2003
Antti Sykari wrote:Many of the suggestions below have been given earlier. However, perhaps it's good to repeat why some of them are worth considering. First of all, I think that here is a useful feature. At least for the relational (<, <=, >=, >) expression part. "a < b < c" is semantically obvious to everyone. It's compact and more readable than "a < b && b < c", and causes less bugs: people do write things like "1 < a < 5" in accident, and get bitten. Most importantly, it's comfortable to use.And it has been implemented in Python because of "being pretty obvious". Python is a language that allows to do obvious things the obvious way. I'll write an article, summarizing all unusual decisions made in its design and post it here. D could learn something from it.The only downside IMO is that there are some subtle issues to decide and hence the feature requires some time to design and implement. Such as: - Is the operation short-circuited? (Adding another short-circuited "operator" in the language -- but why not?) - Are the operations evaluated only once? (Logically, yes, since they are written in the source code only once)Evaluation order has always been undefined. I guess it also doesn't state how many times a function is called if it's in the same sequence group (C defines "sequence points", remember?). You could only safely use pure functions in expressions so far, and that's how it has to stay.- How does this interact with operator overloading? (My guess would be "translate to a<b && b<c && ... first, then do overloading) - Apply that to == and != (and === and !==), too? How about "a < b == x < y"? (I wouldn't)Ouch.Someone might also say that all extra features are bad, because they require time and effort to learn and teach and write about. For example, the old complaint that there are 4 ways to increment a variable in C. But this applies only to features which aren't obviously easy to use, and this one is. Also I could envision it to be close-to obvious to implement.No, this point has not been really criticised, because it's usually obvious when you increment vars, one way or another. It has been criticised, that there are generally too many ways to make obvious things, making them unobvious at first sight. All almost equally bad.No way should it work like that!!! Such a feature, if introduced, should work the *obvious* way. This feature has been known as tuples in other languages, and should work the same way.This has also subtle issues. Joe R. Newbie will try this (or as swap is likely to be implemented as a library routine, something similar): void swap(inout int a, inout int b) { a, b = b, a; } which translates to: a = b; b = a; and leads to problems. Should this be the default behavior or something which first assigns to temporary values?2) multiple assignment a,b,c = A,B,C; is executed like this: A,B,C is evaluated from left to right then values are assigned to a,b,c from left to right (a=A, b=B, c=C)Other things to consider:They have to mean obvious things. In general, Daniel Yokomiso has already made some thoughts on this topic, and he has probably come up with some suitable solution. I'll have to take a look at it, or we could simply ask him. He develops his own impure functional language, which could supersede Haskell and OCaml. :)- should functions return multiple valuesYes. Tuples.- if f returns two values, what does "a, b = f()" mean?The obvious.- if f returns two values, waht does "a = f()" mean?Error. Discarded values have been already a major plaque in C. You forget function call parenthesis - and whoops! I guess you know that. If someone means to use one return value, then he should state that, like "a, null = f()"- if f returns one value, what does "a, b = f(), y()" mean?Error.- do these features fit into the grammar seamlessly?Dunno. There should be a way to make them fit into grammar. I have not read compiler sources yet, and i'm going to. And i might then do what Burton promised, but has not done so far: write documentation on them.It's no problem. You can't write "x+++++y", you have to separate it with spaces - now, you could requere that "* *" is mul and dereference, and "**" is power. I also proposed once that if you have a function with 2 parameters and one return value, that it can be called like "y = a 'fun' b" which expands to "y = fun(a, b)". Some lexical mean is probably requiered to recognise in-fix functions.Yes. However ^ is taken by XOR. How about ** ?If I were implementing a language from scratch, I'd probably use ^ as an exponentiation operator. ^ doesn't particularly look like "xor" if you haven't done much bit-level C programming. (Nor does | look like "or" but that's another issue altogether...) ** clashes with multiplication combined with pointer dereference.<= also clashes with the "less-or-equal" operator.I'm a fool, i've notised that too late."<-", "->", and "<->" could be used if logical operators were required. (At least if -> pointer syntax were to be demolished) But still, there's the issue of: a<-b; // a <- b or a < -b ?It's not a problem, it's a decision question. It's solved the same way as so far.Having pure compile-time functions would be very neat, but effectively it would require a D compiler to be a D interpreter at the same time. Dunno about the purity checking. I suppose you will very soon find out the purity of a function when the compiler starts interpreting a function and eventually tries to read a global variable, format your hard disk, or send a network packet, or something else "impure" ;)No, that's really not a problem. Do you know how compiler's semantic analyser and the constant wrappers work? No, not exactly interpreters, but are very close to that. I have mentioned the very clean criteriums for a purity of a function, here i re-formulate them in a more straightforward form: - if a function is external (ie no source is available for it), it is impure. Unless its somewhere explicitly stated otherwise. (ie const qualifier in declaration?) - function body is scanned for variable acesses. If it acesses any global variables that are not constant, it is unpure. Even if it only reads them, becuase some other function might have modified them, making a function yuild inconsistent results. - function body is scanned for function calls. All these functions must also qualify to be pure, else this one isn't.An alternative would be to require some different kind of syntax for pure functions, effectively making them to be of different type of functions. And all functions that are pure should be declared pure, or we have yet again the annoying problem I mentioned in the other post: we really know we have pure functions, but we can't use them because they are not declared pure.const qualifier on return type? I can't imgine of anything else it could mean.[switch] It just *might* be converted to if statements, though, and you'll never know unless you disassemble the object code ;)Urgh, well, initial math contains IFs, but the switch body generally doesn't. Though sure it might, it's not the basic idea of it. But when you allow for range syntax a..b, there would be a real need for a compiler to split these switches into several jump tables connected with IFs. Though this might be good cause it could make the source more terse and reduce the probability of a bug... BUT THEN it would also be a good idea to introduce pascalese range/set type.Sometimes it sounds like a nice idea to make all arithmetic operators like "div: Int x Int -> Int" for each integral type Int. Only it isn't so. Case 1: current machine architectures have operations like "mul: Int32 x Int32 -> Int64" and "div: Int64 x Int32 -> Int32 x Int32" (both dividend and remainder computed);Basically yes.Case 2: integer division actually produces a rational number, and probably the technically most feasible solution would be using floats.ieek, i've been looking for a new word for "extended" and ot the wrong one, i meant "real". "float" is very limited and should not be used for intermediates.But I guess that some C compatibility is in order sometimes. At least it might make the language more comfortable to some people."seem to be more comfortable" :) But for a C successor it's kind of vital.Nope. Some languages don't include a union at all, but insteat a smart union. Dynamic lanuages usually don't have stuff like that at all, since you always can change a variables's type and check types.One thing i find very important to implement is a "smart union", which knows is current state. It would fix the unsafety of the normal union and make serialisation possible.In C++, this can be achieved with metaprogramming ( http://www.crystalclearsoftware.com/cgi-bin/boost_wiki/wiki.pl?Variant) although it isn't very simple. An ideal language would provide the right abstractions for implementing things like a smart union, and then include it in the standard library. (I can dream, can't I...)D gets better and better with time (by which I mean delegates and everything)... but there still are couple of feature requests that keep coming up. Perhaps some kind of wiki, bugzilla or similar would be suitable for tracking feature requests and storing comments, possibly even voting for them.Noone reads these. That's why we have this newsgroup :) -i.
Mar 02 2003
Hi, Comments embedded. In article <b3ub9r$shi$1 digitaldaemon.com>, Ilya Minkov says...Antti Sykari wrote:Hmmm, IMO we could just copy Icon ( http://www.cs.arizona.edu/icon/ ) generators, at least a piece of them, to implement this correctly, without leaving semantic problems. It works like this: if a < b == x < y then write ("Ok!") This evaluates left to right: 1 - if a < b it returns b, else it fails (failure is a generator thing). 2 - using the b value returned it compares to x. If they're equal, it returns x value, else it fails. 3 - using the x value returned it compares to y. If less then, it returns y, else fails. 4 - if a value was returned it continues to the then part. The nice thing about this is that using success/failure instead of true vs. false for relational expressions lead to better syntax and semantics. In Icon one can write: if y < (x | 5) then write("y=", y) and it'll work correctly, comparing y with x and with 5. Also more powerful stuff can be written: if (a | b | c) = (d | e | f) then write("Ok!") Of course this would lead to big changes in D, but it can do many obvious things possible (like the "y < (x | 5)" stuff). It also can be used to implement multiple return values with same type.Many of the suggestions below have been given earlier. However, perhaps it's good to repeat why some of them are worth considering. First of all, I think that here is a useful feature. At least for the relational (<, <=, >=, >) expression part. "a < b < c" is semantically obvious to everyone. It's compact and more readable than "a < b && b < c", and causes less bugs: people do write things like "1 < a < 5" in accident, and get bitten. Most importantly, it's comfortable to use.And it has been implemented in Python because of "being pretty obvious". Python is a language that allows to do obvious things the obvious way. I'll write an article, summarizing all unusual decisions made in its design and post it here. D could learn something from it.The only downside IMO is that there are some subtle issues to decide and hence the feature requires some time to design and implement. Such as: - Is the operation short-circuited? (Adding another short-circuited "operator" in the language -- but why not?) - Are the operations evaluated only once? (Logically, yes, since they are written in the source code only once)Evaluation order has always been undefined. I guess it also doesn't state how many times a function is called if it's in the same sequence group (C defines "sequence points", remember?). You could only safely use pure functions in expressions so far, and that's how it has to stay.- How does this interact with operator overloading? (My guess would be "translate to a<b && b<c && ... first, then do overloading) - Apply that to == and != (and === and !==), too? How about "a < b == x < y"? (I wouldn't)Ouch.I cheat ;-) It has some simple solutions for this, like letting the compiler create all the temporary variables and deal with any evaluation order problems, like "int x, y = i++, i++;". Eon has no side-effects in expressions, so it can get away with tuples. D has to deal with this problems. But I think that tuples are a nice thing to have, including tuple constructors (bind them together) and tuple deconstructors (tear them apart). Using an iterative fibonnaci solution: int fib(int n) in { assert(n > 0); } { int a, b = 0, 1; for (int i = 0; i < n; i++) { a, b = b, a + b; } return a; } tuples lead to cleaner syntax, without temp variables. This is toy code, but it's pretty :-)Someone might also say that all extra features are bad, because they require time and effort to learn and teach and write about. For example, the old complaint that there are 4 ways to increment a variable in C. But this applies only to features which aren't obviously easy to use, and this one is. Also I could envision it to be close-to obvious to implement.No, this point has not been really criticised, because it's usually obvious when you increment vars, one way or another. It has been criticised, that there are generally too many ways to make obvious things, making them unobvious at first sight. All almost equally bad.No way should it work like that!!! Such a feature, if introduced, should work the *obvious* way. This feature has been known as tuples in other languages, and should work the same way.This has also subtle issues. Joe R. Newbie will try this (or as swap is likely to be implemented as a library routine, something similar): void swap(inout int a, inout int b) { a, b = b, a; } which translates to: a = b; b = a; and leads to problems. Should this be the default behavior or something which first assigns to temporary values?2) multiple assignment a,b,c = A,B,C; is executed like this: A,B,C is evaluated from left to right then values are assigned to a,b,c from left to right (a=A, b=B, c=C)Other things to consider:They have to mean obvious things. In general, Daniel Yokomiso has already made some thoughts on this topic, and he has probably come up with some suitable solution. I'll have to take a look at it, or we could simply ask him. He develops his own impure functional language, which could supersede Haskell and OCaml. :)Unless a is of "(int, int)" type.- should functions return multiple valuesYes. Tuples.- if f returns two values, what does "a, b = f()" mean?The obvious.- if f returns two values, waht does "a = f()" mean?Error. Discarded values have been already a major plaque in C. You forget function call parenthesis - and whoops! I guess you know that. If someone means to use one return value, then he should state that, like "a, null = f()"Depends on a and b types. int f(); (int, int, int) y(); int a; (int, int, int) b; a, b = f(), y(); should compile and run ok. At least it's "pretty obvious".- if f returns one value, what does "a, b = f(), y()" mean?Error.As an side note, this is similar to Hindley-Milner type inference algorithm. Its complexity is big (IIRC it's greater than exponential space), so it mays lead to larger compile-times if the compiler can't annotate the object code with purity marks.- do these features fit into the grammar seamlessly?Dunno. There should be a way to make them fit into grammar. I have not read compiler sources yet, and i'm going to. And i might then do what Burton promised, but has not done so far: write documentation on them.It's no problem. You can't write "x+++++y", you have to separate it with spaces - now, you could requere that "* *" is mul and dereference, and "**" is power. I also proposed once that if you have a function with 2 parameters and one return value, that it can be called like "y = a 'fun' b" which expands to "y = fun(a, b)". Some lexical mean is probably requiered to recognise in-fix functions.Yes. However ^ is taken by XOR. How about ** ?If I were implementing a language from scratch, I'd probably use ^ as an exponentiation operator. ^ doesn't particularly look like "xor" if you haven't done much bit-level C programming. (Nor does | look like "or" but that's another issue altogether...) ** clashes with multiplication combined with pointer dereference.<= also clashes with the "less-or-equal" operator.I'm a fool, i've notised that too late."<-", "->", and "<->" could be used if logical operators were required. (At least if -> pointer syntax were to be demolished) But still, there's the issue of: a<-b; // a <- b or a < -b ?It's not a problem, it's a decision question. It's solved the same way as so far.Having pure compile-time functions would be very neat, but effectively it would require a D compiler to be a D interpreter at the same time. Dunno about the purity checking. I suppose you will very soon find out the purity of a function when the compiler starts interpreting a function and eventually tries to read a global variable, format your hard disk, or send a network packet, or something else "impure" ;)No, that's really not a problem. Do you know how compiler's semantic analyser and the constant wrappers work? No, not exactly interpreters, but are very close to that. I have mentioned the very clean criteriums for a purity of a function, here i re-formulate them in a more straightforward form: - if a function is external (ie no source is available for it), it is impure. Unless its somewhere explicitly stated otherwise. (ie const qualifier in declaration?) - function body is scanned for variable acesses. If it acesses any global variables that are not constant, it is unpure. Even if it only reads them, becuase some other function might have modified them, making a function yuild inconsistent results. - function body is scanned for function calls. All these functions must also qualify to be pure, else this one isn't.Another nice case for tuples.An alternative would be to require some different kind of syntax for pure functions, effectively making them to be of different type of functions. And all functions that are pure should be declared pure, or we have yet again the annoying problem I mentioned in the other post: we really know we have pure functions, but we can't use them because they are not declared pure.const qualifier on return type? I can't imgine of anything else it could mean.[switch] It just *might* be converted to if statements, though, and you'll never know unless you disassemble the object code ;)Urgh, well, initial math contains IFs, but the switch body generally doesn't. Though sure it might, it's not the basic idea of it. But when you allow for range syntax a..b, there would be a real need for a compiler to split these switches into several jump tables connected with IFs. Though this might be good cause it could make the source more terse and reduce the probability of a bug... BUT THEN it would also be a good idea to introduce pascalese range/set type.Sometimes it sounds like a nice idea to make all arithmetic operators like "div: Int x Int -> Int" for each integral type Int. Only it isn't so. Case 1: current machine architectures have operations like "mul: Int32 x Int32 -> Int64" and "div: Int64 x Int32 -> Int32 x Int32" (both dividend and remainder computed);Basically yes.I second the wiki suggestion (I've suggested that some time ago). At least we could keep related discussions together. Right now we keep having the same discussions about certain stuff. Best regards, Daniel Yokomiso. "Beware of bugs in the above code; I have only proved it correct, not tried it." - Donald Knuth (in a memo to Peter van Emde Boas)Case 2: integer division actually produces a rational number, and probably the technically most feasible solution would be using floats.ieek, i've been looking for a new word for "extended" and ot the wrong one, i meant "real". "float" is very limited and should not be used for intermediates.But I guess that some C compatibility is in order sometimes. At least it might make the language more comfortable to some people."seem to be more comfortable" :) But for a C successor it's kind of vital.Nope. Some languages don't include a union at all, but insteat a smart union. Dynamic lanuages usually don't have stuff like that at all, since you always can change a variables's type and check types.One thing i find very important to implement is a "smart union", which knows is current state. It would fix the unsafety of the normal union and make serialisation possible.In C++, this can be achieved with metaprogramming ( http://www.crystalclearsoftware.com/cgi-bin/boost_wiki/wiki.pl?Variant) although it isn't very simple. An ideal language would provide the right abstractions for implementing things like a smart union, and then include it in the standard library. (I can dream, can't I...)D gets better and better with time (by which I mean delegates and everything)... but there still are couple of feature requests that keep coming up. Perhaps some kind of wiki, bugzilla or similar would be suitable for tracking feature requests and storing comments, possibly even voting for them.Noone reads these. That's why we have this newsgroup :) -i.
Mar 03 2003
Daniel Yokomiso wrote:In article <b3ub9r$shi$1 digitaldaemon.com>, Ilya Minkov says...It's not all that clear to me why. You have to scan every function, so that a compiled unit has all safe functions qualified as being safe. You just have to scan them in the correct order, which is identified by recursion. The only real problem would be to resolve mutually recursive functions. So far there are no mutually recursives or they are left "impure", i assume the time complexity would be near linear. Well, someone might one day come up with additional pass of analysis for mutually-recursive functions, and that would also resolve that problem... With an impact at complexity. I'm not sure, what complexity does it take to identify loopings in a directed graph? (HUGE?) I assume that mutual recursion analysis can be limited to groups of only 2-3 functions, and that would already cover usual cases. It doesn't have to be exhaustive you know, as opposed to type inference. BTW, doesn't OCaml somehow circumvent doing the complete Hindley-Milner analysis for types? -i.- if a function is external (ie no source is available for it), it is impure. Unless its somewhere explicitly stated otherwise. (ie const qualifier in declaration?) - function body is scanned for variable acesses. If it acesses any global variables that are not constant, it is unpure. Even if it only reads them, becuase some other function might have modified them, making a function yuild inconsistent results. - function body is scanned for function calls. All these functions must also qualify to be pure, else this one isn't.As an side note, this is similar to Hindley-Milner type inference algorithm. Its complexity is big (IIRC it's greater than exponential space), so it mays lead to larger compile-times if the compiler can't annotate the object code with purity marks.
Mar 03 2003
Ilya Minkov wrote: ...Well, someone might one day come up with additional pass of analysis for mutually-recursive functions, and that would also resolve that problem... With an impact at complexity. I'm not sure, what complexity does it take to identify loopings in a directed graph? (HUGE?) I assume that mutual recursion analysis can be limited to groups of only 2-3 functions, and that would already cover usual cases. It doesn't have to be exhaustive you know, as opposed to type inference.I can help you there. Breaking loops in a directed graph requires only a couple simple linear passes. Here's some pseudo code: breakLoops(Graph graph) { Node node; clearNodeFlags(graph); foreach(graph, node) { if(!node.visited) { breakLoopsFromNode(node); } } } clearVisitedNodes(Graph graph) { Node node; foreach(graph, node) { node.visited = false; node.marked = false; } } breakLoopsFromNode(Node node) { Node otherNode; Edge edge; node.visited = true; node.marked = true; foreach(node.outEdges, edge) { otherNode = edge.toNode; if(otherNode.marked) { edge.isLoopEdge = true; // Here's where you break loops } else if(!otherNode.visited) { breakLoopsFromNode(otherNode); } } node.marked = false; } -- Bill
Mar 03 2003
"Ilya Minkov" <midiclub 8ung.at> escreveu na mensagem news:b40g4b$25gf$1 digitaldaemon.com...Daniel Yokomiso wrote:As I said "if the compiler CAN'T annotate the object code with purity marks". If it annotate every module, than it's ok. There's still a problem with delegate parameters: a function may be pure if a delegate paramter is pure, but if the parameter isn't, than it can be pure. Like higher-order functions like map, fold, filter, etc..In article <b3ub9r$shi$1 digitaldaemon.com>, Ilya Minkov says...It's not all that clear to me why. You have to scan every function, so that a compiled unit has all safe functions qualified as being safe. You just have to scan them in the correct order, which is identified by recursion.- if a function is external (ie no source is available for it), it is impure. Unless its somewhere explicitly stated otherwise. (ie const qualifier in declaration?) - function body is scanned for variable acesses. If it acesses any global variables that are not constant, it is unpure. Even if it only reads them, becuase some other function might have modified them, making a function yuild inconsistent results. - function body is scanned for function calls. All these functions must also qualify to be pure, else this one isn't.As an side note, this is similar to Hindley-Milner type inference algorithm. Its complexity is big (IIRC it's greater than exponential space), so it mays lead to larger compile-times if the compiler can't annotate the object code with purity marks.The only real problem would be to resolve mutually recursive functions. So far there are no mutually recursives or they are left "impure", i assume the time complexity would be near linear. Well, someone might one day come up with additional pass of analysis for mutually-recursive functions, and that would also resolve that problem... With an impact at complexity. I'm not sure, what complexity does it take to identify loopings in a directed graph? (HUGE?) I assume that mutual recursion analysis can be limited to groups of only 2-3 functions, and that would already cover usual cases. It doesn't have to be exhaustive you know, as opposed to type inference. BTW, doesn't OCaml somehow circumvent doing the complete Hindley-Milner analysis for types?I don't know about this, but I guess they don't.-i.--- Outgoing mail is certified Virus Free. Checked by AVG anti-virus system (http://www.grisoft.com). Version: 6.0.459 / Virus Database: 258 - Release Date: 25/2/2003
Mar 03 2003
Other things to consider: - should functions return multiple values - if f returns two values, what does "a, b = f()" mean? - if f returns two values, waht does "a = f()" mean? - if f returns one value, what does "a, b = f(), y()" mean? - do these features fit into the grammar seamlessly?look at LUA (www.lua.org) the only thing that is require is a change in the comma operator and as order is not important why have an operator that enforces order :)
Mar 03 2003