digitalmars.D - Generalized switch statement (a la pattern matching)
- =?ISO-8859-1?Q?Alex_R=F8nne_Petersen?= (58/58) Oct 27 2011 Hi,
- Norbert Nemec (6/64) Oct 27 2011 Where is the fundamental advantage compared to the following?
- bearophile (7/13) Oct 28 2011 The advantages are visible even with the D final switch, but are more ge...
Hi, Many functional languages have what is called pattern matching. That is, matching a value against a set of expressions, testing for exact equality. let x = fooOrBarOrBaz() match x with | "foo" -> printf "got foo" | "bar" -> printf "got bar" | "baz" -> printf "got baz" It's dead simple, and works more or less exactly like a switch: Walk through each case until a value matching x is found. Then, execute that branch. There are three major differences, however: 1) If no match could be found, an exception (in D's case, this would be an Error-derived type) is thrown. 2) A 'default' branch is specified by matching against a plain variable (typically _ if unused). This always matches, since it's just a plain binding of the matched-against value to another variable. 3) The cases in the pattern match can be *any* expression. They don't have to be compile-time constants. This is extremely flexible, and some would argue that it is what the switch statement always should have been. Of course, the values in each case have to be compatible or implicitly convertible to the type of the value being matched against. It is worth noting that pattern matching does not ruin compiler optimization of switches. It merely takes some extra effort to determine that all cases are constant. Furthermore, point (2) naturally leads to the question: What happens if you match a value against an already-bound variable? In functional languages, what happens is typically shadowing (i.e. the already-bound variable becomes 'hidden'). In D, I think it would make more sense to match against the variable's value, since D doesn't have shadowing anywhere else AFAIK. Additionally, binding the value to a new variable in the 'default' case might not make sense in D. When you hit the 'default:' label, you already have the switched-upon value in scope anyway. Another trait of pattern matching in functional languages is that a pattern match is usually an expression. This means that whatever value is returned from the matched case (if any) is the result of the pattern match expression. I don't think that this would 'fit in' in D. Then we'd have to make if-then-else, while, for, foreach, do-while, etc expressions too, for consistency, which doesn't really seem at home in an imperative language. Furthermore, there is the issue of introducing a new keyword. I don't think this is a good idea, especially not a common word like "match" or something along those lines (and __match would just be outright ugly). So, I propose the following changes: 1) The switch statement should be generalized to allow any expression in cases. If the expression is an existing variable, the switched-upon value will be matched against the value of that variable. If the expression is an unbound variable, it is a compile-time error. 2) If no case is matched, and no 'default' case is present in the switch, an Error (say, SwitchCaseError or whatever) should be raised. I don't think this is a bad idea, since we're already moving towards deprecating lack of 'default' in non-final switches. Point (1) is the most important one here. (2) is not crucial for good pattern matching capabilities, and is more of an aid in debugging. What do you folks think? Would this have a place in D? - Alex
Oct 27 2011
Where is the fundamental advantage compared to the following? if(x == "foo") writeln("got foo") else if(x == "foo") writeln("got foo") else if(x == "foo") writeln("got foo") else assert(false); On 27.10.2011 14:05, Alex Rønne Petersen wrote:Hi, Many functional languages have what is called pattern matching. That is, matching a value against a set of expressions, testing for exact equality. let x = fooOrBarOrBaz() match x with | "foo" -> printf "got foo" | "bar" -> printf "got bar" | "baz" -> printf "got baz" It's dead simple, and works more or less exactly like a switch: Walk through each case until a value matching x is found. Then, execute that branch. There are three major differences, however: 1) If no match could be found, an exception (in D's case, this would be an Error-derived type) is thrown. 2) A 'default' branch is specified by matching against a plain variable (typically _ if unused). This always matches, since it's just a plain binding of the matched-against value to another variable. 3) The cases in the pattern match can be *any* expression. They don't have to be compile-time constants. This is extremely flexible, and some would argue that it is what the switch statement always should have been. Of course, the values in each case have to be compatible or implicitly convertible to the type of the value being matched against. It is worth noting that pattern matching does not ruin compiler optimization of switches. It merely takes some extra effort to determine that all cases are constant. Furthermore, point (2) naturally leads to the question: What happens if you match a value against an already-bound variable? In functional languages, what happens is typically shadowing (i.e. the already-bound variable becomes 'hidden'). In D, I think it would make more sense to match against the variable's value, since D doesn't have shadowing anywhere else AFAIK. Additionally, binding the value to a new variable in the 'default' case might not make sense in D. When you hit the 'default:' label, you already have the switched-upon value in scope anyway. Another trait of pattern matching in functional languages is that a pattern match is usually an expression. This means that whatever value is returned from the matched case (if any) is the result of the pattern match expression. I don't think that this would 'fit in' in D. Then we'd have to make if-then-else, while, for, foreach, do-while, etc expressions too, for consistency, which doesn't really seem at home in an imperative language. Furthermore, there is the issue of introducing a new keyword. I don't think this is a good idea, especially not a common word like "match" or something along those lines (and __match would just be outright ugly). So, I propose the following changes: 1) The switch statement should be generalized to allow any expression in cases. If the expression is an existing variable, the switched-upon value will be matched against the value of that variable. If the expression is an unbound variable, it is a compile-time error. 2) If no case is matched, and no 'default' case is present in the switch, an Error (say, SwitchCaseError or whatever) should be raised. I don't think this is a bad idea, since we're already moving towards deprecating lack of 'default' in non-final switches. Point (1) is the most important one here. (2) is not crucial for good pattern matching capabilities, and is more of an aid in debugging. What do you folks think? Would this have a place in D? - Alex
Oct 27 2011
Norbert Nemec:Where is the fundamental advantage compared to the following? if(x == "foo") writeln("got foo") else if(x == "foo") writeln("got foo") else if(x == "foo") writeln("got foo") else assert(false);The advantages are visible even with the D final switch, but are more generalized. The various cases are formatted nicely with less syntax noise, this allows you to see them better. And the compiler gives an error if you duplicate cases, or if some cases are missing. This is also useful if you modify the data structures themselves (adding or removing cases); the compiler gives errors for all the patterns you have to fix after the change. This helps write correct code both the first time, and later during code updates. Regarding pattern matching in D, I'd like a "just" improvement of D switches: http://d.puremagic.com/issues/show_bug.cgi?id=596 Bye, bearophile
Oct 28 2011