digitalmars.D - Generalized switch statement (a la pattern matching)
- =?ISO-8859-1?Q?Alex_R=F8nne_Petersen?= (58/58) Oct 27 2011 Hi,
- Norbert Nemec (6/64) Oct 27 2011 Where is the fundamental advantage compared to the following?
- bearophile (7/13) Oct 28 2011 The advantages are visible even with the D final switch, but are more ge...
Hi, Many functional languages have what is called pattern matching. That is, matching a value against a set of expressions, testing for exact equality. let x = fooOrBarOrBaz() match x with | "foo" -> printf "got foo" | "bar" -> printf "got bar" | "baz" -> printf "got baz" It's dead simple, and works more or less exactly like a switch: Walk through each case until a value matching x is found. Then, execute that branch. There are three major differences, however: 1) If no match could be found, an exception (in D's case, this would be an Error-derived type) is thrown. 2) A 'default' branch is specified by matching against a plain variable (typically _ if unused). This always matches, since it's just a plain binding of the matched-against value to another variable. 3) The cases in the pattern match can be *any* expression. They don't have to be compile-time constants. This is extremely flexible, and some would argue that it is what the switch statement always should have been. Of course, the values in each case have to be compatible or implicitly convertible to the type of the value being matched against. It is worth noting that pattern matching does not ruin compiler optimization of switches. It merely takes some extra effort to determine that all cases are constant. Furthermore, point (2) naturally leads to the question: What happens if you match a value against an already-bound variable? In functional languages, what happens is typically shadowing (i.e. the already-bound variable becomes 'hidden'). In D, I think it would make more sense to match against the variable's value, since D doesn't have shadowing anywhere else AFAIK. Additionally, binding the value to a new variable in the 'default' case might not make sense in D. When you hit the 'default:' label, you already have the switched-upon value in scope anyway. Another trait of pattern matching in functional languages is that a pattern match is usually an expression. This means that whatever value is returned from the matched case (if any) is the result of the pattern match expression. I don't think that this would 'fit in' in D. Then we'd have to make if-then-else, while, for, foreach, do-while, etc expressions too, for consistency, which doesn't really seem at home in an imperative language. Furthermore, there is the issue of introducing a new keyword. I don't think this is a good idea, especially not a common word like "match" or something along those lines (and __match would just be outright ugly). So, I propose the following changes: 1) The switch statement should be generalized to allow any expression in cases. If the expression is an existing variable, the switched-upon value will be matched against the value of that variable. If the expression is an unbound variable, it is a compile-time error. 2) If no case is matched, and no 'default' case is present in the switch, an Error (say, SwitchCaseError or whatever) should be raised. I don't think this is a bad idea, since we're already moving towards deprecating lack of 'default' in non-final switches. Point (1) is the most important one here. (2) is not crucial for good pattern matching capabilities, and is more of an aid in debugging. What do you folks think? Would this have a place in D? - Alex
Oct 27 2011
Where is the fundamental advantage compared to the following?
if(x == "foo") writeln("got foo")
else if(x == "foo") writeln("got foo")
else if(x == "foo") writeln("got foo")
else assert(false);
On 27.10.2011 14:05, Alex Rønne Petersen wrote:
Hi,
Many functional languages have what is called pattern matching. That is,
matching a value against a set of expressions, testing for exact equality.
let x = fooOrBarOrBaz()
match x with
| "foo" -> printf "got foo"
| "bar" -> printf "got bar"
| "baz" -> printf "got baz"
It's dead simple, and works more or less exactly like a switch: Walk
through each case until a value matching x is found. Then, execute that
branch.
There are three major differences, however:
1) If no match could be found, an exception (in D's case, this would be
an Error-derived type) is thrown.
2) A 'default' branch is specified by matching against a plain variable
(typically _ if unused). This always matches, since it's just a plain
binding of the matched-against value to another variable.
3) The cases in the pattern match can be *any* expression. They don't
have to be compile-time constants. This is extremely flexible, and some
would argue that it is what the switch statement always should have
been. Of course, the values in each case have to be compatible or
implicitly convertible to the type of the value being matched against.
It is worth noting that pattern matching does not ruin compiler
optimization of switches. It merely takes some extra effort to determine
that all cases are constant.
Furthermore, point (2) naturally leads to the question: What happens if
you match a value against an already-bound variable? In functional
languages, what happens is typically shadowing (i.e. the already-bound
variable becomes 'hidden'). In D, I think it would make more sense to
match against the variable's value, since D doesn't have shadowing
anywhere else AFAIK.
Additionally, binding the value to a new variable in the 'default' case
might not make sense in D. When you hit the 'default:' label, you
already have the switched-upon value in scope anyway.
Another trait of pattern matching in functional languages is that a
pattern match is usually an expression. This means that whatever value
is returned from the matched case (if any) is the result of the pattern
match expression. I don't think that this would 'fit in' in D. Then we'd
have to make if-then-else, while, for, foreach, do-while, etc
expressions too, for consistency, which doesn't really seem at home in
an imperative language.
Furthermore, there is the issue of introducing a new keyword. I don't
think this is a good idea, especially not a common word like "match" or
something along those lines (and __match would just be outright ugly).
So, I propose the following changes:
1) The switch statement should be generalized to allow any expression in
cases. If the expression is an existing variable, the switched-upon
value will be matched against the value of that variable. If the
expression is an unbound variable, it is a compile-time error.
2) If no case is matched, and no 'default' case is present in the
switch, an Error (say, SwitchCaseError or whatever) should be raised. I
don't think this is a bad idea, since we're already moving towards
deprecating lack of 'default' in non-final switches.
Point (1) is the most important one here. (2) is not crucial for good
pattern matching capabilities, and is more of an aid in debugging.
What do you folks think? Would this have a place in D?
- Alex
Oct 27 2011
Norbert Nemec:
Where is the fundamental advantage compared to the following?
if(x == "foo") writeln("got foo")
else if(x == "foo") writeln("got foo")
else if(x == "foo") writeln("got foo")
else assert(false);
The advantages are visible even with the D final switch, but are more
generalized. The various cases are formatted nicely with less syntax noise,
this allows you to see them better. And the compiler gives an error if you
duplicate cases, or if some cases are missing.
This is also useful if you modify the data structures themselves (adding or
removing cases); the compiler gives errors for all the patterns you have to fix
after the change. This helps write correct code both the first time, and later
during code updates.
Regarding pattern matching in D, I'd like a "just" improvement of D switches:
http://d.puremagic.com/issues/show_bug.cgi?id=596
Bye,
bearophile
Oct 28 2011








bearophile <bearophileHUGS lycos.com>