www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Pattern matching as a library

reply Simen Kjaeraas <simen.kjaras gmail.com> writes:
As I once again bemoaned D's lack of pattern matching yesterday, 
I was inspired to create this[0] implementation, that plays to 
D's strengths, allows for user-defined matching, and has a fairly 
usable syntax. The core usage looks like this:

unittest {
   auto a = tuple(1, "foo");
   auto b = match(a) (
     _!(int, "foo") = (int i) => 1,
     _!(_, _)       = ()      => 0
   );
   assert(b == 1);
}

With the user-defined matching implemented as follows:

struct Tuple(T...) {
    // Implementation

   // Magic happens here
   bool opMatch(Pattern, Args...)(Pattern p, ref Args args) {
     foreach (i, e; p.pattern) {
       static if (isTypeTuple!e) {
         enum n = countTypes!(p.pattern[0..i]);
         args[n] = fields[i];
       } else static if (!ignore!e) {
         if (fields[i] != e) {
           return false;
         }
       }
     }
   }
}

Or for Algebraic:

struct Algebraic(T...) {
   union {
     T fields;
   }
   size_t which;

   bool opMatch(Pattern, Type)(Pattern p, ref Type args) if 
(staticIndexOf!(Type, T) > -1) {
     enum index = staticIndexOf!(Type, T);
     if (index == which) {
       args = fields[index];
       return true;
     }
     return false;
   }
}

The main problem I see is the temporary allocation of function 
arguments on line 124 and their assignment in opMatch, but I 
currently don't have a better solution.

Also, while I very much dislike using _ for an identifier, I feel 
it may be the best alternative here - it conveys the meaning of 
'don't care' for the pattern, and doesn't stand out like a sore 
thumb before the exclamation mark. Other suggestions are welcome.

The code is available here, and I encourage everyone to play with 
it and critique:

[0]: 
https://github.com/Biotronic/Collectanea/blob/master/biotronic/pattern.d
Mar 12 2016
parent reply Jacob Carlborg <doob me.com> writes:
On 12/03/16 14:12, Simen Kjaeraas wrote:
 As I once again bemoaned D's lack of pattern matching yesterday, I was
 inspired to create this[0] implementation, that plays to D's strengths,
 allows for user-defined matching, and has a fairly usable syntax. The
 core usage looks like this:

 unittest {
    auto a = tuple(1, "foo");
    auto b = match(a) (
      _!(int, "foo") = (int i) => 1,
      _!(_, _)       = ()      => 0
    );
    assert(b == 1);
 }
What kind of syntax is that? Is "match" returning a struct with opCall that is called immediately?
 With the user-defined matching implemented as follows:

 struct Tuple(T...) {
     // Implementation

    // Magic happens here
    bool opMatch(Pattern, Args...)(Pattern p, ref Args args) {
      foreach (i, e; p.pattern) {
        static if (isTypeTuple!e) {
          enum n = countTypes!(p.pattern[0..i]);
          args[n] = fields[i];
        } else static if (!ignore!e) {
          if (fields[i] != e) {
            return false;
          }
        }
      }
    }
 }
Is the tuple iterating all patterns to see if there's a match? Shouldn't that be the job for the the match function?
 Or for Algebraic:

 struct Algebraic(T...) {
    union {
      T fields;
    }
    size_t which;

    bool opMatch(Pattern, Type)(Pattern p, ref Type args) if
 (staticIndexOf!(Type, T) > -1) {
      enum index = staticIndexOf!(Type, T);
      if (index == which) {
        args = fields[index];
        return true;
      }
      return false;
    }
 }

 The main problem I see is the temporary allocation of function arguments
 on line 124 and their assignment in opMatch, but I currently don't have
 a better solution.

 Also, while I very much dislike using _ for an identifier, I feel it may
 be the best alternative here - it conveys the meaning of 'don't care'
 for the pattern, and doesn't stand out like a sore thumb before the
 exclamation mark. Other suggestions are welcome.
I've started implementing a pattern matching function as well. It has a syntax that only use compile time parameters, because both types and values can be passed. I'm not entirely sure on the syntax yet. I'll have to see what's possible to implement. Some suggestions: auto a = tuple(1, "foo"); auto b = match!(a, int, "foo", (int i) => 1, _, _, () => 0, ); If the pull request for inspecting templates ever will be merged it won't be necessary to have typed lambdas: auto b = match!(a, int, "foo", (i) => 1, _, _, () => 0, ); If you only want to match types it could look like this: auto b = match!(a, (int a) => 1 ); Matching a value: auto b = match!(a, 1, () => 2 ); Matching a pattern: auto b = match!(a, (a, b) => a + b ); The else pattern (executed if nothing else matches): auto b = match!(a, () => 4, ); It would be nice if it was possible to verify at compile time if at least one pattern will match. For example: match!("foo", (int a) => 1, ); It's clear that the above pattern can never match. -- /Jacob Carlborg
Mar 12 2016
parent reply Simen Kjaeraas <simen.kjaras gmail.com> writes:
On Saturday, 12 March 2016 at 20:56:47 UTC, Jacob Carlborg wrote:
 On 12/03/16 14:12, Simen Kjaeraas wrote:
 As I once again bemoaned D's lack of pattern matching 
 yesterday, I was
 inspired to create this[0] implementation, that plays to D's 
 strengths,
 allows for user-defined matching, and has a fairly usable 
 syntax. The
 core usage looks like this:

 unittest {
    auto a = tuple(1, "foo");
    auto b = match(a) (
      _!(int, "foo") = (int i) => 1,
      _!(_, _)       = ()      => 0
    );
    assert(b == 1);
 }
What kind of syntax is that? Is "match" returning a struct with opCall that is called immediately?
Indeed. The goal was to make it look similar to a switch statement. I actually started out with an idea for expanding switch with pattern matching using lowerings, then noticed I could do most of the stuff I wanted without compiler changes.
 With the user-defined matching implemented as follows:

 struct Tuple(T...) {
     // Implementation

    // Magic happens here
    bool opMatch(Pattern, Args...)(Pattern p, ref Args args) {
      foreach (i, e; p.pattern) {
        static if (isTypeTuple!e) {
          enum n = countTypes!(p.pattern[0..i]);
          args[n] = fields[i];
        } else static if (!ignore!e) {
          if (fields[i] != e) {
            return false;
          }
        }
      }
    }
 }
Is the tuple iterating all patterns to see if there's a match? Shouldn't that be the job for the the match function?
The match function goes through the list of patterns and for each one asks the tuple if opMatch returns true for that pattern. If it does, the function assigned that pattern is called with the values assigned to args. opMatch here is checking for each element of the pattern if it matches the corresponding element of the tuple. Since the pattern is available at compile-time, opMatch can deny patterns it doesn't like (e.g. trying to match a Tuple!(int, string) with a string). The match function is really only a framework for having similar matching syntax for dissimilar types. If the capability of matching patterns to types were in the match function, how could a user type override it? Matching on a Tuple!(string, string) is different from matching on an Algebraic!(int[], Foo*) is different from matching on a specialized user type that wants to do something real weird (I'm not sure what that'd be, but I'm sure there are people who will want to).
 I've started implementing a pattern matching function as well. 
 It has a syntax that only use compile time parameters, because 
 both types and values can be passed. I'm not entirely sure on 
 the syntax yet. I'll have to see what's possible to implement. 
 Some suggestions:

 auto a = tuple(1, "foo");
 auto b = match!(a,
     int, "foo", (int i) => 1,
     _, _, () => 0,
 );
That works. I feel the grouping is looser than in my example, and that the pattern doesn't stand out from the rest of the expression, but it certainly works, and there are some benefits to that syntax.
 If the pull request for inspecting templates ever will be 
 merged it won't be necessary to have typed lambdas:

 auto b = match!(a,
     int, "foo", (i) => 1,
     _, _, () => 0,
 );
There's a problem using that syntax? It works for me in a toy example: http://dpaste.dzfl.pl/7360ee90b344 Sorry about the lack of comments and stuff, but it's 3:30AM, and I probably shouldn't be programming now.
Mar 12 2016
next sibling parent Simen Kjaeraas <simen.kjaras gmail.com> writes:
On Sunday, 13 March 2016 at 02:33:49 UTC, Simen Kjaeraas wrote:
 http://dpaste.dzfl.pl/7360ee90b344
Dammit, 3:30AM was apparently too late, and some bad code leaked through. I managed to sidestep a problem by writing nonsense code. The problem I get can be reduced to this: struct S { void foo(alias a)() {} } unittest { S s; s.foo!((int i) => 1); // Works s.foo!(i => 1); // Fails } Result: foo.d(8): Error: template instance foo!((i) => 1) cannot use local '__lambda1' as parameter to non-global template foo(alias a)() Bah. It's in bugzilla as bug 5710[1] (with the most discussion), 3051, 3052, 11098, 12285, 12576 and 15564, and has a $150 bounty on bountysource: https://www.bountysource.com/issues/1375082-cannot-use-delegates-as-parameters-to-non-global-template Maybe it's time I learnt how DMD is put together and earn $150... Now, there's a way to work around that, by putting the lambda in an intermediate type. Sadly, that runs afoul of bug 15794[2] for the typed lambda, so I'm stumped for now. [1]: https://issues.dlang.org/show_bug.cgi?id=5710 [2]: https://issues.dlang.org/show_bug.cgi?id=15794
Mar 13 2016
prev sibling parent Jacob Carlborg <doob me.com> writes:
On 13/03/16 03:33, Simen Kjaeraas wrote:

 The match function goes through the list of patterns and for each one
 asks the tuple if opMatch returns true for that pattern. If it does, the
 function assigned that pattern is called with the values assigned to args.

 opMatch here is checking for each element of the pattern if it matches
 the corresponding element of the tuple. Since the pattern is available
 at compile-time, opMatch can deny patterns it doesn't like (e.g. trying
 to match a Tuple!(int, string) with a string).

 The match function is really only a framework for having similar
 matching syntax for dissimilar types.

 If the capability of matching patterns to types were in the match
 function, how could a user type override it? Matching on a
 Tuple!(string, string) is different from matching on an
 Algebraic!(int[], Foo*) is different from matching on a specialized user
 type that wants to do something real weird (I'm not sure what that'd be,
 but I'm sure there are people who will want to).
I misinterpreted the code.
 That works. I feel the grouping is looser than in my example, and that
 the pattern doesn't stand out from the rest of the expression, but it
 certainly works, and there are some benefits to that syntax.
Yeah, that's the downside.
 If the pull request for inspecting templates ever will be merged it
 won't be necessary to have typed lambdas:

 auto b = match!(a,
     int, "foo", (i) => 1,
     _, _, () => 0,
 );
There's a problem using that syntax? It works for me in a toy example:
I've not actually tried that particular syntax. I came to think of it when I saw your syntax. I've mostly been experimenting with the syntax later in my post. -- /Jacob Carlborg
Mar 13 2016