digitalmars.D - Pattern matching as a library

Simen Kjaeraas (56/56) Mar 12 2016 As I once again bemoaned D's lack of pattern matching yesterday,

Jacob Carlborg (44/96) Mar 12 2016 What kind of syntax is that? Is "match" returning a struct with opCall

Simen Kjaeraas (32/88) Mar 12 2016 Indeed. The goal was to make it look similar to a switch

Simen Kjaeraas (26/27) Mar 13 2016 Dammit, 3:30AM was apparently too late, and some bad code leaked
Jacob Carlborg (8/34) Mar 13 2016 Yeah, that's the downside.

Simen Kjaeraas <simen.kjaras gmail.com> writes:

As I once again bemoaned D's lack of pattern matching yesterday, 
I was inspired to create this[0] implementation, that plays to 
D's strengths, allows for user-defined matching, and has a fairly 
usable syntax. The core usage looks like this:

unittest {
   auto a = tuple(1, "foo");
   auto b = match(a) (
     _!(int, "foo") = (int i) => 1,
     _!(_, _)       = ()      => 0
   );
   assert(b == 1);
}

With the user-defined matching implemented as follows:

struct Tuple(T...) {
    // Implementation

   // Magic happens here
   bool opMatch(Pattern, Args...)(Pattern p, ref Args args) {
     foreach (i, e; p.pattern) {
       static if (isTypeTuple!e) {
         enum n = countTypes!(p.pattern[0..i]);
         args[n] = fields[i];
       } else static if (!ignore!e) {
         if (fields[i] != e) {
           return false;
         }
       }
     }
   }
}

Or for Algebraic:

struct Algebraic(T...) {
   union {
     T fields;
   }
   size_t which;

   bool opMatch(Pattern, Type)(Pattern p, ref Type args) if 
(staticIndexOf!(Type, T) > -1) {
     enum index = staticIndexOf!(Type, T);
     if (index == which) {
       args = fields[index];
       return true;
     }
     return false;
   }
}

The main problem I see is the temporary allocation of function 
arguments on line 124 and their assignment in opMatch, but I 
currently don't have a better solution.

Also, while I very much dislike using _ for an identifier, I feel 
it may be the best alternative here - it conveys the meaning of 
'don't care' for the pattern, and doesn't stand out like a sore 
thumb before the exclamation mark. Other suggestions are welcome.

The code is available here, and I encourage everyone to play with 
it and critique:

[0]: 
https://github.com/Biotronic/Collectanea/blob/master/biotronic/pattern.d

Mar 12 2016

Jacob Carlborg <doob me.com> writes:

On 12/03/16 14:12, Simen Kjaeraas wrote:
 As I once again bemoaned D's lack of pattern matching yesterday, I was
 inspired to create this[0] implementation, that plays to D's strengths,
 allows for user-defined matching, and has a fairly usable syntax. The
 core usage looks like this:

 unittest {
    auto a = tuple(1, "foo");
    auto b = match(a) (
      _!(int, "foo") = (int i) => 1,
      _!(_, _)       = ()      => 0
    );
    assert(b == 1);
 }

What kind of syntax is that? Is "match" returning a struct with opCall 
that is called immediately?

 With the user-defined matching implemented as follows:

 struct Tuple(T...) {
     // Implementation

    // Magic happens here
    bool opMatch(Pattern, Args...)(Pattern p, ref Args args) {
      foreach (i, e; p.pattern) {
        static if (isTypeTuple!e) {
          enum n = countTypes!(p.pattern[0..i]);
          args[n] = fields[i];
        } else static if (!ignore!e) {
          if (fields[i] != e) {
            return false;
          }
        }
      }
    }
 }

Is the tuple iterating all patterns to see if there's a match? Shouldn't 
that be the job for the the match function?

 Or for Algebraic:

 struct Algebraic(T...) {
    union {
      T fields;
    }
    size_t which;

    bool opMatch(Pattern, Type)(Pattern p, ref Type args) if
 (staticIndexOf!(Type, T) > -1) {
      enum index = staticIndexOf!(Type, T);
      if (index == which) {
        args = fields[index];
        return true;
      }
      return false;
    }
 }

 The main problem I see is the temporary allocation of function arguments
 on line 124 and their assignment in opMatch, but I currently don't have
 a better solution.

 Also, while I very much dislike using _ for an identifier, I feel it may
 be the best alternative here - it conveys the meaning of 'don't care'
 for the pattern, and doesn't stand out like a sore thumb before the
 exclamation mark. Other suggestions are welcome.

I've started implementing a pattern matching function as well. It has a 
syntax that only use compile time parameters, because both types and 
values can be passed. I'm not entirely sure on the syntax yet. I'll have 
to see what's possible to implement. Some suggestions:

auto a = tuple(1, "foo");
auto b = match!(a,
     int, "foo", (int i) => 1,
     _, _, () => 0,
);

If the pull request for inspecting templates ever will be merged it 
won't be necessary to have typed lambdas:

auto b = match!(a,
     int, "foo", (i) => 1,
     _, _, () => 0,
);

If you only want to match types it could look like this:

auto b = match!(a,
     (int a) => 1
);

Matching a value:

auto b = match!(a,
     1, () => 2
);

Matching a pattern:

auto b = match!(a,
     (a, b) => a + b
);

The else pattern (executed if nothing else matches):

auto b = match!(a,
     () => 4,
);

It would be nice if it was possible to verify at compile time if at 
least one pattern will match. For example:

match!("foo",
     (int a) => 1,
);

It's clear that the above pattern can never match.

-- 
/Jacob Carlborg

Mar 12 2016

Simen Kjaeraas <simen.kjaras gmail.com> writes:

On Saturday, 12 March 2016 at 20:56:47 UTC, Jacob Carlborg wrote:
 On 12/03/16 14:12, Simen Kjaeraas wrote:
 As I once again bemoaned D's lack of pattern matching 
 yesterday, I was
 inspired to create this[0] implementation, that plays to D's 
 strengths,
 allows for user-defined matching, and has a fairly usable 
 syntax. The
 core usage looks like this:

 unittest {
    auto a = tuple(1, "foo");
    auto b = match(a) (
      _!(int, "foo") = (int i) => 1,
      _!(_, _)       = ()      => 0
    );
    assert(b == 1);
 }

 What kind of syntax is that? Is "match" returning a struct with 
 opCall that is called immediately?

Indeed. The goal was to make it look similar to a switch 
statement. I actually started out with an idea for expanding 
switch with pattern matching using lowerings, then noticed I 
could do most of the stuff I wanted without compiler changes.


 With the user-defined matching implemented as follows:

 struct Tuple(T...) {
     // Implementation

    // Magic happens here
    bool opMatch(Pattern, Args...)(Pattern p, ref Args args) {
      foreach (i, e; p.pattern) {
        static if (isTypeTuple!e) {
          enum n = countTypes!(p.pattern[0..i]);
          args[n] = fields[i];
        } else static if (!ignore!e) {
          if (fields[i] != e) {
            return false;
          }
        }
      }
    }
 }

 Is the tuple iterating all patterns to see if there's a match? 
 Shouldn't that be the job for the the match function?

The match function goes through the list of patterns and for each 
one asks the tuple if opMatch returns true for that pattern. If 
it does, the function assigned that pattern is called with the 
values assigned to args.

opMatch here is checking for each element of the pattern if it 
matches the corresponding element of the tuple. Since the pattern 
is available at compile-time, opMatch can deny patterns it 
doesn't like (e.g. trying to match a Tuple!(int, string) with a 
string).

The match function is really only a framework for having similar 
matching syntax for dissimilar types.

If the capability of matching patterns to types were in the match 
function, how could a user type override it? Matching on a 
Tuple!(string, string) is different from matching on an 
Algebraic!(int[], Foo*) is different from matching on a 
specialized user type that wants to do something real weird (I'm 
not sure what that'd be, but I'm sure there are people who will 
want to).


 I've started implementing a pattern matching function as well. 
 It has a syntax that only use compile time parameters, because 
 both types and values can be passed. I'm not entirely sure on 
 the syntax yet. I'll have to see what's possible to implement. 
 Some suggestions:

 auto a = tuple(1, "foo");
 auto b = match!(a,
     int, "foo", (int i) => 1,
     _, _, () => 0,
 );

That works. I feel the grouping is looser than in my example, and 
that the pattern doesn't stand out from the rest of the 
expression, but it certainly works, and there are some benefits 
to that syntax.


 If the pull request for inspecting templates ever will be 
 merged it won't be necessary to have typed lambdas:

 auto b = match!(a,
     int, "foo", (i) => 1,
     _, _, () => 0,
 );

There's a problem using that syntax? It works for me in a toy 
example:

http://dpaste.dzfl.pl/7360ee90b344

Sorry about the lack of comments and stuff, but it's 3:30AM, and 
I probably shouldn't be programming now.

Mar 12 2016

Simen Kjaeraas <simen.kjaras gmail.com> writes:

On Sunday, 13 March 2016 at 02:33:49 UTC, Simen Kjaeraas wrote:
 http://dpaste.dzfl.pl/7360ee90b344

Dammit, 3:30AM was apparently too late, and some bad code leaked 
through. I managed to sidestep a problem by writing nonsense 
code. The problem I get can be reduced to this:

struct S {
     void foo(alias a)() {}
}

unittest {
     S s;
     s.foo!((int i) => 1); // Works
     s.foo!(i => 1); // Fails
}

Result:
foo.d(8): Error: template instance foo!((i) => 1) cannot use 
local '__lambda1' as parameter to non-global template foo(alias 
a)()

Bah. It's in bugzilla as bug 5710[1] (with the most discussion), 
3051, 3052, 11098, 12285, 12576 and 15564, and has a $150 bounty 
on bountysource:
https://www.bountysource.com/issues/1375082-cannot-use-delegates-as-parameters-to-non-global-template

Maybe it's time I learnt how DMD is put together and earn $150...


Now, there's a way to work around that, by putting the lambda in 
an intermediate type. Sadly, that runs afoul of bug 15794[2] for 
the typed lambda, so I'm stumped for now.

[1]: https://issues.dlang.org/show_bug.cgi?id=5710
[2]: https://issues.dlang.org/show_bug.cgi?id=15794

Mar 13 2016

Jacob Carlborg <doob me.com> writes:

On 13/03/16 03:33, Simen Kjaeraas wrote:

 The match function goes through the list of patterns and for each one
 asks the tuple if opMatch returns true for that pattern. If it does, the
 function assigned that pattern is called with the values assigned to args.

 opMatch here is checking for each element of the pattern if it matches
 the corresponding element of the tuple. Since the pattern is available
 at compile-time, opMatch can deny patterns it doesn't like (e.g. trying
 to match a Tuple!(int, string) with a string).

 The match function is really only a framework for having similar
 matching syntax for dissimilar types.

 If the capability of matching patterns to types were in the match
 function, how could a user type override it? Matching on a
 Tuple!(string, string) is different from matching on an
 Algebraic!(int[], Foo*) is different from matching on a specialized user
 type that wants to do something real weird (I'm not sure what that'd be,
 but I'm sure there are people who will want to).

I misinterpreted the code.

 That works. I feel the grouping is looser than in my example, and that
 the pattern doesn't stand out from the rest of the expression, but it
 certainly works, and there are some benefits to that syntax.

Yeah, that's the downside.

 If the pull request for inspecting templates ever will be merged it
 won't be necessary to have typed lambdas:

 auto b = match!(a,
     int, "foo", (i) => 1,
     _, _, () => 0,
 );

 There's a problem using that syntax? It works for me in a toy example:

I've not actually tried that particular syntax. I came to think of it 
when I saw your syntax. I've mostly been experimenting with the syntax 
later in my post.

-- 
/Jacob Carlborg

Mar 13 2016

D Programming

C/C++ Programming

Other

digitalmars.D - Pattern matching as a library