www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Grokking ranges: some new algorithms and ranges

reply Philippe Sigaud <philippe.sigaud gmail.com> writes:
Hello,

(Uh, first time poster, so hi to all!)

I discovered D2 ranges a few months ago and decided to get a grip on them for
the past
few weeks by coding some new ranges. As I'm also reading on Haskell, Clojure
and Scala,
I tried to code in D some functions seen elsewhere.

All in all, I'm having a lot of fun and feel like I'm beginning to grok
templates and ranges. I've
a module with some new functions inspired by those of std.range and
std.algorithm and was wondering if they could be interesting for someone else.

Here is what I have right now (I've a few other small functions I'll write this
week). I hope the list is not too indigest... If someone is interested by
discussing them, I'm all ears. I also don't know if it's exactly kasher to post
a list like this (at least I didn't attach the module), so tell me if I must
move the discussion somewhere else.

Algorithms:

* bimap: an extension of map for mapping binary function on a range ->
bimap!"a+b"(r) (functions can be standard functions or strings, like in
std.algorithm). Application: difference lists (bimap!"b-a"(r))
* nmap : an extension of bimap to map a n-ary function on a range ->
nmap!("a*b+sin(c)", 3)(r). Functions can be standard funs or strings. For
strings, you must indicate the desired arity (unary, binary, whatever).
Application: moving average: nmap!("(a+b+c+d)*0.25", 4)(r)

btw, this uses a template called naryFun(alias fun, size_t arity) which is the
generalization of std.functional.unaryFun and .binaryFun: it produces a
templated n-args function from a string, the arguments being in the 'a'-'z'
range. I've yet to code a CT heuristics determining the probable arity from the
string (I've a runtime one, it works, I just have to translate it into
templates).

* bifilter: same idea, extending std.algorithm.filter to work with binary
predicates -> filter"a>b"(r) -> returns a lazy range whose elements are the
pairs verifying the predicate.
* nfilter: as before, a generalization of this idea: filtering data with a
n-ary (n-args) predicate. Returns arrays of length n satisfying the predicate
-> nfilter!("a<b && b<c",3)(r)

For all these functions, there is also an optional argument: the numbers of
steps taken on the range at each popFront. That is : 
int[] r1 = [0,1,2,3,4,5,6,7];
bfilter!"a<b"(r1) : do you want to apply "a<b" on [0,1], [1,2], [2,3]
(overlapping segments), ... or on [0,1], [2,3], [4,5] (disjoint segments)?

* tmap: maps a n-args function on n different ranges in parallel (in lockstep)
-> nmap!"max"(r1, r2,r3) -> will lazily produce a range with the max of each
tuple. For a 'string' function (tmap!"a+b"(r1, r2)) 'a' means elements from the
first range, 'b'from  the second, etc.
* tfilter: filters n ranges in parallel with a n-args predicate.
In both cases, the range can have different element types, as long as the
predicate or the mapped function is compatible.
Example: say I've r1 = [0,1,2,3], r2 = "abcdef". tmap!"replicate(a+1,b)"(r1,
r2) will produce "a", "bb", "ccc", "dddd" and then stops, as r1 is exhausted.

* comprehension: a first try, maybe watered-down, list comprehension for D. The
syntax is comprehension!(gen, pred)(ranges). It internally (and lazily)
produces the combinations of all input ranges' elements, filter them with
n-args pred and apply n-args gen on them.
For example:

// find all pythagorean triplets (triplets (a,b,c) verifying a*a+b*b==c*c) for
a,b, and c between 1 and 9. 
// numbers(1,10) is just a small range, akin to iota.
auto lc = comprehension!("tuple(a,b,c)", "a*a + b*b == c*c")(numbers(1,10),
numbers(1,10), numbers(1,10))
Output : tuple(3,4,5), tuple(4,3,5)

* setComprehension: as before, but elements are given only once. Uses a range
wrapper AsSet!R which filter values already produced.
* parallelComprehension: as before, inspired by Haskell. Just iterates the
input ranges in parallel (in lockstep), filters them and generates the values.

* flatMap: maps a range-returning function on elements of a range, and flatten
the resulting ranges. If I understand 'Programming in Scala' correctly, that
may be the last necessary element to have full-blown list comprehension in D...

* reduceR: well, Haskell has fold, foldr, so I guess D needed it also. Reduces
a range 'from the right'. Useful for some problems (for example, computing a
real given its continued fraction).
* iterate!(fun)(seed): repeatedly apply fun to seed, produces a range of values.
Example: 
	auto powersOf2 = iterate!"2*a"(1L); // 1, 2, 4, 8, 16, ... (as longs)
	auto sqrt2 = iterate!"(a+2/a)/2"(1.0); // converges to sqrt(2) -> 1.41421...
* scan / scanR: produces a lazy range with all intermediate values of
reduce/reduceR. Useful to get, well, intermediate values.
Example: auto factorials = scan!"a*b"(1L, numbers(1, int.max)) -> 1, 1, 2, 6,
24, 120, ...
* unfold: equivalent to Haskell unfold. It's the opposite of reduce: given a
seed value and a function, it will generates an infinite range.
I've some nice examples using it to lazily calculate prime numbers or the
continued fraction of a real. It's a cousin of std.range.recurrence.

operation on ranges:

* some!pred(range) returns true iff some element in range verify predicate pred
* all!pred(range) returns true iff all elements in range verify pred.
* drop(n, range) returns a copy of range with the n first values dropped
* dropWhile!pred(range) drops value as long as pred if verified ->
dropWhile!"a<3"(r). It works also with a n-args predicate.
* popFrontWhile!pred(range). As st.range.popFrontN, it affects the target range.
* cutAt(n, range): cuts a range in two parts at index n. Returns a tuple.
* cutWhen!pred(range): cuts a range in two parts when pred is verified.
* takeWhile!pred(r): takes value in a range as long as pred is verified. Exists
for unary predicates, but also for n-ary predicates.
* contains(r1, r2): returns true iff r1 contains all r2 (as one block)
* rotate(n, range): takes the first n elements, chain them at the end.

new ranges:

* replicateRange(n, range): returns a range copied n times.
* cartesianProduct(r1, r2): a range whose elements are all possible pairs of
elements (as tuples).
* combinations(r1, r2, ...) a generalization of cartesianProduct, for n ranges.
Lazily produces all possible combinations of elements by taking
one in each input range.
* flatten!level(range): flatten a range of ranges (of rank as deep as you
want). Default behavior is maximum flattening:
int[] r1 = [0,1,2,3];
int[][] r2 = [r1, [4], [5,6]];
int[][][] r3 = [r2, [[7], [8]], r2];
assert(equal(flatten(r3), [0,1,2,3,4,5,6,7,8,0,1,2,3,4,5,6][])); // Max flat
* concat: just one level of flattening. Akin the Haskell concat.
* pairRange: a small zip-like range taking a two ranges. Has for values the
2-tuples created by the input ranges values.
* tupleOfRanges: the same, but generalized to n ranges. I coded this even as
std.range.zip exists, as Zip produce Proxy tuples (to allow iteration  beyond
the shortest range and to allow swapping/sorting), but I needed something with
std.typecons.Tuples as element types. That way interoperation between ranges is
easier.
* unzip : extracts a range from a PairRange/TupleRange
* permutations. As the name says, it's a range producing all possible
permutations of another range's elements. Don't use it on ranges with more than
17 elements, as 18! > size_t.max...
* stutter!n(range): takes a range and repeat n times its elements. Sorry for
the name, but repeat and replicate were already taken.
Eg: stutter!3([0,1,2,3][]) -> 0,0,0,1,1,1,2,2,2,3,3,3.
* extremities(range): alternates between range's two extremities. A cousin of
std.range.Radial
extremities([0,1,2,3,4,5][]) -> 0,5,1,4,2,3.
* transverse(r1, r2, r3, ...): like std.range.Transversal, but iterates on all
'columns', stopping when one of the ranges is exhausted. Transverse has for
element type the common types of the input ranges.
* segment!(segmentLength, segmentStep)(range): segments a range into
segmentLength elements segments, advancing by segmentStep steps.
segment!!3([0,1,2,3,4,5,6][]) -> [0,1,2], [1,2,3], [2,3,4], ... (step defaults
to 1).
* without(r1, r2): a range like r1, but with elements of r2 dropped.
 "banana".without("aaa") -> "bnn".

Ok, that's about it. Thanks for those who read it all.
I've some other things, but more minor. If someone is interested, I can send
them the module (it's only one module for now). I'm no professional coder, so
my code may well be awful, I honestly don't know.

I've also some ideas about extending some std.range/algorithm: for example Map
could be a bidirectional range if the mapped-on range is bidir, and also have a
length or give random-access if possible. Chain has some bugs with infinite
ranges which are easily squashed, and so on.

Ah, and I also have a Chainable!() template to inject some templated opCat
capabilities in ranges, as long as Chain has some also. It allows you to write
delicious-looking code like this:
auto p = [2,3] ~ unfold!nextPrime([2,3]);
auto p2 = filter!"a<100"(p) ~ map!"a%3"(take(100, p)); // Or whatever...


Bye,

Philippe
Nov 01 2009
next sibling parent reply Walter Bright <newshound1 digitalmars.com> writes:
Philippe Sigaud wrote:
 (Uh, first time poster, so hi to all!)
Glad you could join us!
 I discovered D2 ranges a few months ago and decided to get a grip on them for
the past
 few weeks by coding some new ranges. As I'm also reading on Haskell, Clojure
and Scala,
 I tried to code in D some functions seen elsewhere.
 
 All in all, I'm having a lot of fun and feel like I'm beginning to grok
templates and ranges. I've
 a module with some new functions inspired by those of std.range and
std.algorithm and was wondering if they could be interesting for someone else.
What you're doing is great fodder for an article. Care to write one?
Nov 01 2009
parent reply Philippe Sigaud <philippe.sigaud gmail.com> writes:
On Mon, Nov 2, 2009 at 00:47, Walter Bright <newshound1 digitalmars.com>wrote:

 Philippe Sigaud wrote:

 (Uh, first time poster, so hi to all!)
Glad you could join us!
Thanks, Walter, and thank you for D. I did find the NG easily and even found a way to have gmail read it and put it into my mail box. If only I could remember how I ddi it... Anyway, all I can say is that entering D was quite easy, as I'm part of your target population: years of C & C++, some Jave, some dabbling in Python. Compared to C++ were I never did any template, templates in D are relatively easy to wrap your mind around. And I fell in love with the range idea, being immersed in Clojure's Seq or Haskell lists right now. As some other people here would say, I'm just a run-of-the-mill programmer discovering functional programming and the power of sequences. But eh, those were fun to code.
 I discovered D2 ranges a few months ago and decided to get a grip on them
 for the past
 few weeks by coding some new ranges. As I'm also reading on Haskell,
 Clojure and Scala,
 I tried to code in D some functions seen elsewhere.

 All in all, I'm having a lot of fun and feel like I'm beginning to grok
 templates and ranges. I've
 a module with some new functions inspired by those of std.range and
 std.algorithm and was wondering if they could be interesting for someone
 else.
What you're doing is great fodder for an article. Care to write one?
You know, I'm pretty sure my code is no so good to look at. As I said, I'm no professional coder. I guess if all goes well, I can write something on my experience as a complete newbie. My line of work is completely different, so I'm pretty sure I'm rediscovering things known in CS (particulary the functional world) for years... Philippe
Nov 01 2009
parent reply Lutger <lutger.blijdestijn gmail.com> writes:
Philippe Sigaud wrote:

 On Mon, Nov 2, 2009 at 00:47, Walter Bright
 <newshound1 digitalmars.com>wrote:
...
 What you're doing is great fodder for an article. Care to write one?
You know, I'm pretty sure my code is no so good to look at. As I said, I'm no professional coder. I guess if all goes well, I can write something on my experience as a complete newbie. My line of work is completely different, so I'm pretty sure I'm rediscovering things known in CS (particulary the functional world) for years... Philippe
If you're up to it, something like a 'post-mortem' style article about your experiences would be very interesting!
Nov 02 2009
next sibling parent Walter Bright <newshound1 digitalmars.com> writes:
Lutger wrote:
 Philippe Sigaud wrote:
 
 On Mon, Nov 2, 2009 at 00:47, Walter Bright
 <newshound1 digitalmars.com>wrote:
...
 What you're doing is great fodder for an article. Care to write one?
You know, I'm pretty sure my code is no so good to look at. As I said, I'm no professional coder. I guess if all goes well, I can write something on my experience as a complete newbie. My line of work is completely different, so I'm pretty sure I'm rediscovering things known in CS (particulary the functional world) for years... Philippe
If you're up to it, something like a 'post-mortem' style article about your experiences would be very interesting!
I agree. Don't worry about not being a professional writer, you write well enough.
Nov 02 2009
prev sibling parent reply Philippe Sigaud <philippe.sigaud gmail.com> writes:
On Mon, Nov 2, 2009 at 18:20, Lutger <lutger.blijdestijn gmail.com> wrote:


 If you're up to it, something like a 'post-mortem' style article about your
 experiences would be very interesting!
You mean like "How I came to love templates" or "Trying to be fun(ctional)"? Philippe
Nov 02 2009
parent Lutger <lutger.blijdestijn gmail.com> writes:
Philippe Sigaud wrote:

 On Mon, Nov 2, 2009 at 18:20, Lutger <lutger.blijdestijn gmail.com> wrote:
 
 
 If you're up to it, something like a 'post-mortem' style article about
 your experiences would be very interesting!
You mean like "How I came to love templates" or "Trying to be fun(ctional)"? Philippe
Exactly: something in first person.
Nov 02 2009
prev sibling next sibling parent reply Sam Hu <samhu.samhu nospam.com> writes:
Philippe Sigaud Wrote:
 I've some other things, but more minor. If someone is interested, I can send
them the module (it's only one module for now). > Bye,
 
 Philippe
 
 
May I ask for one copy?Thank you so much. Sam Hu (samhu.samhu gmail.com)
Nov 01 2009
parent Philippe Sigaud <philippe.sigaud gmail.com> writes:
On Mon, Nov 2, 2009 at 02:00, Sam Hu <samhu.samhu nospam.com> wrote:

 Philippe Sigaud Wrote:
 I've some other things, but more minor. If someone is interested, I can
send them the module (it's only one module for now). > Bye,
 Philippe
May I ask for one copy?Thank you so much.
I just send you one, along with the Ddoc-generated html. I know it's still a mess: at the very least, it should be cut into 2-3 modules and functions grouped together by family, and as Andrei said, the next step would to factor everything. Don't hesitate to bash it on the head and send me some feedback. Philippe
Nov 01 2009
prev sibling next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Philippe Sigaud wrote:
 Hello,
 
 (Uh, first time poster, so hi to all!)
[snip] Hi and welcome. This is very interesting work - if you want to contribute it to Phobos at least for inspiration, you may want to put it in the public domain on your website or really anywhere on the Net. I'm thinking of factoring some of your APIs a bit, for example bimap is really an application of a binary function over two streams, one of which is a delay of the other. Andrei
Nov 01 2009
parent Philippe Sigaud <philippe.sigaud gmail.com> writes:
On Mon, Nov 2, 2009 at 05:13, Andrei Alexandrescu <
SeeWebsiteForEmail erdani.org> wrote:

 Philippe Sigaud wrote:

 Hello,

 (Uh, first time poster, so hi to all!)
[snip] Hi and welcome. This is very interesting work -
Thanks! And thank you for your good work there, and for the nuggets I found while perusing std.* code.
 if you want to contribute it to Phobos at least for inspiration, you may
 want to put it in the public domain on your website or really anywhere on
 the Net.
Yes, I see that. I'll open an account on dsource.org tonight (It's 7 am where I'm sitting).
 I'm thinking of factoring some of your APIs a bit, for example bimap is
 really an application of a binary function over two streams, one of which is
 a delay of the other.
You're perfectly right. Damn, and I saw that somewhere else. It's a way to transform what a called array ranges into tuple-ranges. Yes OK, the subjacent segmentation of ranges could be done by delaying some copy. Um, it's more uniform this way, as everything could be zips and such and it's even more general : you can create disjoint segments. Thanks! That shouldn't be too difficult to code: make copies, drop first elements in a ramp, zip them into a tuple-range, iterate on them, eventually with stride. By the way, if you know any easy/elegant way to transform a standard function into a typecons.Tuple-accepting one, I'd be quite interested to see it. I return tuples everywhere and, though doable, I don"t find that to easy to plug another range on it. It's my main point of ugliness in the code, from my pov. At the begining, I just wanted to have some map!(tuplify!foo)(zip(ranges)), but somehow it didn't work. I'll try that again. Anyway, I'll go public, clean it somewhat and let this on the backburner from some time, see what comes out of it. And I've some bugs for std.range (mainly in chain). I gather the right way to do that would be to put it on bugzilla? Is it OK to propose some slight code change? (chain assumes opIndexAssign or .length I think, and has some trouble if there is an infinite range somewhere, hasLength does not work for ranges with .length enclosed in static if, which is a common case, etc.) Bye, Philippe
Nov 01 2009
prev sibling parent reply bearophile <bearophileHUGS lycos.com> writes:
Philippe Sigaud:

Hello, it seems you have re-done with ranges for D2 a good amount of things I
did for D1:
http://www.fantascienza.net/leonardo/so/libs_d.zip

- bimap/nmap/map: having a single map able to do all that is better. The same
for filters.
- setComprehension: better to have a set(range).
- comprehension: see select() in my dlibs.

Bye,
bearophile
Nov 02 2009
parent Philippe Sigaud <philippe.sigaud gmail.com> writes:
On Mon, Nov 2, 2009 at 10:14, bearophile <bearophileHUGS lycos.com> wrote:

 Philippe Sigaud:

 Hello, it seems you have re-done with ranges for D2 a good amount of things
 I did for D1:
 http://www.fantascienza.net/leonardo/so/libs_d.zip

 Ah yes, I meant to look at your libraries, but forgot to do it. I'll have a
look right now, to see how you solved some problems.
 - bimap/nmap/map: having a single map able to do all that is better. The
 same for filters.
Well, yes and no. It has grown organically, from the simple structs (mapping and filtering with binary functions) to the more complex ones. I have a few other ranges I want to do and then will begin the cutting down and refactoring. I coded a range segmentation as delays on a zip-range tonight (thanks Andrei for the idea!) and it seems to work. That will allow me to factor some things away. Also, I like having generic and adaptative functions as much as the next guy, but when I tried to put every functionality inside one struct, it became some kind of godzilla of static ifs. So right now, I prefer having some specialised functions to do a more precise work, sometimes even more efficiently, that I can (theoritically) plug one into another. And, right now, I'm limited by the fact that templates can only have one variadic type, for self-evident reasons. So you cannot have both * map!(fun1, fun2, fun3...)(r) behavior (mapping one range with a tuple of functions) that is, std.algo.map behavior, and, * map!(fun, R...)(R manyRanges) (mapping one function on many ranges in parallel). because the ... part in the template can exist only once. map(fun..., R ...)(R ranges) cannot exist. Well it can, but that means separating the funs from the ranges by filtering on the variadic arguments at compile time and I'm a bit leery of doing this, for not that much functionality added. - setComprehension: better to have a set(range).

I'm afraid I disagree. I would expect set(range) to create a Set collection
or to transform a range into a set (I called that asSet!R)
And anyway, I guess I'll just provide an asSet!R wrapper on one hand and
comprehension on the other hand.


 - comprehension: see select() in my dlibs.
I will and gladly, because I'd like to know how you've tackled some problems. Right now, the comprehension range I propose can do all the computations you'll see everywhere on blogs and wikipedia: // python S = (2*x for x in count() if x**2 > 3) // D auto S = comprehension!("2*a", "a*a>3")(numbers()); But not the rebindings: // Scala for (p <- persons; if !p.isMale; c <- p.children) yield (c.name, p.name) There, c comes from p.children, which is a list created anew for each p. It means that inside the for expression, p can be used to create new ranges. Obviously, I can nest the standard comprehensions (first generating the p's, then the c's from the p and creating the tuples(p,c) from there). But I gather I can have it done internally, with a combinations of flatMap and such. Philippe
Nov 02 2009