## digitalmars.D - hashGroup, pairwise

• bearophile (97/97) Jan 06 2015 In Phobos we have std.array.assocArray that allows to build a
```In Phobos we have std.array.assocArray that allows to build a
built-in associative array from a range of key-value Phobos
Tuples (it will become mostly legacy when we will have a good
associative array in Phobos and built-in tuples in D, that is the
opposite of the data structures we have today).

But there is another very common pattern of creation of
associative arrays, where you want to "accumulate" inside the
values.

The basic form of accumulation is counting:

//A
const a = [1, 2, 1, 3, 5, 1, 2];
uint[int] aa1;
foreach (x; a)
aa1[x]++;
aa1.writeln; // [1:3, 5:1, 2:2, 3:1]

Another example is just emumeration in an array:

//B
const b = [-1, 2, 1, 3, 5, -1, -2];
int[][int] aa2;
foreach (x; b)
aa2[x.abs] ~= x;
aa2.writeln; // [1:[-1, 1, -1], 5:[5], 2:[2, -2], 3:[3]]

Or even in a set (here implemented with another associative array
of int-bools):

//C
bool[int][int] aa3;
foreach (x; b)
aa3[x.abs][x] = true;
aa3.writeln; // [1:[1:true, -1:true], 5:[5:true], 2:[2:true,
-2:true], 3:[3:true]]

There are ways to perform those operations with Phobos today, but
the code is so bad looking that it's better to skip this.

So I suggest to add a std.array.hashGroup function to Phobos that
as argument takes a range, and as template argument takes one or
two functions.

The first function is a mapping key unary function.

The second function is optional and takes as argument the
sub-range of grouped values.

The patterns above become:

//A
a.hashGroup!(id, count).writeln;

//B
a.hashGroup!abs.writeln;

//C
a.hashGroup!(abs, items =>
items.zip(true.repeat).assocArray)).writeln;

Where "id" is the identity function (not present in Phobos):

//A
a.hashGroup!(x => x, count).writeln;

Once Phobos has a Set data structure with a constructor helper
function that accepts a range as argument you can also write just:

//C
a.hashGroup!(abs, set).writeln;

----------------------

Another very commonly useful range function is pairwise
(https://issues.dlang.org/show_bug.cgi?id=6788 ). It's a range
similar to cartesianProduct, but it generates a different and
equally commonly useful sequence:

[1, 2, 3, 4].pairwise
Or:
[1, 2, 3, 4].pairwise!2

should yield:

(1, 2)
(1, 3)
(1, 4)
(2, 3)
(2, 4)
(3, 4)

[1, 2, 3, 4].pairwise!3

should yield:

(1, 2, 3)
(1, 2, 4)
(1, 3, 4)
(2, 3, 4)

So despite its name pairwise() can generate more than just pairs
(so better names are welcome).

Like in this lower level code that generates those outputs:

void main() {
import std.stdio, std.range, std.algorithm;
auto a = [1, 2, 3, 4];

foreach (immutable i; 0 .. a.length)
foreach (immutable j; i + 1 .. a.length)
writefln("(%d, %d)", a[i], a[j]);

writeln;

foreach (immutable i; 0 .. a.length)
foreach (immutable j; i + 1 .. a.length)
foreach (immutable k; j + 1 .. a.length)
writefln("(%d, %d, %d)", a[i], a[j], a[k]);
}

Essentially pairwise() encapsulates in a handy range the very
common pattern of nested loops that iterate on distinct pairs,
triples, etc.

----------------------