digitalmars.D - hashGroup, pairwise
- bearophile (97/97) Jan 06 2015 In Phobos we have std.array.assocArray that allows to build a
In Phobos we have std.array.assocArray that allows to build a built-in associative array from a range of key-value Phobos Tuples (it will become mostly legacy when we will have a good associative array in Phobos and built-in tuples in D, that is the opposite of the data structures we have today). But there is another very common pattern of creation of associative arrays, where you want to "accumulate" inside the values. The basic form of accumulation is counting: //A const a = [1, 2, 1, 3, 5, 1, 2]; uint[int] aa1; foreach (x; a) aa1[x]++; aa1.writeln; // [1:3, 5:1, 2:2, 3:1] Another example is just emumeration in an array: //B const b = [-1, 2, 1, 3, 5, -1, -2]; int[][int] aa2; foreach (x; b) aa2[x.abs] ~= x; aa2.writeln; // [1:[-1, 1, -1], 5:[5], 2:[2, -2], 3:[3]] Or even in a set (here implemented with another associative array of int-bools): //C bool[int][int] aa3; foreach (x; b) aa3[x.abs][x] = true; aa3.writeln; // [1:[1:true, -1:true], 5:[5:true], 2:[2:true, -2:true], 3:[3:true]] There are ways to perform those operations with Phobos today, but the code is so bad looking that it's better to skip this. So I suggest to add a std.array.hashGroup function to Phobos that as argument takes a range, and as template argument takes one or two functions. The first function is a mapping key unary function. The second function is optional and takes as argument the sub-range of grouped values. The patterns above become: //A a.hashGroup!(id, count).writeln; //B a.hashGroup!abs.writeln; //C a.hashGroup!(abs, items => items.zip(true.repeat).assocArray)).writeln; Where "id" is the identity function (not present in Phobos): //A a.hashGroup!(x => x, count).writeln; Once Phobos has a Set data structure with a constructor helper function that accepts a range as argument you can also write just: //C a.hashGroup!(abs, set).writeln; ---------------------- Another very commonly useful range function is pairwise (https://issues.dlang.org/show_bug.cgi?id=6788 ). It's a range similar to cartesianProduct, but it generates a different and equally commonly useful sequence: [1, 2, 3, 4].pairwise Or: [1, 2, 3, 4].pairwise!2 should yield: (1, 2) (1, 3) (1, 4) (2, 3) (2, 4) (3, 4) [1, 2, 3, 4].pairwise!3 should yield: (1, 2, 3) (1, 2, 4) (1, 3, 4) (2, 3, 4) So despite its name pairwise() can generate more than just pairs (so better names are welcome). Like in this lower level code that generates those outputs: void main() { import std.stdio, std.range, std.algorithm; auto a = [1, 2, 3, 4]; foreach (immutable i; 0 .. a.length) foreach (immutable j; i + 1 .. a.length) writefln("(%d, %d)", a[i], a[j]); writeln; foreach (immutable i; 0 .. a.length) foreach (immutable j; i + 1 .. a.length) foreach (immutable k; j + 1 .. a.length) writefln("(%d, %d, %d)", a[i], a[j], a[k]); } Essentially pairwise() encapsulates in a handy range the very common pattern of nested loops that iterate on distinct pairs, triples, etc. ---------------------- http://msdn.microsoft.com/en-us/library/ee353635.aspx Bye, bearophile
Jan 06 2015