## digitalmars.D.learn - Set Intersection of SortedRanges

- =?UTF-8?B?Tm9yZGzDtnc=?= (4/4) Dec 05 2016 What's the fastest way of calculating a set-intersection of two
- =?UTF-8?B?Tm9yZGzDtnc=?= (4/6) Dec 05 2016 Ahh, setops has intersection aswell:
- =?UTF-8?B?Tm9yZGzDtnc=?= (7/10) Dec 05 2016 Ahh again, but current Phobos is currently not optimized for the
- Nicholas Wilson (6/18) Dec 05 2016 The double binary search linked only finds *one* element of the
- Nicholas Wilson (56/78) Dec 05 2016 Hmm not sure that will work.
- Nicholas Wilson (2/2) Dec 05 2016 Whoops forgot to add checks for .empty on the ranges, and I don't
- =?UTF-8?B?Tm9yZGzDtnc=?= (5/6) Dec 06 2016 This gives good guidance on three different approaches.

What's the fastest way of calculating a set-intersection of two or more SortedRanges (all containing unique values)? Any typical branchings of the algorithm depending on the lengths of the SortedRanges?

Dec 05 2016

On Monday, 5 December 2016 at 21:34:39 UTC, Nordlöw wrote:What's the fastest way of calculating a set-intersection of two or more SortedRanges (all containing unique values)?Ahh, setops has intersection aswell: https://dlang.org/phobos/std_algorithm_setops.html#setIntersection I should have a guessed that.

Dec 05 2016

On Monday, 5 December 2016 at 21:48:49 UTC, Nordlöw wrote:Ahh, setops has intersection aswell: https://dlang.org/phobos/std_algorithm_setops.html#setIntersection I should have a guessed that.Ahh again, but current Phobos is currently not optimized for the case when all inputs are SortedArrays. In that case a double binary search algorithm as describe here http://cs.stackexchange.com/a/37124/25769 might be faster. Has anybody already done this?

Dec 05 2016

On Monday, 5 December 2016 at 22:10:34 UTC, Nordlöw wrote:On Monday, 5 December 2016 at 21:48:49 UTC, Nordlöw wrote:The double binary search linked only finds *one* element of the intersection. It should be as simple as a linear find (over the range of ranges) that returns (i.e. caches to .front) if the elements intersect.Ahh, setops has intersection aswell: https://dlang.org/phobos/std_algorithm_setops.html#setIntersection I should have a guessed that.Ahh again, but current Phobos is currently not optimized for the case when all inputs are SortedArrays. In that case a double binary search algorithm as describe here http://cs.stackexchange.com/a/37124/25769 might be faster. Has anybody already done this?

Dec 05 2016

On Tuesday, 6 December 2016 at 01:46:38 UTC, Nicholas Wilson wrote:On Monday, 5 December 2016 at 22:10:34 UTC, Nordlöw wrote:Hmm not sure that will work. The annoying property of this problem is calculating .empty and having .empty not mutate the ranges. however if you settle for an eager empty something like auto setIntersect(RoR,alias _pred)(RoR ror) if (allSatisfy!(RoR, isSortedRange!_pred) && allSatisfy!(RoR, isSame!(ElementType) && !allSatisfy!(RoR, isRandomAccessRange)) { enum lengths = RoR.length; alias elem = ElementType!(RoR[0]); alias pred = BinaryFun!_pred; struct SetIntersect { RoR ror; elem[lengths] fronts = void; elem front = void; bool valid = false; this(RoR r) { ror = r; populate(); } void populate() { foreach(i, ref r; ror) fronts[i] = r.front; } void advance() { foreach( ref r; ror) r.popFront(); } void popFront() { valid = false; } bool empty() { if (valid) return false; advance(); populate(); if (reduce!(equal)(fronts)) return false; elem m = reduce!(pred)(fronts); foreach(i, ref r; ror) { while(!pred(r.front,m)) r.popFront(); } populate(); valid = reduce!(equal)(fronts)); return !valid; } } }On Monday, 5 December 2016 at 21:48:49 UTC, Nordlöw wrote:The double binary search linked only finds *one* element of the intersection. It should be as simple as a linear find (over the range of ranges) that returns (i.e. caches to .front) if the elements intersect.Ahh, setops has intersection aswell: https://dlang.org/phobos/std_algorithm_setops.html#setIntersection I should have a guessed that.Ahh again, but current Phobos is currently not optimized for the case when all inputs are SortedArrays. In that case a double binary search algorithm as describe here http://cs.stackexchange.com/a/37124/25769 might be faster. Has anybody already done this?

Dec 05 2016

Whoops forgot to add checks for .empty on the ranges, and I don't think reduce!equal work but you get the point.

Dec 05 2016

On Monday, 5 December 2016 at 22:10:34 UTC, Nordlöw wrote:Has anybody already done this?This gives good guidance on three different approaches. http://stackoverflow.com/questions/2400157/the-intersection-of-two-sorted-arrays/4601106#4601106 Luckily we already have galloping search in Phobos :) I'll do a couple of benchmarks.

Dec 06 2016