www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - Set Intersection of SortedRanges

reply =?UTF-8?B?Tm9yZGzDtnc=?= <per.nordlow gmail.com> writes:
What's the fastest way of calculating a set-intersection of two 
or more SortedRanges (all containing unique values)?

Any typical branchings of the algorithm depending on the lengths 
of the SortedRanges?
Dec 05 2016
parent reply =?UTF-8?B?Tm9yZGzDtnc=?= <per.nordlow gmail.com> writes:
On Monday, 5 December 2016 at 21:34:39 UTC, Nordlöw wrote:
 What's the fastest way of calculating a set-intersection of two 
 or more SortedRanges (all containing unique values)?
Ahh, setops has intersection aswell: https://dlang.org/phobos/std_algorithm_setops.html#setIntersection I should have a guessed that.
Dec 05 2016
parent reply =?UTF-8?B?Tm9yZGzDtnc=?= <per.nordlow gmail.com> writes:
On Monday, 5 December 2016 at 21:48:49 UTC, Nordlöw wrote:
 Ahh, setops has intersection aswell:

 https://dlang.org/phobos/std_algorithm_setops.html#setIntersection

 I should have a guessed that.
Ahh again, but current Phobos is currently not optimized for the case when all inputs are SortedArrays. In that case a double binary search algorithm as describe here http://cs.stackexchange.com/a/37124/25769 might be faster. Has anybody already done this?
Dec 05 2016
next sibling parent reply Nicholas Wilson <iamthewilsonator hotmail.com> writes:
On Monday, 5 December 2016 at 22:10:34 UTC, Nordlöw wrote:
 On Monday, 5 December 2016 at 21:48:49 UTC, Nordlöw wrote:
 Ahh, setops has intersection aswell:

 https://dlang.org/phobos/std_algorithm_setops.html#setIntersection

 I should have a guessed that.
Ahh again, but current Phobos is currently not optimized for the case when all inputs are SortedArrays. In that case a double binary search algorithm as describe here http://cs.stackexchange.com/a/37124/25769 might be faster. Has anybody already done this?
The double binary search linked only finds *one* element of the intersection. It should be as simple as a linear find (over the range of ranges) that returns (i.e. caches to .front) if the elements intersect.
Dec 05 2016
parent reply Nicholas Wilson <iamthewilsonator hotmail.com> writes:
On Tuesday, 6 December 2016 at 01:46:38 UTC, Nicholas Wilson 
wrote:
 On Monday, 5 December 2016 at 22:10:34 UTC, Nordlöw wrote:
 On Monday, 5 December 2016 at 21:48:49 UTC, Nordlöw wrote:
 Ahh, setops has intersection aswell:

 https://dlang.org/phobos/std_algorithm_setops.html#setIntersection

 I should have a guessed that.
Ahh again, but current Phobos is currently not optimized for the case when all inputs are SortedArrays. In that case a double binary search algorithm as describe here http://cs.stackexchange.com/a/37124/25769 might be faster. Has anybody already done this?
The double binary search linked only finds *one* element of the intersection. It should be as simple as a linear find (over the range of ranges) that returns (i.e. caches to .front) if the elements intersect.
Hmm not sure that will work. The annoying property of this problem is calculating .empty and having .empty not mutate the ranges. however if you settle for an eager empty something like auto setIntersect(RoR,alias _pred)(RoR ror) if (allSatisfy!(RoR, isSortedRange!_pred) && allSatisfy!(RoR, isSame!(ElementType) && !allSatisfy!(RoR, isRandomAccessRange)) { enum lengths = RoR.length; alias elem = ElementType!(RoR[0]); alias pred = BinaryFun!_pred; struct SetIntersect { RoR ror; elem[lengths] fronts = void; elem front = void; bool valid = false; this(RoR r) { ror = r; populate(); } void populate() { foreach(i, ref r; ror) fronts[i] = r.front; } void advance() { foreach( ref r; ror) r.popFront(); } void popFront() { valid = false; } bool empty() { if (valid) return false; advance(); populate(); if (reduce!(equal)(fronts)) return false; elem m = reduce!(pred)(fronts); foreach(i, ref r; ror) { while(!pred(r.front,m)) r.popFront(); } populate(); valid = reduce!(equal)(fronts)); return !valid; } } }
Dec 05 2016
parent Nicholas Wilson <iamthewilsonator hotmail.com> writes:
Whoops forgot to add checks for .empty on the ranges, and I don't 
think reduce!equal work but you get the point.
Dec 05 2016
prev sibling parent =?UTF-8?B?Tm9yZGzDtnc=?= <per.nordlow gmail.com> writes:
On Monday, 5 December 2016 at 22:10:34 UTC, Nordlöw wrote:
 Has anybody already done this?
This gives good guidance on three different approaches. Luckily we already have galloping search in Phobos :) I'll do a couple of benchmarks.
Dec 06 2016