www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.bugs - [Issue 6133] New: Improvements to RedBlackTree

http://d.puremagic.com/issues/show_bug.cgi?id=6133

           Summary: Improvements to RedBlackTree
           Product: D
           Version: D2
          Platform: All
        OS/Version: All
            Status: NEW
          Severity: enhancement
          Priority: P2
         Component: Phobos
        AssignedTo: nobody puremagic.com
        ReportedBy: soywiz gmail.com



09:17:18 PDT ---
I have made some modifications to RedBlackTree to include a new template
argument "hasStats" that will include information about the count of the left
and right childs.

I have also included some new methods that allow to do in O(log(N)) time the
following things:
* Determine the length of a RedBlackTree.Range (previously you had to iterate
over all the elements to get that information) the only thing you were able to
do in constant time was get the length of the full collection.
* Determine the position of an element in the ordered list in O(log(N)) (that
is the number of elements that are lesser than one Node)
* Slice elements from a Range by position (something like SKIP and LIMIT from
SQL) in O(log(N))

This is extreme useful to do rankings. For example: I have a tree with users
that have a score and I want to know the position of a User ordered by the
score. Also I want to get the users nearby and do pagination without having a
linear cost.

With hasStats to true, the size of Nodes will be 28 instead of 20 on DMD
2/32bits and will allow counting up to 2^^32 nodes. Inserting and deleting is a
bit slower, but allows to do some new operations in O(log(N)).

The class template definition changed from:
class RedBlackTree(T, alias less = "a < b", bool allowDuplicates = false)
to:
class RedBlackTree(T, alias less = "a < b", bool allowDuplicates = false, bool
hasStats = false)

RedBlackTree.Range now contain these new methods. All O(log(N)):
- Range skip(int skipCount)
- Range limit(int limitCount)
- int length()

RedBlackTree now contain these new methods. All O(log(N)):
- int countLesser(Node node) and an alias getNodePosition // Obtains the
position of a Node.
- Node locateNodeAtPosition(int positionToFind) // Obtains the node that
occupies a determinated position
- Range all() // Shortcut to get a range with all elements to querying.

I have made these methods with a fallback when not using hasStats that will
perform O(n).

The implementation (with a self-containing a demo comparing times):
http://code.google.com/p/dutils/source/browse/trunk/rbtree_with_stats/rbtree_with_stats.d?r=132
http://dutils.googlecode.com/svn-history/r132/trunk/rbtree_with_stats/rbtree_with_stats.d

This is a sketch and I had to do some modifications: change Range from struct
to class, change some fields to public...
I think if will be useful for some people, and since it's a new template option
 should not break anything nor being slower or ocupying more memory if not
using.

I wanted to share it to see if you think it's useful. If so, I can improve the
API, clean things up, make unittesting and create a patch for phobos.

Regards.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Jun 09 2011