digitalmars.D - About std.container.RedBlackTree
- bearophile (81/81) Jan 10 2011 I've had to use a search tree, so RedBlackTree was the right data struct...
- Steven Schveighoffer (66/159) Jan 11 2011 I will do this, when I have some free time. If you want to submit some ...
- spir (15/23) Jan 11 2011 I have thought at this naming issue, precisely, for a while.
- Lars T. Kyllingstad (4/25) Jan 12 2011 I seem to remember this being discussed before, and 'put' being rejected...
- Steven Schveighoffer (4/28) Jan 13 2011 technically, it is an output range ;)
- bearophile (19/54) Jan 13 2011 It's not your fault. The forward reference errors I've found with RedBla...
- bearophile (3/4) Jan 13 2011 http://d.puremagic.com/issues/show_bug.cgi?id=5451
I've had to use a search tree, so RedBlackTree was the right data structure. It seems to do what I need, so thank you for this useful data structure. Some of the things I write here are questions or things that show my ignorance about this implementation. --------------------- Please add some usage examples to this page, this is important and helps people reduce a lot the number of experiments to do to use this tree: --------------------- This doesn't seem to work, it gives a forward reference error: import std.container: RedBlackTree; RedBlackTree!int t; void main() { t = RedBlackTree!int(1); } This gives a similar error: import std.container: RedBlackTree; struct Foo { RedBlackTree!int t; void initialize() { t = RedBlackTree!int(1); } } void main() {} --------------------- I need to create an empty tree and add items to it later (where I declare a fixed-sized array of trees I don't know the items to add). How do you do it? This code doesn't work: import std.container: RedBlackTree; void main() { auto t = RedBlackTree!int(); t.insert(1); } --------------------- Is the tree insert() method documented in the HTML docs? --------------------- A tree is a kind of set, so instead of "insert()" I'd like a name like "add()". (But maybe this is not standard in D). --------------------- In theory an helper redBlackTree() function allows to write just this, with no need to write types: redBlackTree(1, 2, 3) --------------------- I have tried to use printTree(), but I have failed. I don't know what to give to it and the docs don't say that it requires -unittest If it's a private debug function then there's no need to give it a ddoc comment. --------------------- I've seen that the tree doesn't seem to contain a length. Using walkLength is an option, but a possible idea is to replace: struct RedBlackTree(T,alias less = "a < b",bool allowDuplicates = false) With: struct RedBlackTree(T, alias less="a < b", bool allowDuplicates=false, bool withLength=false) --------------------- If you need to add many value nodes quickly to the tree a memory pool may speed up mass allocation. This has some disadvantages too. --------------------- If I have to create a set of epsilon-approximate floating point numbers, what's the most efficient way to use a RedBlackTree to see if it contains a value x (or values very close to x), and othrwise add x to the tree? I was thinking about something like this, but it doesn't look very good because it performs three lookups (this function returns the number of items added): int approxInsert(RedBlackTree!double set, double x, double epsilon) { if (x in set) { return 0; } else { auto nextOnes = set.upperBound(x); if (!nextOnes.empty() && (nextOnes.front - x) < epsilon) { return 0; } auto precedentOnes = set.lowerBound(x); if (!precedentOnes.empty() && (x - precedentOnes.back) < epsilon) { return 0; } else { set.insert(x); return 1; } } } --------------------- This code runs with no errors, is this correct? The two Foos have the same x but different y: import std.container: RedBlackTree; struct Foo { int x, y; } void main() { auto t = RedBlackTree!(Foo, "a.x < b.x")(Foo(1,1)); assert(Foo(1,2) in t); } (In practice this looks like a tree map instead of a tree set.) If this is correct then I suggest to add this example to the std_container.html page. --------------------- Bye and thank you, bearophile
Jan 10 2011
On Mon, 10 Jan 2011 18:14:31 -0500, bearophile <bearophileHUGS lycos.com> wrote:I've had to use a search tree, so RedBlackTree was the right data structure. It seems to do what I need, so thank you for this useful data structure. Some of the things I write here are questions or things that show my ignorance about this implementation. --------------------- Please add some usage examples to this page, this is important and helps people reduce a lot the number of experiments to do to use this tree:I will do this, when I have some free time. If you want to submit some examples, I would gladly include them.--------------------- This doesn't seem to work, it gives a forward reference error: import std.container: RedBlackTree; RedBlackTree!int t; void main() { t = RedBlackTree!int(1); }Grrr... I had issues with forward references (you can see from this comment: http://www.dsource.org/projects/phobos/browser/trunk/phobos/std container.d#L4071), I thought by reordering the functions I had fixed it, but apparently, it resurfaces under certain conditions. Please vote for that bug. http://d.puremagic.com/issues/show_bug.cgi?id=2810 I don't really know what to do about fixing it. Most likely any 'fix' I try will result in some other situation not compiling. I probably should just avoid using auto, not being able to declare a red black tree as a global variable is a huge limitation.--------------------- I need to create an empty tree and add items to it later (where I declare a fixed-sized array of trees I don't know the items to add). How do you do it? This code doesn't work: import std.container: RedBlackTree; void main() { auto t = RedBlackTree!int(); t.insert(1); }RedBlackTree must be initialized with a constructor. Otherwise, your root node is null. I chose this path instead of checking for null on every function. I realize the mistake -- you cannot create an empty tree, because you cannot have a default constructor. I have another function that I use to help create trees during unit tests because IFTI can be weird. I will make this function public and always present, then you can create an empty tree like this: auto t = RedBlackTree!int.create(); If Andrei decides eventually that containers should be classes, then this problem goes away. Please bugzillize this--------------------- Is the tree insert() method documented in the HTML docs?I thought this would do it, but apparently it doesn't: http://www.dsource.org/projects/phobos/browser/trunk/phobos/std/container.d#L4457 I will try to make those docs show up.--------------------- A tree is a kind of set, so instead of "insert()" I'd like a name like "add()". (But maybe this is not standard in D).The function names must be consistent across containers, because the point is that complexity and semantic requirements are attached to the function name. The function names were decided long ago by Andrei, and I don't think insert is a bad name (I believe std::set and std::map in C++ STL uses insert).--------------------- In theory an helper redBlackTree() function allows to write just this, with no need to write types: redBlackTree(1, 2, 3)Yes, this should be done. Please make a bugzilla report. In fact, this can extend to all std.container types.--------------------- I have tried to use printTree(), but I have failed. I don't know what to give to it and the docs don't say that it requires -unittest If it's a private debug function then there's no need to give it a ddoc comment.It is a private debug function, only enabled when version = doRBChecks is enabled. When developing the red black node, the red-black algorithms to fix the tree are very complex and error prone to write. This function basically printed the tree layout in a horribly ugly fashion when the red-black properties were not preserved. It helped me find bugs, but is mostly no-longer needed unless I try some more optimizations. Please ignore the function. I will make sure the comment is not ddoc'd.--------------------- I've seen that the tree doesn't seem to contain a length. Using walkLength is an option, but a possible idea is to replace: struct RedBlackTree(T,alias less = "a < b",bool allowDuplicates = false) With: struct RedBlackTree(T, alias less="a < b", bool allowDuplicates=false, bool withLength=false)The reason for this is to keep it a reference-type (pImpl style), but I realize that I can easily fix this (I can just make the root node contain a length field). Please file a bugzilla to add length.--------------------- If you need to add many value nodes quickly to the tree a memory pool may speed up mass allocation. This has some disadvantages too.I have done this in dcollections, and it helps immensely in node-based containers. It is not clear to me how std.container will incorporate these types of things, so Andrei and I decided to just defer it for now. RedBlackTree is easily modified to use an allocator, all allocation and de-allocations are done via a centralized function. In fact, the same RBNode is used in dcollections. The allocator I created in dcollections is also the default allocator in Tango containers too. I anticipate that something like it will eventually make its way into phobos.--------------------- If I have to create a set of epsilon-approximate floating point numbers, what's the most efficient way to use a RedBlackTree to see if it contains a value x (or values very close to x), and othrwise add x to the tree? I was thinking about something like this, but it doesn't look very good because it performs three lookups (this function returns the number of items added): int approxInsert(RedBlackTree!double set, double x, double epsilon) { if (x in set) { return 0; } else { auto nextOnes = set.upperBound(x); if (!nextOnes.empty() && (nextOnes.front - x) < epsilon) { return 0; } auto precedentOnes = set.lowerBound(x); if (!precedentOnes.empty() && (x - precedentOnes.back) < epsilon) { return 0; } else { set.insert(x); return 1; } } }What about changing the predicate? I know it will likely violate the requirements of a strict ordering, but it should probably work.--------------------- This code runs with no errors, is this correct? The two Foos have the same x but different y: import std.container: RedBlackTree; struct Foo { int x, y; } void main() { auto t = RedBlackTree!(Foo, "a.x < b.x")(Foo(1,1)); assert(Foo(1,2) in t); } (In practice this looks like a tree map instead of a tree set.) If this is correct then I suggest to add this example to the std_container.html page.Yes, the intention is that you can create a map type in this fashion. If you define a predicate, then it is up to you to ensure the predicate makes sense for you. I'm unsure whether a map type could actually be implemented using RedBlackTree, but if it's not possible, I will make it possible.--------------------- Bye and thank you, bearophileThank you for your detailed analysis! I will hope to fix all the problems soon. -Steve
Jan 11 2011
On 01/11/2011 02:22 PM, Steven Schveighoffer wrote:I have thought at this naming issue, precisely, for a while. "add" is bad because of connotation with "addition". D does not use '+' as operator for putting new elements in a container: this is a very sensible choice imo. "insert" is bad because of "in-between" connotation: does not fit when putting an element at the end of a seq, even less for unordered containers. "put" instead seems to me the right term, obvious and general enough: one puts a new element in there. This can nicely adapt to very diverse container types such as sequences including stacks (no explicite index --> put at end), sets/AAs, trees,... Denis _________________ vita es estrany spir.wikidot.comA tree is a kind of set, so instead of "insert()" I'd like a name like "add()". (But maybe this is not standard in D).The function names must be consistent across containers, because the point is that complexity and semantic requirements are attached to the function name. The function names were decided long ago by Andrei, and I don't think insert is a bad name (I believe std::set and std::map in C++ STL uses insert).
Jan 11 2011
On Tue, 11 Jan 2011 15:13:13 +0100, spir wrote:On 01/11/2011 02:22 PM, Steven Schveighoffer wrote:I seem to remember this being discussed before, and 'put' being rejected for fear of confusion with the output range interface. -LarsI have thought at this naming issue, precisely, for a while. "add" is bad because of connotation with "addition". D does not use '+' as operator for putting new elements in a container: this is a very sensible choice imo. "insert" is bad because of "in-between" connotation: does not fit when putting an element at the end of a seq, even less for unordered containers. "put" instead seems to me the right term, obvious and general enough: one puts a new element in there. This can nicely adapt to very diverse container types such as sequences including stacks (no explicite index --> put at end), sets/AAs, trees,...A tree is a kind of set, so instead of "insert()" I'd like a name like "add()". (But maybe this is not standard in D).The function names must be consistent across containers, because the point is that complexity and semantic requirements are attached to the function name. The function names were decided long ago by Andrei, and I don't think insert is a bad name (I believe std::set and std::map in C++ STL uses insert).
Jan 12 2011
On Wed, 12 Jan 2011 03:45:11 -0500, Lars T. Kyllingstad <public kyllingen.nospamnet> wrote:On Tue, 11 Jan 2011 15:13:13 +0100, spir wrote:technically, it is an output range ;) -SteveOn 01/11/2011 02:22 PM, Steven Schveighoffer wrote:I seem to remember this being discussed before, and 'put' being rejected for fear of confusion with the output range interface.I have thought at this naming issue, precisely, for a while. "add" is bad because of connotation with "addition". D does not use '+' as operator for putting new elements in a container: this is a very sensible choice imo. "insert" is bad because of "in-between" connotation: does not fit when putting an element at the end of a seq, even less for unordered containers. "put" instead seems to me the right term, obvious and general enough: one puts a new element in there. This can nicely adapt to very diverse container types such as sequences including stacks (no explicite index --> put at end), sets/AAs, trees,...A tree is a kind of set, so instead of "insert()" I'd like a name like "add()". (But maybe this is not standard in D).The function names must be consistent across containers, because the point is that complexity and semantic requirements are attached to the function name. The function names were decided long ago by Andrei, and I don't think insert is a bad name (I believe std::set and std::map in C++ STL uses insert).
Jan 13 2011
Sorry for being so late with my answer. I am busy for some days. Steven Schveighoffer:I don't really know what to do about fixing it.It's not your fault. The forward reference errors I've found with RedBlackTree are so bad that in the end I've created a different algorithm that doesn't use a search tree.I probably should just avoid using auto,Good.not being able to declare a red black tree as a global variable is a huge limitation.<Also as instance variable for a struct.RedBlackTree must be initialized with a constructor. Otherwise, your root node is null. I chose this path instead of checking for null on every function.This is OK, given your original design choice of using PIMPL.auto t = RedBlackTree!int.create(); Please bugzillize thisOK.The function names must be consistent across containers, because the point is that complexity and semantic requirements are attached to the function name. The function names were decided long ago by Andrei, and I don't think insert is a bad name (I believe std::set and std::map in C++ STL uses insert).Python set uses "add", I prefer it for a set :-) But "insert" is acceptable.OK. I will create a single bugzilla report for RedBlackTree.redBlackTree(1, 2, 3)Yes, this should be done. Please make a bugzilla report. In fact, this can extend to all std.container types.It helped me find bugs, but is mostly no-longer needed unless I try some more optimizations. Please ignore the function. I will make sure the comment is not ddoc'd.Please hide it, but don't remove it from the module :-)OK. The withLength template argument plus some static ifs allow to have zero overhead for people that don't need the O(1) length. I think withLength is better true on default, because removing the length is an optimization...With: struct RedBlackTree(T, alias less="a < b", bool allowDuplicates=false, bool withLength=false)The reason for this is to keep it a reference-type (pImpl style), but I realize that I can easily fix this (I can just make the root node contain a length field). Please file a bugzilla to add length.I have done this in dcollections, and it helps immensely in node-based containers.I agree, I have seen similar things (D1 programs that become twice faster, etc), that's why I have suggested it.It is not clear to me how std.container will incorporate these types of things, so Andrei and I decided to just defer it for now.A template trait...What about changing the predicate? I know it will likely violate the requirements of a strict ordering, but it should probably work.I am not sure it will work, but in the end I have used a very different solution. Implementing an approximate (floating point) hash with a search tree is a common enough need, so you may add something like that approxInsert() as an free templated function in the std.collections module :-)Yes, the intention is that you can create a map type in this fashion. If you define a predicate, then it is up to you to ensure the predicate makes sense for you.OK. Then probably this important explanation (and example too, if you want) must be added to the RedBlackTree docs.Thank you for your detailed analysis!You are welcome. And thank you for implementing this important collection. Bye, bearophile
Jan 13 2011
OK. I will create a single bugzilla report for RedBlackTree.http://d.puremagic.com/issues/show_bug.cgi?id=5451 Bye, bearophile
Jan 13 2011