www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - std.container: RedBlackTree questions

reply "Ivan Kazmenko" <gassa mail.ru> writes:
Hi!

The key points of this lengthy letter are:

(0) I find RedBlackTree hard to use, and below is one story of 
trying to get it to work.

(1) Perhaps I did something wrong a few times.  Please show me a 
better way to go.

(2) Perhaps the library code could be improved as well to make 
the process more intuitive.

-----

I am trying to use RedBlackTree container to maintain a set of 
Elems, Elem being a non-trivial struct.  The problem is that I 
find the container hard to use.

First, I have to note that, instead of storing Elems directly in 
the RedBlackTree, I have a dynamic array of Elems and a 
RedBlackTree of integers pointing to that array to improve 
performance.  In my case, the latter proved to be a few times 
faster than the former because RedBlackTree moves the values 
around quite a bit.  Please comment if there is a common way to 
achieve a similar performance gain without decoupling.

Anyway, instead of storing say "data[a]" and "data[b]" in the 
tree, I store their indices "a" and "b" and compare these 
integers like "data[a] < data[b]".  Below is the story of my few 
steps to making this work.  I'll appreciate any comments on how I 
could improve or take a better direction on any of the steps.

The compiler is DMD 2.063.2, no compile options.  Struct Elem is 
replaced by an alias to long for simplicity.


1. The first try is to specify the comparison directly:
-----
import std.container;
alias Elem = long;
Elem [] data;
RedBlackTree !(int, "data[a] < data[b]") tree;
void main () { }
-----

This gives the following error:
rbt1.d(7): Error: template instance RedBlackTree!(int, "data[a] < 
data[b]")
does not match template declaration RedBlackTree(T, alias less = 
"a < b",
bool allowDuplicates = false) if 
(is(typeof(binaryFun!(less)(T.init, T.init))))

OK, that was predictable.  I never got to getting this to work 
beyond a simple "a < b" or "a + 1 > b + 1" or the like.  Is that 
even possible?


2. Let us try a lambda:
-----
import std.container;
alias Elem = long;
Elem [] data;
RedBlackTree !(int, (a, b) => data[a] < data[b]) tree;
void main ()
{
	tree = redBlackTree !((a, b) => data[a] < data[b]) (new int [0]);
}
-----

Fine with the declaration, but later, an error again:
rbt2.d(11): Error: cannot implicitly convert expression 
(redBlackTree(new
int[](0u))) of type rbt2.main.RedBlackTree!(int, 
__lambda6).RedBlackTree to
rbt2.RedBlackTree!(int, __lambda3).RedBlackTree

I see.  The compiler cannot tell that the lambdas are the same.  
Oh well.


3. An ordinary function then:
-----
import std.container;
alias Elem = long;
Elem [] data;
bool less_data (int a, int b) {return data[a] < data[b];}
RedBlackTree !(int, less_data) tree;
void main ()
{
	tree = redBlackTree !(less_data) (new int [0]);
}
-----

Aaaaaaand:
phobos\std\container.d(6553): Error: less_data (int a, int b) is 
not callable using argument types ()
phobos\std\range.d(611): Error: static assert "Cannot put a 
char[] into a Appender!(string)"
phobos\std\format.d(1433): instantiated from here: 
put!(Appender!(string), char[])
phobos\std\format.d(1335): instantiated from here: 
formatUnsigned!(Appender!(string), char)
phobos\std\format.d(1309): instantiated from here: 
formatIntegral!(Appender!(string), ulong, char)
phobos\std\format.d(2950): ... (4 instantiations, -v to show) ...
phobos\std\container.d(5541): instantiated from here: 
Tuple!(bool, "added", RBNode!(int)*, "n")
rbt3.d(12): instantiated from here: RedBlackTree!(int, less_data)

Ouch.  What?  That does not look like a user-friendly message at 
all.  Am I doing something very wrong?..


4. After a bit of guessing, I got this working by adding an empty 
template argument to the function.  Here it goes:
-----
import std.container;
alias Elem = long;
Elem [] data;
bool less_data () (int a, int b) {return data[a] < data[b];}
RedBlackTree !(int, less_data) tree;
void main ()
{
	tree = redBlackTree !(less_data) (new int [0]);
	data = [4, 3, 5];
	tree.insert (0);
	tree.insert (1);
	tree.insert (2);
}
-----

This compiles and runs fine.  Or does it?  Adding "-unittest" 
compiler option produces another bunch of errors:
phobos\std\container.d(5672): Error: void has no value
phobos\std\container.d(5672): Error: incompatible types for 
((less_data()(int a, int b)) == ("a < b")): 'void' and 'string'
phobos\std\container.d(5946): Error: void has no value
phobos\std\container.d(5946): Error: incompatible types for 
((less_data()(int a, int b)) == ("a < b")): 'void' and 'string'
phobos\std\container.d(6021): Error: void has no value
phobos\std\container.d(6021): Error: incompatible types for 
((less_data()(int a, int b)) == ("a < b")): 'void' and 'string'
phobos\std\container.d(6067): Error: void has no value
phobos\std\container.d(6067): Error: incompatible types for 
((less_data()(int a, int b)) == ("a < b")): 'void' and 'string'
phobos\std\container.d(6108): Error: void has no value
phobos\std\container.d(6108): Error: incompatible types for 
((less_data()(int a, int b)) == ("a < b")): 'void' and 'string'
phobos\std\container.d(6229): Error: void has no value
phobos\std\container.d(6229): Error: incompatible types for 
((less_data()(int a, int b)) == ("a < b")): 'void' and 'string'
phobos\std\container.d(6328): Error: void has no value
phobos\std\container.d(6328): Error: incompatible types for 
((less_data()(int a, int b)) == ("a < b")): 'void' and 'string'

So, now I have a working program unless I want to run a unittest, 
in which case, it does not compile.  I wonder what's so wrong 
with the examples 3 and 4, and how do I get them to compile 
regardless of compiler options?

On a relevant note, I find the unittests of RedBlackTree a bit 
excessive even when they compile successfully.  They seem to test 
the integrity of the whole tree every time a tree operation takes 
place, and that makes the unittests version of my local code run 
too slowly.  Is there a way to turn unittests on only for user 
code and turn them off for the standard library?

Ivan Kazmenko.
Aug 01 2013
next sibling parent reply "John Colvin" <john.loughran.colvin gmail.com> writes:
On Thursday, 1 August 2013 at 12:27:51 UTC, Ivan Kazmenko wrote:
 On a relevant note, I find the unittests of RedBlackTree a bit 
 excessive even when they compile successfully.  They seem to 
 test the integrity of the whole tree every time a tree 
 operation takes place, and that makes the unittests version of 
 my local code run too slowly.  Is there a way to turn unittests 
 on only for user code and turn them off for the standard 
 library?

 Ivan Kazmenko.
Unless you've compiled phobos with -unittest, the unittests in the standard library won't even exist in the binary, let alone take up time to run. Unless... I'm mistaken and actually for some bizarre reason unittests get dragged in from the import files, but that seems very unlikely.
Aug 01 2013
parent reply "Ivan Kazmenko" <gassa mail.ru> writes:
On Thursday, 1 August 2013 at 12:55:30 UTC, John Colvin wrote:
 On Thursday, 1 August 2013 at 12:27:51 UTC, Ivan Kazmenko wrote:
 On a relevant note, I find the unittests of RedBlackTree a bit 
 excessive even when they compile successfully.  They seem to 
 test the integrity of the whole tree every time a tree 
 operation takes place, and that makes the unittests version of 
 my local code run too slowly.  Is there a way to turn 
 unittests on only for user code and turn them off for the 
 standard library?

 Ivan Kazmenko.
Unless you've compiled phobos with -unittest, the unittests in the standard library won't even exist in the binary, let alone take up time to run. Unless... I'm mistaken and actually for some bizarre reason unittests get dragged in from the import files, but that seems very unlikely.
There is a
 version(unittest) version = RBDoChecks;
line and the following
 version(RBDoChecks) check();
calls in the tree implementation. Perhaps the approach is special to RedBlackTree. I agree that RBDoChecks can be useful occasionally, for example, when there is a bug in comparison function. But I would be happy if the aforementioned line is removed from the library, or at least a way to override it is provided.
Aug 01 2013
parent reply "Ivan Kazmenko" <gassa mail.ru> writes:
On Thursday, 1 August 2013 at 13:34:31 UTC, Ivan Kazmenko wrote:
 On Thursday, 1 August 2013 at 12:55:30 UTC, John Colvin wrote:
 On Thursday, 1 August 2013 at 12:27:51 UTC, Ivan Kazmenko 
 wrote:
 On a relevant note, I find the unittests of RedBlackTree a 
 bit excessive even when they compile successfully.  They seem 
 to test the integrity of the whole tree every time a tree 
 operation takes place, and that makes the unittests version 
 of my local code run too slowly.  Is there a way to turn 
 unittests on only for user code and turn them off for the 
 standard library?

 Ivan Kazmenko.
Unless you've compiled phobos with -unittest, the unittests in the standard library won't even exist in the binary, let alone take up time to run. Unless... I'm mistaken and actually for some bizarre reason unittests get dragged in from the import files, but that seems very unlikely.
There is a
 version(unittest) version = RBDoChecks;
line and the following
 version(RBDoChecks) check();
calls in the tree implementation. Perhaps the approach is special to RedBlackTree. I agree that RBDoChecks can be useful occasionally, for example, when there is a bug in comparison function. But I would be happy if the aforementioned line is removed from the library, or at least a way to override it is provided.
I'm confused. I think none of RedBlackTree code is pre-compiled since it has compile-time parameters. But when I comment the
 version(unittest) version = RBDoChecks;
line in Phobos and recompile my example 4, I still get the same errors with "-unittest" compiler option. What's going on?
Aug 02 2013
parent "John Colvin" <john.loughran.colvin gmail.com> writes:
On Friday, 2 August 2013 at 14:36:55 UTC, Ivan Kazmenko wrote:
 On Thursday, 1 August 2013 at 13:34:31 UTC, Ivan Kazmenko wrote:
 On Thursday, 1 August 2013 at 12:55:30 UTC, John Colvin wrote:
 On Thursday, 1 August 2013 at 12:27:51 UTC, Ivan Kazmenko 
 wrote:
 On a relevant note, I find the unittests of RedBlackTree a 
 bit excessive even when they compile successfully.  They 
 seem to test the integrity of the whole tree every time a 
 tree operation takes place, and that makes the unittests 
 version of my local code run too slowly.  Is there a way to 
 turn unittests on only for user code and turn them off for 
 the standard library?

 Ivan Kazmenko.
Unless you've compiled phobos with -unittest, the unittests in the standard library won't even exist in the binary, let alone take up time to run. Unless... I'm mistaken and actually for some bizarre reason unittests get dragged in from the import files, but that seems very unlikely.
There is a
 version(unittest) version = RBDoChecks;
line and the following
 version(RBDoChecks) check();
calls in the tree implementation. Perhaps the approach is special to RedBlackTree. I agree that RBDoChecks can be useful occasionally, for example, when there is a bug in comparison function. But I would be happy if the aforementioned line is removed from the library, or at least a way to override it is provided.
I'm confused. I think none of RedBlackTree code is pre-compiled since it has compile-time parameters. But when I comment the
 version(unittest) version = RBDoChecks;
line in Phobos and recompile my example 4, I still get the same errors with "-unittest" compiler option. What's going on?
ugh yeah that's not nice. There is also the variable doUnittest in there The unittests really need sorting out there. My suggestion for now would be to strip out all the unittests from it yourself. Also, create a bug report for it and hopefully someone will fix it.
Aug 02 2013
prev sibling parent reply "monarch_dodra" <monarchdodra gmail.com> writes:
On Thursday, 1 August 2013 at 12:27:51 UTC, Ivan Kazmenko wrote:
 Hi!

 The key points of this lengthy letter are:

 (0) I find RedBlackTree hard to use, and below is one story of 
 trying to get it to work.

 (1) Perhaps I did something wrong a few times.  Please show me 
 a better way to go.

 (2) Perhaps the library code could be improved as well to make 
 the process more intuitive.

 -----

 I am trying to use RedBlackTree container to maintain a set of 
 Elems, Elem being a non-trivial struct.  The problem is that I 
 find the container hard to use.

 First, I have to note that, instead of storing Elems directly 
 in the RedBlackTree, I have a dynamic array of Elems and a 
 RedBlackTree of integers pointing to that array to improve 
 performance.  In my case, the latter proved to be a few times 
 faster than the former because RedBlackTree moves the values 
 around quite a bit.  Please comment if there is a common way to 
 achieve a similar performance gain without decoupling.

 Anyway, instead of storing say "data[a]" and "data[b]" in the 
 tree, I store their indices "a" and "b" and compare these 
 integers like "data[a] < data[b]".  Below is the story of my 
 few steps to making this work.  I'll appreciate any comments on 
 how I could improve or take a better direction on any of the 
 steps.

 The compiler is DMD 2.063.2, no compile options.  Struct Elem 
 is replaced by an alias to long for simplicity.


 1. The first try is to specify the comparison directly:
 -----
 import std.container;
 alias Elem = long;
 Elem [] data;
 RedBlackTree !(int, "data[a] < data[b]") tree;
 void main () { }
 -----

 This gives the following error:
 rbt1.d(7): Error: template instance RedBlackTree!(int, "data[a] 
 < data[b]")
 does not match template declaration RedBlackTree(T, alias less 
 = "a < b",
 bool allowDuplicates = false) if 
 (is(typeof(binaryFun!(less)(T.init, T.init))))

 OK, that was predictable.  I never got to getting this to work 
 beyond a simple "a < b" or "a + 1 > b + 1" or the like.  Is 
 that even possible?


 2. Let us try a lambda:
 -----
 import std.container;
 alias Elem = long;
 Elem [] data;
 RedBlackTree !(int, (a, b) => data[a] < data[b]) tree;
 void main ()
 {
 	tree = redBlackTree !((a, b) => data[a] < data[b]) (new int 
 [0]);
 }
 -----

 Fine with the declaration, but later, an error again:
 rbt2.d(11): Error: cannot implicitly convert expression 
 (redBlackTree(new
 int[](0u))) of type rbt2.main.RedBlackTree!(int, 
 __lambda6).RedBlackTree to
 rbt2.RedBlackTree!(int, __lambda3).RedBlackTree

 I see.  The compiler cannot tell that the lambdas are the same.
  Oh well.


 3. An ordinary function then:
 -----
 import std.container;
 alias Elem = long;
 Elem [] data;
 bool less_data (int a, int b) {return data[a] < data[b];}
 RedBlackTree !(int, less_data) tree;
 void main ()
 {
 	tree = redBlackTree !(less_data) (new int [0]);
 }
 -----

 Aaaaaaand:
 phobos\std\container.d(6553): Error: less_data (int a, int b) 
 is not callable using argument types ()
 phobos\std\range.d(611): Error: static assert "Cannot put a 
 char[] into a Appender!(string)"
 phobos\std\format.d(1433): instantiated from here: 
 put!(Appender!(string), char[])
 phobos\std\format.d(1335): instantiated from here: 
 formatUnsigned!(Appender!(string), char)
 phobos\std\format.d(1309): instantiated from here: 
 formatIntegral!(Appender!(string), ulong, char)
 phobos\std\format.d(2950): ... (4 instantiations, -v to show) 
 ...
 phobos\std\container.d(5541): instantiated from here: 
 Tuple!(bool, "added", RBNode!(int)*, "n")
 rbt3.d(12): instantiated from here: RedBlackTree!(int, 
 less_data)

 Ouch.  What?  That does not look like a user-friendly message 
 at all.  Am I doing something very wrong?..


 4. After a bit of guessing, I got this working by adding an 
 empty template argument to the function.  Here it goes:
 -----
 import std.container;
 alias Elem = long;
 Elem [] data;
 bool less_data () (int a, int b) {return data[a] < data[b];}
 RedBlackTree !(int, less_data) tree;
 void main ()
 {
 	tree = redBlackTree !(less_data) (new int [0]);
 	data = [4, 3, 5];
 	tree.insert (0);
 	tree.insert (1);
 	tree.insert (2);
 }
 -----

 This compiles and runs fine.  Or does it?  Adding "-unittest" 
 compiler option produces another bunch of errors:
 phobos\std\container.d(5672): Error: void has no value
 phobos\std\container.d(5672): Error: incompatible types for 
 ((less_data()(int a, int b)) == ("a < b")): 'void' and 'string'
 phobos\std\container.d(5946): Error: void has no value
 phobos\std\container.d(5946): Error: incompatible types for 
 ((less_data()(int a, int b)) == ("a < b")): 'void' and 'string'
 phobos\std\container.d(6021): Error: void has no value
 phobos\std\container.d(6021): Error: incompatible types for 
 ((less_data()(int a, int b)) == ("a < b")): 'void' and 'string'
 phobos\std\container.d(6067): Error: void has no value
 phobos\std\container.d(6067): Error: incompatible types for 
 ((less_data()(int a, int b)) == ("a < b")): 'void' and 'string'
 phobos\std\container.d(6108): Error: void has no value
 phobos\std\container.d(6108): Error: incompatible types for 
 ((less_data()(int a, int b)) == ("a < b")): 'void' and 'string'
 phobos\std\container.d(6229): Error: void has no value
 phobos\std\container.d(6229): Error: incompatible types for 
 ((less_data()(int a, int b)) == ("a < b")): 'void' and 'string'
 phobos\std\container.d(6328): Error: void has no value
 phobos\std\container.d(6328): Error: incompatible types for 
 ((less_data()(int a, int b)) == ("a < b")): 'void' and 'string'

 So, now I have a working program unless I want to run a 
 unittest, in which case, it does not compile.  I wonder what's 
 so wrong with the examples 3 and 4, and how do I get them to 
 compile regardless of compiler options?

 On a relevant note, I find the unittests of RedBlackTree a bit 
 excessive even when they compile successfully.  They seem to 
 test the integrity of the whole tree every time a tree 
 operation takes place, and that makes the unittests version of 
 my local code run too slowly.  Is there a way to turn unittests 
 on only for user code and turn them off for the standard 
 library?

 Ivan Kazmenko.
N°4 is clearly a bug in the implementation. N°3, I'm not sure what is going on with the "put" bugs, but it seems to be fixed in head. In both case, one of the problems is that "redBlackTree!less_data" seems to be taking the wrong overload, which explains some of your problems. I'd use an explicit: tree = new RedBlackTree!(int, less_data)(); //No surprises In any case, yes, it is buggy.
Aug 01 2013
parent "Ivan Kazmenko" <gassa mail.ru> writes:
On Thursday, 1 August 2013 at 14:51:14 UTC, monarch_dodra wrote:
 On Thursday, 1 August 2013 at 12:27:51 UTC, Ivan Kazmenko wrote:
 I am trying to use RedBlackTree container to maintain a set of 
 Elems, Elem being a non-trivial struct.  The problem is that I 
 find the container hard to use.
 <...>
 So, now I have a working program unless I want to run a 
 unittest, in which case, it does not compile.  I wonder what's 
 so wrong with the examples 3 and 4, and how do I get them to 
 compile regardless of compiler options?

 On a relevant note, I find the unittests of RedBlackTree a bit 
 excessive even when they compile successfully.  They seem to 
 test the integrity of the whole tree every time a tree 
 operation takes place, and that makes the unittests version of 
 my local code run too slowly.  Is there a way to turn 
 unittests on only for user code and turn them off for the 
 standard library?

 Ivan Kazmenko.
N°4 is clearly a bug in the implementation. N°3, I'm not sure what is going on with the "put" bugs, but it seems to be fixed in head. In both case, one of the problems is that "redBlackTree!less_data" seems to be taking the wrong overload, which explains some of your problems. I'd use an explicit: tree = new RedBlackTree!(int, less_data)(); //No surprises In any case, yes, it is buggy.
Thank you for the answer. The explicit way indeed helps to compile Example 3 (without unittests) using the official DMD 2.063.2 release. So I should create an issue describing my problems with example 4 but not example 3? And perhaps a separate one for the slowdown with unittests turned on. Right? What about examples 1 and/or 2, can they be patched to work, too? And if not, why?
Aug 02 2013