www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - std.container.RedBlackTree versus C++ std::set

reply "Ivan Kazmenko" <gassa mail.ru> writes:
Hi!

I'm learning to use D collections properly, and I'm looking for a 
sorted data structure with logarithmic access time (i.e., a 
binary search tree will do, but a hash table would not help). As 
far as I can see, std.container.RedBlackTree is exactly what I 
need. However, I am not sure if I use it as intended as its 
performance seems inferior to a C++ STL solution (e.g., std::set).

To be more specific, right now I wonder what is the best (or 
intended) way to store an object in the RedBlackTree: should it 
be a class reference, or a struct (passed by value), or something 
quirkier like an integer pointing into an array or a simple 
pointer. The rest of my program suggested to use structs, but the 
whole thing turned out to be rather slow, and the profiler told 
me that these structs are being copied around much more than I 
anticipated.

And so I wrote a minimalistic test program to check the number of 
copy (postblit) constructor calls. Here is the D version:

-----
import std.container;
import std.stdio;

immutable int LIMIT = 100000;

struct element
{
	static int postblit_counter;

	long x;

	int opCmp (ref element other)
	{
		return (x > other.x) - (x < other.x);
	}

	this (long nx)
	{
		x = nx;
	}

	this (ref this)
	{
		assert (x == x);
		postblit_counter++;
	}
}

alias RedBlackTree !(element) container;

void main ()
{
	auto f = new container ();
	element.postblit_counter = 0;
	foreach (i; 0..LIMIT)
	{
		f.insert (element (i));
	}
	writefln ("%s", element.postblit_counter);
}
-----

And now here is a C++ equivalent:

-----
#include <cstdio>
#include <set>
#include <stdint.h>

const int LIMIT = 100000;

using namespace std;

struct element
{
	static int postblit_counter;

	int64_t x;

	bool operator < (const element other) const
	{
		return (x < other.x);
	}

	element (int64_t nx)
	{
		x = nx;
	}

	element (const element & other)
	{
		postblit_counter++;
	}
};

int element::postblit_counter;

typedef set <element> container;

int main (void)
{
	container f;
	element::postblit_counter = 0;
	for (int i = 0; i < LIMIT; i++)
	{
		f.insert (element (i));
	}
	printf ("%d\n", element::postblit_counter);
	return 0;
}
-----

And the results are:
D2 (DMD 2.059, -O):             11,389,556
C++ (MinGW GCC 4.7.2, -O2):      3,072,387

As you can see, in order to insert 100,000 elements, D needs a 
few times more copy constructor calls than C++. However, as far 
as I know, the internal structure used by C++ std::set is the 
very same red-black tree! Am I doing something wrong? And if not, 
what is this cost paid for, are there any features that 
RedBlackTree possesses and STL set doesn't?

Personally, I don't see why at all we should call the copy 
constructor more than once per element. I mean, if we intend to 
build a generic data structure, we sure need an internal node 
object with some extra bytes (internal references and counters) 
per each element, at least in the case of a red-black tree. So 
why don't we just bind each element to that internal node once 
and for all, and then, as long as the node is in the structure, 
use the data contained in it only by reference? What do we 
achieve if we choose to use it by value somewhere?

And if the intended design is "use class references", will that 
create an overhead for garbage collector later on?

...well, thank you for reading it to this point.

-----
Ivan Kazmenko.
Feb 13 2013
next sibling parent reply "Ivan Kazmenko" <gassa mail.ru> writes:
P.S. More on C++ version:

 Personally, I don't see why at all we should call the copy 
 constructor more than once per element. I mean, if we intend to 
 build a generic data structure, we sure need an internal node 
 object with some extra bytes (internal references and counters) 
 per each element, at least in the case of a red-black tree. So 
 why don't we just bind each element to that internal node once 
 and for all, and then, as long as the node is in the structure, 
 use the data contained in it only by reference? What do we 
 achieve if we choose to use it by value somewhere?
Well, when I just changed "bool operator < (const element other) const" to "bool operator < (const element & other) const" in the C++ version of the program, it gave me the exact 100,000 copy constructor calls which justifies the above paragraph. Ouch... I had hopes GCC -O2 optimizes such obvious cases; perhaps it's not that obvious from the compiler point of view. ----- Ivan Kazmenko.
Feb 13 2013
parent reply "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Thursday, February 14, 2013 00:33:26 Ivan Kazmenko wrote:
 P.S. More on C++ version:
 Personally, I don't see why at all we should call the copy
 constructor more than once per element. I mean, if we intend to
 build a generic data structure, we sure need an internal node
 object with some extra bytes (internal references and counters)
 per each element, at least in the case of a red-black tree. So
 why don't we just bind each element to that internal node once
 and for all, and then, as long as the node is in the structure,
 use the data contained in it only by reference? What do we
 achieve if we choose to use it by value somewhere?
Well, when I just changed "bool operator < (const element other) const" to "bool operator < (const element & other) const" in the C++ version of the program, it gave me the exact 100,000 copy constructor calls which justifies the above paragraph. Ouch... I had hopes GCC -O2 optimizes such obvious cases; perhaps it's not that obvious from the compiler point of view.
The biggest performance problem is the GC. The nodes in the RBT are allocated with the GC, so depending on what how many elements you're inserting, how often you're removing them, etc., it could have performance issues. A better GC is in the works but obviously hasn't made it in yet. Also, work is being done on designing custom allocators that std.container can use, but that also has a ways to go. So, the main items which would help make std.container's RBT perform on par with std::set are in the works but not ready yet. Also, it could make a big difference if you use gdc or ldc rather than dmd. dmd compiles code much faster than they do, but its optimizer is much worse - gdc and ldc consistently produce faster code, and if you're compiling the C++ code with gcc, then gdc is a much better comparison anyway, since then both the C++ and D code are using the same compiler backend. - Jonathan M Davis
Feb 13 2013
parent reply "Ivan Kazmenko" <gassa mail.ru> writes:
First, thank you all for the replies!

Jonathan M Davis wrote:
 Also, it could make a big difference if you use gdc or ldc 
 rather than dmd.
Thank you for the suggestion, I'll check that. However, I don't expect the GCC optimizer to reduce the number of calls in this specific test, since it did not kick in in the more obvious case as I mentioned. Right now, I am more concerned with the performance of the front-end (RedBlackTree): if it currently can't be tuned to perform fast enough, I'll just know that I should implement my own version of a binary search tree container for my needs. Rob T wrote:
 You can check if disabling the GC just before the insert process
 improves the performance. You may see 3x performance 
 improvement.
 Disabling is safe provided you re-enable, this can be done
 reliably with scope(exit) or something similar.
Hmm, it indeed performed faster, though not 3x in my setup. However, the number of copy constructor calls stayed the same. Steven Schveighoffer wrote:
 I find the number of postblit calls excessive too.
 I will have to look into why that happens.
 I can say that once an element is allocated, it is not moved or 
 copied.
Could it be easily enforced in the library? Something like this: when accessing data only for internal use in the data structure, use an interface that disables assignment? Can such cases be distinguished from the ones when the library user actively wants to copy data? Steven Schveighoffer wrote:
 I will note that std.container.RedBlackTree is a port of 
 dcollections'
 RBTree implementation.
Thank you for pointing that out. Perhaps I'll have to try the test with that implementation, too. monarch_dodra wrote:
 Keep in mind that C++ and D have very different philosophies
 regarding copy construction.
 ...
 The conclusion is that the comparison is not fair: D's pass by
 value is not *very* different from C++'s pass by ref (in amount
 of data copied).
Well, perhaps I didn't make myself clear enough. The comparison I posted is not intended to be a fair comparison of languages in the general case! It is just my use case stripped to a minimal example that still shows the number of calls correctly. In the actual use case, I have something like ----- struct element { long value; int [] one; int [] two; this (this) { one = one.dup; two = two.dup; } } ----- Sure I could think of some other way to represent the data I need as an object, this one just seems the most intuitive. monarch_dodra wrote:
 If you *do* want a fair-er comparison, then I'd suggest you
 implement a ".dup" on your object, and see how many times THAT
 gets called. Guess what: 0.
Right, but it's the copy constructor that gets called when I copy a struct, and in these cases, I actually want it to copy the arrays as well. Otherwise, the element object may end up in a bad state. Do you imply that I should implement dup for every struct I create and then leave the copy constructor empty? As the library makes use of the copy constructor and not dup, I fear it would be an incorrect design, since the library will make broken copies of my struct and pass them around. monarch_dodra wrote:
 I'm not saying D's approach is perfect, just that D's library is
 optimized for D-like types.
and Rob T wrote:
 I agree. There are cases where structs make a lot of sense,
 usually when they are very simple simple and contain no pointers
 or references, otherwise structs should be avoided in favor of
 classes to avoid doing copy/move constructors and to avoid
 concerns over performance optimizations. With classes, only
 certain points in your code require that a duplicate copy be 
 made
 of the class instance, the majority of code need only to pass
 around a reference which is the default behavior - easy and 
 fast!
So, you suggest passing things by reference once they grow beyond several bytes in size? Wouldn't that mean having to constantly pay attention to some counter-intuitive code when I actually want my objects to behave like values, not references (e.g., matrices or other complex math objects)? In my example case with arrays one and two, I do want my whole object to obey value semantics. Once again, thank you all for attention. ----- Ivan Kazmenko.
Feb 14 2013
parent reply "Rob T" <alanb ucora.com> writes:
When I look at the std.container source code, it seems that the 
payload element is passed by value multiple times unnecessarily, 
so to minimize copy construction you'll have to implement element 
as a class and implement a dup function for it.

I expect performance will increase substantially even with the 
extra heap allocations.

Alternatively, you can implement your own version of RedBlackTree 
with pass by ref optimizations for insertion of struct elements.

--rt
Feb 14 2013
next sibling parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Thu, 14 Feb 2013 13:45:30 -0500, Rob T <alanb ucora.com> wrote:

 When I look at the std.container source code, it seems that the payload  
 element is passed by value multiple times unnecessarily, so to minimize  
 copy construction you'll have to implement element as a class and  
 implement a dup function for it.

 I expect performance will increase substantially even with the extra  
 heap allocations.

 Alternatively, you can implement your own version of RedBlackTree with  
 pass by ref optimizations for insertion of struct elements.
If it was pass by ref, then rbt.insert(5) would not work. And in fact, I wouldn't want things passed by ref if the element is int. I have to admit, I did not consider expensive postblits when I designed it. Almost all my testing is with integer types. Unfortunately, D is in this state where taking a ref parameter means strictly lvalue. So passing rvalues will not work. D does not have the notion that C++ has where const type& can accept both rvalue and lvalue. We desperately need something similar. -Steve
Feb 14 2013
next sibling parent "Rob T" <alanb ucora.com> writes:
On Thursday, 14 February 2013 at 19:31:36 UTC, Steven 
Schveighoffer wrote:
 On Thu, 14 Feb 2013 13:45:30 -0500, Rob T <alanb ucora.com> 
 wrote:

 When I look at the std.container source code, it seems that 
 the payload element is passed by value multiple times 
 unnecessarily, so to minimize copy construction you'll have to 
 implement element as a class and implement a dup function for 
 it.

 I expect performance will increase substantially even with the 
 extra heap allocations.

 Alternatively, you can implement your own version of 
 RedBlackTree with pass by ref optimizations for insertion of 
 struct elements.
If it was pass by ref, then rbt.insert(5) would not work. And in fact, I wouldn't want things passed by ref if the element is int. I have to admit, I did not consider expensive postblits when I designed it. Almost all my testing is with integer types.
You can make it work by implementing two insert functions, one with pass by ref and one with pass by value. The pass by value implementation simply calls the pass by ref version, and everything from there is pass by ref where it makes sense as an optimization. I understand the implementation difficulties, because we have multiple situations to consider and optimize for, such as simple POD which can often pass by value efficiently, but then we have complex structs that do not pass by value efficiently and should be passed by ref instead, but there's no way to detect this easily (I suppose you could try looking at the size and adjust the pass-by method somehow, but ...), and finally if you accept a class then you have different needs because a class is a pointer not a POD value, again I suppose you can detect if a class is being passed and adjust the implementation but that introduces more complexity ... In some ways this is where C++ got it right from a consistency POV, where a class is no different from a struct, so whatever works for a class also works for a struct. In D, structs and classes operate in fundamentally different ways making implementing generics more difficult to do.
 Unfortunately, D is in this state where taking a ref parameter 
 means strictly lvalue.  So passing rvalues will not work.  D 
 does not have the notion that C++ has where const type& can 
 accept both rvalue and lvalue.  We desperately need something 
 similar.

 -Steve
Yes, I agree. --rt
Feb 14 2013
prev sibling next sibling parent reply "Namespace" <rswhite4 googlemail.com> writes:
On Thursday, 14 February 2013 at 19:31:36 UTC, Steven 
Schveighoffer wrote:
 On Thu, 14 Feb 2013 13:45:30 -0500, Rob T <alanb ucora.com> 
 wrote:

 When I look at the std.container source code, it seems that 
 the payload element is passed by value multiple times 
 unnecessarily, so to minimize copy construction you'll have to 
 implement element as a class and implement a dup function for 
 it.

 I expect performance will increase substantially even with the 
 extra heap allocations.

 Alternatively, you can implement your own version of 
 RedBlackTree with pass by ref optimizations for insertion of 
 struct elements.
If it was pass by ref, then rbt.insert(5) would not work. And in fact, I wouldn't want things passed by ref if the element is int. I have to admit, I did not consider expensive postblits when I designed it. Almost all my testing is with integer types. Unfortunately, D is in this state where taking a ref parameter means strictly lvalue. So passing rvalues will not work. D does not have the notion that C++ has where const type& can accept both rvalue and lvalue. We desperately need something similar. -Steve
There are discussions for such thing for almost a year - but nothing has changed so far.
Feb 14 2013
parent reply "Rob T" <alanb ucora.com> writes:
On Thursday, 14 February 2013 at 19:56:33 UTC, Namespace wrote:
 There are discussions for such thing for almost a year - but 
 nothing has changed so far.
I think its finally on the priority radar, but there's still other things in more dire need to be resolved first, like property and lack of shared library support, and probably something else too ... --rt
Feb 14 2013
parent "Namespace" <rswhite4 googlemail.com> writes:
On Thursday, 14 February 2013 at 20:20:59 UTC, Rob T wrote:
 On Thursday, 14 February 2013 at 19:56:33 UTC, Namespace wrote:
 There are discussions for such thing for almost a year - but 
 nothing has changed so far.
I think its finally on the priority radar, but there's still other things in more dire need to be resolved first, like property and lack of shared library support, and probably something else too ... --rt
Yes, not to mention   - A package system for Phobos   - Half Float Datatypes   - JSON   - bug fixes and merging of new features in D1 (although no further development of D1 should be done since December 31) and probably something else too.
Feb 14 2013
prev sibling next sibling parent reply "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Thursday, February 14, 2013 14:31:36 Steven Schveighoffer wrote:
 On Thu, 14 Feb 2013 13:45:30 -0500, Rob T <alanb ucora.com> wrote:
 When I look at the std.container source code, it seems that the payload
 element is passed by value multiple times unnecessarily, so to minimize
 copy construction you'll have to implement element as a class and
 implement a dup function for it.
 
 I expect performance will increase substantially even with the extra
 heap allocations.
 
 Alternatively, you can implement your own version of RedBlackTree with
 pass by ref optimizations for insertion of struct elements.
If it was pass by ref, then rbt.insert(5) would not work. And in fact, I wouldn't want things passed by ref if the element is int. I have to admit, I did not consider expensive postblits when I designed it. Almost all my testing is with integer types.
And Walter and Andrei both seem to think that having expensive postlbits is a design mistake. The range stuff sort of tries to support it with the move* primitives, but Andrei's been wanting to argue for just assuming that postblits are cheap. And Walter was actually arguing at one point for making it illegal (probably by getting rid of postblit all together - I don't remember the details). Plenty of the rest of us don't agree, but to some extent, it is true that having an expensive postblit is asking for it. On a related note, we still need a solution for dealing with const postblits (probably be introducing copy constructors - IIRC Andrei was considering phasing out postblits in favor of copy constructors, which would be unnecessary, but we do need a way to deal with deep copying const structs which hold reference types which need to be deep copied.
 Unfortunately, D is in this state where taking a ref parameter means
 strictly lvalue. So passing rvalues will not work. D does not have the
 notion that C++ has where const type& can accept both rvalue and lvalue.
 We desperately need something similar.
auto ref is the current solution for templated functions, but we do need a more general solution. DIP25 got created in part due to discussions on that, but it probably needs to be fully sorted out before we can fully sort out the const& solution. - Jonathan M Davis
Feb 14 2013
parent reply "Dan" <dbdavidson yahoo.com> writes:
On Thursday, 14 February 2013 at 20:53:26 UTC, Jonathan M Davis 
wrote:
 And Walter and Andrei both seem to think that having expensive 
 postlbits is a
 design mistake. The range stuff sort of tries to support it 
 with the move*
 primitives, but Andrei's been wanting to argue for just 
 assuming that
 postblits are cheap. And Walter was actually arguing at one 
 point for making
 it illegal (probably by getting rid of postblit all together - 
 I don't
 remember the details). Plenty of the rest of us don't agree, 
 but to some
 extent, it is true that having an expensive postblit is asking 
 for it. On a
 related note, we still need a solution for dealing with const 
 postblits
 (probably be introducing copy constructors - IIRC Andrei was 
 considering
 phasing out postblits in favor of copy constructors, which 
 would be
 unnecessary, but we do need a way to deal with deep copying 
 const structs
 which hold reference types which need to be deep copied.
When you say things like "Andrei was considering phasing out postblits..." I get nervous. Can we please have some comments from Andrei/Walter about what the plans are? I'm not asking for the ultimate solution - just to know the general direction and where this issue stands at present. Is there anything any of us can do to help move this forward? Regarding replacing expensive postblits with copy constructors - how does that help? The expense will remain the same - if you need a transitive copy, you need a transitive copy. Thanks Dan
Feb 15 2013
next sibling parent reply "Namespace" <rswhite4 googlemail.com> writes:
 When you say things like "Andrei was considering phasing out 
 postblits..." I get nervous. Can we please have some comments 
 from Andrei/Walter about what the plans are? I'm not asking for 
 the ultimate solution - just to know the general direction and 
 where this issue stands at present. Is there anything any of us 
 can do to help move this forward?
Make a new thread where Walter and Andrei may submit an official staement. There are certainly other interested and this thread is probably not for more suitable.
Feb 15 2013
parent reply "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Friday, February 15, 2013 19:43:07 Namespace wrote:
 When you say things like "Andrei was considering phasing out
 postblits..." I get nervous. Can we please have some comments
 from Andrei/Walter about what the plans are? I'm not asking for
 the ultimate solution - just to know the general direction and
 where this issue stands at present. Is there anything any of us
 can do to help move this forward?
Make a new thread where Walter and Andrei may submit an official staement. There are certainly other interested and this thread is probably not for more suitable.
They're not going to do that until they've decided, and AFAIK, they haven't. Attempting to essentially confront them and get them to say what they plan to do generally doesn't get you anywhere - especially since they're unlikely to say much until they actually are planning to do it, and it's usually the case that no decision has been made. - Jonathan M Davis
Feb 15 2013
parent reply "Namespace" <rswhite4 googlemail.com> writes:
 They're not going to do that until they've decided, and AFAIK, 
 they haven't.
 Attempting to essentially confront them and get them to say 
 what they plan to
 do generally doesn't get you anywhere - especially since 
 they're unlikely to
 say much until they actually are planning to do it, and it's 
 usually the case
 that no decision has been made.

 - Jonathan M Davis
I did not mean the const& thing. This is hopeless and will certainly take some months or years. My post was based on "[...] probably by getting rid of postblit all together.".
Feb 15 2013
next sibling parent "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Friday, February 15, 2013 21:16:51 Namespace wrote:
 They're not going to do that until they've decided, and AFAIK,
 they haven't.
 Attempting to essentially confront them and get them to say
 what they plan to
 do generally doesn't get you anywhere - especially since
 they're unlikely to
 say much until they actually are planning to do it, and it's
 usually the case
 that no decision has been made.
 
 - Jonathan M Davis
I did not mean the const& thing. This is hopeless and will certainly take some months or years. My post was based on "[...] probably by getting rid of postblit all together.".
I know. And my answer is the same. There's nothing special about the const& issue and how that's going. That's how it pretty much always goes. Until it becomes a priority for Walter, it'll go nowhere. Sometimes, if someone makes it a priority and implements it (creating a pull request for it), then that solves the problem, but sometimes that's still not enough, because Walter needs to sort out whether it's the direction that we want to go, which means making it a priority to sort that out, which means that the pull request sits around for a while and may or may not make it in. Bug fixes and more minor language changes do get merged all the time, but if a language change is significant enough, it generally requires that Walter spend some time on it and make a decision on it (even if he's not implementing the changes), and he's generally very slow about that if it's not near the top of his list of priorities. - Jonathan M Davis
Feb 15 2013
prev sibling parent reply "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Friday, February 15, 2013 21:37:46 Jonathan M Davis wrote:
 On Friday, February 15, 2013 21:16:51 Namespace wrote:
 They're not going to do that until they've decided, and AFAIK,
 they haven't.
 Attempting to essentially confront them and get them to say
 what they plan to
 do generally doesn't get you anywhere - especially since
 they're unlikely to
 say much until they actually are planning to do it, and it's
 usually the case
 that no decision has been made.
 
 - Jonathan M Davis
I did not mean the const& thing. This is hopeless and will certainly take some months or years. My post was based on "[...] probably by getting rid of postblit all together.".
I know. And my answer is the same. There's nothing special about the const& issue and how that's going. That's how it pretty much always goes. Until it becomes a priority for Walter, it'll go nowhere. Sometimes, if someone makes it a priority and implements it (creating a pull request for it), then that solves the problem, but sometimes that's still not enough, because Walter needs to sort out whether it's the direction that we want to go, which means making it a priority to sort that out, which means that the pull request sits around for a while and may or may not make it in. Bug fixes and more minor language changes do get merged all the time, but if a language change is significant enough, it generally requires that Walter spend some time on it and make a decision on it (even if he's not implementing the changes), and he's generally very slow about that if it's not near the top of his list of priorities.
I should probably add that bringing up discussions on how to solve problems in the language can be of benefit, because they often result in good discussions that help lead toward a solution, and that can lead towards that solution ending up in the language (and depending on the discussion Andrei and/or Walter will get involved). But simply asking them about the state of things or essentially contronting them and trying to get them to give official statements on their plans doesn't generally work. If nothing else, they simply don't generally say what they're planning to do before they've actually decided on it. They might start discussions to discuss something that they're considering, but they're unlikely to state any kind of actual plans before they've actually decided, which generally means no official statements about what's coming or planned. - Jonathan M Davis
Feb 15 2013
next sibling parent reply "Namespace" <rswhite4 googlemail.com> writes:
Again: My intention was not const&.

And you're right, but there was so many discussions about const& 
(since dmd 2.057; also in the last few days) and as every 
discussion here: after page 2 is the topic changed.
I'm also very sure that neither Walter nor Andrei see a 
(important) reason for something similar as const& because they 
don't need it. And if you don't need something, the priority for 
such thing is very low.
So everything we can do (after that much requests and 
discussions) is to wait what and when they will decide something.
I count the versions.
Feb 15 2013
parent reply "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Friday, February 15, 2013 22:41:24 Namespace wrote:
 Again: My intention was not const&.
I know. What I'm saying applies in general.
 And you're right, but there was so many discussions about const&
 (since dmd 2.057; also in the last few days) and as every
 discussion here: after page 2 is the topic changed.
 I'm also very sure that neither Walter nor Andrei see a
 (important) reason for something similar as const& because they
 don't need it. And if you don't need something, the priority for
 such thing is very low.
 So everything we can do (after that much requests and
 discussions) is to wait what and when they will decide something.
 I count the versions.
They definitely agree that it's a problem. They just don't see it as having as high a priority as you do. For instance, as far as ref-related problems go, the issue that DIP 25 covers is something that they consider to be a much bigger issue (since it deals with safe and SafeD). It'll be fixed though. It's just a question of how and when. - Jonathan M Davis
Feb 15 2013
parent "Namespace" <rswhite4 googlemail.com> writes:
 They definitely agree that it's a problem. They just don't see 
 it as having as
 high a priority as you do.
For me it is not a priority (anymore). I use C++ again.
 It's just a question of how and when.

 - Jonathan M Davis
Yes as I said: I count the versions. :)
Feb 15 2013
prev sibling parent "Dan" <dbdavidson yahoo.com> writes:
On Friday, 15 February 2013 at 20:58:30 UTC, Jonathan M Davis 
wrote:
 I should probably add that bringing up discussions on how to 
 solve problems in
 the language can be of benefit, because they often result in 
 good discussions
 that help lead toward a solution, and that can lead towards 
 that solution
 ending up in the language (and depending on the discussion 
 Andrei and/or
 Walter will get involved). But simply asking them about the 
 state of things or
 essentially contronting them and trying to get them to give 
 official statements
 on their plans doesn't generally work. If nothing else, they 
 simply don't
 generally say what they're planning to do before they've 
 actually decided on
 it. They might start discussions to discuss something that 
 they're
 considering, but they're unlikely to state any kind of actual 
 plans before
 they've actually decided, which generally means no official 
 statements about
 what's coming or planned.

 - Jonathan M Davis
Well, that is a somewhat passive yet diplomatic stance. If the suggestion is - "get involved, suggest a fix or a solution and maybe you'll get your answers" that is reasonable. But for my part, I am not a language developer - just a language user and fan of D who is betting on D by using it for my livelihood. Regarding postblits - you've mentioned several times that Andrei and Walter have discussed and have a solution in the works. Does it make sense to go do a DIP or something just to illicit feedback on a topic? You have some of the most helpful responses I've seen on the group - extremely detailed and accurate. Thanks. However, being in the know you sometimes you let slip a little bit here and there that causes me to give pause and at a minimum want to get more information. Like "And Walter was actually arguing at one point for making it illegal (probably by getting rid of postblit all together - I don't remember the details)". It is not unreasonable for users to want to know what is the future direction for a certain at risk feature. Regarding a suggestion for postblit/copy ctor to get things going here is mine: - Implement copy constructors for structs. Allow for overloading on 'this' and 'const this', but no qualifier on the constructor itself. - Allow the user to disable copy constructors. - Provide for a default copy constructor that does regular copies of fundamental types, shallow copies of arrays and associative arrays, and calls corresponding copy constructors (or postblits during some grace period). In terms of migration path: - Allow the struct designer to choose to use either postblit or copy constructor but not both. For the case of using them to provide deep copy semantics the approaches are different. For postblits you are asking - "what fields do I need to copy" with the comfort of knowing all other fields were already copied with the blit. For copy constructors it is different because you have to think about all fields since a blit will not have happened first. To ease the transition provide the ability for the copy constructor to call blitSourceIntoThis(source). - Leave postblits alone. Allow them to continue as is and phase them out 1 or more years after successful implementation of copy constructors - Often the only purpose of the copy constructor is to do a deep copy. This could easily be provided by the compiler or phobos. Further, consider standardizing on the ".dup" and ".idup" conventions for this purpose. Thanks Dan
Feb 16 2013
prev sibling next sibling parent "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Friday, February 15, 2013 19:14:21 Dan wrote:
 When you say things like "Andrei was considering phasing out
 postblits..." I get nervous. Can we please have some comments
 from Andrei/Walter about what the plans are? I'm not asking for
 the ultimate solution - just to know the general direction and
 where this issue stands at present. Is there anything any of us
 can do to help move this forward?
Yeah, they're unlikely to say much until something has actually been decided. Frequently, if you try and pin them down on stuff like that, you won't get an answer, especially if you ask out of the blue. But regardless, I wouldn't expect postblits to be gotten rid of (because that would break code). Rather, they would be discouraged, and newer code would be written to use copy constructors instead. But nothing has been decided for certain AFAIK.
 Regarding replacing expensive postblits with copy constructors -
 how does that help? The expense will remain the same - if you
 need a transitive copy, you need a transitive copy.
The problem has nothing to do with const. The problem const and immutable. It's impossible to write a const or immutable postblit constructor, because postblit operates by doing the copy and then allowing you to change it, whereas const and immutable cannot be changed. You need to construct the copy as const or immutable, which means you need to make the changes when you make the copy, which basically means that you need a copy constructor. Efficiency comes in when arguing whether a postblit or copy constructor is appropriate in the first place or whether they can be implemented efficiently. Walter seems to be of the opinion that they should only be used in cases where they're actually efficient (i.e. you never use them for deep copying, only for handling stuff like ref-counting). He's much more in favor of COW. But much as he thinks that way, I don't think that there's any real risk of that being codified in the language at this point. Not only would it be very difficult to restrict postblit like that (if not impossible), but it would break tons of code (which Walter is very much against), and most everyone would scream (and several already did when he brought it up). Andrei likes the idea of basically requiring that ranges have O(1) postblits/copy constructors so that algorithms can assume that copying is cheap, but he's never decided that we're going that way for certain, and even if we did, it would just mean that we wouldn't write code which cared about the expense of copying ranges, and anyone who wrote ranges that were expensive to copy would take a performance hit. Honestly, I suspect that that's mostly where we are already. And given how ranges work, an expensive postblit doesn't really make sense anyway. No, I think that the only real concern here is if/when we'll get copy constructors so that we can copy const or immutable structs that require postblits or copy constructors. Walter and Andrei are very concerned about achieving and maintaining stability of the language and library, so any major breaking changes (especially to the language) would need to have very good reasons and would likely be discussed in detail in the newsgroup first. So, while I think that some of what they've said would be of some concern if D were still being designed (as opposed to being stabilized), given where we are, it's unlikely to be a concern. - Jonathan M Davis
Feb 15 2013
prev sibling parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Fri, 15 Feb 2013 13:14:21 -0500, Dan <dbdavidson yahoo.com> wrote:

 When you say things like "Andrei was considering phasing out  
 postblits..." I get nervous. Can we please have some comments from  
 Andrei/Walter about what the plans are? I'm not asking for the ultimate  
 solution - just to know the general direction and where this issue  
 stands at present. Is there anything any of us can do to help move this  
 forward?
I think the plan was to *assume* postblits were inexpensive, not to disallow them. -Steve
Feb 15 2013
parent "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Friday, February 15, 2013 16:53:36 Steven Schveighoffer wrote:
 On Fri, 15 Feb 2013 13:14:21 -0500, Dan <dbdavidson yahoo.com> wrote:
 When you say things like "Andrei was considering phasing out
 postblits..." I get nervous. Can we please have some comments from
 Andrei/Walter about what the plans are? I'm not asking for the ultimate
 solution - just to know the general direction and where this issue
 stands at present. Is there anything any of us can do to help move this
 forward?
I think the plan was to *assume* postblits were inexpensive, not to disallow them.
Yes, but as I recall, in discussions on const and postblit, Andrei made a comment about how postblit hadn't really worked out and we need copy constructors (in which case, we'd move to normally using copy constructors as the norm). But I don't believe that Andrei suggested getting rid of postblit completely (due to the code breakage it would cause if nothing else), and while I think that Walter and Andrei would deal with the postblit issue very differently if they were to start over, there are no plans at present to get rid of it. - Jonathan M Davis
Feb 15 2013
prev sibling parent reply "Dan" <dbdavidson yahoo.com> writes:
On Thursday, 14 February 2013 at 19:31:36 UTC, Steven 
Schveighoffer wrote:
 If it was pass by ref, then rbt.insert(5) would not work.  And 
 in fact, I wouldn't want things passed by ref if the element is 
 int.
What is the reason for the second objection - is just performance? Is performance really an issue? From this thread it looks like fundamental types by ref or value is not really a big deal in terms of performance. OTOH - as size and/or postblit execution gets expensive the cost *is* significant. --------------------------- 4 bytes: using cref_(int size) took 29[ms] 4 bytes: using inref(int size) took 29[ms] 4 bytes: using in___(int size) took 30[ms] 8 bytes: using cref_(int size) took 29[ms] 8 bytes: using inref(int size) took 28[ms] 8 bytes: using in___(int size) took 31[ms] ... 128 bytes: using cref_(int size) took 29[ms] 128 bytes: using inref(int size) took 29[ms] 128 bytes: using in___(int size) took 290[ms] ---------------------------
 I have to admit, I did not consider expensive postblits when I 
 designed it.  Almost all my testing is with integer types.
For performance, it seems by ref should always be preferred in generic code because you can not know the cost of postblit - or copy construction if that were a future solution for structs. Granted the no rvalue passing is a pain - but is it a big deal in library/generic code? Thanks, Dan
Feb 15 2013
parent "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Friday, February 15, 2013 20:12:08 Dan wrote:
 Granted the no rvalue passing is a pain - but is it a big deal in
 library/generic code?
Yes, it's a big deal. Any place that it's part of an API, it's a big deal. It forces you to create what are essentially useless variables all over the place if you require ref. We _will_ have a solution similar to const& at some point, and that's what the correct way to go to solve this problem will be, but that situation still needs to be sorted out (and the situation with DIP 25 - http://wiki.dlang.org/DIP25 - could have an impact on that given that it's also has to do with changes to ref). - Jonathan M Davis
Feb 15 2013
prev sibling parent reply "Ivan Kazmenko" <gassa mail.ru> writes:
Rob T wrote:
 When I look at the std.container source code, it seems that the 
 payload element is passed by value multiple times 
 unnecessarily, so to minimize copy construction you'll have to 
 implement element as a class and implement a dup function for 
 it.
Thank you all for assistance, I've finally pinned it down! It's the default magic "a < b" that devours the most copies of my precious structs! Once I change the declaration: ----- alias RedBlackTree !(element) container; ----- to ----- bool lessfun (ref element a, ref element b) { return a < b; } alias RedBlackTree !(element, lessfun) container; ----- ... I get only exactly 3 * LIMIT postblit constructor calls, which is 300,000 instead of 11,389,556. While I still want to find where the excess 200,000 calls originate from, this is definitely asymptotically better than before: O (n) versus O (n log n) calls. And it is now less of a bottleneck in my original program. This raises the question to the default behavior of std.functional.binaryFun. Sure, for integers, class references and small types, the current way is the best. On the other hand, for large structs and maybe some other complex objects that obey value semantics, it seems it would be beneficial to, given a string like "a < b", produce a function taking arguments by reference and not by value. Maybe there is some simple criterion (size of type greater than size_t - seems bad? type is a struct - better?) which can be used with static if to generate "(ref a, ref b)" instead of "(a, b)" version by default. Of course, all that could only work if there is a more detailed hierarchy of binary functions, at least a binaryFunConst taking const arguments. ----- Ivan Kazmenko.
Feb 14 2013
next sibling parent reply "bearophile" <bearophileHUGS lycos.com> writes:
Ivan Kazmenko:

 Thank you all for assistance, I've finally pinned it down!
This is great. The first step to solve a problem is to find it. The second step is to formalize it. So I suggest you to create a small benchmark that shows what you mean (it should not include RedBlackTree) and put it in Bugzilla, with some timings. Bye, bearophile
Feb 14 2013
parent "monarch_dodra" <monarchdodra gmail.com> writes:
On Thursday, 14 February 2013 at 20:44:13 UTC, bearophile wrote:
 Ivan Kazmenko:

 Thank you all for assistance, I've finally pinned it down!
This is great. The first step to solve a problem is to find it. The second step is to formalize it. So I suggest you to create a small benchmark that shows what you mean (it should not include RedBlackTree) and put it in Bugzilla, with some timings. Bye, bearophile
I've already begun investigating, and am now benching my solution. But filing issues never hurt anyone.
Feb 14 2013
prev sibling next sibling parent reply "Namespace" <rswhite4 googlemail.com> writes:
A small but nice intentioned advice of me is this article: 
http://dlang.org/dstyle.html
"The names of user-defined types should be PascalCased, which is 
the same as camelCased except that the first letter is uppercase."
Feb 14 2013
parent "Ivan Kazmenko" <gassa mail.ru> writes:
Namespace wrote:
 A small but nice intentioned advice of me is this article: 
 http://dlang.org/dstyle.html
 "The names of user-defined types should be PascalCased, which 
 is the same as camelCased except that the first letter is 
 uppercase."
Thanks for pointing that out. Indeed, PascalCase seems convenient to use with D structs. On the other side, I still have aesthetic considerations against camelCase and prefer underscores_between_words for variables in the programs I write for personal use.
Feb 14 2013
prev sibling next sibling parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Thu, 14 Feb 2013 15:33:19 -0500, Ivan Kazmenko <gassa mail.ru> wrote:

 Rob T wrote:
 When I look at the std.container source code, it seems that the payload  
 element is passed by value multiple times unnecessarily, so to minimize  
 copy construction you'll have to implement element as a class and  
 implement a dup function for it.
Thank you all for assistance, I've finally pinned it down! It's the default magic "a < b" that devours the most copies of my precious structs! Once I change the declaration: ----- alias RedBlackTree !(element) container; ----- to ----- bool lessfun (ref element a, ref element b) { return a < b; } alias RedBlackTree !(element, lessfun) container; ----- ... I get only exactly 3 * LIMIT postblit constructor calls, which is 300,000 instead of 11,389,556. While I still want to find where the excess 200,000 calls originate from, this is definitely asymptotically better than before: O (n) versus O (n log n) calls. And it is now less of a bottleneck in my original program. This raises the question to the default behavior of std.functional.binaryFun. Sure, for integers, class references and small types, the current way is the best. On the other hand, for large structs and maybe some other complex objects that obey value semantics, it seems it would be beneficial to, given a string like "a < b", produce a function taking arguments by reference and not by value. Maybe there is some simple criterion (size of type greater than size_t - seems bad? type is a struct - better?) which can be used with static if to generate "(ref a, ref b)" instead of "(a, b)" version by default. Of course, all that could only work if there is a more detailed hierarchy of binary functions, at least a binaryFunConst taking const arguments.
It is great that you found this! Again, the issue here is D has restrictions on ref parameters. There are no such restrictions on pass-by-value parameters. So by default, the predicate tries by-value. In this case, RedBlackTree is always assured of having lvalues for the predicate, so I can probably make the default use pass-by-ref. If you could file a bug, that would be most helpful. http://d.puremagic.com/issues/enter_bug.cgi?product=D -Steve
Feb 14 2013
parent reply "Ivan Kazmenko" <gassa mail.ru> writes:
Steven Schveighoffer wrote:

 Again, the issue here is D has restrictions on ref parameters.  
 There are no such restrictions on pass-by-value parameters.  So 
 by default, the predicate tries by-value.
I am starting to understand the difficulty with ref being too restrictive. Some kind of "const ref" in the language would have helped indeed. But it would probably cause some allocation difficulties for actual constants... or worse?
 In this case, RedBlackTree is always assured of having lvalues 
 for the predicate, so I can probably make the default use 
 pass-by-ref.  If you could file a bug, that would be most 
 helpful.
Done! http://d.puremagic.com/issues/show_bug.cgi?id=9513
Feb 14 2013
parent "Namespace" <rswhite4 googlemail.com> writes:
 I am starting to understand the difficulty with ref being too 
 restrictive.  Some kind of "const ref" in the language would 
 have helped indeed.  But it would probably cause some 
 allocation difficulties for actual constants... or worse?
Search for 'auto ref' or right after 'rvalue references' and read some of the discussions. It's instructive and then you can understand the whole topic maybe a little more. No thread comes to a solution and unfortunately the answer of Walter and Andrei are very very thin on the subject.
Feb 15 2013
prev sibling next sibling parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Thu, 14 Feb 2013 15:33:19 -0500, Ivan Kazmenko <gassa mail.ru> wrote:

 ... I get only exactly 3 * LIMIT postblit constructor calls, which is  
 300,000 instead of 11,389,556.  While I still want to find where the  
 excess 200,000 calls originate from, this is definitely asymptotically  
 better than before
As I said elsewhere, the issue is D not having a good solution for accepting both rvalues and lvalues by ref. Jonathan mentioned auto ref, but it's not exactly implemented correctly. In C++, it just takes all parameters as const T&, and that works for both lvalues and rvalues. The extra 200k copies are from the implementation taking all parameters by value. If we didn't do that, sans a working auto ref, things like tree.insert(element(5)) would fail to compile (cannot pass an rvalue by reference) -Steve
Feb 14 2013
parent "Namespace" <rswhite4 googlemail.com> writes:
 As I said elsewhere, the issue is D not having a good solution 
 for accepting both rvalues and lvalues by ref.  Jonathan 
 mentioned auto ref, but it's not exactly implemented correctly.

 In C++, it just takes all parameters as const T&, and that 
 works for both lvalues and rvalues.

 The extra 200k copies are from the implementation taking all 
 parameters by value.  If we didn't do that, sans a working auto 
 ref, things like tree.insert(element(5)) would fail to compile 
 (cannot pass an rvalue by reference)

 -Steve
And Jonathan said also that 'auto ref' is not the solution for non-template functions. Probably 'auto ref' will never work for non-template functions. So a general solution must be found first.
Feb 14 2013
prev sibling parent reply "monarch_dodra" <monarchdodra gmail.com> writes:
On Thursday, 14 February 2013 at 20:33:21 UTC, Ivan Kazmenko 
wrote:
 Rob T wrote:
 When I look at the std.container source code, it seems that 
 the payload element is passed by value multiple times 
 unnecessarily, so to minimize copy construction you'll have to 
 implement element as a class and implement a dup function for 
 it.
Thank you all for assistance, I've finally pinned it down! It's the default magic "a < b" that devours the most copies of my precious structs! Once I change the declaration: ----- alias RedBlackTree !(element) container; ----- to ----- bool lessfun (ref element a, ref element b) { return a < b; } alias RedBlackTree !(element, lessfun) container; ----- ... I get only exactly 3 * LIMIT postblit constructor calls, which is 300,000 instead of 11,389,556. While I still want to find where the excess 200,000 calls originate from, this is definitely asymptotically better than before: O (n) versus O (n log n) calls. And it is now less of a bottleneck in my original program. This raises the question to the default behavior of std.functional.binaryFun. Sure, for integers, class references and small types, the current way is the best. On the other hand, for large structs and maybe some other complex objects that obey value semantics, it seems it would be beneficial to, given a string like "a < b", produce a function taking arguments by reference and not by value. Maybe there is some simple criterion (size of type greater than size_t - seems bad? type is a struct - better?) which can be used with static if to generate "(ref a, ref b)" instead of "(a, b)" version by default. Of course, all that could only work if there is a more detailed hierarchy of binary functions, at least a binaryFunConst taking const arguments. ----- Ivan Kazmenko.
Also keep in mind that "a < b" is implemented as two calls to "a.opCmp(b)", and "a.opCmp(b)" is itself usually implemented as two calls to "something < something else" (!) as convenient as opCmp is for the implementation implementation, it is not always the best in terms of raw power. If you have can write a straight up less(S1, S2) function, such as "a.i < b.i", then using that can buy you a hefty amount of time.
Feb 15 2013
parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Fri, 15 Feb 2013 12:11:55 -0500, monarch_dodra <monarchdodra gmail.com>  
wrote:

 Also keep in mind that "a < b" is implemented as two calls to  
 "a.opCmp(b)", and "a.opCmp(b)" is itself usually implemented as two  
 calls to "something < something else" (!)
Huh? a < b is implemented as a.opCmp(b) < 0, not two calls to a.opCmp(b). HOWEVER, when you get to the point of executing a == b, it's implemented as !(a < b) && !(b < a), which could more efficiently be rewritten as a.opEquals(b) or a.opCmp(b) == 0. In the case of red black tree however, the nature of traversal makes it so that you only have the redundant call once (when you have found the insertion spot, you already have called one half, you just have to call the other half). For dcollections, I don't accept a < b, I accept the int-returning compare function (which I think is pass by ref, so shouldn't have this problem). The std.container.RedBlackTree is a port of the same code, but adapted to the more phobos-style functions. -Steve
Feb 15 2013
next sibling parent reply "Dan" <dbdavidson yahoo.com> writes:
On Friday, 15 February 2013 at 17:42:30 UTC, Steven Schveighoffer 
wrote:
 On Fri, 15 Feb 2013 12:11:55 -0500, monarch_dodra 
 <monarchdodra gmail.com> wrote:

 Also keep in mind that "a < b" is implemented as two calls to 
 "a.opCmp(b)", and "a.opCmp(b)" is itself usually implemented 
 as two calls to "something < something else" (!)
Huh? a < b is implemented as a.opCmp(b) < 0, not two calls to a.opCmp(b).
Yes. But isn't opCmp still more work than just a<b? It must be because it returns three states instead of two. So, I think the general warning on opCmp is warranted especially with structs. For structs we could use compile time reflection to provide automatic efficient opCmp behavior. Section 3.1 here outlines a way: https://github.com/patefacio/d-help/blob/master/doc/canonical.pdf Any comments appreciated. Thanks Dan
 HOWEVER, when you get to the point of executing a == b, it's 
 implemented as !(a < b) && !(b < a), which could more 
 efficiently be rewritten as a.opEquals(b) or a.opCmp(b) == 0.
Feb 15 2013
parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Fri, 15 Feb 2013 13:02:39 -0500, Dan <dbdavidson yahoo.com> wrote:

 On Friday, 15 February 2013 at 17:42:30 UTC, Steven Schveighoffer wrote:
 On Fri, 15 Feb 2013 12:11:55 -0500, monarch_dodra  
 <monarchdodra gmail.com> wrote:

 Also keep in mind that "a < b" is implemented as two calls to  
 "a.opCmp(b)", and "a.opCmp(b)" is itself usually implemented as two  
 calls to "something < something else" (!)
Huh? a < b is implemented as a.opCmp(b) < 0, not two calls to a.opCmp(b).
Yes. But isn't opCmp still more work than just a<b?
It depends: bool less(int a, int b) { return a < b; } In assembly (pardon my ignorance of asm syntax): ld a AX; sub AX, b; jneg L1; // jump if AX is less than zero ld AX 0; ret; L1: ld AX 1; ret; With opcmp: int opCmp(int a, int b) { return a - b; } Which simplifies the assembly significantly. Note we don't have to shoehorn the result into a bool (which must be 0 or 1). For other cases, such as a complex struct, less might be more efficient. And really, when it comes down to it, it all depends on how you use it. If most of the time you are using a < b or b < a, then less certainly can be more efficient (but not necessarily). But if you want to do <=, ==, or
=, then opCmp can be more efficient.
-Steve
Feb 15 2013
parent reply "monarch_dodra" <monarchdodra gmail.com> writes:
On Friday, 15 February 2013 at 22:06:55 UTC, Steven Schveighoffer 
wrote:
 With opcmp:

 int opCmp(int a, int b)
 {
    return a - b;
 }
Genius! Now I feel bad for every time I've done integral opCmp with double ternary operators :(
Feb 15 2013
next sibling parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Fri, Feb 15, 2013 at 11:49:53PM +0100, monarch_dodra wrote:
 On Friday, 15 February 2013 at 22:06:55 UTC, Steven Schveighoffer
 wrote:
With opcmp:

int opCmp(int a, int b)
{
   return a - b;
}
Genius! Now I feel bad for every time I've done integral opCmp with double ternary operators :(
You know that this is the origin of C's strcmp, right? Ever wondered why the C standard says that it returns <0, 0, or >0, as opposed to fixed values like -1, 0, 1? Now you know why: it's because you could implement strcmp by repeatedly subtracting respective pairs of characters from the two strings, and return the result when it's no longer zero (in some CPUs' assembly language, this is a single branch-on-nonzero instruction). No need for extraneous overhead to convert the result of the comparison into fixed values of any sort. D simply inherited from the genius of C's original designers. :) Of course, this optimization may no longer be relevant in today's CPU's, due to pipelining, hazards, branch prediction, etc.. T -- If creativity is stifled by rigid discipline, then it is not true creativity.
Feb 15 2013
prev sibling next sibling parent "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Friday, February 15, 2013 23:49:53 monarch_dodra wrote:
 On Friday, 15 February 2013 at 22:06:55 UTC, Steven Schveighoffer
 
 wrote:
 With opcmp:
 
 int opCmp(int a, int b)
 {
 
 return a - b;
 
 }
Genius! Now I feel bad for every time I've done integral opCmp with double ternary operators :(
Except that you have to worry about overflow if you use subtraction, so it's actually a bad idea in the general case. - Jonathan M Davis
Feb 15 2013
prev sibling parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Fri, 15 Feb 2013 17:49:53 -0500, monarch_dodra <monarchdodra gmail.com>  
wrote:

 On Friday, 15 February 2013 at 22:06:55 UTC, Steven Schveighoffer wrote:
 With opcmp:

 int opCmp(int a, int b)
 {
    return a - b;
 }
Genius! Now I feel bad for every time I've done integral opCmp with double ternary operators :(
Actually, I'm not as smart as you think ;) There needs to be some extra checking, possibly with the carry flag. For example: 0x70000000 - (-0x70000000) would be less than zero as a resulting integer. But it would work with shorts or bytes! In any case, the example was bad, the point that which is better depends on the situation is still correct. -Steve
Feb 15 2013
next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Fri, 15 Feb 2013 18:41:51 -0500, Steven Schveighoffer  
<schveiguy yahoo.com> wrote:

 There needs to be some extra checking, possibly with the carry flag.
And further to that point, my generated assembly for a < b also is similarly flawed, so it still may be better to do opCmp. -Steve
Feb 15 2013
prev sibling next sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
Steven Schveighoffer:

 Actually, I'm not as smart as you think ;)

 There needs to be some extra checking, possibly with the carry 
 flag.

 For example:

 0x70000000 - (-0x70000000)

 would be less than zero as a resulting integer.

 But it would work with shorts or bytes!
I remember a page about such problem in the old D wiki. I was unable to find it inside the new D wiki. Bye, bearophile
Feb 15 2013
prev sibling parent "Dan" <dbdavidson yahoo.com> writes:
On Friday, 15 February 2013 at 23:41:50 UTC, Steven Schveighoffer 
wrote:
 In any case, the example was bad, the point that which is 
 better depends on the situation is still correct.
The compiler has to do something with "a<b". I would hope for a fundamental type it can avoid the lowering and just do whatever assembly is optimal. For a non-fundamental type, what is the case where (a.opCmp(b)<0) is more optimal than a comparable (a.opLess(b))? It is easy to create an inefficient implementation of opCmp that calls "<" twice. struct S { string s; int opCmp ( const ref S other ) { if ( s<other.s ) { return -1; } else if ( other.s<s ) { return 1; } else { return 0; } } } My point was, you can use compile time reflection to generate a suitable opCmp that uses s.opCmp if it exists or does the long version of two comparisons if not. Thanks Dan
Feb 15 2013
prev sibling parent "monarch_dodra" <monarchdodra gmail.com> writes:
On Friday, 15 February 2013 at 17:42:30 UTC, Steven Schveighoffer 
wrote:
 On Fri, 15 Feb 2013 12:11:55 -0500, monarch_dodra 
 <monarchdodra gmail.com> wrote:

 Also keep in mind that "a < b" is implemented as two calls to 
 "a.opCmp(b)", and "a.opCmp(b)" is itself usually implemented 
 as two calls to "something < something else" (!)
Huh? a < b is implemented as a.opCmp(b) < 0, not two calls to a.opCmp(b).
Right... sorry. But still, calling opCmp is usually a bit more expensive that a simpler function dedicated to giving you a < b. 99% of the time, I agree that it doesn't matter mutch, and the gain in semantic power trumps performance, but for a dedicated algorithm, eg sort, there is a definite performance difference...
 HOWEVER, when you get to the point of executing a == b, it's 
 implemented  as !(a < b) && !(b < a), which could more 
 efficiently be rewritten as  a.opEquals(b) or a.opCmp(b) == 0.
And that.
Feb 15 2013
prev sibling next sibling parent reply "Rob T" <alanb ucora.com> writes:
You can check if disabling the GC just before the insert process 
improves the performance. You may see 3x performance improvement. 
Disabling is safe provided you re-enable, this can be done 
reliably with scope(exit) or something similar.

import core.memory;

// ...


void main ()
{
	auto f = new container ();
	element.postblit_counter = 0;
         GC.disable;
	foreach (i; 0..LIMIT)
	{
		f.insert (element (i));
	}
         GC.enable;
	writefln ("%s", element.postblit_counter);
}

--rt
Feb 13 2013
parent reply FG <home fgda.pl> writes:
On 2013-02-14 01:09, Rob T wrote:
 You can check if disabling the GC just before the insert process improves the
 performance. You may see 3x performance improvement. Disabling is safe provided
 you re-enable, this can be done reliably with scope(exit) or something similar.
How did you know? It was 3x in my case. :) Well, even more but the program had to wait 2 seconds at the end to collect. With LIMIT at 10M, g++: 5.0s, gdc: 27.0s and 8.7s with GC.disable. Internal memory handling by containers - yeah, can't wait to see that happen!
Feb 13 2013
parent "Rob T" <alanb ucora.com> writes:
On Thursday, 14 February 2013 at 00:25:15 UTC, FG wrote:
 On 2013-02-14 01:09, Rob T wrote:
 You can check if disabling the GC just before the insert 
 process improves the
 performance. You may see 3x performance improvement. Disabling 
 is safe provided
 you re-enable, this can be done reliably with scope(exit) or 
 something similar.
How did you know? It was 3x in my case. :) Well, even more but the program had to wait 2 seconds at the end to collect. With LIMIT at 10M, g++: 5.0s, gdc: 27.0s and 8.7s with GC.disable. Internal memory handling by containers - yeah, can't wait to see that happen!
Oh yes, I forgot to mention you can expect ~2 second additional delay from re-enabling the CG. How did I know? Read this ... http://forum.dlang.org/thread/waarzqtfcxuzhzdelhtt forum.dlang.org So it seems that we can get close to g++ performance with some manual tweaking of the GC, which is good, but you have to know this and know where to make the adjustments. What we really need is a better GC with manual fine tuning control over how it operates, with an ability to gain feed back from it to understand where and when it may be doing something stupid, and even better than that is if we could use D completely without the GC, which may be a requirement for certain types of applications. Unfortunately, a certain subset of D depends on there being a method of automated garbage collection in place, so currently you cannot use all of D without a GC. --rt
Feb 13 2013
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Wed, 13 Feb 2013 18:22:02 -0500, Ivan Kazmenko <gassa mail.ru> wrote:

 Hi!

 I'm learning to use D collections properly, and I'm looking for a sorted  
 data structure with logarithmic access time (i.e., a binary search tree  
 will do, but a hash table would not help). As far as I can see,  
 std.container.RedBlackTree is exactly what I need. However, I am not  
 sure if I use it as intended as its performance seems inferior to a C++  
 STL solution (e.g., std::set).

 To be more specific, right now I wonder what is the best (or intended)  
 way to store an object in the RedBlackTree: should it be a class  
 reference, or a struct (passed by value), or something quirkier like an  
 integer pointing into an array or a simple pointer. The rest of my  
 program suggested to use structs, but the whole thing turned out to be  
 rather slow, and the profiler told me that these structs are being  
 copied around much more than I anticipated.

 And so I wrote a minimalistic test program to check the number of copy  
 (postblit) constructor calls. Here is the D version:

 -----
 import std.container;
 import std.stdio;

 immutable int LIMIT = 100000;

 struct element
 {
 	static int postblit_counter;

 	long x;

 	int opCmp (ref element other)
 	{
 		return (x > other.x) - (x < other.x);
 	}

 	this (long nx)
 	{
 		x = nx;
 	}

 	this (ref this)
 	{
 		assert (x == x);
 		postblit_counter++;
 	}
 }

 alias RedBlackTree !(element) container;

 void main ()
 {
 	auto f = new container ();
 	element.postblit_counter = 0;
 	foreach (i; 0..LIMIT)
 	{
 		f.insert (element (i));
 	}
 	writefln ("%s", element.postblit_counter);
 }
 -----

 And now here is a C++ equivalent:

 -----
 #include <cstdio>
 #include <set>
 #include <stdint.h>

 const int LIMIT = 100000;

 using namespace std;

 struct element
 {
 	static int postblit_counter;

 	int64_t x;

 	bool operator < (const element other) const
 	{
 		return (x < other.x);
 	}

 	element (int64_t nx)
 	{
 		x = nx;
 	}

 	element (const element & other)
 	{
 		postblit_counter++;
 	}
 };

 int element::postblit_counter;

 typedef set <element> container;

 int main (void)
 {
 	container f;
 	element::postblit_counter = 0;
 	for (int i = 0; i < LIMIT; i++)
 	{
 		f.insert (element (i));
 	}
 	printf ("%d\n", element::postblit_counter);
 	return 0;
 }
 -----

 And the results are:
 D2 (DMD 2.059, -O):             11,389,556
 C++ (MinGW GCC 4.7.2, -O2):      3,072,387

 As you can see, in order to insert 100,000 elements, D needs a few times  
 more copy constructor calls than C++. However, as far as I know, the  
 internal structure used by C++ std::set is the very same red-black tree!  
 Am I doing something wrong? And if not, what is this cost paid for, are  
 there any features that RedBlackTree possesses and STL set doesn't?

 Personally, I don't see why at all we should call the copy constructor  
 more than once per element. I mean, if we intend to build a generic data  
 structure, we sure need an internal node object with some extra bytes  
 (internal references and counters) per each element, at least in the  
 case of a red-black tree. So why don't we just bind each element to that  
 internal node once and for all, and then, as long as the node is in the  
 structure, use the data contained in it only by reference? What do we  
 achieve if we choose to use it by value somewhere?
I find the number of postblit calls excessive too. I will have to look into why that happens. I can say that once an element is allocated, it is not moved or copied. That is, the red black algorithm does not copy the data around inside the tree, it rotates by adjusting pointers. I required this because it makes cursors sane (they always point at the same element). I will note that std.container.RedBlackTree is a port of dcollections' RBTree implementation. As std.container's API is unfinished, there certainly is room for improvement on the std.container version. Dcollections' API is more complete, and may perform better. Specifically, RBTree inside dcollections uses a custom allocator that should significantly reduce runtime.
 And if the intended design is "use class references", will that create  
 an overhead for garbage collector later on?
It is not designed only for classes, your usage above should be perfectly valid. -Steve
Feb 13 2013
prev sibling next sibling parent reply "monarch_dodra" <monarchdodra gmail.com> writes:
On Wednesday, 13 February 2013 at 23:22:03 UTC, Ivan Kazmenko 
wrote:
 Hi!
 -----
 Ivan Kazmenko.
Keep in mind that C++ and D have very different philosophies regarding copy construction. C++ has "strong ownership", so for example, whenever you copy a string/vector, or pass it by value, the whole damn thing has to be completely duplicated. It's so expansive that C++ has gone out of its way to use pass-by-reference semantics, as well (now) RValue references. D, on the other hand (being GC), has a "shallow ownership" philosopy: Basically, when you make a copy, nothing happens, appart from the binary copy of the object. You want to pass a string by value? that's 8 bytes of data. Period. The *most* complex CC I have *ever* seen in D merely increments a reference counter. That's it. 90% (99%?) of the time, there isn't even a postblit implemented. Because of this, D has embraced pass by value. Not to mention, if you are using classes, those are already pointer types. Why pass them by ref? -------- The conclusion is that the comparison is not fair: D's pass by value is not *very* different from C++'s pass by ref (in amount of data copied). If you *do* want a fair-er comparison, then I'd suggest you implement a ".dup" on your object, and see how many times THAT gets called. Guess what: 0. I'm not saying D's approach is perfect, just that D's library is optimized for D-like types. Just the same, a lot the stl is not properlly optimized for little POD types, as they get passed around by ref, when their actual size is the same as the pointer referencing them...
Feb 13 2013
parent reply "Rob T" <alanb ucora.com> writes:
On Thursday, 14 February 2013 at 06:56:38 UTC, monarch_dodra 
wrote:
 On Wednesday, 13 February 2013 at 23:22:03 UTC, Ivan Kazmenko 
 wrote:
 Hi!
 -----
 Ivan Kazmenko.
Keep in mind that C++ and D have very different philosophies regarding copy construction. C++ has "strong ownership", so for example, whenever you copy a string/vector, or pass it by value, the whole damn thing has to be completely duplicated. It's so expansive that C++ has gone out of its way to use pass-by-reference semantics, as well (now) RValue references. D, on the other hand (being GC), has a "shallow ownership" philosopy: Basically, when you make a copy, nothing happens, appart from the binary copy of the object. You want to pass a string by value? that's 8 bytes of data. Period. The *most* complex CC I have *ever* seen in D merely increments a reference counter. That's it. 90% (99%?) of the time, there isn't even a postblit implemented. Because of this, D has embraced pass by value. Not to mention, if you are using classes, those are already pointer types. Why pass them by ref? -------- The conclusion is that the comparison is not fair: D's pass by value is not *very* different from C++'s pass by ref (in amount of data copied). If you *do* want a fair-er comparison, then I'd suggest you implement a ".dup" on your object, and see how many times THAT gets called. Guess what: 0. I'm not saying D's approach is perfect, just that D's library is optimized for D-like types. Just the same, a lot the stl is not properlly optimized for little POD types, as they get passed around by ref, when their actual size is the same as the pointer referencing them...
I agree. There are cases where structs make a lot of sense, usually when they are very simple simple and contain no pointers or references, otherwise structs should be avoided in favor of classes to avoid doing copy/move constructors and to avoid concerns over performance optimizations. With classes, only certain points in your code require that a duplicate copy be made of the class instance, the majority of code need only to pass around a reference which is the default behavior - easy and fast! --rt
Feb 14 2013
parent reply "Namespace" <rswhite4 googlemail.com> writes:
 I agree. There are cases where structs make a lot of sense, 
 usually when they are very simple simple and contain no 
 pointers or references, otherwise structs should be avoided in 
 favor of classes to avoid doing copy/move constructors and to 
 avoid concerns over performance optimizations. With classes, 
 only certain points in your code require that a duplicate copy 
 be made of the class instance, the majority of code need only 
 to pass around a reference which is the default behavior - easy 
 and fast!

 --rt
It sounds like Java philosophy: Objects are always better. Or have I misunderstood? In any case, a intensive use of classes / objects, instead of structs would be also an enormous heap effort. I usually try to use as few classes as possible.
Feb 14 2013
parent reply "monarch_dodra" <monarchdodra gmail.com> writes:
On Thursday, 14 February 2013 at 10:23:18 UTC, Namespace wrote:
 I agree. There are cases where structs make a lot of sense, 
 usually when they are very simple simple and contain no 
 pointers or references, otherwise structs should be avoided in 
 favor of classes to avoid doing copy/move constructors and to 
 avoid concerns over performance optimizations. With classes, 
 only certain points in your code require that a duplicate copy 
 be made of the class instance, the majority of code need only 
 to pass around a reference which is the default behavior - 
 easy and fast!

 --rt
It sounds like Java philosophy: Objects are always better. Or have I misunderstood? In any case, a intensive use of classes / objects, instead of structs would be also an enormous heap effort. I usually try to use as few classes as possible.
It's a matter of balance. If you start having really complex objects (and big, eg > 100 bytes), then classes tend to scale better. If having a class is *really* too much overhead, but your objects start getting too big to pass around by value, you can just new them on the heap, and you'll get the "best" (or worst?) of both worlds. Another good balance are stack based struct pointer wrappers to implementation : You can pass them by value, but they carry a complex payload. The advantage to doing this over a naked array is the static type. The *dis*-advantage is that D has no standard default initialization scheme (classes do though). Most things in phobos use this scheme. The point I (we?) are trying to get across is that *usually* (not a hard rule) copying things in D is *expected* to be trivial and cheap. If this is not the case, then the tools you'll interface with will not work optimally.
Feb 14 2013
parent reply "Namespace" <rswhite4 googlemail.com> writes:
 Another good balance are stack based struct pointer wrappers to 
 implementation : You can pass them by value, but they carry a 
 complex payload.
I'm not sure what that is. Can you give a small example?
Feb 14 2013
parent reply "monarch_dodra" <monarchdodra gmail.com> writes:
On Thursday, 14 February 2013 at 10:44:22 UTC, Namespace wrote:
 Another good balance are stack based struct pointer wrappers 
 to implementation : You can pass them by value, but they carry 
 a complex payload.
I'm not sure what that is. Can you give a small example?
struct S { static struct Payload { //Tons of data here } Payload* _p; //fonctions } Ref counted is implemented that way. most of the containers are also implemented that way. associative arrays are also implemented that way under the hood.
Feb 14 2013
parent reply "Namespace" <rswhite4 googlemail.com> writes:
 struct S
 {
     static struct Payload
     {
         //Tons of data here
     }
     Payload* _p;

     //fonctions
 }

 Ref counted is implemented that way. most of the containers are 
 also implemented that way. associative arrays are also 
 implemented that way under the hood.
But you have to allocate '_p' again on the heap. I see no advantage over classes, except that these structures are just not null by default. Is that the advantage?
Feb 14 2013
parent reply "monarch_dodra" <monarchdodra gmail.com> writes:
On Thursday, 14 February 2013 at 10:58:19 UTC, Namespace wrote:
 struct S
 {
    static struct Payload
    {
        //Tons of data here
    }
    Payload* _p;

    //fonctions
 }

 Ref counted is implemented that way. most of the containers 
 are also implemented that way. associative arrays are also 
 implemented that way under the hood.
But you have to allocate '_p' again on the heap.
Well, yeah, that's the point. I'm not saying it's a best fit for everything. You are basically trading construction costs for copy costs.
 I see no advantage over classes, except that these structures 
 are just not null by default.
Actually, (and IMO, this is a very big problem), these structures are *always* null by default. There is no easy way to "default initialize" structs in D :(
 Is that the advantage?
The advantage is deterministic RAII. With classes you only get non-deterministic RAII. For example: File. File will close the underlying FILE* when the last File is destroyed. But no later.
Feb 14 2013
parent "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Thursday, February 14, 2013 13:27:30 monarch_dodra wrote:
 Actually, (and IMO, this is a very big problem), these structures
 are *always* null by default. There is no easy way to "default
 initialize" structs in D :(
You mean there's no way to default-construct them. They're always default- initialized to their init value if you don't explicitly initialize them. - Jonathan M Davis
Feb 14 2013
prev sibling next sibling parent reply Dmitry Olshansky <dmitry.olsh gmail.com> writes:
14-Feb-2013 03:22, Ivan Kazmenko пишет:
 Hi!

 I'm learning to use D collections properly, and I'm looking for a sorted
 data structure with logarithmic access time (i.e., a binary search tree
 will do, but a hash table would not help). As far as I can see,
 std.container.RedBlackTree is exactly what I need. However, I am not
 sure if I use it as intended as its performance seems inferior to a C++
 STL solution (e.g., std::set).

 To be more specific, right now I wonder what is the best (or intended)
 way to store an object in the RedBlackTree: should it be a class
 reference, or a struct (passed by value), or something quirkier like an
 integer pointing into an array or a simple pointer. The rest of my
 program suggested to use structs, but the whole thing turned out to be
 rather slow, and the profiler told me that these structs are being
 copied around much more than I anticipated.

 And so I wrote a minimalistic test program to check the number of copy
 (postblit) constructor calls. Here is the D version:
[snip]
 And the results are:
 D2 (DMD 2.059, -O):             11,389,556
I'd add : -release -inline or it may not inline away temporary copies. -- Dmitry Olshansky
Feb 14 2013
parent "Ivan Kazmenko" <gassa mail.ru> writes:
Dmitry Olshansky wrote:
 D2 (DMD 2.059, -O):             11,389,556
I'd add : -release -inline or it may not inline away temporary copies.
Thank you for the suggestion. However, the result after "-O -release -inline" is still 11,389,556.
Feb 14 2013
prev sibling parent reply "Namespace" <rswhite4 googlemail.com> writes:
If I use (as you do)
----
this(ref this) {
----
I get 0 as output. (dmd 2.062 beta1)
If I remove the 'ref' I get 11389556.
Feb 14 2013
next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Thu, 14 Feb 2013 09:47:31 -0500, Namespace <rswhite4 googlemail.com>  
wrote:

 If I use (as you do)
 ----
 this(ref this) {
 ----
 I get 0 as output. (dmd 2.062 beta1)
 If I remove the 'ref' I get 11389556.
this(ref this) is not a postblit. That it would even compile is a bug. -Steve
Feb 14 2013
prev sibling parent "Ivan Kazmenko" <gassa mail.ru> writes:
 If I use (as you do)
 ----
 this(ref this) {
 ----
 I get 0 as output. (dmd 2.062 beta1)
 If I remove the 'ref' I get 11389556.
Ouch, sorry! That's a copy & paste bug I introduced when posting here. Actually, I use this (this) of course. I tried this (ref this) at some point, it does indeed compile surprisingly, but it's not what gets called on copying (and thus is of no use).
Feb 14 2013