digitalmars.D.learn - std.container.RedBlackTree versus C++ std::set
- Ivan Kazmenko (112/112) Feb 13 2013 Hi!
- Ivan Kazmenko (11/20) Feb 13 2013 Well, when I just changed
- Jonathan M Davis (14/33) Feb 13 2013 The biggest performance problem is the GC. The nodes in the RBT are allo...
- Ivan Kazmenko (64/99) Feb 14 2013 Thank you for the suggestion, I'll check that. However, I don't
- Rob T (9/9) Feb 14 2013 When I look at the std.container source code, it seems that the
- Steven Schveighoffer (10/18) Feb 14 2013 If it was pass by ref, then rbt.insert(5) would not work. And in fact, ...
- Rob T (25/50) Feb 14 2013 You can make it work by implementing two insert functions, one
- Namespace (4/29) Feb 14 2013 There are discussions for such thing for almost a year - but
- Rob T (6/8) Feb 14 2013 I think its finally on the priority radar, but there's still
- Namespace (8/17) Feb 14 2013 Yes, not to mention
- Jonathan M Davis (18/38) Feb 14 2013 And Walter and Andrei both seem to think that having expensive postlbits...
- Dan (13/36) Feb 15 2013 When you say things like "Andrei was considering phasing out
- Namespace (4/10) Feb 15 2013 Make a new thread where Walter and Andrei may submit an official
- Jonathan M Davis (7/18) Feb 15 2013 They're not going to do that until they've decided, and AFAIK, they have...
- Namespace (4/14) Feb 15 2013 I did not mean the const& thing. This is hopeless and will
- Jonathan M Davis (15/31) Feb 15 2013 I know. And my answer is the same. There's nothing special about the con...
- Jonathan M Davis (14/45) Feb 15 2013 I should probably add that bringing up discussions on how to solve probl...
- Namespace (11/11) Feb 15 2013 Again: My intention was not const&.
- Jonathan M Davis (8/19) Feb 15 2013 They definitely agree that it's a problem. They just don't see it as hav...
- Namespace (2/7) Feb 15 2013 Yes as I said: I count the versions. :)
- Dan (48/72) Feb 16 2013 Well, that is a somewhat passive yet diplomatic stance. If the
- Jonathan M Davis (41/50) Feb 15 2013 Yeah, they're unlikely to say much until something has actually been dec...
- Steven Schveighoffer (4/10) Feb 15 2013 I think the plan was to *assume* postblits were inexpensive, not to
- Jonathan M Davis (10/20) Feb 15 2013 Yes, but as I recall, in discussions on const and postblit, Andrei made ...
- Dan (27/32) Feb 15 2013 What is the reason for the second objection - is just performance?
- Jonathan M Davis (9/11) Feb 15 2013 Yes, it's a big deal. Any place that it's part of an API, it's a big dea...
- Ivan Kazmenko (36/41) Feb 14 2013 Thank you all for assistance, I've finally pinned it down! It's
- bearophile (7/8) Feb 14 2013 This is great. The first step to solve a problem is to find it.
- monarch_dodra (3/11) Feb 14 2013 I've already begun investigating, and am now benching my solution.
- Namespace (4/4) Feb 14 2013 A small but nice intentioned advice of me is this article:
- Ivan Kazmenko (6/11) Feb 14 2013 Thanks for pointing that out. Indeed, PascalCase seems
- Steven Schveighoffer (10/45) Feb 14 2013 It is great that you found this!
- Ivan Kazmenko (6/13) Feb 14 2013 I am starting to understand the difficulty with ref being too
- Namespace (5/9) Feb 15 2013 Search for 'auto ref' or right after 'rvalue references' and read
- Steven Schveighoffer (11/15) Feb 14 2013 As I said elsewhere, the issue is D not having a good solution for
- Namespace (4/14) Feb 14 2013 And Jonathan said also that 'auto ref' is not the solution for
- monarch_dodra (10/51) Feb 15 2013 Also keep in mind that "a < b" is implemented as two calls to
- Steven Schveighoffer (16/19) Feb 15 2013 Huh?
- Dan (12/23) Feb 15 2013 Yes. But isn't opCmp still more work than just a
- Steven Schveighoffer (27/40) Feb 15 2013 It depends:
- monarch_dodra (5/10) Feb 15 2013 Genius!
- H. S. Teoh (16/29) Feb 15 2013 You know that this is the origin of C's strcmp, right? Ever wondered why
- Jonathan M Davis (4/20) Feb 15 2013 Except that you have to worry about overflow if you use subtraction, so ...
- Steven Schveighoffer (11/21) Feb 15 2013 Actually, I'm not as smart as you think ;)
- Steven Schveighoffer (5/6) Feb 15 2013 And further to that point, my generated assembly for a < b also is
- bearophile (5/12) Feb 15 2013 I remember a page about such problem in the old D wiki. I was
- Dan (26/28) Feb 15 2013 The compiler has to do something with "a
- monarch_dodra (9/20) Feb 15 2013 Right... sorry.
- Rob T (19/19) Feb 13 2013 You can check if disabling the GC just before the insert process
- FG (5/8) Feb 13 2013 How did you know? It was 3x in my case. :)
- Rob T (17/31) Feb 13 2013 Oh yes, I forgot to mention you can expect ~2 second additional
- Steven Schveighoffer (15/121) Feb 13 2013 I find the number of postblit calls excessive too. I will have to look ...
- monarch_dodra (31/34) Feb 13 2013 Keep in mind that C++ and D have very different philosophies
- Rob T (11/45) Feb 14 2013 I agree. There are cases where structs make a lot of sense,
- Namespace (5/15) Feb 14 2013 It sounds like Java philosophy: Objects are always better.
- monarch_dodra (18/34) Feb 14 2013 It's a matter of balance. If you start having really complex
- Namespace (2/5) Feb 14 2013 I'm not sure what that is.
- monarch_dodra (13/18) Feb 14 2013 struct S
- Namespace (4/16) Feb 14 2013 But you have to allocate '_p' again on the heap.
- monarch_dodra (11/29) Feb 14 2013 Well, yeah, that's the point. I'm not saying it's a best fit for
- Jonathan M Davis (4/7) Feb 14 2013 You mean there's no way to default-construct them. They're always defaul...
- Dmitry Olshansky (7/25) Feb 14 2013 I'd add :
- Ivan Kazmenko (3/7) Feb 14 2013 Thank you for the suggestion. However, the result after "-O
- Namespace (6/6) Feb 14 2013 If I use (as you do)
- Steven Schveighoffer (4/10) Feb 14 2013 this(ref this) is not a postblit. That it would even compile is a bug.
- Ivan Kazmenko (5/11) Feb 14 2013 Ouch, sorry! That's a copy & paste bug I introduced when posting
Hi! I'm learning to use D collections properly, and I'm looking for a sorted data structure with logarithmic access time (i.e., a binary search tree will do, but a hash table would not help). As far as I can see, std.container.RedBlackTree is exactly what I need. However, I am not sure if I use it as intended as its performance seems inferior to a C++ STL solution (e.g., std::set). To be more specific, right now I wonder what is the best (or intended) way to store an object in the RedBlackTree: should it be a class reference, or a struct (passed by value), or something quirkier like an integer pointing into an array or a simple pointer. The rest of my program suggested to use structs, but the whole thing turned out to be rather slow, and the profiler told me that these structs are being copied around much more than I anticipated. And so I wrote a minimalistic test program to check the number of copy (postblit) constructor calls. Here is the D version: ----- import std.container; import std.stdio; immutable int LIMIT = 100000; struct element { static int postblit_counter; long x; int opCmp (ref element other) { return (x > other.x) - (x < other.x); } this (long nx) { x = nx; } this (ref this) { assert (x == x); postblit_counter++; } } alias RedBlackTree !(element) container; void main () { auto f = new container (); element.postblit_counter = 0; foreach (i; 0..LIMIT) { f.insert (element (i)); } writefln ("%s", element.postblit_counter); } ----- And now here is a C++ equivalent: ----- #include <cstdio> #include <set> #include <stdint.h> const int LIMIT = 100000; using namespace std; struct element { static int postblit_counter; int64_t x; bool operator < (const element other) const { return (x < other.x); } element (int64_t nx) { x = nx; } element (const element & other) { postblit_counter++; } }; int element::postblit_counter; typedef set <element> container; int main (void) { container f; element::postblit_counter = 0; for (int i = 0; i < LIMIT; i++) { f.insert (element (i)); } printf ("%d\n", element::postblit_counter); return 0; } ----- And the results are: D2 (DMD 2.059, -O): 11,389,556 C++ (MinGW GCC 4.7.2, -O2): 3,072,387 As you can see, in order to insert 100,000 elements, D needs a few times more copy constructor calls than C++. However, as far as I know, the internal structure used by C++ std::set is the very same red-black tree! Am I doing something wrong? And if not, what is this cost paid for, are there any features that RedBlackTree possesses and STL set doesn't? Personally, I don't see why at all we should call the copy constructor more than once per element. I mean, if we intend to build a generic data structure, we sure need an internal node object with some extra bytes (internal references and counters) per each element, at least in the case of a red-black tree. So why don't we just bind each element to that internal node once and for all, and then, as long as the node is in the structure, use the data contained in it only by reference? What do we achieve if we choose to use it by value somewhere? And if the intended design is "use class references", will that create an overhead for garbage collector later on? ...well, thank you for reading it to this point. ----- Ivan Kazmenko.
Feb 13 2013
P.S. More on C++ version:Personally, I don't see why at all we should call the copy constructor more than once per element. I mean, if we intend to build a generic data structure, we sure need an internal node object with some extra bytes (internal references and counters) per each element, at least in the case of a red-black tree. So why don't we just bind each element to that internal node once and for all, and then, as long as the node is in the structure, use the data contained in it only by reference? What do we achieve if we choose to use it by value somewhere?Well, when I just changed "bool operator < (const element other) const" to "bool operator < (const element & other) const" in the C++ version of the program, it gave me the exact 100,000 copy constructor calls which justifies the above paragraph. Ouch... I had hopes GCC -O2 optimizes such obvious cases; perhaps it's not that obvious from the compiler point of view. ----- Ivan Kazmenko.
Feb 13 2013
On Thursday, February 14, 2013 00:33:26 Ivan Kazmenko wrote:P.S. More on C++ version:The biggest performance problem is the GC. The nodes in the RBT are allocated with the GC, so depending on what how many elements you're inserting, how often you're removing them, etc., it could have performance issues. A better GC is in the works but obviously hasn't made it in yet. Also, work is being done on designing custom allocators that std.container can use, but that also has a ways to go. So, the main items which would help make std.container's RBT perform on par with std::set are in the works but not ready yet. Also, it could make a big difference if you use gdc or ldc rather than dmd. dmd compiles code much faster than they do, but its optimizer is much worse - gdc and ldc consistently produce faster code, and if you're compiling the C++ code with gcc, then gdc is a much better comparison anyway, since then both the C++ and D code are using the same compiler backend. - Jonathan M DavisPersonally, I don't see why at all we should call the copy constructor more than once per element. I mean, if we intend to build a generic data structure, we sure need an internal node object with some extra bytes (internal references and counters) per each element, at least in the case of a red-black tree. So why don't we just bind each element to that internal node once and for all, and then, as long as the node is in the structure, use the data contained in it only by reference? What do we achieve if we choose to use it by value somewhere?Well, when I just changed "bool operator < (const element other) const" to "bool operator < (const element & other) const" in the C++ version of the program, it gave me the exact 100,000 copy constructor calls which justifies the above paragraph. Ouch... I had hopes GCC -O2 optimizes such obvious cases; perhaps it's not that obvious from the compiler point of view.
Feb 13 2013
First, thank you all for the replies! Jonathan M Davis wrote:Also, it could make a big difference if you use gdc or ldc rather than dmd.Thank you for the suggestion, I'll check that. However, I don't expect the GCC optimizer to reduce the number of calls in this specific test, since it did not kick in in the more obvious case as I mentioned. Right now, I am more concerned with the performance of the front-end (RedBlackTree): if it currently can't be tuned to perform fast enough, I'll just know that I should implement my own version of a binary search tree container for my needs. Rob T wrote:You can check if disabling the GC just before the insert process improves the performance. You may see 3x performance improvement. Disabling is safe provided you re-enable, this can be done reliably with scope(exit) or something similar.Hmm, it indeed performed faster, though not 3x in my setup. However, the number of copy constructor calls stayed the same. Steven Schveighoffer wrote:I find the number of postblit calls excessive too. I will have to look into why that happens. I can say that once an element is allocated, it is not moved or copied.Could it be easily enforced in the library? Something like this: when accessing data only for internal use in the data structure, use an interface that disables assignment? Can such cases be distinguished from the ones when the library user actively wants to copy data? Steven Schveighoffer wrote:I will note that std.container.RedBlackTree is a port of dcollections' RBTree implementation.Thank you for pointing that out. Perhaps I'll have to try the test with that implementation, too. monarch_dodra wrote:Keep in mind that C++ and D have very different philosophies regarding copy construction. ... The conclusion is that the comparison is not fair: D's pass by value is not *very* different from C++'s pass by ref (in amount of data copied).Well, perhaps I didn't make myself clear enough. The comparison I posted is not intended to be a fair comparison of languages in the general case! It is just my use case stripped to a minimal example that still shows the number of calls correctly. In the actual use case, I have something like ----- struct element { long value; int [] one; int [] two; this (this) { one = one.dup; two = two.dup; } } ----- Sure I could think of some other way to represent the data I need as an object, this one just seems the most intuitive. monarch_dodra wrote:If you *do* want a fair-er comparison, then I'd suggest you implement a ".dup" on your object, and see how many times THAT gets called. Guess what: 0.Right, but it's the copy constructor that gets called when I copy a struct, and in these cases, I actually want it to copy the arrays as well. Otherwise, the element object may end up in a bad state. Do you imply that I should implement dup for every struct I create and then leave the copy constructor empty? As the library makes use of the copy constructor and not dup, I fear it would be an incorrect design, since the library will make broken copies of my struct and pass them around. monarch_dodra wrote:I'm not saying D's approach is perfect, just that D's library is optimized for D-like types.and Rob T wrote:I agree. There are cases where structs make a lot of sense, usually when they are very simple simple and contain no pointers or references, otherwise structs should be avoided in favor of classes to avoid doing copy/move constructors and to avoid concerns over performance optimizations. With classes, only certain points in your code require that a duplicate copy be made of the class instance, the majority of code need only to pass around a reference which is the default behavior - easy and fast!So, you suggest passing things by reference once they grow beyond several bytes in size? Wouldn't that mean having to constantly pay attention to some counter-intuitive code when I actually want my objects to behave like values, not references (e.g., matrices or other complex math objects)? In my example case with arrays one and two, I do want my whole object to obey value semantics. Once again, thank you all for attention. ----- Ivan Kazmenko.
Feb 14 2013
When I look at the std.container source code, it seems that the payload element is passed by value multiple times unnecessarily, so to minimize copy construction you'll have to implement element as a class and implement a dup function for it. I expect performance will increase substantially even with the extra heap allocations. Alternatively, you can implement your own version of RedBlackTree with pass by ref optimizations for insertion of struct elements. --rt
Feb 14 2013
On Thu, 14 Feb 2013 13:45:30 -0500, Rob T <alanb ucora.com> wrote:When I look at the std.container source code, it seems that the payload element is passed by value multiple times unnecessarily, so to minimize copy construction you'll have to implement element as a class and implement a dup function for it. I expect performance will increase substantially even with the extra heap allocations. Alternatively, you can implement your own version of RedBlackTree with pass by ref optimizations for insertion of struct elements.If it was pass by ref, then rbt.insert(5) would not work. And in fact, I wouldn't want things passed by ref if the element is int. I have to admit, I did not consider expensive postblits when I designed it. Almost all my testing is with integer types. Unfortunately, D is in this state where taking a ref parameter means strictly lvalue. So passing rvalues will not work. D does not have the notion that C++ has where const type& can accept both rvalue and lvalue. We desperately need something similar. -Steve
Feb 14 2013
On Thursday, 14 February 2013 at 19:31:36 UTC, Steven Schveighoffer wrote:On Thu, 14 Feb 2013 13:45:30 -0500, Rob T <alanb ucora.com> wrote:You can make it work by implementing two insert functions, one with pass by ref and one with pass by value. The pass by value implementation simply calls the pass by ref version, and everything from there is pass by ref where it makes sense as an optimization. I understand the implementation difficulties, because we have multiple situations to consider and optimize for, such as simple POD which can often pass by value efficiently, but then we have complex structs that do not pass by value efficiently and should be passed by ref instead, but there's no way to detect this easily (I suppose you could try looking at the size and adjust the pass-by method somehow, but ...), and finally if you accept a class then you have different needs because a class is a pointer not a POD value, again I suppose you can detect if a class is being passed and adjust the implementation but that introduces more complexity ... In some ways this is where C++ got it right from a consistency POV, where a class is no different from a struct, so whatever works for a class also works for a struct. In D, structs and classes operate in fundamentally different ways making implementing generics more difficult to do.When I look at the std.container source code, it seems that the payload element is passed by value multiple times unnecessarily, so to minimize copy construction you'll have to implement element as a class and implement a dup function for it. I expect performance will increase substantially even with the extra heap allocations. Alternatively, you can implement your own version of RedBlackTree with pass by ref optimizations for insertion of struct elements.If it was pass by ref, then rbt.insert(5) would not work. And in fact, I wouldn't want things passed by ref if the element is int. I have to admit, I did not consider expensive postblits when I designed it. Almost all my testing is with integer types.Unfortunately, D is in this state where taking a ref parameter means strictly lvalue. So passing rvalues will not work. D does not have the notion that C++ has where const type& can accept both rvalue and lvalue. We desperately need something similar. -SteveYes, I agree. --rt
Feb 14 2013
On Thursday, 14 February 2013 at 19:31:36 UTC, Steven Schveighoffer wrote:On Thu, 14 Feb 2013 13:45:30 -0500, Rob T <alanb ucora.com> wrote:There are discussions for such thing for almost a year - but nothing has changed so far.When I look at the std.container source code, it seems that the payload element is passed by value multiple times unnecessarily, so to minimize copy construction you'll have to implement element as a class and implement a dup function for it. I expect performance will increase substantially even with the extra heap allocations. Alternatively, you can implement your own version of RedBlackTree with pass by ref optimizations for insertion of struct elements.If it was pass by ref, then rbt.insert(5) would not work. And in fact, I wouldn't want things passed by ref if the element is int. I have to admit, I did not consider expensive postblits when I designed it. Almost all my testing is with integer types. Unfortunately, D is in this state where taking a ref parameter means strictly lvalue. So passing rvalues will not work. D does not have the notion that C++ has where const type& can accept both rvalue and lvalue. We desperately need something similar. -Steve
Feb 14 2013
On Thursday, 14 February 2013 at 19:56:33 UTC, Namespace wrote:There are discussions for such thing for almost a year - but nothing has changed so far.I think its finally on the priority radar, but there's still other things in more dire need to be resolved first, like property and lack of shared library support, and probably something else too ... --rt
Feb 14 2013
On Thursday, 14 February 2013 at 20:20:59 UTC, Rob T wrote:On Thursday, 14 February 2013 at 19:56:33 UTC, Namespace wrote:Yes, not to mention - A package system for Phobos - Half Float Datatypes - JSON - bug fixes and merging of new features in D1 (although no further development of D1 should be done since December 31) and probably something else too.There are discussions for such thing for almost a year - but nothing has changed so far.I think its finally on the priority radar, but there's still other things in more dire need to be resolved first, like property and lack of shared library support, and probably something else too ... --rt
Feb 14 2013
On Thursday, February 14, 2013 14:31:36 Steven Schveighoffer wrote:On Thu, 14 Feb 2013 13:45:30 -0500, Rob T <alanb ucora.com> wrote:And Walter and Andrei both seem to think that having expensive postlbits is a design mistake. The range stuff sort of tries to support it with the move* primitives, but Andrei's been wanting to argue for just assuming that postblits are cheap. And Walter was actually arguing at one point for making it illegal (probably by getting rid of postblit all together - I don't remember the details). Plenty of the rest of us don't agree, but to some extent, it is true that having an expensive postblit is asking for it. On a related note, we still need a solution for dealing with const postblits (probably be introducing copy constructors - IIRC Andrei was considering phasing out postblits in favor of copy constructors, which would be unnecessary, but we do need a way to deal with deep copying const structs which hold reference types which need to be deep copied.When I look at the std.container source code, it seems that the payload element is passed by value multiple times unnecessarily, so to minimize copy construction you'll have to implement element as a class and implement a dup function for it. I expect performance will increase substantially even with the extra heap allocations. Alternatively, you can implement your own version of RedBlackTree with pass by ref optimizations for insertion of struct elements.If it was pass by ref, then rbt.insert(5) would not work. And in fact, I wouldn't want things passed by ref if the element is int. I have to admit, I did not consider expensive postblits when I designed it. Almost all my testing is with integer types.Unfortunately, D is in this state where taking a ref parameter means strictly lvalue. So passing rvalues will not work. D does not have the notion that C++ has where const type& can accept both rvalue and lvalue. We desperately need something similar.auto ref is the current solution for templated functions, but we do need a more general solution. DIP25 got created in part due to discussions on that, but it probably needs to be fully sorted out before we can fully sort out the const& solution. - Jonathan M Davis
Feb 14 2013
On Thursday, 14 February 2013 at 20:53:26 UTC, Jonathan M Davis wrote:And Walter and Andrei both seem to think that having expensive postlbits is a design mistake. The range stuff sort of tries to support it with the move* primitives, but Andrei's been wanting to argue for just assuming that postblits are cheap. And Walter was actually arguing at one point for making it illegal (probably by getting rid of postblit all together - I don't remember the details). Plenty of the rest of us don't agree, but to some extent, it is true that having an expensive postblit is asking for it. On a related note, we still need a solution for dealing with const postblits (probably be introducing copy constructors - IIRC Andrei was considering phasing out postblits in favor of copy constructors, which would be unnecessary, but we do need a way to deal with deep copying const structs which hold reference types which need to be deep copied.When you say things like "Andrei was considering phasing out postblits..." I get nervous. Can we please have some comments from Andrei/Walter about what the plans are? I'm not asking for the ultimate solution - just to know the general direction and where this issue stands at present. Is there anything any of us can do to help move this forward? Regarding replacing expensive postblits with copy constructors - how does that help? The expense will remain the same - if you need a transitive copy, you need a transitive copy. Thanks Dan
Feb 15 2013
When you say things like "Andrei was considering phasing out postblits..." I get nervous. Can we please have some comments from Andrei/Walter about what the plans are? I'm not asking for the ultimate solution - just to know the general direction and where this issue stands at present. Is there anything any of us can do to help move this forward?Make a new thread where Walter and Andrei may submit an official staement. There are certainly other interested and this thread is probably not for more suitable.
Feb 15 2013
On Friday, February 15, 2013 19:43:07 Namespace wrote:They're not going to do that until they've decided, and AFAIK, they haven't. Attempting to essentially confront them and get them to say what they plan to do generally doesn't get you anywhere - especially since they're unlikely to say much until they actually are planning to do it, and it's usually the case that no decision has been made. - Jonathan M DavisWhen you say things like "Andrei was considering phasing out postblits..." I get nervous. Can we please have some comments from Andrei/Walter about what the plans are? I'm not asking for the ultimate solution - just to know the general direction and where this issue stands at present. Is there anything any of us can do to help move this forward?Make a new thread where Walter and Andrei may submit an official staement. There are certainly other interested and this thread is probably not for more suitable.
Feb 15 2013
They're not going to do that until they've decided, and AFAIK, they haven't. Attempting to essentially confront them and get them to say what they plan to do generally doesn't get you anywhere - especially since they're unlikely to say much until they actually are planning to do it, and it's usually the case that no decision has been made. - Jonathan M DavisI did not mean the const& thing. This is hopeless and will certainly take some months or years. My post was based on "[...] probably by getting rid of postblit all together.".
Feb 15 2013
On Friday, February 15, 2013 21:16:51 Namespace wrote:I know. And my answer is the same. There's nothing special about the const& issue and how that's going. That's how it pretty much always goes. Until it becomes a priority for Walter, it'll go nowhere. Sometimes, if someone makes it a priority and implements it (creating a pull request for it), then that solves the problem, but sometimes that's still not enough, because Walter needs to sort out whether it's the direction that we want to go, which means making it a priority to sort that out, which means that the pull request sits around for a while and may or may not make it in. Bug fixes and more minor language changes do get merged all the time, but if a language change is significant enough, it generally requires that Walter spend some time on it and make a decision on it (even if he's not implementing the changes), and he's generally very slow about that if it's not near the top of his list of priorities. - Jonathan M DavisThey're not going to do that until they've decided, and AFAIK, they haven't. Attempting to essentially confront them and get them to say what they plan to do generally doesn't get you anywhere - especially since they're unlikely to say much until they actually are planning to do it, and it's usually the case that no decision has been made. - Jonathan M DavisI did not mean the const& thing. This is hopeless and will certainly take some months or years. My post was based on "[...] probably by getting rid of postblit all together.".
Feb 15 2013
On Friday, February 15, 2013 21:37:46 Jonathan M Davis wrote:On Friday, February 15, 2013 21:16:51 Namespace wrote:I should probably add that bringing up discussions on how to solve problems in the language can be of benefit, because they often result in good discussions that help lead toward a solution, and that can lead towards that solution ending up in the language (and depending on the discussion Andrei and/or Walter will get involved). But simply asking them about the state of things or essentially contronting them and trying to get them to give official statements on their plans doesn't generally work. If nothing else, they simply don't generally say what they're planning to do before they've actually decided on it. They might start discussions to discuss something that they're considering, but they're unlikely to state any kind of actual plans before they've actually decided, which generally means no official statements about what's coming or planned. - Jonathan M DavisI know. And my answer is the same. There's nothing special about the const& issue and how that's going. That's how it pretty much always goes. Until it becomes a priority for Walter, it'll go nowhere. Sometimes, if someone makes it a priority and implements it (creating a pull request for it), then that solves the problem, but sometimes that's still not enough, because Walter needs to sort out whether it's the direction that we want to go, which means making it a priority to sort that out, which means that the pull request sits around for a while and may or may not make it in. Bug fixes and more minor language changes do get merged all the time, but if a language change is significant enough, it generally requires that Walter spend some time on it and make a decision on it (even if he's not implementing the changes), and he's generally very slow about that if it's not near the top of his list of priorities.They're not going to do that until they've decided, and AFAIK, they haven't. Attempting to essentially confront them and get them to say what they plan to do generally doesn't get you anywhere - especially since they're unlikely to say much until they actually are planning to do it, and it's usually the case that no decision has been made. - Jonathan M DavisI did not mean the const& thing. This is hopeless and will certainly take some months or years. My post was based on "[...] probably by getting rid of postblit all together.".
Feb 15 2013
Again: My intention was not const&. And you're right, but there was so many discussions about const& (since dmd 2.057; also in the last few days) and as every discussion here: after page 2 is the topic changed. I'm also very sure that neither Walter nor Andrei see a (important) reason for something similar as const& because they don't need it. And if you don't need something, the priority for such thing is very low. So everything we can do (after that much requests and discussions) is to wait what and when they will decide something. I count the versions.
Feb 15 2013
On Friday, February 15, 2013 22:41:24 Namespace wrote:Again: My intention was not const&.I know. What I'm saying applies in general.And you're right, but there was so many discussions about const& (since dmd 2.057; also in the last few days) and as every discussion here: after page 2 is the topic changed. I'm also very sure that neither Walter nor Andrei see a (important) reason for something similar as const& because they don't need it. And if you don't need something, the priority for such thing is very low. So everything we can do (after that much requests and discussions) is to wait what and when they will decide something. I count the versions.They definitely agree that it's a problem. They just don't see it as having as high a priority as you do. For instance, as far as ref-related problems go, the issue that DIP 25 covers is something that they consider to be a much bigger issue (since it deals with safe and SafeD). It'll be fixed though. It's just a question of how and when. - Jonathan M Davis
Feb 15 2013
They definitely agree that it's a problem. They just don't see it as having as high a priority as you do.For me it is not a priority (anymore). I use C++ again.It's just a question of how and when. - Jonathan M DavisYes as I said: I count the versions. :)
Feb 15 2013
On Friday, 15 February 2013 at 20:58:30 UTC, Jonathan M Davis wrote:I should probably add that bringing up discussions on how to solve problems in the language can be of benefit, because they often result in good discussions that help lead toward a solution, and that can lead towards that solution ending up in the language (and depending on the discussion Andrei and/or Walter will get involved). But simply asking them about the state of things or essentially contronting them and trying to get them to give official statements on their plans doesn't generally work. If nothing else, they simply don't generally say what they're planning to do before they've actually decided on it. They might start discussions to discuss something that they're considering, but they're unlikely to state any kind of actual plans before they've actually decided, which generally means no official statements about what's coming or planned. - Jonathan M DavisWell, that is a somewhat passive yet diplomatic stance. If the suggestion is - "get involved, suggest a fix or a solution and maybe you'll get your answers" that is reasonable. But for my part, I am not a language developer - just a language user and fan of D who is betting on D by using it for my livelihood. Regarding postblits - you've mentioned several times that Andrei and Walter have discussed and have a solution in the works. Does it make sense to go do a DIP or something just to illicit feedback on a topic? You have some of the most helpful responses I've seen on the group - extremely detailed and accurate. Thanks. However, being in the know you sometimes you let slip a little bit here and there that causes me to give pause and at a minimum want to get more information. Like "And Walter was actually arguing at one point for making it illegal (probably by getting rid of postblit all together - I don't remember the details)". It is not unreasonable for users to want to know what is the future direction for a certain at risk feature. Regarding a suggestion for postblit/copy ctor to get things going here is mine: - Implement copy constructors for structs. Allow for overloading on 'this' and 'const this', but no qualifier on the constructor itself. - Allow the user to disable copy constructors. - Provide for a default copy constructor that does regular copies of fundamental types, shallow copies of arrays and associative arrays, and calls corresponding copy constructors (or postblits during some grace period). In terms of migration path: - Allow the struct designer to choose to use either postblit or copy constructor but not both. For the case of using them to provide deep copy semantics the approaches are different. For postblits you are asking - "what fields do I need to copy" with the comfort of knowing all other fields were already copied with the blit. For copy constructors it is different because you have to think about all fields since a blit will not have happened first. To ease the transition provide the ability for the copy constructor to call blitSourceIntoThis(source). - Leave postblits alone. Allow them to continue as is and phase them out 1 or more years after successful implementation of copy constructors - Often the only purpose of the copy constructor is to do a deep copy. This could easily be provided by the compiler or phobos. Further, consider standardizing on the ".dup" and ".idup" conventions for this purpose. Thanks Dan
Feb 16 2013
On Friday, February 15, 2013 19:14:21 Dan wrote:When you say things like "Andrei was considering phasing out postblits..." I get nervous. Can we please have some comments from Andrei/Walter about what the plans are? I'm not asking for the ultimate solution - just to know the general direction and where this issue stands at present. Is there anything any of us can do to help move this forward?Yeah, they're unlikely to say much until something has actually been decided. Frequently, if you try and pin them down on stuff like that, you won't get an answer, especially if you ask out of the blue. But regardless, I wouldn't expect postblits to be gotten rid of (because that would break code). Rather, they would be discouraged, and newer code would be written to use copy constructors instead. But nothing has been decided for certain AFAIK.Regarding replacing expensive postblits with copy constructors - how does that help? The expense will remain the same - if you need a transitive copy, you need a transitive copy.The problem has nothing to do with const. The problem const and immutable. It's impossible to write a const or immutable postblit constructor, because postblit operates by doing the copy and then allowing you to change it, whereas const and immutable cannot be changed. You need to construct the copy as const or immutable, which means you need to make the changes when you make the copy, which basically means that you need a copy constructor. Efficiency comes in when arguing whether a postblit or copy constructor is appropriate in the first place or whether they can be implemented efficiently. Walter seems to be of the opinion that they should only be used in cases where they're actually efficient (i.e. you never use them for deep copying, only for handling stuff like ref-counting). He's much more in favor of COW. But much as he thinks that way, I don't think that there's any real risk of that being codified in the language at this point. Not only would it be very difficult to restrict postblit like that (if not impossible), but it would break tons of code (which Walter is very much against), and most everyone would scream (and several already did when he brought it up). Andrei likes the idea of basically requiring that ranges have O(1) postblits/copy constructors so that algorithms can assume that copying is cheap, but he's never decided that we're going that way for certain, and even if we did, it would just mean that we wouldn't write code which cared about the expense of copying ranges, and anyone who wrote ranges that were expensive to copy would take a performance hit. Honestly, I suspect that that's mostly where we are already. And given how ranges work, an expensive postblit doesn't really make sense anyway. No, I think that the only real concern here is if/when we'll get copy constructors so that we can copy const or immutable structs that require postblits or copy constructors. Walter and Andrei are very concerned about achieving and maintaining stability of the language and library, so any major breaking changes (especially to the language) would need to have very good reasons and would likely be discussed in detail in the newsgroup first. So, while I think that some of what they've said would be of some concern if D were still being designed (as opposed to being stabilized), given where we are, it's unlikely to be a concern. - Jonathan M Davis
Feb 15 2013
On Fri, 15 Feb 2013 13:14:21 -0500, Dan <dbdavidson yahoo.com> wrote:When you say things like "Andrei was considering phasing out postblits..." I get nervous. Can we please have some comments from Andrei/Walter about what the plans are? I'm not asking for the ultimate solution - just to know the general direction and where this issue stands at present. Is there anything any of us can do to help move this forward?I think the plan was to *assume* postblits were inexpensive, not to disallow them. -Steve
Feb 15 2013
On Friday, February 15, 2013 16:53:36 Steven Schveighoffer wrote:On Fri, 15 Feb 2013 13:14:21 -0500, Dan <dbdavidson yahoo.com> wrote:Yes, but as I recall, in discussions on const and postblit, Andrei made a comment about how postblit hadn't really worked out and we need copy constructors (in which case, we'd move to normally using copy constructors as the norm). But I don't believe that Andrei suggested getting rid of postblit completely (due to the code breakage it would cause if nothing else), and while I think that Walter and Andrei would deal with the postblit issue very differently if they were to start over, there are no plans at present to get rid of it. - Jonathan M DavisWhen you say things like "Andrei was considering phasing out postblits..." I get nervous. Can we please have some comments from Andrei/Walter about what the plans are? I'm not asking for the ultimate solution - just to know the general direction and where this issue stands at present. Is there anything any of us can do to help move this forward?I think the plan was to *assume* postblits were inexpensive, not to disallow them.
Feb 15 2013
On Thursday, 14 February 2013 at 19:31:36 UTC, Steven Schveighoffer wrote:If it was pass by ref, then rbt.insert(5) would not work. And in fact, I wouldn't want things passed by ref if the element is int.What is the reason for the second objection - is just performance? Is performance really an issue? From this thread it looks like fundamental types by ref or value is not really a big deal in terms of performance. OTOH - as size and/or postblit execution gets expensive the cost *is* significant. --------------------------- 4 bytes: using cref_(int size) took 29[ms] 4 bytes: using inref(int size) took 29[ms] 4 bytes: using in___(int size) took 30[ms] 8 bytes: using cref_(int size) took 29[ms] 8 bytes: using inref(int size) took 28[ms] 8 bytes: using in___(int size) took 31[ms] ... 128 bytes: using cref_(int size) took 29[ms] 128 bytes: using inref(int size) took 29[ms] 128 bytes: using in___(int size) took 290[ms] ---------------------------I have to admit, I did not consider expensive postblits when I designed it. Almost all my testing is with integer types.For performance, it seems by ref should always be preferred in generic code because you can not know the cost of postblit - or copy construction if that were a future solution for structs. Granted the no rvalue passing is a pain - but is it a big deal in library/generic code? Thanks, Dan
Feb 15 2013
On Friday, February 15, 2013 20:12:08 Dan wrote:Granted the no rvalue passing is a pain - but is it a big deal in library/generic code?Yes, it's a big deal. Any place that it's part of an API, it's a big deal. It forces you to create what are essentially useless variables all over the place if you require ref. We _will_ have a solution similar to const& at some point, and that's what the correct way to go to solve this problem will be, but that situation still needs to be sorted out (and the situation with DIP 25 - http://wiki.dlang.org/DIP25 - could have an impact on that given that it's also has to do with changes to ref). - Jonathan M Davis
Feb 15 2013
Rob T wrote:When I look at the std.container source code, it seems that the payload element is passed by value multiple times unnecessarily, so to minimize copy construction you'll have to implement element as a class and implement a dup function for it.Thank you all for assistance, I've finally pinned it down! It's the default magic "a < b" that devours the most copies of my precious structs! Once I change the declaration: ----- alias RedBlackTree !(element) container; ----- to ----- bool lessfun (ref element a, ref element b) { return a < b; } alias RedBlackTree !(element, lessfun) container; ----- ... I get only exactly 3 * LIMIT postblit constructor calls, which is 300,000 instead of 11,389,556. While I still want to find where the excess 200,000 calls originate from, this is definitely asymptotically better than before: O (n) versus O (n log n) calls. And it is now less of a bottleneck in my original program. This raises the question to the default behavior of std.functional.binaryFun. Sure, for integers, class references and small types, the current way is the best. On the other hand, for large structs and maybe some other complex objects that obey value semantics, it seems it would be beneficial to, given a string like "a < b", produce a function taking arguments by reference and not by value. Maybe there is some simple criterion (size of type greater than size_t - seems bad? type is a struct - better?) which can be used with static if to generate "(ref a, ref b)" instead of "(a, b)" version by default. Of course, all that could only work if there is a more detailed hierarchy of binary functions, at least a binaryFunConst taking const arguments. ----- Ivan Kazmenko.
Feb 14 2013
Ivan Kazmenko:Thank you all for assistance, I've finally pinned it down!This is great. The first step to solve a problem is to find it. The second step is to formalize it. So I suggest you to create a small benchmark that shows what you mean (it should not include RedBlackTree) and put it in Bugzilla, with some timings. Bye, bearophile
Feb 14 2013
On Thursday, 14 February 2013 at 20:44:13 UTC, bearophile wrote:Ivan Kazmenko:I've already begun investigating, and am now benching my solution. But filing issues never hurt anyone.Thank you all for assistance, I've finally pinned it down!This is great. The first step to solve a problem is to find it. The second step is to formalize it. So I suggest you to create a small benchmark that shows what you mean (it should not include RedBlackTree) and put it in Bugzilla, with some timings. Bye, bearophile
Feb 14 2013
A small but nice intentioned advice of me is this article: http://dlang.org/dstyle.html "The names of user-defined types should be PascalCased, which is the same as camelCased except that the first letter is uppercase."
Feb 14 2013
Namespace wrote:A small but nice intentioned advice of me is this article: http://dlang.org/dstyle.html "The names of user-defined types should be PascalCased, which is the same as camelCased except that the first letter is uppercase."Thanks for pointing that out. Indeed, PascalCase seems convenient to use with D structs. On the other side, I still have aesthetic considerations against camelCase and prefer underscores_between_words for variables in the programs I write for personal use.
Feb 14 2013
On Thu, 14 Feb 2013 15:33:19 -0500, Ivan Kazmenko <gassa mail.ru> wrote:Rob T wrote:It is great that you found this! Again, the issue here is D has restrictions on ref parameters. There are no such restrictions on pass-by-value parameters. So by default, the predicate tries by-value. In this case, RedBlackTree is always assured of having lvalues for the predicate, so I can probably make the default use pass-by-ref. If you could file a bug, that would be most helpful. http://d.puremagic.com/issues/enter_bug.cgi?product=D -SteveWhen I look at the std.container source code, it seems that the payload element is passed by value multiple times unnecessarily, so to minimize copy construction you'll have to implement element as a class and implement a dup function for it.Thank you all for assistance, I've finally pinned it down! It's the default magic "a < b" that devours the most copies of my precious structs! Once I change the declaration: ----- alias RedBlackTree !(element) container; ----- to ----- bool lessfun (ref element a, ref element b) { return a < b; } alias RedBlackTree !(element, lessfun) container; ----- ... I get only exactly 3 * LIMIT postblit constructor calls, which is 300,000 instead of 11,389,556. While I still want to find where the excess 200,000 calls originate from, this is definitely asymptotically better than before: O (n) versus O (n log n) calls. And it is now less of a bottleneck in my original program. This raises the question to the default behavior of std.functional.binaryFun. Sure, for integers, class references and small types, the current way is the best. On the other hand, for large structs and maybe some other complex objects that obey value semantics, it seems it would be beneficial to, given a string like "a < b", produce a function taking arguments by reference and not by value. Maybe there is some simple criterion (size of type greater than size_t - seems bad? type is a struct - better?) which can be used with static if to generate "(ref a, ref b)" instead of "(a, b)" version by default. Of course, all that could only work if there is a more detailed hierarchy of binary functions, at least a binaryFunConst taking const arguments.
Feb 14 2013
Steven Schveighoffer wrote:Again, the issue here is D has restrictions on ref parameters. There are no such restrictions on pass-by-value parameters. So by default, the predicate tries by-value.I am starting to understand the difficulty with ref being too restrictive. Some kind of "const ref" in the language would have helped indeed. But it would probably cause some allocation difficulties for actual constants... or worse?In this case, RedBlackTree is always assured of having lvalues for the predicate, so I can probably make the default use pass-by-ref. If you could file a bug, that would be most helpful.Done! http://d.puremagic.com/issues/show_bug.cgi?id=9513
Feb 14 2013
I am starting to understand the difficulty with ref being too restrictive. Some kind of "const ref" in the language would have helped indeed. But it would probably cause some allocation difficulties for actual constants... or worse?Search for 'auto ref' or right after 'rvalue references' and read some of the discussions. It's instructive and then you can understand the whole topic maybe a little more. No thread comes to a solution and unfortunately the answer of Walter and Andrei are very very thin on the subject.
Feb 15 2013
On Thu, 14 Feb 2013 15:33:19 -0500, Ivan Kazmenko <gassa mail.ru> wrote:... I get only exactly 3 * LIMIT postblit constructor calls, which is 300,000 instead of 11,389,556. While I still want to find where the excess 200,000 calls originate from, this is definitely asymptotically better than beforeAs I said elsewhere, the issue is D not having a good solution for accepting both rvalues and lvalues by ref. Jonathan mentioned auto ref, but it's not exactly implemented correctly. In C++, it just takes all parameters as const T&, and that works for both lvalues and rvalues. The extra 200k copies are from the implementation taking all parameters by value. If we didn't do that, sans a working auto ref, things like tree.insert(element(5)) would fail to compile (cannot pass an rvalue by reference) -Steve
Feb 14 2013
As I said elsewhere, the issue is D not having a good solution for accepting both rvalues and lvalues by ref. Jonathan mentioned auto ref, but it's not exactly implemented correctly. In C++, it just takes all parameters as const T&, and that works for both lvalues and rvalues. The extra 200k copies are from the implementation taking all parameters by value. If we didn't do that, sans a working auto ref, things like tree.insert(element(5)) would fail to compile (cannot pass an rvalue by reference) -SteveAnd Jonathan said also that 'auto ref' is not the solution for non-template functions. Probably 'auto ref' will never work for non-template functions. So a general solution must be found first.
Feb 14 2013
On Thursday, 14 February 2013 at 20:33:21 UTC, Ivan Kazmenko wrote:Rob T wrote:Also keep in mind that "a < b" is implemented as two calls to "a.opCmp(b)", and "a.opCmp(b)" is itself usually implemented as two calls to "something < something else" (!) as convenient as opCmp is for the implementation implementation, it is not always the best in terms of raw power. If you have can write a straight up less(S1, S2) function, such as "a.i < b.i", then using that can buy you a hefty amount of time.When I look at the std.container source code, it seems that the payload element is passed by value multiple times unnecessarily, so to minimize copy construction you'll have to implement element as a class and implement a dup function for it.Thank you all for assistance, I've finally pinned it down! It's the default magic "a < b" that devours the most copies of my precious structs! Once I change the declaration: ----- alias RedBlackTree !(element) container; ----- to ----- bool lessfun (ref element a, ref element b) { return a < b; } alias RedBlackTree !(element, lessfun) container; ----- ... I get only exactly 3 * LIMIT postblit constructor calls, which is 300,000 instead of 11,389,556. While I still want to find where the excess 200,000 calls originate from, this is definitely asymptotically better than before: O (n) versus O (n log n) calls. And it is now less of a bottleneck in my original program. This raises the question to the default behavior of std.functional.binaryFun. Sure, for integers, class references and small types, the current way is the best. On the other hand, for large structs and maybe some other complex objects that obey value semantics, it seems it would be beneficial to, given a string like "a < b", produce a function taking arguments by reference and not by value. Maybe there is some simple criterion (size of type greater than size_t - seems bad? type is a struct - better?) which can be used with static if to generate "(ref a, ref b)" instead of "(a, b)" version by default. Of course, all that could only work if there is a more detailed hierarchy of binary functions, at least a binaryFunConst taking const arguments. ----- Ivan Kazmenko.
Feb 15 2013
On Fri, 15 Feb 2013 12:11:55 -0500, monarch_dodra <monarchdodra gmail.com> wrote:Also keep in mind that "a < b" is implemented as two calls to "a.opCmp(b)", and "a.opCmp(b)" is itself usually implemented as two calls to "something < something else" (!)Huh? a < b is implemented as a.opCmp(b) < 0, not two calls to a.opCmp(b). HOWEVER, when you get to the point of executing a == b, it's implemented as !(a < b) && !(b < a), which could more efficiently be rewritten as a.opEquals(b) or a.opCmp(b) == 0. In the case of red black tree however, the nature of traversal makes it so that you only have the redundant call once (when you have found the insertion spot, you already have called one half, you just have to call the other half). For dcollections, I don't accept a < b, I accept the int-returning compare function (which I think is pass by ref, so shouldn't have this problem). The std.container.RedBlackTree is a port of the same code, but adapted to the more phobos-style functions. -Steve
Feb 15 2013
On Friday, 15 February 2013 at 17:42:30 UTC, Steven Schveighoffer wrote:On Fri, 15 Feb 2013 12:11:55 -0500, monarch_dodra <monarchdodra gmail.com> wrote:Yes. But isn't opCmp still more work than just a<b? It must be because it returns three states instead of two. So, I think the general warning on opCmp is warranted especially with structs. For structs we could use compile time reflection to provide automatic efficient opCmp behavior. Section 3.1 here outlines a way: https://github.com/patefacio/d-help/blob/master/doc/canonical.pdf Any comments appreciated. Thanks DanAlso keep in mind that "a < b" is implemented as two calls to "a.opCmp(b)", and "a.opCmp(b)" is itself usually implemented as two calls to "something < something else" (!)Huh? a < b is implemented as a.opCmp(b) < 0, not two calls to a.opCmp(b).HOWEVER, when you get to the point of executing a == b, it's implemented as !(a < b) && !(b < a), which could more efficiently be rewritten as a.opEquals(b) or a.opCmp(b) == 0.
Feb 15 2013
On Fri, 15 Feb 2013 13:02:39 -0500, Dan <dbdavidson yahoo.com> wrote:On Friday, 15 February 2013 at 17:42:30 UTC, Steven Schveighoffer wrote:It depends: bool less(int a, int b) { return a < b; } In assembly (pardon my ignorance of asm syntax): ld a AX; sub AX, b; jneg L1; // jump if AX is less than zero ld AX 0; ret; L1: ld AX 1; ret; With opcmp: int opCmp(int a, int b) { return a - b; } Which simplifies the assembly significantly. Note we don't have to shoehorn the result into a bool (which must be 0 or 1). For other cases, such as a complex struct, less might be more efficient. And really, when it comes down to it, it all depends on how you use it. If most of the time you are using a < b or b < a, then less certainly can be more efficient (but not necessarily). But if you want to do <=, ==, orOn Fri, 15 Feb 2013 12:11:55 -0500, monarch_dodra <monarchdodra gmail.com> wrote:Yes. But isn't opCmp still more work than just a<b?Also keep in mind that "a < b" is implemented as two calls to "a.opCmp(b)", and "a.opCmp(b)" is itself usually implemented as two calls to "something < something else" (!)Huh? a < b is implemented as a.opCmp(b) < 0, not two calls to a.opCmp(b).=, then opCmp can be more efficient.-Steve
Feb 15 2013
On Friday, 15 February 2013 at 22:06:55 UTC, Steven Schveighoffer wrote:With opcmp: int opCmp(int a, int b) { return a - b; }Genius! Now I feel bad for every time I've done integral opCmp with double ternary operators :(
Feb 15 2013
On Fri, Feb 15, 2013 at 11:49:53PM +0100, monarch_dodra wrote:On Friday, 15 February 2013 at 22:06:55 UTC, Steven Schveighoffer wrote:You know that this is the origin of C's strcmp, right? Ever wondered why the C standard says that it returns <0, 0, or >0, as opposed to fixed values like -1, 0, 1? Now you know why: it's because you could implement strcmp by repeatedly subtracting respective pairs of characters from the two strings, and return the result when it's no longer zero (in some CPUs' assembly language, this is a single branch-on-nonzero instruction). No need for extraneous overhead to convert the result of the comparison into fixed values of any sort. D simply inherited from the genius of C's original designers. :) Of course, this optimization may no longer be relevant in today's CPU's, due to pipelining, hazards, branch prediction, etc.. T -- If creativity is stifled by rigid discipline, then it is not true creativity.With opcmp: int opCmp(int a, int b) { return a - b; }Genius! Now I feel bad for every time I've done integral opCmp with double ternary operators :(
Feb 15 2013
On Friday, February 15, 2013 23:49:53 monarch_dodra wrote:On Friday, 15 February 2013 at 22:06:55 UTC, Steven Schveighoffer wrote:Except that you have to worry about overflow if you use subtraction, so it's actually a bad idea in the general case. - Jonathan M DavisWith opcmp: int opCmp(int a, int b) { return a - b; }Genius! Now I feel bad for every time I've done integral opCmp with double ternary operators :(
Feb 15 2013
On Fri, 15 Feb 2013 17:49:53 -0500, monarch_dodra <monarchdodra gmail.com> wrote:On Friday, 15 February 2013 at 22:06:55 UTC, Steven Schveighoffer wrote:Actually, I'm not as smart as you think ;) There needs to be some extra checking, possibly with the carry flag. For example: 0x70000000 - (-0x70000000) would be less than zero as a resulting integer. But it would work with shorts or bytes! In any case, the example was bad, the point that which is better depends on the situation is still correct. -SteveWith opcmp: int opCmp(int a, int b) { return a - b; }Genius! Now I feel bad for every time I've done integral opCmp with double ternary operators :(
Feb 15 2013
On Fri, 15 Feb 2013 18:41:51 -0500, Steven Schveighoffer <schveiguy yahoo.com> wrote:There needs to be some extra checking, possibly with the carry flag.And further to that point, my generated assembly for a < b also is similarly flawed, so it still may be better to do opCmp. -Steve
Feb 15 2013
Steven Schveighoffer:Actually, I'm not as smart as you think ;) There needs to be some extra checking, possibly with the carry flag. For example: 0x70000000 - (-0x70000000) would be less than zero as a resulting integer. But it would work with shorts or bytes!I remember a page about such problem in the old D wiki. I was unable to find it inside the new D wiki. Bye, bearophile
Feb 15 2013
On Friday, 15 February 2013 at 23:41:50 UTC, Steven Schveighoffer wrote:In any case, the example was bad, the point that which is better depends on the situation is still correct.The compiler has to do something with "a<b". I would hope for a fundamental type it can avoid the lowering and just do whatever assembly is optimal. For a non-fundamental type, what is the case where (a.opCmp(b)<0) is more optimal than a comparable (a.opLess(b))? It is easy to create an inefficient implementation of opCmp that calls "<" twice. struct S { string s; int opCmp ( const ref S other ) { if ( s<other.s ) { return -1; } else if ( other.s<s ) { return 1; } else { return 0; } } } My point was, you can use compile time reflection to generate a suitable opCmp that uses s.opCmp if it exists or does the long version of two comparisons if not. Thanks Dan
Feb 15 2013
On Friday, 15 February 2013 at 17:42:30 UTC, Steven Schveighoffer wrote:On Fri, 15 Feb 2013 12:11:55 -0500, monarch_dodra <monarchdodra gmail.com> wrote:Right... sorry. But still, calling opCmp is usually a bit more expensive that a simpler function dedicated to giving you a < b. 99% of the time, I agree that it doesn't matter mutch, and the gain in semantic power trumps performance, but for a dedicated algorithm, eg sort, there is a definite performance difference...Also keep in mind that "a < b" is implemented as two calls to "a.opCmp(b)", and "a.opCmp(b)" is itself usually implemented as two calls to "something < something else" (!)Huh? a < b is implemented as a.opCmp(b) < 0, not two calls to a.opCmp(b).HOWEVER, when you get to the point of executing a == b, it's implemented as !(a < b) && !(b < a), which could more efficiently be rewritten as a.opEquals(b) or a.opCmp(b) == 0.And that.
Feb 15 2013
You can check if disabling the GC just before the insert process improves the performance. You may see 3x performance improvement. Disabling is safe provided you re-enable, this can be done reliably with scope(exit) or something similar. import core.memory; // ... void main () { auto f = new container (); element.postblit_counter = 0; GC.disable; foreach (i; 0..LIMIT) { f.insert (element (i)); } GC.enable; writefln ("%s", element.postblit_counter); } --rt
Feb 13 2013
On 2013-02-14 01:09, Rob T wrote:You can check if disabling the GC just before the insert process improves the performance. You may see 3x performance improvement. Disabling is safe provided you re-enable, this can be done reliably with scope(exit) or something similar.How did you know? It was 3x in my case. :) Well, even more but the program had to wait 2 seconds at the end to collect. With LIMIT at 10M, g++: 5.0s, gdc: 27.0s and 8.7s with GC.disable. Internal memory handling by containers - yeah, can't wait to see that happen!
Feb 13 2013
On Thursday, 14 February 2013 at 00:25:15 UTC, FG wrote:On 2013-02-14 01:09, Rob T wrote:Oh yes, I forgot to mention you can expect ~2 second additional delay from re-enabling the CG. How did I know? Read this ... http://forum.dlang.org/thread/waarzqtfcxuzhzdelhtt forum.dlang.org So it seems that we can get close to g++ performance with some manual tweaking of the GC, which is good, but you have to know this and know where to make the adjustments. What we really need is a better GC with manual fine tuning control over how it operates, with an ability to gain feed back from it to understand where and when it may be doing something stupid, and even better than that is if we could use D completely without the GC, which may be a requirement for certain types of applications. Unfortunately, a certain subset of D depends on there being a method of automated garbage collection in place, so currently you cannot use all of D without a GC. --rtYou can check if disabling the GC just before the insert process improves the performance. You may see 3x performance improvement. Disabling is safe provided you re-enable, this can be done reliably with scope(exit) or something similar.How did you know? It was 3x in my case. :) Well, even more but the program had to wait 2 seconds at the end to collect. With LIMIT at 10M, g++: 5.0s, gdc: 27.0s and 8.7s with GC.disable. Internal memory handling by containers - yeah, can't wait to see that happen!
Feb 13 2013
On Wed, 13 Feb 2013 18:22:02 -0500, Ivan Kazmenko <gassa mail.ru> wrote:Hi! I'm learning to use D collections properly, and I'm looking for a sorted data structure with logarithmic access time (i.e., a binary search tree will do, but a hash table would not help). As far as I can see, std.container.RedBlackTree is exactly what I need. However, I am not sure if I use it as intended as its performance seems inferior to a C++ STL solution (e.g., std::set). To be more specific, right now I wonder what is the best (or intended) way to store an object in the RedBlackTree: should it be a class reference, or a struct (passed by value), or something quirkier like an integer pointing into an array or a simple pointer. The rest of my program suggested to use structs, but the whole thing turned out to be rather slow, and the profiler told me that these structs are being copied around much more than I anticipated. And so I wrote a minimalistic test program to check the number of copy (postblit) constructor calls. Here is the D version: ----- import std.container; import std.stdio; immutable int LIMIT = 100000; struct element { static int postblit_counter; long x; int opCmp (ref element other) { return (x > other.x) - (x < other.x); } this (long nx) { x = nx; } this (ref this) { assert (x == x); postblit_counter++; } } alias RedBlackTree !(element) container; void main () { auto f = new container (); element.postblit_counter = 0; foreach (i; 0..LIMIT) { f.insert (element (i)); } writefln ("%s", element.postblit_counter); } ----- And now here is a C++ equivalent: ----- #include <cstdio> #include <set> #include <stdint.h> const int LIMIT = 100000; using namespace std; struct element { static int postblit_counter; int64_t x; bool operator < (const element other) const { return (x < other.x); } element (int64_t nx) { x = nx; } element (const element & other) { postblit_counter++; } }; int element::postblit_counter; typedef set <element> container; int main (void) { container f; element::postblit_counter = 0; for (int i = 0; i < LIMIT; i++) { f.insert (element (i)); } printf ("%d\n", element::postblit_counter); return 0; } ----- And the results are: D2 (DMD 2.059, -O): 11,389,556 C++ (MinGW GCC 4.7.2, -O2): 3,072,387 As you can see, in order to insert 100,000 elements, D needs a few times more copy constructor calls than C++. However, as far as I know, the internal structure used by C++ std::set is the very same red-black tree! Am I doing something wrong? And if not, what is this cost paid for, are there any features that RedBlackTree possesses and STL set doesn't? Personally, I don't see why at all we should call the copy constructor more than once per element. I mean, if we intend to build a generic data structure, we sure need an internal node object with some extra bytes (internal references and counters) per each element, at least in the case of a red-black tree. So why don't we just bind each element to that internal node once and for all, and then, as long as the node is in the structure, use the data contained in it only by reference? What do we achieve if we choose to use it by value somewhere?I find the number of postblit calls excessive too. I will have to look into why that happens. I can say that once an element is allocated, it is not moved or copied. That is, the red black algorithm does not copy the data around inside the tree, it rotates by adjusting pointers. I required this because it makes cursors sane (they always point at the same element). I will note that std.container.RedBlackTree is a port of dcollections' RBTree implementation. As std.container's API is unfinished, there certainly is room for improvement on the std.container version. Dcollections' API is more complete, and may perform better. Specifically, RBTree inside dcollections uses a custom allocator that should significantly reduce runtime.And if the intended design is "use class references", will that create an overhead for garbage collector later on?It is not designed only for classes, your usage above should be perfectly valid. -Steve
Feb 13 2013
On Wednesday, 13 February 2013 at 23:22:03 UTC, Ivan Kazmenko wrote:Hi! ----- Ivan Kazmenko.Keep in mind that C++ and D have very different philosophies regarding copy construction. C++ has "strong ownership", so for example, whenever you copy a string/vector, or pass it by value, the whole damn thing has to be completely duplicated. It's so expansive that C++ has gone out of its way to use pass-by-reference semantics, as well (now) RValue references. D, on the other hand (being GC), has a "shallow ownership" philosopy: Basically, when you make a copy, nothing happens, appart from the binary copy of the object. You want to pass a string by value? that's 8 bytes of data. Period. The *most* complex CC I have *ever* seen in D merely increments a reference counter. That's it. 90% (99%?) of the time, there isn't even a postblit implemented. Because of this, D has embraced pass by value. Not to mention, if you are using classes, those are already pointer types. Why pass them by ref? -------- The conclusion is that the comparison is not fair: D's pass by value is not *very* different from C++'s pass by ref (in amount of data copied). If you *do* want a fair-er comparison, then I'd suggest you implement a ".dup" on your object, and see how many times THAT gets called. Guess what: 0. I'm not saying D's approach is perfect, just that D's library is optimized for D-like types. Just the same, a lot the stl is not properlly optimized for little POD types, as they get passed around by ref, when their actual size is the same as the pointer referencing them...
Feb 13 2013
On Thursday, 14 February 2013 at 06:56:38 UTC, monarch_dodra wrote:On Wednesday, 13 February 2013 at 23:22:03 UTC, Ivan Kazmenko wrote:I agree. There are cases where structs make a lot of sense, usually when they are very simple simple and contain no pointers or references, otherwise structs should be avoided in favor of classes to avoid doing copy/move constructors and to avoid concerns over performance optimizations. With classes, only certain points in your code require that a duplicate copy be made of the class instance, the majority of code need only to pass around a reference which is the default behavior - easy and fast! --rtHi! ----- Ivan Kazmenko.Keep in mind that C++ and D have very different philosophies regarding copy construction. C++ has "strong ownership", so for example, whenever you copy a string/vector, or pass it by value, the whole damn thing has to be completely duplicated. It's so expansive that C++ has gone out of its way to use pass-by-reference semantics, as well (now) RValue references. D, on the other hand (being GC), has a "shallow ownership" philosopy: Basically, when you make a copy, nothing happens, appart from the binary copy of the object. You want to pass a string by value? that's 8 bytes of data. Period. The *most* complex CC I have *ever* seen in D merely increments a reference counter. That's it. 90% (99%?) of the time, there isn't even a postblit implemented. Because of this, D has embraced pass by value. Not to mention, if you are using classes, those are already pointer types. Why pass them by ref? -------- The conclusion is that the comparison is not fair: D's pass by value is not *very* different from C++'s pass by ref (in amount of data copied). If you *do* want a fair-er comparison, then I'd suggest you implement a ".dup" on your object, and see how many times THAT gets called. Guess what: 0. I'm not saying D's approach is perfect, just that D's library is optimized for D-like types. Just the same, a lot the stl is not properlly optimized for little POD types, as they get passed around by ref, when their actual size is the same as the pointer referencing them...
Feb 14 2013
I agree. There are cases where structs make a lot of sense, usually when they are very simple simple and contain no pointers or references, otherwise structs should be avoided in favor of classes to avoid doing copy/move constructors and to avoid concerns over performance optimizations. With classes, only certain points in your code require that a duplicate copy be made of the class instance, the majority of code need only to pass around a reference which is the default behavior - easy and fast! --rtIt sounds like Java philosophy: Objects are always better. Or have I misunderstood? In any case, a intensive use of classes / objects, instead of structs would be also an enormous heap effort. I usually try to use as few classes as possible.
Feb 14 2013
On Thursday, 14 February 2013 at 10:23:18 UTC, Namespace wrote:It's a matter of balance. If you start having really complex objects (and big, eg > 100 bytes), then classes tend to scale better. If having a class is *really* too much overhead, but your objects start getting too big to pass around by value, you can just new them on the heap, and you'll get the "best" (or worst?) of both worlds. Another good balance are stack based struct pointer wrappers to implementation : You can pass them by value, but they carry a complex payload. The advantage to doing this over a naked array is the static type. The *dis*-advantage is that D has no standard default initialization scheme (classes do though). Most things in phobos use this scheme. The point I (we?) are trying to get across is that *usually* (not a hard rule) copying things in D is *expected* to be trivial and cheap. If this is not the case, then the tools you'll interface with will not work optimally.I agree. There are cases where structs make a lot of sense, usually when they are very simple simple and contain no pointers or references, otherwise structs should be avoided in favor of classes to avoid doing copy/move constructors and to avoid concerns over performance optimizations. With classes, only certain points in your code require that a duplicate copy be made of the class instance, the majority of code need only to pass around a reference which is the default behavior - easy and fast! --rtIt sounds like Java philosophy: Objects are always better. Or have I misunderstood? In any case, a intensive use of classes / objects, instead of structs would be also an enormous heap effort. I usually try to use as few classes as possible.
Feb 14 2013
Another good balance are stack based struct pointer wrappers to implementation : You can pass them by value, but they carry a complex payload.I'm not sure what that is. Can you give a small example?
Feb 14 2013
On Thursday, 14 February 2013 at 10:44:22 UTC, Namespace wrote:struct S { static struct Payload { //Tons of data here } Payload* _p; //fonctions } Ref counted is implemented that way. most of the containers are also implemented that way. associative arrays are also implemented that way under the hood.Another good balance are stack based struct pointer wrappers to implementation : You can pass them by value, but they carry a complex payload.I'm not sure what that is. Can you give a small example?
Feb 14 2013
struct S { static struct Payload { //Tons of data here } Payload* _p; //fonctions } Ref counted is implemented that way. most of the containers are also implemented that way. associative arrays are also implemented that way under the hood.But you have to allocate '_p' again on the heap. I see no advantage over classes, except that these structures are just not null by default. Is that the advantage?
Feb 14 2013
On Thursday, 14 February 2013 at 10:58:19 UTC, Namespace wrote:Well, yeah, that's the point. I'm not saying it's a best fit for everything. You are basically trading construction costs for copy costs.struct S { static struct Payload { //Tons of data here } Payload* _p; //fonctions } Ref counted is implemented that way. most of the containers are also implemented that way. associative arrays are also implemented that way under the hood.But you have to allocate '_p' again on the heap.I see no advantage over classes, except that these structures are just not null by default.Actually, (and IMO, this is a very big problem), these structures are *always* null by default. There is no easy way to "default initialize" structs in D :(Is that the advantage?The advantage is deterministic RAII. With classes you only get non-deterministic RAII. For example: File. File will close the underlying FILE* when the last File is destroyed. But no later.
Feb 14 2013
On Thursday, February 14, 2013 13:27:30 monarch_dodra wrote:Actually, (and IMO, this is a very big problem), these structures are *always* null by default. There is no easy way to "default initialize" structs in D :(You mean there's no way to default-construct them. They're always default- initialized to their init value if you don't explicitly initialize them. - Jonathan M Davis
Feb 14 2013
14-Feb-2013 03:22, Ivan Kazmenko пишет:Hi! I'm learning to use D collections properly, and I'm looking for a sorted data structure with logarithmic access time (i.e., a binary search tree will do, but a hash table would not help). As far as I can see, std.container.RedBlackTree is exactly what I need. However, I am not sure if I use it as intended as its performance seems inferior to a C++ STL solution (e.g., std::set). To be more specific, right now I wonder what is the best (or intended) way to store an object in the RedBlackTree: should it be a class reference, or a struct (passed by value), or something quirkier like an integer pointing into an array or a simple pointer. The rest of my program suggested to use structs, but the whole thing turned out to be rather slow, and the profiler told me that these structs are being copied around much more than I anticipated. And so I wrote a minimalistic test program to check the number of copy (postblit) constructor calls. Here is the D version:[snip]And the results are: D2 (DMD 2.059, -O): 11,389,556I'd add : -release -inline or it may not inline away temporary copies. -- Dmitry Olshansky
Feb 14 2013
Dmitry Olshansky wrote:Thank you for the suggestion. However, the result after "-O -release -inline" is still 11,389,556.D2 (DMD 2.059, -O): 11,389,556I'd add : -release -inline or it may not inline away temporary copies.
Feb 14 2013
If I use (as you do) ---- this(ref this) { ---- I get 0 as output. (dmd 2.062 beta1) If I remove the 'ref' I get 11389556.
Feb 14 2013
On Thu, 14 Feb 2013 09:47:31 -0500, Namespace <rswhite4 googlemail.com> wrote:If I use (as you do) ---- this(ref this) { ---- I get 0 as output. (dmd 2.062 beta1) If I remove the 'ref' I get 11389556.this(ref this) is not a postblit. That it would even compile is a bug. -Steve
Feb 14 2013
If I use (as you do) ---- this(ref this) { ---- I get 0 as output. (dmd 2.062 beta1) If I remove the 'ref' I get 11389556.Ouch, sorry! That's a copy & paste bug I introduced when posting here. Actually, I use this (this) of course. I tried this (ref this) at some point, it does indeed compile surprisingly, but it's not what gets called on copying (and thus is of no use).
Feb 14 2013