digitalmars.D.learn - Efficient way to pass struct as parameter
- Tim Hsu (13/13) Jan 02 2018 I am creating Vector3 structure. I use struct to avoid GC.
- Jonathan M Davis (36/49) Jan 02 2018 When passing a struct to a funtion, if the argument is an rvalue, it wil...
- Igor Shirkalin (4/5) Jan 02 2018 Smart optimizer should think for you without any "auto" private
- Jonathan M Davis (9/14) Jan 02 2018 A smart optimizer may very well optimize out a number of copies. The fac...
- Seb (12/25) Jan 02 2018 If you want the compiler to ensure that a struct doesn't get
- Johan Engelen (18/31) Jan 02 2018 Pass the Vector3f by value.
- Adam D. Ruppe (3/4) Jan 02 2018 This is very frequently the correct answer to these questions!
- Jonathan M Davis (11/15) Jan 02 2018 It also makes for much cleaner code if you pretty much always pass by va...
- Tim Hsu (8/12) Jan 02 2018 However speed really matters for me. I am writing a path tracing
- H. S. Teoh (9/23) Jan 03 2018 That's why you need to use a profiler to find out where the hotspots
- Jacob Carlborg (5/6) Jan 03 2018 This is the correct answer. Never assume anything about performance
- H. S. Teoh (30/35) Jan 02 2018 +1.
- Patrick Schluter (3/8) Jan 03 2018 That's why I always tell that C++ is premature optimization
- =?UTF-8?Q?Ali_=c3=87ehreli?= (28/36) Jan 03 2018 In my earlier C++ days I've embarrassed myself by insisting that strings...
- Marco Leise (7/29) Jan 28 2018 May I add, this is also optimal performance-wise. The result
- Cecil Ward (25/38) Mar 15 2018 This isn't a question for you. it's a question for the compiler,
- Cecil Ward (2/5) Mar 15 2018 or even 'I' will be delighted to take a look.
- Cecil Ward (9/15) Mar 15 2018 Also link time optimisation and whole program optimisation might
I am creating Vector3 structure. I use struct to avoid GC. However, struct will be copied when passed as parameter to function struct Ray { Vector3f origin; Vector3f dir; nogc system this(Vector3f *origin, Vector3f *dir) { this.origin = *origin; this.dir = *dir; } } How can I pass struct more efficiently?
Jan 02 2018
On Tuesday, January 02, 2018 18:21:13 Tim Hsu via Digitalmars-d-learn wrote:I am creating Vector3 structure. I use struct to avoid GC. However, struct will be copied when passed as parameter to function struct Ray { Vector3f origin; Vector3f dir; nogc system this(Vector3f *origin, Vector3f *dir) { this.origin = *origin; this.dir = *dir; } } How can I pass struct more efficiently?When passing a struct to a funtion, if the argument is an rvalue, it will be moved rather than copied, but if it's an lvalue, it will be copied. If the parameter is marked with ref, then the lvalue will be passed by reference and not copied, but rvalues will not be accepted (and unlike with C++, tacking on const doesn't affect that). Alternatively, if the function is templated (and you can add empty parens to templatize a function if you want to), then an auto ref parameter will result in different template instantiations depending on whether the argument is an lvalue or rvalue. If it's an lvalue, then the template will be instantiated with that parameter as ref, so the argument will be passed by ref and no copy will be made, whereas if it's an rvalue, then the parameter will end up without having ref, so the argument will be moved. If the function isn't templated and can't be templated (e.g. if its a member function of a class and you want it to be virtual), then you'd need to overload the function with overloads that have ref and don't have ref in order to get the same effect (though the non-ref overload can simply forward to the ref overload). That does get a bit tedious though if you have several parameters. If you want to guarantee that no copy will ever be made, then you will have to either use ref or a pointer, which could get annoying with rvalues (since you'd have to assign them to a variable) and could actually result in more copies, because it would restrict the compiler's abilities to use moves instead of copies. In general, the best way is likely going to be to use auto ref where possible and overload functions where not. Occasionally, there is talk of adding something similar to C++'s const& to D, but Andrei does not want to add rvalue references to the language, and D's const is restrictive enough that requiring const to avoid the copy would arguably be overly restrictive. It may be that someone will eventually propose a feature with semantics that Andrei will accept that acts similarly to const&, but it has yet to happen. auto ref works for a lot of cases though, and D's ability to do moves without a move constructor definitely reduces the number of unnecessary copies. See also: https://stackoverflow.com/questions/35120474/does-d-have-a-move-constructor - Jonathan M Davis
Jan 02 2018
On Tuesday, 2 January 2018 at 18:45:48 UTC, Jonathan M Davis wrote:[...]Smart optimizer should think for you without any "auto" private words if function is inlined. I mean LDC compiler first of all.
Jan 02 2018
On Tuesday, January 02, 2018 19:27:50 Igor Shirkalin via Digitalmars-d-learn wrote:On Tuesday, 2 January 2018 at 18:45:48 UTC, Jonathan M Davis wrote:A smart optimizer may very well optimize out a number of copies. The fact that D requires that structs be moveable opens up all kinds of optimization opportunities - even more so when stuff gets inlined. However, if you want to guarantee that unnecessary copies aren't happening, you have to ensure that ref gets used with lvalues and does not get used with rvalues, and that tends to mean either using auto ref or overloading functions on ref. - Jonathan M Davis[...]Smart optimizer should think for you without any "auto" private words if function is inlined. I mean LDC compiler first of all.
Jan 02 2018
On Tuesday, 2 January 2018 at 18:21:13 UTC, Tim Hsu wrote:I am creating Vector3 structure. I use struct to avoid GC. However, struct will be copied when passed as parameter to function struct Ray { Vector3f origin; Vector3f dir; nogc system this(Vector3f *origin, Vector3f *dir) { this.origin = *origin; this.dir = *dir; } } How can I pass struct more efficiently?If you want the compiler to ensure that a struct doesn't get copied, you can disable its postblit: disable this(this); Now, there are a couple of goodies in std.typecons like RefCounted or Unique that allow you to pass struct around without needing to worry about memory allocation: https://dlang.org/phobos/std_typecons.html#RefCounted https://dlang.org/phobos/std_typecons.html#Unique Example: https://run.dlang.io/is/3rbqpn Of course, you can always roll your own allocator: https://run.dlang.io/is/uNmn0d
Jan 02 2018
On Tuesday, 2 January 2018 at 18:21:13 UTC, Tim Hsu wrote:I am creating Vector3 structure. I use struct to avoid GC. However, struct will be copied when passed as parameter to function struct Ray { Vector3f origin; Vector3f dir; nogc system this(Vector3f *origin, Vector3f *dir) { this.origin = *origin; this.dir = *dir; } } How can I pass struct more efficiently?Pass the Vector3f by value. There is not one best solution here: it depends on what you are doing with the struct, and how large the struct is. It depends on whether the function will be inlined. It depends on the CPU. And probably 10 other things. Vector3f is a small struct (I'm guessing it's 3 floats?), pass it by value and it will be passed in registers. This "copy" costs nothing on x86, the CPU will have to load the floats from memory and store them in a register anyway, before it can write it to the target Vector3f, regardless of how you pass the Vector3f. You can play with some code here: https://godbolt.org/g/w56jmA Passing by pointer (ref is the same) has large downsides and is certainly not always fastest. For small structs and if copying is not semantically wrong, just pass by value. More important: measure what bottlenecks your program has and optimize there. - Johan
Jan 02 2018
On Tuesday, 2 January 2018 at 22:17:14 UTC, Johan Engelen wrote:Pass the Vector3f by value.This is very frequently the correct answer to these questions! Never assume ref is faster if speed matters - it may not be.
Jan 02 2018
On Tuesday, January 02, 2018 22:49:20 Adam D. Ruppe via Digitalmars-d-learn wrote:On Tuesday, 2 January 2018 at 22:17:14 UTC, Johan Engelen wrote:It also makes for much cleaner code if you pretty much always pass by value and then only start dealing with ref or auto ref when you know that you need it - especially if you're going to need to manually overload the function on refness. But for better or worse, a lot of this sort of thing ultimately depends on what the optimizer does to a particular piece of code, and that's far from easy to predict given everything that an optimizer can do these days - especially if you're using ldc rather than dmd. - Jonathan M DavisPass the Vector3f by value.This is very frequently the correct answer to these questions! Never assume ref is faster if speed matters - it may not be.
Jan 02 2018
On Tuesday, 2 January 2018 at 22:49:20 UTC, Adam D. Ruppe wrote:On Tuesday, 2 January 2018 at 22:17:14 UTC, Johan Engelen wrote:However speed really matters for me. I am writing a path tracing program. Ray will be constructed million of times during computation. And will be passed to functions to test intersection billion of times. After Reading comments here, it seems ray will be passed by value to the intersection testing function. I am not sure if ray is small enough to be passed by value. It needs some experiment.Pass the Vector3f by value.This is very frequently the correct answer to these questions! Never assume ref is faster if speed matters - it may not be.
Jan 02 2018
On Wed, Jan 03, 2018 at 07:02:28AM +0000, Tim Hsu via Digitalmars-d-learn wrote:On Tuesday, 2 January 2018 at 22:49:20 UTC, Adam D. Ruppe wrote:That's why you need to use a profiler to find out where the hotspots are. It may not be where you think it is.On Tuesday, 2 January 2018 at 22:17:14 UTC, Johan Engelen wrote:However speed really matters for me.Pass the Vector3f by value.This is very frequently the correct answer to these questions! Never assume ref is faster if speed matters - it may not be.I am writing a path tracing program. Ray will be constructed million of times during computation. And will be passed to functions to test intersection billion of times. After Reading comments here, it seems ray will be passed by value to the intersection testing function. I am not sure if ray is small enough to be passed by value. It needs some experiment.With modern CPUs with advanced caching, it may not always be obvious whether passing by value or passing by reference is better. Always use a profiler to be sure. T -- If blunt statements had a point, they wouldn't be blunt...
Jan 03 2018
On 2018-01-03 08:02, Tim Hsu wrote:It needs some experiment.This is the correct answer. Never assume anything about performance before having tested it. -- /Jacob Carlborg
Jan 03 2018
On Tue, Jan 02, 2018 at 10:17:14PM +0000, Johan Engelen via Digitalmars-d-learn wrote: [...]Passing by pointer (ref is the same) has large downsides and is certainly not always fastest. For small structs and if copying is not semantically wrong, just pass by value.+1.More important: measure what bottlenecks your program has and optimize there.[...] It cannot be said often enough: premature optimization is the root of all evils. It makes your code less readable, less maintainable, more bug-prone, and makes you spend far too much time and energy fiddling with details that ultimately may not even matter, and worst of all, it may not even be a performance win in the end, e.g., if you end up with CPU cache misses / excessive RAM roundtrips because of too much indirection, where you could have passed the entire struct in registers. When it comes to optimization, there are 3 rules: profile, profile, profile. I used to heavily hand-"optimize" my code a lot (I come from a strong C/C++ background -- premature optimization seems to be a common malady among us in that crowd). Then I started using a profiler, and I suddenly had that sinking realization that all those countless hours of tweaking my code to be "optimal" were wasted, because the *real* bottleneck was somewhere else completely. From many such experiences, I've learned that (1) the real bottleneck is rarely where you predict it to be, and (2) most real bottlenecks can be fixed with very simple changes (sometimes even a 1-line change) with very big speed gains, whereas (3) fixing supposed "inefficiencies" that aren't founded on real evidence (i.e., using a profiler) usually cost many hours of time, add tons of complexity, and rarely give you more than 1-2% speedups (and sometimes can actually make your code perform *worse*: your code can become so complicated the compiler's optimizer is unable to generate optimal code for it). T -- MSDOS = MicroSoft's Denial Of Service
Jan 02 2018
On Tuesday, 2 January 2018 at 23:27:22 UTC, H. S. Teoh wrote:When it comes to optimization, there are 3 rules: profile, profile, profile. I used to heavily hand-"optimize" my code a lot (I come from a strong C/C++ background -- premature optimization seems to be a common malady among us in that crowd).That's why I always tell that C++ is premature optimization oriented programming, aka as POOP.
Jan 03 2018
On 01/03/2018 10:40 AM, Patrick Schluter wrote:On Tuesday, 2 January 2018 at 23:27:22 UTC, H. S. Teoh wrote:In my earlier C++ days I've embarrassed myself by insisting that strings should be passed by reference for performance reasons. (No, I had not profiled.) Then I learned more and always returned vectors (and maps) by value from producer functions: vector<int> makeInts(some param) { // ... } That's how it should be! :) I used the same function when interviewing candidates (apologies to all; I don't remember good things about my interviewing other people; I hope I will never interview people like that anymore). They would invariably write a function something like this: void makeInts(vector<int> & result, some param) { // ... } And that's wrong because there are the big questions of what do you require or do with the reference parameter 'result'? Would you clear it first? If not, shouldn't the function be named appendInts? If you cleared it upfront, would you still be happy if an exception was thrown inside the function, etc. That's why I like producer functions that return values: vector<int> makeInts(some param) { // ... } And if they can be 'pure', D allows them to be used to initialize immutable variables as well. Pretty cool! :) AliWhen it comes to optimization, there are 3 rules: profile, profile, profile. I used to heavily hand-"optimize" my code a lot (I come from a strong C/C++ background -- premature optimization seems to be a common malady among us in that crowd).That's why I always tell that C++ is premature optimization oriented programming, aka as POOP.
Jan 03 2018
Am Wed, 3 Jan 2018 10:57:13 -0800 schrieb Ali =C3=87ehreli <acehreli yahoo.com>:On 01/03/2018 10:40 AM, Patrick Schluter wrote: > On Tuesday, 2 January 2018 at 23:27:22 UTC, H. S. Teoh wrote: =20 >> >> When it comes to optimization, there are 3 rules: profile, profile, >> profile. I used to heavily hand-"optimize" my code a lot (I come from >> a strong C/C++ background -- premature optimization seems to be a >> common malady among us in that crowd). =20 > > That's why I always tell that C++ is premature optimization oriented > programming, aka as POOP. =20 =20 [=E2=80=A6] That's why I like producer functions that return values: =20 vector<int> makeInts(some param) { // ... } =20 And if they can be 'pure', D allows them to be used to initialize=20 immutable variables as well. Pretty cool! :) =20 AliMay I add, this is also optimal performance-wise. The result variable will be allocated on the caller stack and the callee writes directly to it. So even POOPs like me, do it. --=20 Marco
Jan 28 2018
On Tuesday, 2 January 2018 at 18:21:13 UTC, Tim Hsu wrote:I am creating Vector3 structure. I use struct to avoid GC. However, struct will be copied when passed as parameter to function struct Ray { Vector3f origin; Vector3f dir; nogc system this(Vector3f *origin, Vector3f *dir) { this.origin = *origin; this.dir = *dir; } } How can I pass struct more efficiently?This isn't a question for you. it's a question for the compiler, let the compiler do its thing. Stick in a -O3 if you are using GCC or LDC and build in release more not debug mode. Make sure that the compiler can see the source code of the implementation of the constructor wherever it is used and it should just be inclined away to nonexistence. Your constructor should not even exist, if it is then you are looking at a false picture or a mistake where optimisation has been turned off for the sake of easy source-level debugging. Post up the assembler language output for a routine where this code is used in some critical situation, and then we can help make sure that the code _generation_ is optimal. I reiterate, unless something is badly wrong or you are seeing a red herring, no 'call' to the constructor code should even exist in the cases where it is actually 'called'. You may well see a useless copy of the constructor code because the compilers seem to generate such even though it is never called ans so is a waste of space. The compiler will analyse the constructor's instructions and just copy-and-paste them as assignment statements with that then getting thoroughly optimised down into something which may just be a memory write or a register-register copy that costs zero. If you post up snippets of generated asm then U will be delighted to take a look.
Mar 15 2018
On Thursday, 15 March 2018 at 23:14:14 UTC, Cecil Ward wrote:On Tuesday, 2 January 2018 at 18:21:13 UTC, Tim Hsu wrote:or even 'I' will be delighted to take a look.[...]U
Mar 15 2018
On Thursday, 15 March 2018 at 23:15:47 UTC, Cecil Ward wrote:On Thursday, 15 March 2018 at 23:14:14 UTC, Cecil Ward wrote:Also link time optimisation and whole program optimisation might be your friends if you are having problems because module boundaries mean that the compiler cannot expand the source code of the constructor implementation to inline it and fully optimise it away to nothing. You certainly should have no problems if the code that uses the struct can see the struct definition's actual source text directly.On Tuesday, 2 January 2018 at 18:21:13 UTC, Tim Hsu wrote:or even 'I' will be delighted to take a look.[...]U
Mar 15 2018