digitalmars.D.learn - Taking arguments by value or by reference

Anonymouse (7/27) Oct 03 2020 I'm passing structs around (collections of strings) whose .sizeof

Max Haughton (21/49) Oct 03 2020 Firstly, the new in semantics are very new and possibly subtly

Anonymouse (13/25) Oct 04 2020 I mostly really only want a read-only view of the struct, and

Mathias LANG (32/45) Oct 04 2020 If the struct adds overhead to copy, use `const ref`. But if you
Max Haughton (14/41) Oct 04 2020 This is skill you only really hone with experience, but it's not

IGotD- (22/50) Oct 04 2020 I don't agree with this, especially if the struct is 432 bytes.

Adam D. Ruppe (12/14) Oct 04 2020 If the compiler chooses to inline the function (which happens

Anonymouse <zorael gmail.com> writes:

I'm passing structs around (collections of strings) whose .sizeof 
returns 432.

The readme for 2.094.0 includes the following:

 This release reworks the meaning of in to properly support all 
 those use cases. in parameters will now be passed by reference 
 when optimal, [...]

 * Otherwise, if the type's size requires it, it will be passed 
 by reference.
 Currently, types which are over twice the machine word size 
 will be passed by
 reference, however this is controlled by the backend and can be 
 changed based
 on the platform's ABI.

However, I asked in #d a while ago and was told to always pass by 
value until it breaks, and only then resort to ref.

 [18:32:16] <zorael> at what point should I start passing my 
 structs by ref rather than by value? some are nested in others, 
 so sizeofs range between 120 and 620UL
 [18:33:43] <Herringway> when you start getting stack overflows
 [18:39:09] <zorael> so if I don't need ref for the references, 
 there's no inherent merit to it unless I get in trouble without 
 it?
 [18:39:20] <Herringway> pretty much
 [18:40:16] <Herringway> in many cases the copying is merely 
 theoretical and doesn't actually happen when optimized

I've so far just been using const parameters. What should I be 
using?

Oct 03 2020

Max Haughton <maxhaton gmail.com> writes:

On Saturday, 3 October 2020 at 23:00:46 UTC, Anonymouse wrote:
 I'm passing structs around (collections of strings) whose 
 .sizeof returns 432.

 The readme for 2.094.0 includes the following:

 This release reworks the meaning of in to properly support all 
 those use cases. in parameters will now be passed by reference 
 when optimal, [...]

 * Otherwise, if the type's size requires it, it will be passed 
 by reference.
 Currently, types which are over twice the machine word size 
 will be passed by
 reference, however this is controlled by the backend and can 
 be changed based
 on the platform's ABI.

 However, I asked in #d a while ago and was told to always pass 
 by value until it breaks, and only then resort to ref.

 [18:32:16] <zorael> at what point should I start passing my 
 structs by ref rather than by value? some are nested in 
 others, so sizeofs range between 120 and 620UL
 [18:33:43] <Herringway> when you start getting stack overflows
 [18:39:09] <zorael> so if I don't need ref for the references, 
 there's no inherent merit to it unless I get in trouble 
 without it?
 [18:39:20] <Herringway> pretty much
 [18:40:16] <Herringway> in many cases the copying is merely 
 theoretical and doesn't actually happen when optimized

 I've so far just been using const parameters. What should I be 
 using?

Firstly, the new in semantics are very new and possibly subtly 
broken (take a look at the current thread in general).

Secondly, as to the more specific question of how to pass a big 
struct around it may be helpful to look at this quick godbolt 
example (https://d.godbolt.org/z/nPvTWz). Pay attention to the 
instructions writing to stack memory (or not). A struct that big 
will be passed around on the stack, whether it gets copied or not 
depends on the semantics of the struct etc.

The guiding principle to your function parameters should be 
correctness - if I am passing a big struct around, if I want to 
take ownership of it I probably want to take it by value but if I 
want to modify it I should take it by reference (or by pointer 
but don't overcomplicate, notice in the previous example they 
lower to the same thing). If I just want to look at it, it should 
be taken by const ref if possible (D const isn't the same as C++ 
const, this may catch you out).

Const-correctness is a rule to live by especially with an big 
unwieldy struct.

I would avoid the new in for now, but I would go with const ref 
from what you've described so far.

Oct 03 2020

Anonymouse <zorael gmail.com> writes:

On Saturday, 3 October 2020 at 23:47:32 UTC, Max Haughton wrote:
 The guiding principle to your function parameters should be 
 correctness - if I am passing a big struct around, if I want to 
 take ownership of it I probably want to take it by value but if 
 I want to modify it I should take it by reference (or by 
 pointer but don't overcomplicate, notice in the previous 
 example they lower to the same thing). If I just want to look 
 at it, it should be taken by const ref if possible (D const 
 isn't the same as C++ const, this may catch you out).

 Const-correctness is a rule to live by especially with an big 
 unwieldy struct.

 I would avoid the new in for now, but I would go with const ref 
 from what you've described so far.

I mostly really only want a read-only view of the struct, and 
whether a copy was done or not is academic. However, profiling 
showed (what I interpret as) a lot of copying being done in 
release builds specifically.

https://i.imgur.com/JJzh4Zc.jpg

Naturally a situation where I need ref I'd use ref, and in the 
rare cases where it actually helps to have a mutable copy 
directly I take it mutable. But if I understand what you're 
saying, and ignoring --preview=in, you'd recommend I use const 
ref where I would otherwise use const?

Is there some criteria I can go by when making this decision, or 
does it always reduce to looking at the disassembly?

Oct 04 2020

Mathias LANG <geod24 gmail.com> writes:

On Sunday, 4 October 2020 at 14:26:43 UTC, Anonymouse wrote:
 [...]

 I mostly really only want a read-only view of the struct, and 
 whether a copy was done or not is academic. However, profiling 
 showed (what I interpret as) a lot of copying being done in 
 release builds specifically.

 https://i.imgur.com/JJzh4Zc.jpg

 Naturally a situation where I need ref I'd use ref, and in the 
 rare cases where it actually helps to have a mutable copy 
 directly I take it mutable. But if I understand what you're 
 saying, and ignoring --preview=in, you'd recommend I use const 
 ref where I would otherwise use const?

 Is there some criteria I can go by when making this decision, 
 or does it always reduce to looking at the disassembly?

If the struct adds overhead to copy, use `const ref`. But if you 
do, you might end up with another set of problems. Aliasing is 
one of them, and the dangers of it are discussed at length in the 
thread about `-preview=in` in general. The other issue is that 
`const ref` means you cannot pass rvalues.
This is when people usually turn towards `auto ref`. 
Unfortunately, it requires you to use templates, which is not 
always possible.

So, in short: `auto ref const` if it's a template and aliasing is 
not a concern, `const ref` if the copy adds overhead, and add a 
`const` non-`ref` overload to deal with rvalues if needed. If you 
want to be a bit more strict, throwing `scope` in the mix is good 
practice, too.

----------

Now, about `-preview=in`: The aim of this switch is to address 
*exactly* this use case. While it is still experimental and I 
don't recommend using it in critical projects just yet, giving it 
a try should be straightforward and any feedback is appreciated.

What I mean by "should be straightforward", is that the only 
thing `-preview=in` will complain about is `in ref` (it triggers 
an error).

The main issue at the moment is that, if you use `dub`, you need 
to have control over the dependencies to add a configuration, or 
use `DFLAGS="-preview=in" dub` in order for it to work. Working 
on a fix to that right now.

For reference, this is what adapting code  to use `-preview=in` 
feels like in my project: 
https://github.com/Geod24/agora/commit/a52419851a7e6e4ef241c4617ebe0c8cc0ebe5cc
You can see that I added it pretty much everywhere the type 
`Hash` was used, because `Hash` is a 64 bytes struct but I needed 
to support rvalues.

Oct 04 2020

Max Haughton <maxhaton gmail.com> writes:

On Sunday, 4 October 2020 at 14:26:43 UTC, Anonymouse wrote:
 On Saturday, 3 October 2020 at 23:47:32 UTC, Max Haughton wrote:
 The guiding principle to your function parameters should be 
 correctness - if I am passing a big struct around, if I want 
 to take ownership of it I probably want to take it by value 
 but if I want to modify it I should take it by reference (or 
 by pointer but don't overcomplicate, notice in the previous 
 example they lower to the same thing). If I just want to look 
 at it, it should be taken by const ref if possible (D const 
 isn't the same as C++ const, this may catch you out).

 Const-correctness is a rule to live by especially with an big 
 unwieldy struct.

 I would avoid the new in for now, but I would go with const 
 ref from what you've described so far.

 I mostly really only want a read-only view of the struct, and 
 whether a copy was done or not is academic. However, profiling 
 showed (what I interpret as) a lot of copying being done in 
 release builds specifically.

 https://i.imgur.com/JJzh4Zc.jpg

 Naturally a situation where I need ref I'd use ref, and in the 
 rare cases where it actually helps to have a mutable copy 
 directly I take it mutable. But if I understand what you're 
 saying, and ignoring --preview=in, you'd recommend I use const 
 ref where I would otherwise use const?

 Is there some criteria I can go by when making this decision, 
 or does it always reduce to looking at the disassembly?

This is skill you only really hone with experience, but it's not 
too bad once you're used to it.

For a big struct, I would just stick to expressing what you want 
it to *do* rather than how you want it to perform. If you want to 
take ownership you basically have to take by value, but if you 
(as you said) want a read only view definitely const ref. If I 
was reading your code, ref immediately tells me not to think 
about ownership and const ref immediately tells me you just want 
to look at the goods.

One thing I haven't mentioned so far is that not all types have 
non-trivial semantics when it comes to passing them around by 
value, so if you are writing generic code it is often best to 
avoid these.

Oct 04 2020

IGotD- <nise nise.com> writes:

On Saturday, 3 October 2020 at 23:00:46 UTC, Anonymouse wrote:
 I'm passing structs around (collections of strings) whose 
 .sizeof returns 432.

 The readme for 2.094.0 includes the following:

 This release reworks the meaning of in to properly support all 
 those use cases. in parameters will now be passed by reference 
 when optimal, [...]

 * Otherwise, if the type's size requires it, it will be passed 
 by reference.
 Currently, types which are over twice the machine word size 
 will be passed by
 reference, however this is controlled by the backend and can 
 be changed based
 on the platform's ABI.

 However, I asked in #d a while ago and was told to always pass 
 by value until it breaks, and only then resort to ref.

 [18:32:16] <zorael> at what point should I start passing my 
 structs by ref rather than by value? some are nested in 
 others, so sizeofs range between 120 and 620UL
 [18:33:43] <Herringway> when you start getting stack overflows
 [18:39:09] <zorael> so if I don't need ref for the references, 
 there's no inherent merit to it unless I get in trouble 
 without it?
 [18:39:20] <Herringway> pretty much
 [18:40:16] <Herringway> in many cases the copying is merely 
 theoretical and doesn't actually happen when optimized

 I've so far just been using const parameters. What should I be 
 using?

I don't agree with this, especially if the struct is 432 bytes. 
It takes time and memory to copy such structure. I always use 
"const ref" when I pass structures because that's only a pointer. 
Classes are references by themselves so its not applicable there. 
Only "ref" when I want to modify the contents.

However there are some exceptions to this rule in D as D support 
slice parameters. In this case you want a copy as slice of the 
array, often because the slice is often casted from something 
else. Basically the array slice parameter become an lvalue.

This copy of parameters to the stack is an abomination in 
computer science and only useful in some cases but mostly not. 
The best would be if the compiler itself could determine what is 
the most efficient. Nim does this and it was not long ago 
suggested that the "in" keyword should have a new life as such 
optimization, is that the change that has entered in 2.094.0? Why 
wasn't this a DIP?

I even see this in some C++ program code where strings are passed 
as value which means that the string is copied including a 
possible memory allocation which certainly slow things down.

Do not listen to people who says "pass everything by value" 
because that is in general not ideal in imperative languages.

Oct 04 2020

Adam D. Ruppe <destructionator gmail.com> writes:

On Sunday, 4 October 2020 at 15:30:48 UTC, IGotD- wrote:
 I don't agree with this, especially if the struct is 432 bytes. 
 It takes time and memory to copy such structure.

If the compiler chooses to inline the function (which happens 
quite frequently with optimizations turned on), no copy takes 
place regardless of how you write it if the compiler can see it 
is unnecessary.

Returning a struct by value rarely means a copy either since the 
compiler actually passed a pointer to where it wants it up front, 
so it is constructed in-place.

So like "pass by value" in the language is not necessarily big 
copies in the generated binary. That's why the irc folks were 
advising to not worry about it unless you see a problem coming up 
that the profiles points here.

Oct 04 2020

D Programming

C/C++ Programming

Other

digitalmars.D.learn - Taking arguments by value or by reference