digitalmars.D.learn - guidelines for parameter types

Dan (213/213) Dec 17 2012 Assume V is a non-template parameter type and v is a parameter of

Dan (3/3) Dec 17 2012 On Monday, 17 December 2012 at 20:46:27 UTC, Dan wrote:
bearophile (4/4) Dec 17 2012 Are in, out, scope, std.typecons.Nullable,

Dan (6/10) Dec 17 2012 Not at all - I cringe at the thought of dealing with those as

=?UTF-8?B?QWxpIMOHZWhyZWxp?= (109/172) Dec 17 2012 Thank you very much for doing the hard work on this. I find this kind of...

Dan (64/108) Dec 18 2012 Thanks - I will study it. I see that you have covered also in,

=?UTF-8?B?QWxpIMOHZWhyZWxp?= (181/257) Dec 18 2012 For convenience, here are the chapters and guidelines that are relevant:
H. S. Teoh (57/69) Dec 18 2012 It's not just about whether the function mutates something or not.

Dan (9/25) Dec 18 2012 Thanks! You and Ali have presented good examples where 'ref

"Dan" <dbdavidson yahoo.com> writes:

Assume V is a non-template parameter type and v is a parameter of 
that type for any function. Also assume T is a template parameter 
type and t is a parameter of that type for any function. Is the 
following table and set of guidelines below reasonable? What 
other guidelines do you use or would make sense to follow? I 
apologize if this is obvious/well known and I consider myself new 
to D, so please address any of my misconceptions. I'm finding 
that on the surface D sounds much simpler than it is, but if I 
can get a good set of guidelines it should all work out.

Thanks,
Dan

(COW means Copy on Write)

| convention              | what it means/when to use             
     |
|-------------------------+-------------------------------------------|
| V v                     | V is primitive, dynamic arr, assoc 
array, |
|                         | COW, or kown copy cheap (< 16 bytes)  
     |
|                         |                                       
     |
| const(V) v              | same as (V v), pedantic - makes copy 
and  |
|                         | guarantees no mutation in function    
     |
|                         |                                       
     |
| immutable(V) v          | No need - for ensuring no local 
changes   |
|                         | prefer 'const(V) v'                   
     |
|                         |                                       
     |
| ref V v                 | Use only when mutation of v is 
required   |
|                         |                                       
     |
| ref const(V) v          | Indicate v will not be changed, 
accepts   |
|                         | {V, const(V), immutable(V)}           
     |
|                         |                                       
     |
| ref immutable(V) v      | No need - restrictive with no benefit 
     |
|                         | over 'ref const(V) v'                 
     |
|                         |                                       
     |
| V* v                    | Use only when mutation of v is 
required.  |
|                         | Prefer 'ref V v' unless null 
significant  |
|                         | or unsafe manipulations desired       
     |
|                         |                                       
     |
| const(V)* v             | Indicate v will not be changed,       
     |
|                         | accepts {V*, const(V)*, 
immutable(V)*}    |
|                         | still prefer ref unless null 
significant  |
|                         | or unsafe manipulations desired       
     |
|                         |                                       
     |
| immutable(V)* v         | No need - restrictive with no benefit 
     |
|                         | over 'const(V)* v'                    
     |
|                         |                                       
     |
| T t                     | T is primitive, dynamic array, or 
assoc   |
|                         | array (i.e. cheap/shallow copies). 
For    |
|                         | generic code no knowledge of COW or   
     |
|                         | cheapness so prefer 'ref T t'         
     |
|                         |                                       
     |
| const(T) t              | same as (T t), pedantic - makes copy 
and  |
|                         | guarantees no mutation in function    
     |
|                         |                                       
     |
| immutable(T) t          | No need - for ensuring no local 
changes   |
|                         | prefer 'const(V) v'                   
     |
|                         |                                       
     |
| ref T t                 | Use only when mutation of t is 
required   |
|                         | prefer 'ref const(T) t' if mutation 
not   |
|                         | required                              
     |
|                         |                                       
     |
| ref const(T) t          | Indicate t will not be changed, 
accepts   |
|                         | {T, const(T), immutable(T)} without 
copy  |
|                         |                                       
     |
| ref immutable(T) t      | No need - restrictive with no benefit 
     |
|                         | over 'ref const(T) t'                 
     |
|                         |                                       
     |
| auto ref T t            | Use only when mutation of t required 
and  |
|                         | want support of by value for rvalues  
     |
|                         | (May be obviated in the long run)     
     |
|                         |                                       
     |
| auto ref const(T) t     | Indicate t will not be changed, 
accepts   |
|                         | [lr]value {T, const(T), immutable(T)} 
     |
|                         | (May be obviated in the long run)     
     |
|                         |                                       
     |
| auto ref immutable(T) t | No need - restrictive with no benefit 
     |
|                         | over 'auto ref const(T) t'            
     |
|                         |                                       
     |
| T* t                    | Use only when mutation of t is 
required.  |
|                         | Prefer 'ref T t' unless null is       
     |
|                         | significant or dealing with unsafe 
code.  |
|                         |                                       
     |
| const(T)* t             | Prefer 'ref const(T) t' unless        
     |
|                         | null is significant or dealing with   
     |
|                         | unsafe code                           
     |
|                         |                                       
     |
| immutable(T)* t         | No need - restrictive with no benefit 
     |
|                         | over 'const(T)* t'                    
     |
|                         |                                       
     |


*** Parameter Type Guidelines ***

  - Use pointers when null has specific intended meaning or the 
function wants unsafe code, otherwise prefer ref

  - Prefer const(T|V) to immutable(T|V) because const(T|V) is 
accepting of mutables and it ensures they are not mutated. 
immutable(T|V), on the other hand, only accepts immutables for 
types with aliasing and therefore makes the function less 
applicable. This eliminates 7 rows from consideration.

  - Always use const(T|V) when passing by ref if referred to 
instance is not mutated. For the non template case (i.e. V) not 
using const(V) means const and immutables can not be used as 
arguments. This unnecessarily reduces the application of the 
function. This is a debatable guideline for template types T, 
since the T being parameterized could be T = const(S), so it does 
not prevent the function from being called with const(S) or 
immutable(S). But the real problem with using just T instead of 
const(T) in the signature of a function that does not mutate t is 
the developer reading the signature has no way of knowing that T 
will not be mutated without compiling and seeing if it breaks. It 
is as if important user information is missing. So 'foo(T)(T t)' 
or 'foo(T)(ref T t)' may both accept 'const(S) s', but only if 
the compiled code does not mutate s. But without showing that 
guarantee to the compiler and developer with signature like 
'foo(T)(const(T) t)' or 'foo(T)(ref const(T) t)' you can be 
setting yourself up for future problems. For example, if you go 
with 'foo(T)(ref T t)', in the future you might (accidentally) 
add a mutating call on t. Then all existing code that passed in 
const(S) would break and if you test with only mutables you might 
not see the errors. An example from phobos that violates this is 
formatValue when passing in a struct. Even though the argument is 
not modified (why would it be) the signature is has 'auto ref T 
val' instead of 'auto ref const(T) val'.

  - Prefer 'ref' to by value on all template parameters that are 
not primitives, dynamic arrays or associative arrays - since 
there is no knowledge of how expensive the copy will be. (this is 
a guideline that is violated by SortedRange.(lowerBound, 
upperBound, trisect)).

  - When to use 'auto ref' template parameters: 'auto ref T t' as 
a parameter says - make one or two functions with different 
signatures and same body based on how it is called. If called 
with rvalue use 'T t', if called with lvalue use 'ref T t'. From 
this thread: 
http://forum.dlang.org/thread/4F84D6DD.5090405 digitalmars.com?page=1 
it sounds like this will no longer be necessary since in the 
future a single signature of 'ref T t' will support both lvalue 
and rvalue. So, for now it is best to start with 'ref T' instead 
of 'auto ref T' unless there is really an interim need for 
support of passing literals/rvalues into the function. One 
downside to 'auto ref' is it has the power to combinatorially 
increase the number of instantiations of each function. The 
upside is it allows rvalues to be passed in until the ultimate 
solution is implemented. In generic code, if you don't mind 
requiring lvalues of users, don't bother with auto parameters at 
all and stick with 'ref'.

Dec 17 2012

"Dan" <dbdavidson yahoo.com> writes:

On Monday, 17 December 2012 at 20:46:27 UTC, Dan wrote:
Sorry, here is the table more legible:
http://pastebin.com/0bFSL0Xz

Dec 17 2012

"bearophile" <bearophileHUGS lycos.com> writes:

Are in, out, scope, std.typecons.Nullable, 
std.typecons.Rebindable missing in your table?

Bye,
bearophile

Dec 17 2012

"Dan" <dbdavidson yahoo.com> writes:

On Monday, 17 December 2012 at 21:12:16 UTC, bearophile wrote:
 Are in, out, scope, std.typecons.Nullable, 
 std.typecons.Rebindable missing in your table?

 Bye,
 bearophile

Not at all - I cringe at the thought of dealing with those as 
well now. But if you want to give them a go as well I'd be glad 
to learn from your experience.

Thanks
Dan

Dec 17 2012

=?UTF-8?B?QWxpIMOHZWhyZWxp?= <acehreli yahoo.com> writes:

Thank you very much for doing the hard work on this. I find this kind of 
information very important.

On 12/17/2012 12:46 PM, Dan wrote:
 Assume V is a non-template parameter type and v is a parameter of that
 type for any function. Also assume T is a template parameter type and t
 is a parameter of that type for any function. Is the following table and
 set of guidelines below reasonable? What other guidelines do you use or
 would make sense to follow? I apologize if this is obvious/well known

I don't think this is well known at all. :) I have thought about these 
myself and came up with some guidelines at http://ddili.org/ders/d.en

To be honest, I already have doubts abouts some of those guidelines and 
I already know that some should be changed. For example, 'const ref' 
parameters may be better than 'in' parameters in many contexts.

 and I consider myself new to D, so please address any of my
 misconceptions. I'm finding that on the surface D sounds much simpler
 than it is, but if I can get a good set of guidelines it should all work
 out.

 Thanks,
 Dan

 (COW means Copy on Write)

| convention              | what it means/when to use                 |
|-------------------------+-------------------------------------------|
| V v                     | V is primitive, dynamic arr, assoc array, |
|                         | COW, or kown copy cheap (< 16 bytes)      |

I don't know how practical it is but it would be nice if the price of 
copying an object could be considered by the compiler, not by the 
programmer.

According to D's philosophy structs don't have identities. If I pass a 
struct by-value, the compiler should pick the fastest method.

| const(V) v              | same as (V v), pedantic - makes copy and  |
|                         | guarantees no mutation in function        |

That's sensible. (In practice though, it is rarely done in C++. For 
example, if V is int and v is not intended to be modified, it is still 
passed in as 'V v'.)

| immutable(V) v          | No need - for ensuring no local changes   |
|                         | prefer 'const(V) v'                       |

That makes a difference whether V is a value type or not. (It is not 
clear whether you mean V is a value type.) Otherwise, e.g. 
immutable(char[]) v has a legitimate meaning: The function requires that 
the caller provides immutable data.

| ref V v                 | Use only when mutation of v is required   |

Agreed.

| ref const(V) v          | Indicate v will not be changed, accepts   |
|                         | {V, const(V), immutable(V)}               |

Agreed.

| ref immutable(V) v      | No need - restrictive with no benefit     |
|                         | over 'ref const(V) v'                     |

I still has a different meaning: You must have an immutable V and I need 
a reference to it. It may be that the identity of the object is 
important and that the function would store a reference to it.

| V* v                    | Use only when mutation of v is required.  |
|                         | Prefer 'ref V v' unless null significant  |
|                         | or unsafe manipulations desired           |

Agreed.

Also, pointers may be needed especially when interfacing with C and C++ 
libraries, but again, the D function can still take 'ref' and pass the 
address of that ref to the C function.

| const(V)* v             | Indicate v will not be changed,           |
|                         | accepts {V*, const(V)*, immutable(V)*}    |
|                         | still prefer ref unless null significant  |
|                         | or unsafe manipulations desired           |

Agreed.

| immutable(V)* v         | No need - restrictive with no benefit     |
|                         | over 'const(V)* v'                        |

Again, if the function demands immutable(V), which may be null, then it 
actually has some use.

| T t                     | T is primitive, dynamic array, or assoc   |
|                         | array (i.e. cheap/shallow copies). For    |
|                         | generic code no knowledge of COW or       |
|                         | cheapness so prefer 'ref T t'             |

I am not sure about that last guideline. I think we should simply type T 
and the compiler does its magic. I don't know how practical my hope is.

Besides, we don't know whether T is primitive or not. It can be 
anything. If T is int, 'ref T t' could actually be slower due to the 
pointer indirection due to ref.

| const(T) t              | same as (T t), pedantic - makes copy and  |
|                         | guarantees no mutation in function        |

Agreed.

| immutable(T) t          | No need - for ensuring no local changes   |
|                         | prefer 'const(V) v'                       |

Agreed.

| ref T t                 | Use only when mutation of t is required   |
|                         | prefer 'ref const(T) t' if mutation not   |
|                         | required                                  |

Agreed. Again though, about the last comment, maybe 'const(T) t' is 
better than 'ref const(T) t'.

| ref const(T) t          | Indicate t will not be changed, accepts   |
|                         | {T, const(T), immutable(T)} without copy  |

Agreed.

| ref immutable(T) t      | No need - restrictive with no benefit     |
|                         | over 'ref const(T) t'                     |

Again, there may be a use case.

| auto ref T t            | Use only when mutation of t required and  |
|                         | want support of by value for rvalues      |
|                         | (May be obviated in the long run)         |

I have to remind me about that one again.

| auto ref const(T) t     | Indicate t will not be changed, accepts   |
|                         | [lr]value {T, const(T), immutable(T)}     |
|                         | (May be obviated in the long run)         |

I have to remind me about that one again. :)

| auto ref immutable(T) t | No need - restrictive with no benefit     |
|                         | over 'auto ref const(T) t'                |

Same. :p

| T* t                    | Use only when mutation of t is required.  |
|                         | Prefer 'ref T t' unless null is           |
|                         | significant or dealing with unsafe code.  |

Agreed.

| const(T)* t             | Prefer 'ref const(T) t' unless            |
|                         | null is significant or dealing with       |
|                         | unsafe code                               |

Agreed.

| immutable(T)* t         | No need - restrictive with no benefit     |
|                         | over 'const(T)* t'                        |

To repeat myself: The function may require immutable data that may be null.

 *** Parameter Type Guidelines ***

 - Use pointers when null has specific intended meaning or the function
 wants unsafe code, otherwise prefer ref

 - Prefer const(T|V) to immutable(T|V) because const(T|V) is accepting of
 mutables and it ensures they are not mutated. immutable(T|V), on the
 other hand, only accepts immutables for types with aliasing and
 therefore makes the function less applicable. This eliminates 7 rows
 from consideration.

 - Always use const(T|V) when passing by ref if referred to instance is
 not mutated. For the non template case (i.e. V) not using const(V) means
 const and immutables can not be used as arguments. This unnecessarily
 reduces the application of the function. This is a debatable guideline
 for template types T, since the T being parameterized could be T =
 const(S), so it does not prevent the function from being called with
 const(S) or immutable(S). But the real problem with using just T instead
 of const(T) in the signature of a function that does not mutate t is the
 developer reading the signature has no way of knowing that T will not be
 mutated without compiling and seeing if it breaks. It is as if important
 user information is missing. So 'foo(T)(T t)' or 'foo(T)(ref T t)' may
 both accept 'const(S) s', but only if the compiled code does not mutate
 s. But without showing that guarantee to the compiler and developer with
 signature like 'foo(T)(const(T) t)' or 'foo(T)(ref const(T) t)' you can
 be setting yourself up for future problems. For example, if you go with
 'foo(T)(ref T t)', in the future you might (accidentally) add a mutating
 call on t. Then all existing code that passed in const(S) would break
 and if you test with only mutables you might not see the errors. An
 example from phobos that violates this is formatValue when passing in a
 struct. Even though the argument is not modified (why would it be) the
 signature is has 'auto ref T val' instead of 'auto ref const(T) val'.

 - Prefer 'ref' to by value on all template parameters that are not
 primitives, dynamic arrays or associative arrays - since there is no
 knowledge of how expensive the copy will be.

I still think that the compiler should help me with that. Similar to how 
it applies automatic move semantics to value types. (Supposedly... I 
don't know how successful it is.)

 (this is a guideline that
 is violated by SortedRange.(lowerBound, upperBound, trisect)).

 - When to use 'auto ref' template parameters: 'auto ref T t' as a
 parameter says - make one or two functions with different signatures and
 same body based on how it is called. If called with rvalue use 'T t', if
 called with lvalue use 'ref T t'. From this thread:
 http://forum.dlang.org/thread/4F84D6DD.5090405 digitalmars.com?page=1

I have to read that thread again, this time more carefully. :)

 it
 sounds like this will no longer be necessary since in the future a
 single signature of 'ref T t' will support both lvalue and rvalue.

Sounds great.

 So,
 for now it is best to start with 'ref T' instead of 'auto ref T' unless
 there is really an interim need for support of passing literals/rvalues
 into the function. One downside to 'auto ref' is it has the power to
 combinatorially increase the number of instantiations of each function.
 The upside is it allows rvalues to be passed in until the ultimate
 solution is implemented. In generic code, if you don't mind requiring
 lvalues of users, don't bother with auto parameters at all and stick
 with 'ref'.

Not much experience with that one. I hope others will chime in... :)

Ali

Dec 17 2012

"Dan" <dbdavidson yahoo.com> writes:

On Tuesday, 18 December 2012 at 06:34:55 UTC, Ali Çehreli wrote:
 I don't think this is well known at all. :) I have thought 
 about these myself and came up with some guidelines at 
 http://ddili.org/ders/d.en

Thanks - I will study it. I see that you have covered also in, 
out, inout, lazy, scope, and shared, so that should keep me busy 
for a while.

 I don't know how practical it is but it would be nice if the 
 price of copying an object could be considered by the compiler, 
 not by the programmer.

I agree - would be nice if compiler could do it but if it tried 
some would just not be happy about the choices, no matter what.

 According to D's philosophy structs don't have identities. If I 
 pass a struct by-value, the compiler should pick the fastest 
 method.

Even if there is a postblit? Maybe that would work, but say your 
object were a reference counting type. If the compiler decided to 
pass by ref sneakily for performance gain when you think it is by 
value that might be a problem. Maybe not, though, as long as you 
know how it works. I have seen that literal structs passed to a 
function will not call the postblit - but Johnathan says this was 
a bug in the way the compiler classifies literals.

 That's sensible. (In practice though, it is rarely done in C++. 
 For example, if V is int and v is not intended to be modified, 
 it is still passed in as 'V v'.)

Absolutely. I read somewhere it was pedantic to do such things. 
Then I read some other articles that touted the benefit, even on 
an int, because the reader of (void foo(const int x) {...} ) 
knows x will/should not change, so it has clearer intentions for 
future maintainers.

 That makes a difference whether V is a value type or not. (It 
 is not clear whether you mean V is a value type.) Otherwise, 
 e.g. immutable(char[]) v has a legitimate meaning: The function 
 requires that the caller provides immutable data.

When is 'immutable(char[]) v' preferable to 'const(char[]) v'? If 
you select 'const(char[]) v' instead, your function will not 
mutate v and if it is generally a useful function it will even 
accept 'char[]' that *is* mutable. I agree with the meaning you 
suggest, but under what circumstances is it important to a 
function to know that v is immutable as opposed to simply const?

 | ref immutable(V) v | No need - restrictive with no benefit|
 |                    | over 'ref const(V) v'                |

 I still has a different meaning: You must have an immutable V 
 and I need a reference to it. It may be that the identity of 
 the object is important and that the function would store a 
 reference to it.

This may be a use-case for it. You want to store a reference to v 
and save it for later - so immutable is preferred over const. I 
may be mistaken but I thought the thread on 'rvalue references' 
talks about taking away the rights to take the address of any ref 
parameter: http://forum.dlang.org/post/4F863629.6000407 erdani.com


 | V* v      | Use only when mutation of v is required.  |
 |           | Prefer 'ref V v' unless null significant  |
 |           | or unsafe manipulations desired           |

 Agreed.

 Also, pointers may be needed especially when interfacing with C 
 and C++ libraries, but again, the D function can still take 
 'ref' and pass the address of that ref to the C function.

By 'unsafe manipulations' I meant things such as low level memory 
management, interfacing with C and such. It may be that in the 
future you will not be able to take the address of any 'ref' 
parameter (see previous link). So, if you know you are going to 
do interfacing with C or pointer work it and other non-safe code 
it may be best to just take 'V* v'.

 Again, if the function demands immutable(V), which may be null, 
 then it actually has some use.

I agree - I just don't know yet when a function would demand 
'immutable(V)' over 'const(V)'.

 | T t     | T is primitive, dynamic array, or  assoc   |
 |         | array (i.e. cheap/shallow copies). For     |
 |         | generic code no knowledge of COW or        |
 |         | cheapness so prefer 'ref T t'              |

 I am not sure about that last guideline. I think we should 
 simply type T and the compiler does its magic. I don't know how 
 practical my hope is.

 Besides, we don't know whether T is primitive or not. It can be 
 anything. If T is int, 'ref T t' could actually be slower due 
 to the pointer indirection due to ref.

Agreed. In a separate thread 
http://forum.dlang.org/thread/opufykfxwkkjchqcwgrg forum.dlang.org 
I included some timings of passing a struct as 'in S', 'in ref 
S', and 'const ref S'. The very small sizes, matching up to sizes 
of primitives, showed litte if any benefit of by value over ref. 
Maybe the test/benchmark was flawed? But for big sizes, the by 
reference clearly won by a large margin. The problem with 
template code is you don't have any knowledge and the cost of 'by 
value' is unbounded, whereas difference between 'int t' and 'ref 
const(int) t' might be small. For instance, I don't like that 
SortedRange.lowerBound is creating many copies of the input 
object while doing its binary search on the data. Well, I suppose 
you could do:

void foo(T)(T t) if(isPrimitive!T ||
                     isDynamicArray!T ||
                     isAssociativeArray!T) {
...
}

void foo(T)(ref const(T) t) if(!isPrimitive!T &&
                                !isDynamicArray!T &&
                                !isAssociativeArray!T) {
...
}

You are right that compiler magic could help here.

 I still think that the compiler should help me with that. 
 Similar to how it applies automatic move semantics to value 
 types. (Supposedly... I don't know how successful it is.)

Agreed.

Thanks,
Dan

Dec 18 2012

=?UTF-8?B?QWxpIMOHZWhyZWxp?= <acehreli yahoo.com> writes:

On 12/18/2012 04:51 AM, Dan wrote:
 On Tuesday, 18 December 2012 at 06:34:55 UTC, Ali Çehreli wrote:
 I don't think this is well known at all. :) I have thought about these
 myself and came up with some guidelines at http://ddili.org/ders/d.en

 Thanks - I will study it. I see that you have covered also in, out,
 inout, lazy, scope, and shared, so that should keep me busy for a while.

For convenience, here are the chapters and guidelines that are relevant:

1) Immutability:

   http://ddili.org/ders/d.en/const_and_immutable.html

Quoting:

* As a general rule, prefer immutable variables over mutable
   ones.

* Define constant values as enum if their values can be
   calculated at compile time. For example, the constant value of
   seconds per minute can be an enum:

         enum int secondsPerMinute = 60;

* There is no need to specify the type explicitly if it can be
   inferred from the right hand side:

         enum secondsPerMinute = 60;

* Consider the hidden cost of enum arrays and enum associative
   arrays. Define them as immutable variables if the arrays are
   large and they are used more than once in the program.  Specify
   variables as immutable if their values will never change but
   cannot be known at compile time. Again, the type can be
   inferred:

         immutable guess = read_int("What is your guess");

* If a function does not modify a parameter, specify that
   parameter as const. This would allow both mutable and immutable
   variables to be passed as arguments:

     void foo(const char[] s)
     {
         // ...
     }

     void main()
     {
         char[] mutableString;
         string immutableString;

         foo(mutableString);      // ← compiles
         foo(immutableString);    // ← compiles
     }

* Following from the previous guideline, consider that const
   parameters cannot be passed to functions taking immutable. See
   the section titled "Should a parameter be const or immutable?"
   above.

* If the function modifies a parameter, leave that parameter as
   mutable (const or immutable would not allow modifications
   anyway):

     import std.stdio;

     void reverse(dchar[] s)
     {
         foreach (i; 0 .. s.length / 2) {
             immutable temp = s[i];
             s[i] = s[$ - 1 - i];
             s[$ - 1 - i] = temp;
         }
     }

     void main()
     {
         dchar[] salutation = "hello"d.dup;
         reverse(salutation);
         writeln(salutation);
     }

     The output:

     olleh


2) const ref Parameters and const Member Functions:

   http://ddili.org/ders/d.en/const_member_functions.html

Quoting:

* To give the guarantee that a parameter is not modified by the
   function, mark that parameter as in, const, or const ref.

* Mark member functions that do not modify the object as const:

     struct TimeOfDay
     {
     // ...
         string toString() const
         {
             return format("%02s:%02s", hour, minute);
         }
     }

  This would make the struct (or class) more useful by removing an
  unnecessary limitation. The examples in the rest of the book
  will observe this guideline.


3) Constructor and Other Special Functions:

   http://ddili.org/ders/d.en/special_functions.html

Quoting:

Immutability of constructor parameters

   In the Immutability chapter we have seen that it is not easy to
   decide whether parameters of reference types should be defined
   as const or immutable. Although the same considerations apply
   for constructor parameters as well, immutable is usually a
   better choice for constructor parameters.

   The reason is, it is common to assign the parameters to members
   to be used at a later time. When a parameter is not immutable,
   there is no guarantee that the original variable will not
   change by the time the member gets used.

 I don't know how practical it is but it would be nice if the price of
 copying an object could be considered by the compiler, not by the
 programmer.

 I agree - would be nice if compiler could do it but if it tried some
 would just not be happy about the choices, no matter what.

 According to D's philosophy structs don't have identities. If I pass a
 struct by-value, the compiler should pick the fastest method.

 Even if there is a postblit? Maybe that would work, but say your object
 were a reference counting type. If the compiler decided to pass by ref
 sneakily for performance gain when you think it is by value that might
 be a problem. Maybe not, though, as long as you know how it works. I
 have seen that literal structs passed to a function will not call the
 postblit - but Johnathan says this was a bug in the way the compiler
 classifies literals.

I am also keeping in mind that struct objects are supposed to be treated 
as simple values without identities:

   http://dlang.org/struct.html

Quoting:

   A struct is defined to not have an identity; that is, the
   implementation is free to make bit copies of the struct as
   convenient.

 That's sensible. (In practice though, it is rarely done in C++. For
 example, if V is int and v is not intended to be modified, it is still
 passed in as 'V v'.)

 Absolutely. I read somewhere it was pedantic to do such things. Then I
 read some other articles that touted the benefit, even on an int,
 because the reader of (void foo(const int x) {...} ) knows x will/should
 not change, so it has clearer intentions for future maintainers.

Yeah. In C++, it is funny that all of my local variables are const as 
much as possible, but all of the by-value parameters are left non-const. 
I think part of the reason is the fact that, that top level const is 
seen as leaking an implementation detail to the signature. It also has a 
potential to confuse the newer users.

 That makes a difference whether V is a value type or not. (It is not
 clear whether you mean V is a value type.) Otherwise, e.g.
 immutable(char[]) v has a legitimate meaning: The function requires
 that the caller provides immutable data.

 When is 'immutable(char[]) v' preferable to 'const(char[]) v'? If you
 select 'const(char[]) v' instead, your function will not mutate v and if
 it is generally a useful function it will even accept 'char[]' that *is*
 mutable. I agree with the meaning you suggest, but under what
 circumstances is it important to a function to know that v is immutable
 as opposed to simply const?

Yes, const(char)[] is more welcoming as you state. On the other hand, 
immutable is a requirement on the user: The function demands immutable 
data. This may be so if that string should be used later unchanged. 
Imagine a constructor takes the file name as 'string' (i.e. 
immutable(char)[]). Then the object is assured that the file name can be 
used later and it will be the same as when the object has been constructed.

Assuming that the object (or a function) needs the string to not change 
ever, let's enumerate the cases:

If the function signature is const(char)[], the function must make an 
.idup of it because it cannot rely on the user not changing it.

If the function signature is immutable(char)[], then the function is 
leaking out an implementation detail: It is communicating the fact to 
the user, saying "I need an immutable string, if you have one, great; if 
not, *you* make an immutable copy to give me." By that analysis, I see 
'string' parameters as an optimization: Yes, an immutable data is 
needed. If the user has one, the immutable copy is elided.

A solution that I have for the above is to make the function a template, 
and use a 'static if' to decide whether the object was mutable, and make 
an immutable copy if needed:

import std.stdio;
import std.conv;

ref immutableOf(T)(ref T param)
{
     static if (is(typeof(T[0]) == immutable)) {
         return param;

     } else {
         writeln("Duplicating mutable " ~ T.stringof);
         return to!(immutable(T))(param);
     }
}

void foo(T)(T s)
{
     immutable imm_s = immutableOf(s);
     writefln("s.ptr: %s, imm_s.ptr: %s", s.ptr, imm_s.ptr);
}

void main()
{
     char[] m = "hello".dup;
     immutable(char)[] s = "world";

     foo(m);
     foo(s);
}

The output shows that an immutable copy is made only when user's data 
has been mutable to begin with:

Duplicating mutable char[]
s.ptr: 7F6E216E8FD0, imm_s.ptr: 7F6E216E8FC0
s.ptr: 482240, imm_s.ptr: 482240

The above works but obviously is very cumbersome.

There is a similar analysis for return value types: Why should I ever 
return a string from a function that produces one? Why restrict my 
users? I should return char[] so that they can further modify it they 
want to.

Later I learned that mutable return values of pure functions can 
automatically casted to immutable; so yes, it makes more sense to return.

char[] foo() pure     // <-- returns mutable
{
     char[] result;
     return result;
}

void main()
{
     char[] m = foo();  // <-- works
     string s = foo();  // <-- works
}

 | ref immutable(V) v | No need - restrictive with no benefit|
 | | over 'ref const(V) v' |

 I still has a different meaning: You must have an immutable V and I
 need a reference to it. It may be that the identity of the object is
 important and that the function would store a reference to it.

 This may be a use-case for it. You want to store a reference to v and
 save it for later - so immutable is preferred over const. I may be
 mistaken but I thought the thread on 'rvalue references' talks about
 taking away the rights to take the address of any ref parameter:
 http://forum.dlang.org/post/4F863629.6000407 erdani.com

I am behind with my reading. I remember that thread but I must study it 
again. :)

 Again, if the function demands immutable(V), which may be null, then
 it actually has some use.

 I agree - I just don't know yet when a function would demand
 'immutable(V)' over 'const(V)'.

It makes sense only for by-reference I think. At the risk of repeating 
myself, the function wants to store a file name to be used later.

 | T t | T is primitive, dynamic array, or assoc |
 | | array (i.e. cheap/shallow copies). For |
 | | generic code no knowledge of COW or |
 | | cheapness so prefer 'ref T t' |

 I am not sure about that last guideline. I think we should simply type
 T and the compiler does its magic. I don't know how practical my 


hope is.
 Besides, we don't know whether T is primitive or not. It can be
 anything. If T is int, 'ref T t' could actually be slower due to the
 pointer indirection due to ref.

 Agreed. In a separate thread
 http://forum.dlang.org/thread/opufykfxwkkjchqcwgrg forum.dlang.org I
 included some timings of passing a struct as 'in S', 'in ref S', and
 'const ref S'. The very small sizes, matching up to sizes of primitives,
 showed litte if any benefit of by value over ref. Maybe the
 test/benchmark was flawed?

I must read that too. :)

I wonder whether the compiler applied optimizations and was able to keep 
lots of stuff in registers. If the code is complex enough perhaps then 
by-value may be faster. (?)

 But for big sizes, the by reference clearly
 won by a large margin. The problem with template code is you don't have
 any knowledge and the cost of 'by value' is unbounded, whereas
 difference between 'int t' and 'ref const(int) t' might be small.

Right. I hope others bring their experiences. We must understand these 
details. :)

I was fortunate enough to meet with deadalnix and Denis Koroskin last 
week. I told deadalnix about this very topic and how important it is to 
have a talk on this at DConf. He said he might be willing to give that 
talk. (Unless of course you make your submission for DConf 2013 first. ;) )

Ali

Dec 18 2012

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Tue, Dec 18, 2012 at 01:51:31PM +0100, Dan wrote:
 On Tuesday, 18 December 2012 at 06:34:55 UTC, Ali �ehreli wrote:

[...]
That makes a difference whether V is a value type or not. (It is
not clear whether you mean V is a value type.) Otherwise, e.g.
immutable(char[]) v has a legitimate meaning: The function
requires that the caller provides immutable data.

 
 When is 'immutable(char[]) v' preferable to 'const(char[]) v'? If
 you select 'const(char[]) v' instead, your function will not mutate
 v and if it is generally a useful function it will even accept
 'char[]' that *is* mutable. I agree with the meaning you suggest,
 but under what circumstances is it important to a function to know
 that v is immutable as opposed to simply const?

It's not just about whether the function mutates something or not.
Sometimes the function counts on the data not changing, ever. For
example, if you're implementing a library AA type, you'd want the key to
be immutable so that whatever hash value you computed for the bucket
will not suddenly become invalid just because the user changed it from
underneath you:

	// Problematic example
	struct AA {
		struct Bucket {
			const(char)[] key;
			uint  hash;
			const(char)[] value;

			Bucket* next;
		}
		Bucket*[] htable;

		void addEntry(const(char)[] key, const(char)[] value) {
			// N.B.: Bucket now stores a reference to key
			auto buck = new Bucket(key, hashof(key), value);

			// The validity of this depends on the
			// referenced key not changing, ever.
			htable[buck.hash % htable.length] = buck;
		}

		const(char)[] opIndex(const(char)[] key) {
			auto hash = hashof(key);
			auto buck = htable[hash % htable.length];
			while (buck) {
				if (buck.key == key)
					return buck.value;
				buck = buck.next;
			}
			// throw out of bounds error here
		}
	}
	void main() {
		AA aa;
		char[] myKey = "abc";

		// OK: char[] implicitly converts to const(char)[].
		aa.addEntry(myKey, "some value");

		myKey[0] = 'c';	// <--- oops! now the entry's Bucket is wrong!

		auto v = aa["abc"];	// this will throw, 'cos the
					// right slot is found but the
					// entry's .key value has
					// changed, so it won't match

		auto u = aa["cbc"];	// in all likelihood, this will
					// also throw because the hash
					// of "cbc" is unlikely to be
					// equal to the hash of "abc" so
					// we won't find the right slot
	}

In this case, the key passed to .addEntry *must* be immutable. That's
the only way to guarantee that the AA's internal structures won't get
invalidated by outside code.


T

-- 
Skill without imagination is craftsmanship and gives us many useful objects
such as wickerwork picnic baskets.  Imagination without skill gives us modern
art. -- Tom Stoppard

Dec 18 2012

"Dan" <dbdavidson yahoo.com> writes:

On Tuesday, 18 December 2012 at 18:08:18 UTC, H. S. Teoh wrote:
 It's not just about whether the function mutates something or 
 not.
 Sometimes the function counts on the data not changing, ever. 
 For
 example, if you're implementing a library AA type, you'd want 
 the key to
 be immutable so that whatever hash value you computed for the 
 bucket
 will not suddenly become invalid just because the user changed 
 it from
 underneath you:

[snip]
 In this case, the key passed to .addEntry *must* be immutable. 
 That's
 the only way to guarantee that the AA's internal structures 
 won't get
 invalidated by outside code.

Thanks! You and Ali have presented good examples where 'ref 
immutable(T|V) t|v' trumps 'ref const(T|V) t|v' and if I 
understand them correctly it is for whenever the instance method 
(in the case of a member function) will hold onto the argument 
for later use. I'll refine my selection process accordingly.

Thanks,
Dan

Dec 18 2012

D Programming

C/C++ Programming

Other

digitalmars.D.learn - guidelines for parameter types