www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - phobos by ref or by value

reply "Dan" <dbdavidson yahoo.com> writes:
Is there a general philosophy in D phobos on when to pass by 
value or
reference?  For instance, to find a slice using lowerBound many 
copies
of the target item, as well as copies of items in the collection 
are
made (see code example below). This seems unnecessary - why not 
have
functions like:

     auto lowerBound(...)(V value)

be:

     auto lowerBound(...)(ref V value)

or:

     auto lowerBound(...)(auto ref V value)

Is this a source for desire for no postblits, shallow semantics on
copy/assignment with additional logic for copy on write 
semantics. If
libraries in general are coded to make many copies of parameters 
it
might be a big improvement to not have postblits. A general 
purpose
library accessing ranges will not know the semantics of V (deep or
shallow), so why incur the cost of copies? Certainly finding a
lowerBound on a range of V can be done with 0 copies of elements?

Is there an established philosophy?

Thanks
Dan
-------
   struct S {
     DateTime date;
     double val;
     this(this) { writeln("copying S ", &this, ' ', date, ',', 
val); }
   }
---------------
   auto before(ref const(ValueType) vt) const {
     auto ass = assumeSorted!orderingPred(_history[]);
     writeln("Before lb");
     auto lb = ass.lowerBound(vt);
     writeln("After lb");
     return History!(V, orderingPred)(_history[0 .. lb.length]);
   }

---------------
Before lb
copying S 7FFF622CE110 2001-Nov-01 00:00:00,0
copying S 7FFF622CE090 2001-Nov-01 00:00:00,0
copying S 7FFF622CE020 2001-Jan-01 00:00:00,100
copying S 7FFF622CE030 2001-Nov-01 00:00:00,0
copying S 7FFF622CDF90 2001-Jan-01 00:00:00,100
copying S 7FFF622CDFA0 2001-Nov-01 00:00:00,0
copying S 7FFF622CE020 2002-Jan-01 00:00:00,200
copying S 7FFF622CE030 2001-Nov-01 00:00:00,0
copying S 7FFF622CDF90 2002-Jan-01 00:00:00,200
copying S 7FFF622CDFA0 2001-Nov-01 00:00:00,0
copying S 7FFF622CE020 2001-Jan-01 00:00:00,200
copying S 7FFF622CE030 2001-Nov-01 00:00:00,0
copying S 7FFF622CDF90 2001-Jan-01 00:00:00,200
copying S 7FFF622CDFA0 2001-Nov-01 00:00:00,0
After lb
Dec 16 2012
parent reply Jonathan M Davis <jmdavisProg gmx.com> writes:
On Sunday, December 16, 2012 16:09:45 Dan wrote:
 Is there a general philosophy in D phobos on when to pass by
 value or
 reference?  For instance, to find a slice using lowerBound many
 copies
 of the target item, as well as copies of items in the collection
 are
 made (see code example below). This seems unnecessary - why not
 have
 functions like:
 
      auto lowerBound(...)(V value)
 
 be:
 
      auto lowerBound(...)(ref V value)
 
 or:
 
      auto lowerBound(...)(auto ref V value)
 
 Is this a source for desire for no postblits, shallow semantics on
 copy/assignment with additional logic for copy on write
 semantics. If
 libraries in general are coded to make many copies of parameters
 it
 might be a big improvement to not have postblits. A general
 purpose
 library accessing ranges will not know the semantics of V (deep or
 shallow), so why incur the cost of copies? Certainly finding a
 lowerBound on a range of V can be done with 0 copies of elements?
 
 Is there an established philosophy?
You _don't_ take ranges by ref unless you want to alter the original, which is almost never the case. Functions like popFrontN are the exception. And since you _are_ going to mutate the parameter (since ranges iterate via mutation), something like const ref would never make sense, even if it had C++'s semantics. I'm not sure if auto ref screams at you if you try and mutate the original, but if it doesn't, then you get problems when passing it lvalue ranges, because they'd be being passed by ref and mutated, which you don't want. So, auto ref makes no sense either. You pretty much always pass ranges by value. And a range which does a deep copy when it's copied is a fundamentally broken range anyway. It has the wrong semantics and won't function correctly with many range-based functions. Ranges are supposed to be a view into a range of values (possibly in a container), and copying the view shouldn't copy the actual elements. Otherwise, you'd be doing the equivalent of passing around a container by value, which is almost always a horrible idea. As for types which aren't ranges, they're almost a non-issue in Phobos. Most functions in Phobos take either a range or a primitive type. There aren't very many user-defined types in Phobos which aren't ranges (e.g. the types in std.datetime), but those that aren't ranges are generally either small enough that trying to pass by const ref or auto ref doesn't buy you much (if anything), or they're classes, in which case, it's a non-issue. And almost every generic function in Phobos takes a range. So, functions in Phobos almost always take their arguments by value. They'll use ref when it's required for the semantics of what they're doing, but auto ref on function parameters is rare. - Jonathan M Davis
Dec 16 2012
parent reply "Dan" <dbdavidson yahoo.com> writes:
On Sunday, 16 December 2012 at 23:02:30 UTC, Jonathan M Davis 
wrote:
 You _don't_ take ranges by ref unless you want to alter the 
 original, which is
 almost never the case. Functions like popFrontN are the 
 exception. And since
 you _are_ going to mutate the parameter (since ranges iterate 
 via mutation),
 something like const ref would never make sense, even if it had 
 C++'s
 semantics. I'm not sure if auto ref screams at you if you try 
 and mutate the
 original, but if it doesn't, then you get problems when passing 
 it lvalue
 ranges, because they'd be being passed by ref and mutated, 
 which you don't
 want. So, auto ref makes no sense either. You pretty much 
 always pass ranges
 by value. And a range which does a deep copy when it's copied 
 is a
 fundamentally broken range anyway. It has the wrong semantics 
 and won't
 function correctly with many range-based functions. Ranges are 
 supposed to be
 a view into a range of values (possibly in a container), and 
 copying the view
 shouldn't copy the actual elements. Otherwise, you'd be doing 
 the equivalent
 of passing around a container by value, which is almost always 
 a horrible
 idea.

 As for types which aren't ranges, they're almost a non-issue in 
 Phobos. Most
 functions in Phobos take either a range or a primitive type. 
 There aren't very
 many user-defined types in Phobos which aren't ranges (e.g. the 
 types in
 std.datetime), but those that aren't ranges are generally 
 either small enough
 that trying to pass by const ref or auto ref doesn't buy you 
 much (if
 anything), or they're classes, in which case, it's a non-issue. 
 And almost
 every generic function in Phobos takes a range. So, functions 
 in Phobos almost
 always take their arguments by value.
I assume you are talking about functions other than lowerBound, upperBound, trisect.
 They'll use ref when it's required for
 the semantics of what they're doing, but auto ref on function 
 parameters is
 rare.
When would ref be required for semantics? I am asking this to learn the D way - so any guidelines are helpful. We have language spec and TDPL. Maybe we need another book or three in the vein of Meyers "50 Effective Ways". Sorry, but I don't understand the focus on ranges. I know ranges are involved because lowerBound is a method on SortedRange. But I am asking why a member function of a range (i.e. lowerBound) takes its argument by value. I don't mind copies of ranges being made when needed - as I think they are "light copies" of pointers. But by value of type V in lowerBound performs unnecessary copy of the element of unknown size/complexity. The library can not know the cost of that *and* it can be avoided (I think). I thought ranges were a refinement or improvement on pair of iterators. So I have a range of items already existing in memory and I want to find all elements in the range less than some value of type V. I don't understand the choice of the V as opposed to 'ref const(V)'. What this does is cause the fire of postblits again and again on a non-phobos user defined struct - and I think they are needless. *find* or *lower_bound* in C++, for example, take the element to be found as 'const &' so copies are not made. Why is that not done here? If it is not an oversight, I have more to learn on how things work in D and therefore want a broader set of guidelines. I would think a guideline like: "In generic code always take generic types that are not known to be primitives or very small collections of pointers (like dynamic array, associative array) by reference since you can not know the cost of copying". Usually the best place to learn the way of a language is studying its standard libraries, so that is what I am after - the why's of it. Thanks Dan
Dec 16 2012
next sibling parent reply "bearophile" <bearophileHUGS lycos.com> writes:
Dan:

 Usually the best place to learn the way of a language is 
 studying its standard libraries,
Then I suggest you to not study std.random because it currently contains know flaws regarding what you are saying. Bye, bearophile
Dec 16 2012
parent "Dan" <dbdavidson yahoo.com> writes:
On Monday, 17 December 2012 at 03:23:13 UTC, bearophile wrote:
 Then I suggest you to not study std.random because it currently 
 contains know flaws regarding what you are saying.
Fine, thanks. But which would be recommended to study?
Dec 16 2012
prev sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Monday, December 17, 2012 04:06:52 Dan wrote:
 They'll use ref when it's required for
 the semantics of what they're doing, but auto ref on function
 parameters is
 rare.
When would ref be required for semantics? I am asking this to learn the D way - so any guidelines are helpful. We have language spec and TDPL. Maybe we need another book or three in the vein of Meyers "50 Effective Ways".
ref is required when you want the argument you're passing in to be altered rather than the copy being altered. That's the same as in C++. Ranges in general don't do deep copies when they're passed around for basically the same reasons that pointers don't. If you want to know more about ranges, this is probably the best resource at this point: http://ddili.org/ders/d.en/ranges.html There are probably plenty of cases in D where the equivalent of C++'s const& would be desirable, but D doesn't really have that at this point. The closest is auto ref, which only works with templated functions, and it doesn't prevent the argument from being mutated (though auto ref const would). Also, const in D is far more restrictive than it is in C++, making it so that forcing const on function parameters can be highly restrictive and annoying. It's an ongoing debate on how to solve that, as emulating C++'s const& and having const ref take rvalues has been rejected. So, in most cases, the issue is completely ignored at this point in Phobos. And since most functions in Phobos take either ranges or built-in types (where passing by value is not a problem), so in most cases, it's not an issue at all. Long term, it's something that should probably be addressed, but until the const ref situation is sorted out, it probably won't be. Functions which take the element of a range rather than a range probably should do something to avoid unnecessary copies, and auto ref may be the solution to that at the moment, but it's not clear how that's going to be sorted out in the long run. - Jonathan M Davis
Dec 16 2012