www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.bugs - [Issue 1596] New: op*Assign should return void

reply d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=1596

           Summary: op*Assign should return void
           Product: D
           Version: unspecified
          Platform: All
        OS/Version: All
            Status: NEW
          Severity: enhancement
          Priority: P2
         Component: DMD
        AssignedTo: bugzilla digitalmars.com
        ReportedBy: andrei metalanguage.com


D's assignment operators opAddAssign, opMulAssign etc. have taken part of C++'s
behavior in the area. In order to behave similarly to a primitive type, a
user-defined operator typically returns *this as an rvalue (in C++ they'd
usually return an lvalue).

There are a few problems with this approach:

1. Extra busywork done for no good reason. The code:

a += b;

translates to:

typeof(a) __unused = void;
a.opAddAssign(b, __unused);

So there is one extra memcpy performed for no reason (the copy is done on the
caller site, which has no idea whether the result of the operation will be used
or not). This problem will be exacerbated when copy construction semantics will
be introduced.

2. Extra work for the programmer, and relying on convention instead of language
rules.

Experience with C++ shows that user-defined implementations of assignment
operators always return *this, and that there is absolutely no reason to return
anything else. 

Fix:

The rule belongs to the language. Require all op*Assign to return void. If a
value is required on the caller side, pass the left-hand side of the operation.


-- 
Oct 19 2007
next sibling parent Don Clugston <dac nospam.com.au> writes:
d-bugmail puremagic.com wrote:
 http://d.puremagic.com/issues/show_bug.cgi?id=1596
 
            Summary: op*Assign should return void
            Product: D
            Version: unspecified
           Platform: All
         OS/Version: All
             Status: NEW
           Severity: enhancement
           Priority: P2
          Component: DMD
         AssignedTo: bugzilla digitalmars.com
         ReportedBy: andrei metalanguage.com
 
 
 D's assignment operators opAddAssign, opMulAssign etc. have taken part of C++'s
 behavior in the area. In order to behave similarly to a primitive type, a
 user-defined operator typically returns *this as an rvalue (in C++ they'd
 usually return an lvalue).
 
 There are a few problems with this approach:
 
 1. Extra busywork done for no good reason. The code:
 
 a += b;
 
 translates to:
 
 typeof(a) __unused = void;
 a.opAddAssign(b, __unused);
 
 So there is one extra memcpy performed for no reason (the copy is done on the
 caller site, which has no idea whether the result of the operation will be used
 or not). This problem will be exacerbated when copy construction semantics will
 be introduced.
 
 2. Extra work for the programmer, and relying on convention instead of language
 rules.
 
 Experience with C++ shows that user-defined implementations of assignment
 operators always return *this, and that there is absolutely no reason to return
 anything else.
 
 Fix:
 
 The rule belongs to the language. Require all op*Assign to return void. If a
 value is required on the caller side, pass the left-hand side of the operation.
I think this is right. But it's worth mentioning that this would break some of my code, since I'm using the return type of opXXAssign() operations in expression templates. However, since D expression templates don't work for comparison operators (because opCmp() returns an int), changing this is likely to force us to a better expression template solution.
Oct 23 2007
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=1596


yebblies <yebblies gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |yebblies gmail.com



Is this still valid?
I'd imaging opXXXAssign would return ref typeof(this) these days.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Jul 04 2011
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=1596


Jonathan M Davis <jmdavisProg gmx.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jmdavisProg gmx.com



PDT ---
It's valid in that Andrei seems to be requesting that op*Assign be _required_
to return ref typeof(this), whereas currently, you can make it return whatever
you want.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Jul 04 2011
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=1596





 It's valid in that Andrei seems to be requesting that op*Assign be _required_
 to return ref typeof(this), whereas currently, you can make it return whatever
 you want.
Well, reason 1 doesn't apply to returning ref typeof(this), and one valid use I can think of for returning something else would be expression templates. As it's four years old, I'd close this if it was my report, but I'll leave it to Andrei to make a decision on this. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Jul 05 2011
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=1596


Steven Schveighoffer <schveiguy yahoo.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |schveiguy yahoo.com



05:23:14 PDT ---
D should not be forcing behavior on normal functions, which operators are. 
They are special in that if defined in a specific way, they will be used when
using operators, but other than that, they are ordinary functions.  If you want
to force behavior, make the operators keywords (like C++).  Otherwise, you have
special rules for ordinary symbols.  Lowering is supposed to work by simply
rewriting code, not by requiring certain signatures.

This relates to Andrei's later bug regarding opEquals - the compiler currently
requires a certain signature for opEquals, and it simply gets in the way of
writing efficient or usable code.

Besides, I think Andrei's original point is that it was impossible to do the
most efficient thing (return the lvalue of this), which is no longer the case.
So this was an attempt to solve that problem using a language rule instead of
adding ref returns.

I think this bug is very obsolete.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Jul 05 2011
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=1596


Don <clugdbug yahoo.com.au> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |clugdbug yahoo.com.au




 D should not be forcing behavior on normal functions, which operators are. 
 They are special in that if defined in a specific way, they will be used when
 using operators, but other than that, they are ordinary functions.  If you want
 to force behavior, make the operators keywords (like C++).  Otherwise, you have
 special rules for ordinary symbols.  Lowering is supposed to work by simply
 rewriting code, not by requiring certain signatures.
??? The existing language doesn't obey these rules, which you've apparently just made up.
 This relates to Andrei's later bug regarding opEquals - the compiler currently
 requires a certain signature for opEquals, and it simply gets in the way of
 writing efficient or usable code.
Actually the problem with opEquals is that there is NO option which works properly. Giving you more bad options wouldn't help.
 Besides, I think Andrei's original point is that it was impossible to do the
 most efficient thing (return the lvalue of this), which is no longer the case.
 So this was an attempt to solve that problem using a language rule instead of
 adding ref returns.
No. The point is this: There is only ONE correct choice for the return value: it must be some form of 'return this'. The compiler is better equipped to work out the most efficient way to do it, than the programmer is. Giving freedom to choose the return value does only two things: (1) it gives you an opportunity to make a mistake. (2) it adds extra noise in the code. The one legitimate use of a return value is in expression templates, but as already mentioned, they don't work with opCmp. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Jul 05 2011
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=1596




09:02:07 PDT ---


 D should not be forcing behavior on normal functions, which operators are. 
 They are special in that if defined in a specific way, they will be used when
 using operators, but other than that, they are ordinary functions.  If you want
 to force behavior, make the operators keywords (like C++).  Otherwise, you have
 special rules for ordinary symbols.  Lowering is supposed to work by simply
 rewriting code, not by requiring certain signatures.
??? The existing language doesn't obey these rules, which you've apparently just made up.
What rules? There should be no "rules" of what an ordinary function should be or should not return. I believe the current compiler works this way, at least in terms of opAddAssign. When the compiler sees a += b, it should rewrite as a.opAddAssign(b) (or whatever the equivalent template should be). Why does it matter what opAddAssign returns? If it returns a C, who cares? There is nothing special about opAddAssign except that it's the vehicle to implement += via lowering.
 This relates to Andrei's later bug regarding opEquals - the compiler currently
 requires a certain signature for opEquals, and it simply gets in the way of
 writing efficient or usable code.
Actually the problem with opEquals is that there is NO option which works properly. Giving you more bad options wouldn't help.
Of course there is: struct T { bool opEquals(T other) {...} } The issue with opEquals (and this is different from opAddAssign) is the compiler will use opEquals to populate the xequals function pointer in the TypeInfo, if it is the right signature. The error in the current compiler is that opEquals does not even compile unless you use a useless signature. This "feature" was added to fix something else, but I think it was an error to add it. I should be able to do: bool opEquals(int other) if I so wish, and it just won't populate the xequals function. It's a normal symbol, and should be a normal function. opEquals is not a special token, it's a normal symbol.
 Besides, I think Andrei's original point is that it was impossible to do the
 most efficient thing (return the lvalue of this), which is no longer the case.
 So this was an attempt to solve that problem using a language rule instead of
 adding ref returns.
No. The point is this: There is only ONE correct choice for the return value: it must be some form of 'return this'. The compiler is better equipped to work out the most efficient way to do it, than the programmer is.
opAddAssign probably should return this. But it could also return void. Or it could return int. It shouldn't matter to the compiler.
 Giving freedom to choose the return value does only two things: 
 (1) it gives you an opportunity to make a mistake. 
 (2) it adds extra noise in the code.
(3) it allows some future use we do not foresee, people can be inventive. (4) it makes the compiler simpler and more consistent. BTW, what is the extra noise in the code? The specification of the return value? I find this a weak argument. What if you want to return by ref or by value? Don't get me wrong, I think in most cases, it should return this by reference. But there could be cases where returning something else would be advantageous. For example, what if you have a struct parameterized on constancy, and you want to return a different const form? There are some crazy amazing things I've seen in phobos and other code that nobody thought of before, and some of it is based on unintuitive uses of operators. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Jul 05 2011
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=1596







 D should not be forcing behavior on normal functions, which operators are. 
 They are special in that if defined in a specific way, they will be used when
 using operators, but other than that, they are ordinary functions.  If you want
 to force behavior, make the operators keywords (like C++).  Otherwise, you have
 special rules for ordinary symbols.  Lowering is supposed to work by simply
 rewriting code, not by requiring certain signatures.
??? The existing language doesn't obey these rules, which you've apparently just made up.
What rules? There should be no "rules" of what an ordinary function should be or should not return. I believe the current compiler works this way, at least in terms of opAddAssign.
The rules you made up about lowering not involving signatures. Almost every case where the compiler recognizes particular names, it requires a particular signature. The existing opAddAssign is one of the few cases where the requirements are loose.
 When the compiler sees a += b, it should rewrite as a.opAddAssign(b) (or
 whatever the equivalent template should be).  Why does it matter what
 opAddAssign returns?  If it returns a C, who cares?  There is nothing special
 about opAddAssign except that it's the vehicle to implement += via lowering.
Basically, the more you can enforce semantics of operators, the more useful they are. In C++, the STL is based entirely on agreed semantics of operators.
 
 This relates to Andrei's later bug regarding opEquals - the compiler currently
 requires a certain signature for opEquals, and it simply gets in the way of
 writing efficient or usable code.
Actually the problem with opEquals is that there is NO option which works properly. Giving you more bad options wouldn't help.
Of course there is: struct T { bool opEquals(T other) {...} }
That doesn't work with const(T).
 Besides, I think Andrei's original point is that it was impossible to do the
 most efficient thing (return the lvalue of this), which is no longer the case.
 So this was an attempt to solve that problem using a language rule instead of
 adding ref returns.
No. The point is this: There is only ONE correct choice for the return value: it must be some form of 'return this'. The compiler is better equipped to work out the most efficient way to do it, than the programmer is.
opAddAssign probably should return this. But it could also return void. Or it could return int. It shouldn't matter to the compiler.
 Giving freedom to choose the return value does only two things: 
 (1) it gives you an opportunity to make a mistake. 
 (2) it adds extra noise in the code.
(3) it allows some future use we do not foresee, people can be inventive. (4) it makes the compiler simpler and more consistent. BTW, what is the extra noise in the code? The specification of the return value? I find this a weak argument. What if you want to return by ref or by value? Don't get me wrong, I think in most cases, it should return this by reference. But there could be cases where returning something else would be advantageous. For example, what if you have a struct parameterized on constancy, and you want to return a different const form? There are some crazy amazing things I've seen in phobos and other code that nobody thought of before, and some of it is based on unintuitive uses of operators.
(4) is untrue. (3) was discussed by Andrei. Although it sounds plausible, and it's the reason it existed in C++, experience in C++ has shown that this is NOT true. It has been thoroughly explored, and it is not a useful feature. So (3) isn't true either. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Jul 05 2011
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=1596


Stewart Gordon <smjg iname.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |smjg iname.com




 Experience with C++ shows that user-defined implementations of 
 assignment operators always return *this, and that there is 
 absolutely no reason to return anything else.
What's C++ to do with anything? This is D. You can't just blindly apply principles learned from one language to another without understanding how the principles depend on characteristics of the language. In this instance, from what I can make out it is because C++ isn't garbage collected, so you're bound to want to implement such operations by modifying the object in-place rather than creating a new one. Arithmetical classes (which should be immutable, so that they behave as values) are an example of something that can benefit from the latter in a garbage collected language such as D, if only the small change I describe below is made.
 The rule belongs to the language. Require all op*Assign to return void. If a
 value is required on the caller side, pass the left-hand side of the operation.
I agree that the design of op*Assign is flawed, but disagree with this change. Here's how it should work IMO: - op*Assign (including plain opAssign) may return either void or typeof(this). - If it's void, the AssignExpression calls this op*Assign and then returns its lvalue. - If it's typeof(this), the AssignExpression calls this op*Assign, sets the lvalue to what op*Assign returns and then returns it. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Jul 08 2011
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=1596




PDT ---
- If it's typeof(this), the AssignExpression calls this op*Assign, sets the
lvalue to what op*Assign returns and then returns it.
That would be a _bad_ idea. On some level, it defeats the purpose of op*Assign in the first place. There are cases where it is more efficient to do the operation and assign in one operation. Your suggestion basically relegates to op*Assign to being another opBinary except that it allows for the compiler to replace a op= b with a = a op b. Assignment _needs_ to be part of the implementation of op*Assign. That's part of the point. Yes, the return value of op*Assign should be the same as the value as the lvalue, but getting the return value and then doing the assignment makes op*Assign pretty much pointless. If that were the proper way to do it, we wouldn't have op*Assign in the first place. We'd just create op= from opBinary and opAssign. And that's _not_ what we want. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Jul 08 2011
prev sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=1596




I'm not sure you get the point.  It's to give the class designer the choice
between in-place modification and creating a new object as the implementation
of op=.

OK, so a op= b ought to be automatically equivalent to a = a op b in the
absence of an opOpAssign.  So the rule would be to define opOpAssign iff you
want it to modify in-place.  But this still doesn't cover cases where whether
the modification is in-place isn't a compile-time constant.  Look at how ~=
works for instance.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Jul 08 2011