www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Overloading operators by operator symbol

reply Bill Baxter <wbaxter gmail.com> writes:
I'm not a big fan of magic operator method names.  Python has its 
__add__ etc methods, Lua has very similar, D has opAdd etc.
Personally I prefer C++'s way of just using the syntax itself.  I find 
it a lot easier to remember and it looks less "magical".

I started wondering if it might be able to accomplish something like 
that using mixins.  Here's an example of what I've gotten to work so far:

class AClass
{
     // Look ma! I'm overloading operators by symbols!
     mixin Operator!("+", myPlus);
     mixin Operator!("+=", myPlusEq);
     mixin Operator!("-", myMinus);
     mixin Operator!("-=", myMinusEq);

     // the actual operator overload implementations
     int myPlus(int v){ return m_value + v; };
     int myMinus(int v){ return m_value - v; };
     int myPlusEq(int v){ return m_value += v; };
     int myMinusEq(int v){ return m_value -= v; };

     int m_value = 0;
}

void main()
{
     // example use
     AClass a = new AClass();
     a += 3;
     writefln("a is: ", a.m_value);
     writefln("a+5 is: ", a + 5);
     a -= 10;
     writefln("a-=10; a is now: ", a.m_value);
     writefln("a-5 is: ", a - 5);
}

// The guy who makes it happen
template Operator(char[] op, alias OpFn )
{
     // todo: actually derive these types from OpFn
     alias int RetType;
     alias int ArgType;
     static if(op=="+") {
         RetType opAdd(ArgType v) {
             return OpFn(v);
         }
     }
     else static if(op=="-") {
         RetType opSub(ArgType v) {
             return OpFn(v);
         }
     }
     else static if(op=="+=") {
         RetType opAddAssign(ArgType v) {
             return OpFn(v);
         }
     }
     else static if(op=="-=") {
         RetType opSubAssign(ArgType v) {
             return OpFn(v);
         }
     }
}

This is pretty simplistic and not very complete.  Ideally the syntax 
would look more like

     mixin Operator!("-",
        int(int v){ return my_value + v; }
     );

or best

     int Operator!("-")(int v){ return my_value + v; }

But I couldn't figure out any way to make those work. :-)
Can anyone do better?

--bb
Oct 28 2006
next sibling parent "Jarrett Billingsley" <kb3ctd2 yahoo.com> writes:
"Bill Baxter" <wbaxter gmail.com> wrote in message 
news:ehv8kj$1d4$1 digitaldaemon.com...

 class AClass
 {
     // Look ma! I'm overloading operators by symbols!
     mixin Operator!("+", myPlus);
     mixin Operator!("+=", myPlusEq);
     mixin Operator!("-", myMinus);
     mixin Operator!("-=", myMinusEq);

     // the actual operator overload implementations
     int myPlus(int v){ return m_value + v; };
     int myMinus(int v){ return m_value - v; };
     int myPlusEq(int v){ return m_value += v; };
     int myMinusEq(int v){ return m_value -= v; };

     int m_value = 0;
 }
Basically you've just replaced "opAdd" etc. with "myPlus" etc. ;) I know what you're getting at but..
 This is pretty simplistic and not very complete.  Ideally the syntax would 
 look more like

     mixin Operator!("-",
        int(int v){ return my_value + v; }
     );

 or best

     int Operator!("-")(int v){ return my_value + v; }

 But I couldn't figure out any way to make those work. :-)
 Can anyone do better?
I think we'd have to have the ability to dynamically generate symbols with templates (i.e. some form of token pasting) in order for this to be possible. But then, of course, you have the problem of not being able to declare delegates at a class level, which would make it hard to pass the implementation into the Operator template..
Oct 28 2006
prev sibling next sibling parent reply Mike Parker <aldacron71 yahoo.com> writes:
Bill Baxter wrote:
 I'm not a big fan of magic operator method names.  Python has its 
 __add__ etc methods, Lua has very similar, D has opAdd etc.
 Personally I prefer C++'s way of just using the syntax itself.  I find 
 it a lot easier to remember and it looks less "magical".
 
 I started wondering if it might be able to accomplish something like 
 that using mixins.  Here's an example of what I've gotten to work so far:
The idea behind opAdd and friends is that they establish an explicit contract. In C++, what exactly is operator+ supposed to used for? It doesn't always mean 'addition'. Even in the standard library, std::string uses '+' to mean 'concatenation'. In some C++ vector math libraries, '*' is used to calculate the dot product of two vectors, since there's no such thing as the multiplication of two vectors (that is, it's not uniquely defined), while at the same time being used to multiply a vector by a scalar. opAdd is an explicit interface saying that "this operator does addition." Programmers can still abuse it, just as they can abuse the contract established by any interface. But when a library implementor uses opAdd to do concatenation or something other than addition, you can now point at them and say they are breaking the contract. operator+ doesn't allow you to do that since '+' by itself does not necessarily equivocate to 'addition'.
Oct 28 2006
next sibling parent Chris Nicholson-Sauls <ibisbasenji gmail.com> writes:
Mike Parker wrote:
 Bill Baxter wrote:
 
 I'm not a big fan of magic operator method names.  Python has its 
 __add__ etc methods, Lua has very similar, D has opAdd etc.
 Personally I prefer C++'s way of just using the syntax itself.  I find 
 it a lot easier to remember and it looks less "magical".

 I started wondering if it might be able to accomplish something like 
 that using mixins.  Here's an example of what I've gotten to work so far:
The idea behind opAdd and friends is that they establish an explicit contract. In C++, what exactly is operator+ supposed to used for? It doesn't always mean 'addition'. Even in the standard library, std::string uses '+' to mean 'concatenation'. In some C++ vector math libraries, '*' is used to calculate the dot product of two vectors, since there's no such thing as the multiplication of two vectors (that is, it's not uniquely defined), while at the same time being used to multiply a vector by a scalar. opAdd is an explicit interface saying that "this operator does addition." Programmers can still abuse it, just as they can abuse the contract established by any interface. But when a library implementor uses opAdd to do concatenation or something other than addition, you can now point at them and say they are breaking the contract. operator+ doesn't allow you to do that since '+' by itself does not necessarily equivocate to 'addition'.
Additionally, it allows for some minimizing. For example, instead of an 'operator<', an 'operator>', and an 'operator==' to make a class comparable, we merely define an 'opCmp' that does the work of all three. There could yet be some more minimizing (I feel that opApply(dg)/opApplyReverse(dg) could, for example, become opApply(dg,reverse?) instead). -- Chris Nicholson-Sauls
Oct 28 2006
prev sibling parent rm <roel.mathys gmail.com> writes:
Mike Parker wrote:
 Bill Baxter wrote:
 I'm not a big fan of magic operator method names.  Python has its
 __add__ etc methods, Lua has very similar, D has opAdd etc.
 Personally I prefer C++'s way of just using the syntax itself.  I find
 it a lot easier to remember and it looks less "magical".
Personally, I think binary operators should be static functions or non-member functions. I understand magic to be something that is momentarily beyond my horizon of understanding, so whatever syntax is used ... I think it's necessary to have operator overloading to not have to "learn" two ways of doing operations, one for built-in types and one for user-defined types.
 I started wondering if it might be able to accomplish something like
 that using mixins.  Here's an example of what I've gotten to work so far:
The idea behind opAdd and friends is that they establish an explicit contract. In C++, what exactly is operator+ supposed to used for? It doesn't always mean 'addition'. Even in the standard library, std::string uses '+' to mean 'concatenation'. In some C++ vector math libraries, '*' is used to calculate the dot product of two vectors, since there's no such thing as the multiplication of two vectors (that is, it's not uniquely defined), while at the same time being used to multiply a vector by a scalar.
well with opMul you can exactly do the same, that's just syntax. So, drawing a conclusion because in C++ you have to overload operator+ and in D you have to overload opMul ... ?
 opAdd is an explicit interface saying that "this operator does
 addition." Programmers can still abuse it, just as they can abuse the
 contract established by any interface. But when a library implementor
 uses opAdd to do concatenation or something other than addition, you can
 now point at them and say they are breaking the contract. operator+
 doesn't allow you to do that since '+' by itself does not necessarily
 equivocate to 'addition'.
where are these definitions given? Or should I "interpret" opAdd as "operator for addition", and then again, if wonder what Webster has to say about addition, e.g. you have to add 100 grams of sugar, then you have to mix-in :-) 500 ml of milk ... The more important question is, is all that is needed provided in D? Maybe the first question should be, what is needed? roel
Oct 28 2006
prev sibling next sibling parent reply Walter Bright <newshound digitalmars.com> writes:
Bill Baxter wrote:
 I'm not a big fan of magic operator method names.  Python has its 
 __add__ etc methods, Lua has very similar, D has opAdd etc.
 Personally I prefer C++'s way of just using the syntax itself.  I find 
 it a lot easier to remember and it looks less "magical".
The reasons for "opAdd" instead of "operator+" are: 1) opAdd is eminently more greppable. Try grepping for operator+: operator /* comment */ + (T t) operator +\ +() operator+(T t) Oops, I found operator/ instead! I thought operator++ was operator+! Is that a unary + or a binary +? You practically need a full C++ front end to do the job correctly. D can do tolerably well with simple grep. 2) opAdd looks like "opAdd" in the object symbol table rather than "?H" (I am not making up ?H, it really is that) giving one a clue without needing a decoder ring. 3) it encourages the use of operating overloading for arithmetic purposes, rather than "parse this predicate once", which happens with C++ operator overloading. 4) operators that are mathematically related can be derived from each other: in C++ the == and != are separately overloadable. Anyone who wants to do mathematical overloading has to do both and take care that one actually is the not of the other. With opEquals, one function can serve both. This makes more of a difference with <, <=, >, >= where 4 overloads are replaced by opCmp. 5) Note C++'s inability to distinguish operator[] as an lvalue and as an rvalue. D has opIndexAssign and opIndex. 6) Note the kludge-o-matic C++ overloading of operator++ and its different meanings. I can never remember which is which without looking it up. D has opAddAssign and opPostInc.
Oct 30 2006
parent reply Bill Baxter <dnewsgroup billbaxter.com> writes:
Walter Bright wrote:
 Bill Baxter wrote:
 I'm not a big fan of magic operator method names.  Python has its 
 __add__ etc methods, Lua has very similar, D has opAdd etc.
 Personally I prefer C++'s way of just using the syntax itself.  I find 
 it a lot easier to remember and it looks less "magical".
The reasons for "opAdd" instead of "operator+" are: 1) opAdd is eminently more greppable. Try grepping for operator+: operator /* comment */ + (T t) operator +\ +() operator+(T t) Oops, I found operator/ instead! I thought operator++ was operator+! Is that a unary + or a binary +? You practically need a full C++ front end to do the job correctly. D can do tolerably well with simple grep.
Most of those are just perversities that don't exist in real code. /operator\s*+[^+]/ would find you 99% of all real use cases. On the other hand say I want to find all operator overloads period. With C++ I can pretty much just grep for 'operator', whereas for D I'd have to be a little smarter, because just grepping for 'op' is likely to turn up lots of cruft. Ok /\Wop[A-Z]/ would probably do a decent job where \W is the 'not a word character pattern'. Either way I think this one is pretty much a wash. It's not that hard to grep for either one.
 
 2) opAdd looks like "opAdd" in the object symbol table rather than "?H" 
 (I am not making up ?H, it really is that) giving one a clue without 
 needing a decoder ring.
I guess that's nice for the compiler writer. Does it affect the user somehow too? Because I'm not usually so concerned about how things look in the symbol table given all the name mangling going on everywhere. Besides, couldn't one arrange things so that 'operator+' appeared in the symbol table as something like '__operator_plus' if one so desired?
 3) it encourages the use of operating overloading for arithmetic 
 purposes, rather than "parse this predicate once", which happens with 
 C++ operator overloading.
I suppose. But I suspect programmers will likely see it as a feature they can use no matter what you call it. C++ books generally recommend not overloading + for things that are semantically unrelated to adding, but people do it anyway. Similarly people use static opCall in D as a constructor. If the programmer really wants a succinct syntax for some common operation, then they're going to consider operator overloading as one of their design choices, no matter what those methods are called.
 4) operators that are mathematically related can be derived from each 
 other: in C++ the == and != are separately overloadable. Anyone who 
 wants to do mathematical overloading has to do both and take care that 
 one actually is the not of the other. With opEquals, one function can 
 serve both. This makes more of a difference with <, <=, >, >= where 4 
 overloads are replaced by opCmp.
Well, operator < alone is used in C++, and via similar mathematical identities you can construct <=, >, >= out of it. given a < b operator we have: a > b === b < a a <= b === !(b<a) a >= b === !(a<b)
 5) Note C++'s inability to distinguish operator[] as an lvalue and as an 
 rvalue. D has opIndexAssign and opIndex.
Seems C++ does ok there: type& operator[]() { } // lvalue case type operator[]() { } const // rvalue case
 6) Note the kludge-o-matic C++ overloading of operator++ and its 
 different meanings. I can never remember which is which without looking 
 it up. D has opAddAssign and opPostInc.
Yeh, that is super hacky and hard to remember. Maybe C++ should have added 'loperator' to distinguish left from right. Really this 'hard to remember' point is the main reason I think symbols for operator overloads would be superior. Something like this, though I realize totally hopeless, would nonetheless be nice: (Let 'i' mean 'this', though 'this' could be used instead. Let 'x' (or any non-i letter) mean the other thing where needed.) operator[-i] -- opNeg operator[+i] -- opPos operator[~i] -- opCom operator[i++] -- opPostInc operator[i--] -- opPostDec operator[i+] -- opAdd operator[x+i] -- opAdd_r operator[i==] -- opEquals operator[i+=] -- opAddAssign operator[i in] -- opIn operator[in i] -- opIn_r operator[i[]] -- opIndex operator[i[]=] -- opIndexAssign operator[i[..]] -- opSlice operator[i[..]=] -- opSliceAssign etc... Then I don't have to remember what name the language chose to represent the operator, I just have to remember the syntactical situation in which I want that operator to be invoked. I realize it's unconventional. (I've never seen such a thing in a language before -- maybe haskell comes close.) But I've always been annoyed by operator overloading in the languages I've used. Why not just make the operator declaration show the exact use case where the operator is invoked?? The above could even be implemented as some sort of preprocessor. It's just pure syntactic sugar for the more cryptic built-in method names. For opCmp, basically you'd only allow operator[i>] and then say it should return positive if i>, zero if equal, and - if less than. The only issue is I think opApply / opApplyReverse, and there the problem is that these are not really operators to begin with, they're iterators. Unlike an operator they have no associated syntax. --bb
Oct 30 2006
parent Walter Bright <newshound digitalmars.com> writes:
Bill Baxter wrote:
 Walter Bright wrote:
 Oops, I found operator/ instead! I thought operator++ was operator+! 
 Is that a unary + or a binary +? You practically need a full C++ front 
 end to do the job correctly. D can do tolerably well with simple grep.
Most of those are just perversities that don't exist in real code. /operator\s*+[^+]/ would find you 99% of all real use cases.
Perhaps you're right, but I sure get tired of things in C++ that work only most of the time (and I didn't even get into what the preprocessor can do to any reliance on grep). I like things to work reliably. I want to make sure I found all the operator overload cases when I do a code audit. There's a thread on comp.lang.c about writing a program that can convert C++ // comments to /* */ comments. Most of the thread is about all the weird corner cases (like trigraphs, line splicing, etc.) that can happen in C++ and how doing a correct job of it is far more complicated than it looks like it should be. This is not unusual, but typical of C++ source code analysis problems.
 2) opAdd looks like "opAdd" in the object symbol table rather than 
 "?H" (I am not making up ?H, it really is that) giving one a clue 
 without needing a decoder ring.
I guess that's nice for the compiler writer. Does it affect the user somehow too? Because I'm not usually so concerned about how things look in the symbol table given all the name mangling going on everywhere.
How they look in the symbol table matters when you're having problems getting things to link properly or getting error messages from the linker or looking at exported names from a DLL or using a debugger without full debug info or using a disassembler, etc.
 Besides, couldn't one arrange things so that 'operator+' appeared in the 
 symbol table as something like '__operator_plus' if one so desired?
Yes, one could. But it's one less level of indirection to connect "opAdd" in the symbol table with "opAdd" in the source code.
 3) it encourages the use of operating overloading for arithmetic 
 purposes, rather than "parse this predicate once", which happens with 
 C++ operator overloading.
I suppose. But I suspect programmers will likely see it as a feature they can use no matter what you call it. C++ books generally recommend not overloading + for things that are semantically unrelated to adding, but people do it anyway. Similarly people use static opCall in D as a constructor. If the programmer really wants a succinct syntax for some common operation, then they're going to consider operator overloading as one of their design choices, no matter what those methods are called.
Programmers can and will do whatever they want, but it helps to encourage correct usage by following the dictum "if it looks wrong, it probably is wrong". And overloading opAdd to be "parse" is going to look wrong, wrong, wrong.
 4) operators that are mathematically related can be derived from each 
 other: in C++ the == and != are separately overloadable. Anyone who 
 wants to do mathematical overloading has to do both and take care that 
 one actually is the not of the other. With opEquals, one function can 
 serve both. This makes more of a difference with <, <=, >, >= where 4 
 overloads are replaced by opCmp.
Well, operator < alone is used in C++, and via similar mathematical identities you can construct <=, >, >= out of it. given a < b operator we have: a > b === b < a a <= b === !(b<a) a >= b === !(a<b)
I know you can construct those identities in C++, but the point is you have to manually construct them every time, which is tedious and a source of error. C++ won't do it for you.
 5) Note C++'s inability to distinguish operator[] as an lvalue and as 
 an rvalue. D has opIndexAssign and opIndex.
Seems C++ does ok there: type& operator[]() { } // lvalue case type operator[]() { } const // rvalue case
That's by learned and commonly followed convention, not by design. Even worse, the lvalue case is restricted to only allow assignment through the reference - making it impossible to have an lvalue case where some post processing needs to be done with the new contents of the lvalue.
 Really this 'hard to remember' point is the main reason I think symbols 
 for operator overloads would be superior.  Something like this, though I 
 realize totally hopeless, would nonetheless be nice:
 
 (Let 'i' mean 'this', though 'this' could be used instead.
  Let 'x' (or any non-i letter) mean the other thing where needed.)
 
 operator[-i]   -- opNeg
 operator[+i]   -- opPos
 operator[~i]   -- opCom
 operator[i++]  -- opPostInc
 operator[i--]  -- opPostDec
 operator[i+]   -- opAdd
 operator[x+i]  -- opAdd_r
 operator[i==]  -- opEquals
 operator[i+=]  -- opAddAssign
 operator[i in] -- opIn
 operator[in i] -- opIn_r
 operator[i[]]  -- opIndex
 operator[i[]=] -- opIndexAssign
 operator[i[..]] -- opSlice
 operator[i[..]=] -- opSliceAssign
 etc...
 
 Then I don't have to remember what name the language chose to represent 
 the operator, I just have to remember the syntactical situation in which 
 I want that operator to be invoked.
You'd have to remember the funky syntactical oddities for each operator in the above notation (note that it's inconsistent). I don't think there's any real improvement.
 The only issue is I think opApply / opApplyReverse, and there the 
 problem is that these are not really operators to begin with, they're 
 iterators.  Unlike an operator they have no associated syntax.
Using the opXxxx convention does enable the overloading of operations that do not have an obvious operator symbol.
Oct 31 2006
prev sibling parent Daniel Keep <daniel.keep.lists gmail.com> writes:
Bill Baxter wrote:
 ..., Lua has very similar, ...
Just pointing out that Lua's special methods are in a completely different namespace to the "normal" methods, so it isn't a problem. Special methods are attached to a table's metatable, which exists just for that purpose.
 local t = {}
 local mt = getmetatable(t) or {}

 function mt:index(k)
     return "foo"
 end

 setmetatable(t, mt)

 io.print(t.blah) -- prints "foo"
Apologies if any of that is incorrect; very sleepy over here :3 -- Daniel -- Unlike Knuth, I have neither proven or tried the above; it may not even make sense. v2sw5+8Yhw5ln4+5pr6OFPma8u6+7Lw4Tm6+7l6+7D i28a2Xs3MSr2e4/6+7t4TNSMb6HTOp5en5g6RAHCP http://hackerkey.com/
Oct 30 2006