www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - opEquals/opCmp returning other types

reply "Brian Palmer" <brian+d codekitchen.net> writes:
I'm working on a DSL for generating SQL queries, based loosely on 
Python's SQLAlchemy and Ruby's Sequel. One nice thing about the 
DSL is the compact syntax for specifying WHERE clauses. With some 
fiddling, I got it working for opEquals, a simplified example:

   foreach(network; db["networks"].each) {
     writefln("network: %s", network.name);
     foreach(host; db["hosts"].where(db.c.id == network.id)) {
       writefln("\thost: %s", host.address);
     }
   }

This works because db.c.id returns a struct which defines an 
opEquals which returns a "Filter" struct, rather than an int. I'm 
not positive that it should really be allowed by the compiler, 
but it works:

struct Filter { ... }
struct Column {
     Filter opEquals(T)(T rhs) { ... }
}

Then the .where call takes a filter, and uses it to output a 
snippet of sql like "id = 5"

However, this won't work for comparison operators like < and >, 
which all map to opCmp, or for != (since that's rewritten to !(a 
== b))

I guess I have two questions, one, am I going to shoot myself in 
the foot by going down this path, because it only happens to work 
due to the compiler being too lax? And is there interest in 
extending D to allow the rest of the operators to return 
non-boolean results? I'm thinking something like falling back to 
opBinary!("<"), etc, if opCmp isn't defined for a struct.
Mar 18 2012
next sibling parent =?utf-8?Q?Simen_Kj=C3=A6r=C3=A5s?= <simen.kjaras gmail.com> writes:
On Mon, 19 Mar 2012 01:54:27 +0100, Brian Palmer <brian+d codekitchen.net>  
wrote:

 I'm working on a DSL for generating SQL queries, based loosely on  
 Python's SQLAlchemy and Ruby's Sequel. One nice thing about the DSL is  
 the compact syntax for specifying WHERE clauses. With some fiddling, I  
 got it working for opEquals, a simplified example:

    foreach(network; db["networks"].each) {
      writefln("network: %s", network.name);
      foreach(host; db["hosts"].where(db.c.id == network.id)) {
        writefln("\thost: %s", host.address);
      }
    }

 This works because db.c.id returns a struct which defines an opEquals  
 which returns a "Filter" struct, rather than an int. I'm not positive  
 that it should really be allowed by the compiler, but it works:

 struct Filter { ... }
 struct Column {
      Filter opEquals(T)(T rhs) { ... }
 }

 Then the .where call takes a filter, and uses it to output a snippet of  
 sql like "id = 5"

 However, this won't work for comparison operators like < and >, which  
 all map to opCmp, or for != (since that's rewritten to !(a == b))

 I guess I have two questions, one, am I going to shoot myself in the  
 foot by going down this path, because it only happens to work due to the  
 compiler being too lax? And is there interest in extending D to allow  
 the rest of the operators to return non-boolean results? I'm thinking  
 something like falling back to opBinary!("<"), etc, if opCmp isn't  
 defined for a struct.
I for one, agree that having opEquals and opCmp return irregular types is awesome. Now, I see there are very good reasons to leave the behavior of all those 'derived' operators to the compiler. Except, of course, when you want to do something weird, like you're doing now. I also think that the 'D creed' states that 'the simple should be easy, the complex possible'. I like the idea of opBinary!"<", but would perhaps argue for opBinaryComparison!"<", to distinguish it from the 'normal' operators.
Mar 18 2012
prev sibling next sibling parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Mon, Mar 19, 2012 at 01:54:27AM +0100, Brian Palmer wrote:
[...]
 However, this won't work for comparison operators like < and >, which
 all map to opCmp, or for != (since that's rewritten to !(a == b))
 
 I guess I have two questions, one, am I going to shoot myself in the
 foot by going down this path, because it only happens to work due to
 the compiler being too lax? And is there interest in extending D to
 allow the rest of the operators to return non-boolean results? I'm
 thinking something like falling back to opBinary!("<"), etc, if opCmp
 isn't defined for a struct.
IIRC, the reason D chose to go with opCmp and opEquals rather than the C++ route of operator...() for each comparison operator is so that the language can give some guarantees about the consistency of a<b, a<=b, a==b, etc.. In C++, for example, you can define operator<() and operator>() in completely arbitrary ways, which means they can be totally unrelated to each other, and return results that have nothing to do with each other. This causes inconsistency in that a<b does not necessarily imply b>a, and vice versa. Which makes for inconsistent code. By using opCmp in D, such inconsistency is avoided, and the user is spared the tedium of having to define operator<(), operator<=(), operator>(), operator>=(), (and in D, you also have to add !<=, !>=, !<, !>, etc.), ad nauseum; a single operator opCmp() takes care of all these cases. T -- Prosperity breeds contempt, and poverty breeds consent. -- Suck.com
Mar 18 2012
next sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 03/19/2012 02:31 AM, H. S. Teoh wrote:
 On Mon, Mar 19, 2012 at 01:54:27AM +0100, Brian Palmer wrote:
 [...]
 However, this won't work for comparison operators like<  and>, which
 all map to opCmp, or for != (since that's rewritten to !(a == b))

 I guess I have two questions, one, am I going to shoot myself in the
 foot by going down this path, because it only happens to work due to
 the compiler being too lax? And is there interest in extending D to
 allow the rest of the operators to return non-boolean results? I'm
 thinking something like falling back to opBinary!("<"), etc, if opCmp
 isn't defined for a struct.
IIRC, the reason D chose to go with opCmp and opEquals rather than the C++ route of operator...() for each comparison operator is so that the language can give some guarantees about the consistency of a<b, a<=b, a==b, etc..
That is fine. Iff you want it.
 In C++, for example, you can define operator<() and operator>() in
 completely arbitrary ways, which means they can be totally unrelated to
 each other, and return results that have nothing to do with each other.
I can also write a program that behaves in an arbitrary way. I usually want it to do something useful though.
 This causes inconsistency in that a<b does not necessarily imply b>a,
 and vice versa.
I suppose you meant b>=a. But even then, that is already sometimes the case for built-in types.
 Which makes for inconsistent code.
This is dependent on the context. foreach(network; db["networks"].each) { writefln("network: %s", network.name); foreach(host; db["hosts"].where(db.c.id == network.id)) { writefln("\thost: %s", host.address); } } vs. foreach(network; db["networks"].each) { writefln("network: %s", network.name); foreach(host; db["hosts"].where(db.c.id.less(network.id))) { writefln("\thost: %s", host.address); } } This inconsistency is merely syntactical though and could be fixed using an 'equals' method instead of opEquals.
 By using opCmp in D, such inconsistency is avoided, and the user is
 spared the tedium of having to define operator<(), operator<=(),
 operator>(), operator>=(), (and in D, you also have to add !<=, !>=, !<,
 !>, etc.), ad nauseum; a single operator opCmp() takes care of all these
 cases.


 T
The current limitations make it impossible to define (for example) a floating point type with NaN that behaves like built-in float/double/real.
Mar 19 2012
parent reply =?utf-8?Q?Simen_Kj=C3=A6r=C3=A5s?= <simen.kjaras gmail.com> writes:
On Mon, 19 Mar 2012 09:29:34 +0100, Timon Gehr <timon.gehr gmx.ch> wrote:

 The current limitations make it impossible to define (for example) a  
 floating point type with NaN that behaves like built-in  
 float/double/real.
As it turns out, this is possible. opCmp can return a float, and things work just fine (!<, !>=, etc). The problem appears with opEquals. This is the generated assembly: fldz fucompp fnstsw ax sahf jne <somewhere> So it compares the result to 0.0, copies status flags to the CPU, then checks if any of the flags are set. If the returned value *is* equal to 0.0, the C3 flag is set. The result is that the jump is taken only when the returned value is *less* than 0.0. I have a feeling this is wrong. Should I file this in BugZilla? Anyways. With this newly-won knowledge, we can design an opEquals that returns a float, and behaves correctly (until the above bug [if I'm right] is fixed): struct MyInt { int n; float opEquals(MyInt other) const { if (n == int.min || other.n == int.min) { return float.nan; } return n == other.n ? -1.0 : 1.0; // Note workaround. //return n - other.n; // Should work. } float opCmp(MyInt other) const { if (n == int.min || other.n == int.min) { return float.nan; } return n - other.n; } } unittest { assert( MyInt(int.min) != MyInt(int.min) ); assert( MyInt(0) == MyInt(0) ); assert( MyInt(int.min) !< MyInt(0) ); assert( MyInt(int.min) !> MyInt(0) ); assert( MyInt(int.min) !<= MyInt(0) ); assert( MyInt(int.min) !>= MyInt(0) ); assert( MyInt(0) !< MyInt(int.min) ); assert( MyInt(0) !> MyInt(int.min) ); assert( MyInt(0) !<= MyInt(int.min) ); assert( MyInt(0) !>= MyInt(int.min) ); assert( MyInt(1) > MyInt(0) ); assert( MyInt(1) >= MyInt(0) ); assert( MyInt(0) < MyInt(1) ); assert( MyInt(0) <= MyInt(1) ); }
Mar 19 2012
parent =?utf-8?Q?Simen_Kj=C3=A6r=C3=A5s?= <simen.kjaras gmail.com> writes:
On Mon, 19 Mar 2012 18:28:00 +0100, Simen Kj=C3=A6r=C3=A5s <simen.kjaras=
 gmail.com>  =

wrote:

 On Mon, 19 Mar 2012 09:29:34 +0100, Timon Gehr <timon.gehr gmx.ch> wro=
te:
 The current limitations make it impossible to define (for example) a =
=
 floating point type with NaN that behaves like built-in  =
 float/double/real.
As it turns out, this is possible. opCmp can return a float, and things work just fine (!<, !>=3D, etc). The problem appears with opEquals. This is the generated assembly: fldz fucompp fnstsw ax sahf jne <somewhere> So it compares the result to 0.0, copies status flags to the CPU, then checks if any of the flags are set. If the returned value *is* equal to 0.0, the C3 flag is set. The result is that the jump is taken only when the returned value is *less* than 0.0. I have a feeling this is wrong. Should I file this in BugZilla?
I went ahead and filed it: http://d.puremagic.com/issues/show_bug.cgi?id=3D7734
Mar 19 2012
prev sibling parent reply "Brian Palmer" <brian+d codekitchen.net> writes:
On Monday, 19 March 2012 at 01:29:50 UTC, H. S. Teoh wrote:

 In C++, for example, you can define operator<() and operator>() 
 in
 completely arbitrary ways, which means they can be totally 
 unrelated to
 each other, and return results that have nothing to do with 
 each other.
 This causes inconsistency in that a<b does not necessarily 
 imply b>a,
 and vice versa. Which makes for inconsistent code.
While I totally get that concern, I've never really seen it become a real issue in any of the large C++ systems I've worked on. Maybe I've just been lucky? Ruby also allows these arbitrary operator redefinitions, and it's never been an issue in the large Ruby systems I've worked on, either. Python also allows them, though I don't have much Python experience. In fact, a lot of the most useful DSLs in Ruby rely heavily on being able to do these overrides. I think D today is missing out on a lot of those possibilities.
Mar 19 2012
next sibling parent "Paulo Pinto" <pjmlp progtools.org> writes:
I think the only people who have issues with operator definitions don't
think of them as function names, in the abstract mathematics sense.

+, < and so on are just simbolic names for some operation.

If I want to know what + does, I have to do the same as when I see Add, and
look to the definition being called.

--
Paulo

"Brian Palmer"  wrote in message 
news:bafsgxezdurhidyvveaz forum.dlang.org...

On Monday, 19 March 2012 at 01:29:50 UTC, H. S. Teoh wrote:

 In C++, for example, you can define operator<() and operator>() in
 completely arbitrary ways, which means they can be totally unrelated to
 each other, and return results that have nothing to do with each other.
 This causes inconsistency in that a<b does not necessarily imply b>a,
 and vice versa. Which makes for inconsistent code.
While I totally get that concern, I've never really seen it become a real issue in any of the large C++ systems I've worked on. Maybe I've just been lucky? Ruby also allows these arbitrary operator redefinitions, and it's never been an issue in the large Ruby systems I've worked on, either. Python also allows them, though I don't have much Python experience. In fact, a lot of the most useful DSLs in Ruby rely heavily on being able to do these overrides. I think D today is missing out on a lot of those possibilities.
Mar 19 2012
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 3/19/12 10:17 AM, Brian Palmer wrote:
 On Monday, 19 March 2012 at 01:29:50 UTC, H. S. Teoh wrote:

 In C++, for example, you can define operator<() and operator>() in
 completely arbitrary ways, which means they can be totally unrelated to
 each other, and return results that have nothing to do with each other.
 This causes inconsistency in that a<b does not necessarily imply b>a,
 and vice versa. Which makes for inconsistent code.
While I totally get that concern, I've never really seen it become a real issue in any of the large C++ systems I've worked on. Maybe I've just been lucky? Ruby also allows these arbitrary operator redefinitions, and it's never been an issue in the large Ruby systems I've worked on, either. Python also allows them, though I don't have much Python experience.
I think the concern is not bugs in the implementation, but instead a lot of boilerplate text (e.g. in the C++ standard) and code (all of those mind-numbing one-liners that swap parameter order etc).
 In fact, a lot of the most useful DSLs in Ruby rely heavily on being
 able to do these overrides. I think D today is missing out on a lot of
 those possibilities.
For D, hooking built-in operators is a poor way to implement DSLs. Best is to use strings, CTFE, and mixin. Andrei
Mar 19 2012
parent "Brian Palmer" <brian+d codekitchen.net> writes:
On Monday, 19 March 2012 at 15:33:33 UTC, Andrei Alexandrescu
wrote:

 For D, hooking built-in operators is a poor way to implement 
 DSLs. Best is to use strings, CTFE, and mixin.


 Andrei
Hmm I have to say I'm not (yet) convinced. But I'll try to drink the kool-aid and see if it converts me. I'd love some help, as I'm still no D expert. My initial gut reaction, since I already plan to support arbitrary SQL snippets in where clauses, is to modify the where() method to something like: foreach(network; db["networks"].each) { writefln("network: %s", network.name); foreach(host; db["hosts"].where("hosts.id = ?", network.id)) { writefln("\thost: %s", host.address); } } Which, in a lot of ways, it's very cool that I can parse that at compile time. However, I see a few problems already. First of all, where can a reasonable line be drawn between arbitrary SQL, and snippets where we parse and actually understand the snippet? On one extreme, I could implement a full SQL parser and parse any given snippet to pull out identifiers and such. Sounds painful, but potentially awesome. The other extreme is to just treat *all* where clauses as SQL snippets with no extra "smarts", but if I do that, suddenly I've lost a whole lot of functionality and my library is less useful (it can't auto-quote identifiers, intelligently combine conditionals and add aliases or scopes to identifiers, a whole slew of useful functionality is gone). A middle ground would be to just parse a simple subset of possible snippets, and treat any we don't understand as raw SQL. But that seems like a surprising, frustrating experience for users of the library. Anyway, I'm going to play around with the idea, see if I can convince myself. Any advice on how to convince myself would be appreciated. :)
Mar 19 2012
prev sibling next sibling parent Jacob Carlborg <doob me.com> writes:
On 2012-03-19 01:54, Brian Palmer wrote:
 I'm working on a DSL for generating SQL queries, based loosely on
 Python's SQLAlchemy and Ruby's Sequel. One nice thing about the DSL is
 the compact syntax for specifying WHERE clauses. With some fiddling, I
 got it working for opEquals, a simplified example:

 foreach(network; db["networks"].each) {
 writefln("network: %s", network.name);
 foreach(host; db["hosts"].where(db.c.id == network.id)) {
 writefln("\thost: %s", host.address);
 }
 }
I've been playing with the exact same idea: Foo.where(x => x.name == "foo"); Generates this SQL: select foos.* from foos where foos.name = 'foo' I was hoping this would work for the other operators as well. -- /Jacob Carlborg
Mar 19 2012
prev sibling parent reply Don Clugston <dac nospam.com> writes:
On 19/03/12 01:54, Brian Palmer wrote:
 I'm working on a DSL for generating SQL queries, based loosely on
 Python's SQLAlchemy and Ruby's Sequel. One nice thing about the DSL is
 the compact syntax for specifying WHERE clauses. With some fiddling, I
 got it working for opEquals, a simplified example:

 foreach(network; db["networks"].each) {
 writefln("network: %s", network.name);
 foreach(host; db["hosts"].where(db.c.id == network.id)) {
 writefln("\thost: %s", host.address);
 }
 }

 This works because db.c.id returns a struct which defines an opEquals
 which returns a "Filter" struct, rather than an int. I'm not positive
 that it should really be allowed by the compiler, but it works:

 struct Filter { ... }
 struct Column {
 Filter opEquals(T)(T rhs) { ... }
 }

 Then the .where call takes a filter, and uses it to output a snippet of
 sql like "id = 5"

 However, this won't work for comparison operators like < and >, which
 all map to opCmp, or for != (since that's rewritten to !(a == b))

 I guess I have two questions, one, am I going to shoot myself in the
 foot by going down this path, because it only happens to work due to the
 compiler being too lax? And is there interest in extending D to allow
 the rest of the operators to return non-boolean results? I'm thinking
 something like falling back to opBinary!("<"), etc, if opCmp isn't
 defined for a struct.
I don't think this is EVER what you really want. I believe that if you think you want this feature, 100% of the time, what you really want is a syntax tree of the entire expression. That is, either you want ">" to be a comparison, and "+" to be an addition, OR you want a syntax tree.
Mar 21 2012
parent reply "Brian Palmer" <brian+d codekitchen.net> writes:
On Wednesday, 21 March 2012 at 11:05:41 UTC, Don Clugston wrote:

 I don't think this is EVER what you really want.
 I believe that if you think you want this feature, 100% of the 
 time, what you really want is a syntax tree of the entire 
 expression. That is, either you want ">" to be a comparison, 
 and "+" to be an addition, OR you want a syntax tree.
Well yes, the whole point is to build up a syntax tree that can be manipulated before being outputted as a raw sql string. The operator overloading is a convenient way to do that, that has turned out to be intuitive and easy for developers to use in many other DSLs. To say that's not EVER what you really want seems a bit silly, considering all the libraries in other languages that utilize this technique. That said, if this isn't the D way, it's not the D way, I'm certainly not going to try and shoe-horn it in based on undefined behavior or something.
Mar 21 2012
parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Wed, Mar 21, 2012 at 03:29:21PM +0100, Brian Palmer wrote:
 On Wednesday, 21 March 2012 at 11:05:41 UTC, Don Clugston wrote:
 
I don't think this is EVER what you really want.
I believe that if you think you want this feature, 100% of the
time, what you really want is a syntax tree of the entire
expression. That is, either you want ">" to be a comparison, and
"+" to be an addition, OR you want a syntax tree.
Well yes, the whole point is to build up a syntax tree that can be manipulated before being outputted as a raw sql string. The operator overloading is a convenient way to do that, that has turned out to be intuitive and easy for developers to use in many other DSLs. To say that's not EVER what you really want seems a bit silly, considering all the libraries in other languages that utilize this technique. That said, if this isn't the D way, it's not the D way, I'm certainly not going to try and shoe-horn it in based on undefined behavior or something.
The "D way" is to use strings for DSELs which get evaluated at compile-time, or a custom set of methods that you can build expressions out of. Operator overloading really should be limited to arithmetic types (for numerical classes) and built-in operations like array lookups and stuff. Trying to shoehorn language-level operators to do something they weren't intended to do only leads to problems. (C++'s overloading of << and >> for I/O is a very bad design decision IMO.) T -- Why is it that all of the instruments seeking intelligent life in the universe are pointed away from Earth? -- Michael Beibl
Mar 21 2012
parent reply Jacob Carlborg <doob me.com> writes:
On 2012-03-21 17:44, H. S. Teoh wrote:

 The "D way" is to use strings for DSELs which get evaluated at
 compile-time, or a custom set of methods that you can build expressions
 out of. Operator overloading really should be limited to arithmetic
 types (for numerical classes) and built-in operations like array lookups
 and stuff.

 Trying to shoehorn language-level operators to do something they weren't
 intended to do only leads to problems. (C++'s overloading of<<  and>>
 for I/O is a very bad design decision IMO.)
"find", "map" and similar functions can be used on arrays. What's wrong in being able to use the same syntax for accessing a database. I think the following would be a great syntax: Person.where(x => x.name == "John"); Where "Person" is a class connected to a database table. -- /Jacob Carlborg
Mar 21 2012
parent reply Don Clugston <dac nospam.com> writes:
On 21/03/12 21:53, Jacob Carlborg wrote:
 On 2012-03-21 17:44, H. S. Teoh wrote:

 The "D way" is to use strings for DSELs which get evaluated at
 compile-time, or a custom set of methods that you can build expressions
 out of. Operator overloading really should be limited to arithmetic
 types (for numerical classes) and built-in operations like array lookups
 and stuff.

 Trying to shoehorn language-level operators to do something they weren't
 intended to do only leads to problems. (C++'s overloading of<< and>>
 for I/O is a very bad design decision IMO.)
"find", "map" and similar functions can be used on arrays. What's wrong in being able to use the same syntax for accessing a database. I think the following would be a great syntax: Person.where(x => x.name == "John"); Where "Person" is a class connected to a database table.
Indeed, it may be possible to use a new-style delegate literal instead of a string, for defining your DSL. My point was that we don't need to be able to return arbitrary types from operators. Instead, we might need some syntax sugar or library support for getting syntax trees from expressions.
Mar 22 2012
parent reply "David Nadlinger" <see klickverbot.at> writes:
On Thursday, 22 March 2012 at 10:06:46 UTC, Don Clugston wrote:
 Indeed, it may be possible to use a new-style delegate literal 
 instead of a string, for defining your DSL.
 My point was that we don't need to be able to return arbitrary 
 types from operators. Instead, we might need some syntax sugar 
 or library support for getting syntax trees from expressions.
And how exactly do you plan to implement _library_ support for that in the case of expressions containing comparison or equality operators? David
Mar 22 2012
parent "Daniel Murphy" <yebblies nospamgmail.com> writes:
"David Nadlinger" <see klickverbot.at> wrote in message 
news:ajuozpunkzfawcapczcg forum.dlang.org...
 And how exactly do you plan to implement _library_ support for that in the 
 case of expressions containing comparison or equality operators?

 David
With a D parser in phobos.
Mar 22 2012