www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - recursive equal, and firstDifference functions

reply "timotheecour" <timothee.cour2 gmail.com> writes:
we need a std.algorithm.equalRecurse(T1,T2)(T1 a, T2 b) that 
compares recursively a and b;

its behavior should be:

if opEqual is defined, call it
else, if its a range, call std.algorithm.equal (ie compare nb 
elements, then each element for equality)
else, if it's a class/struct, make sure types are same and call 
it recursively on each field.
else if it's a numerical type, call "=="
else (is there an else?)

just as std.algorithm.equal, we should have 
equalRecurse([1],[1.0]);

Currently, "==" works in a non-intuitive way, breaking principle 
of least surprise:
----
struct A{	int[]b=[1]; }
void main(){
	auto a1=A();
	auto a2=A();
	assert(a1!=a2);
}
----
we should have:
assert(equalRecurse(a1,a2));

What are your thoughts on those:

A)
Can we just extend std.algorithm.equal (which only works on 
ranges) to arbitrary types instead of introducing a new function 
equalRecurse ?

B)
Should "==" / opEqual behave like that (but it might be too late 
to change)?

C)
in the example above, why assert(a1!=a2); but 
assert(A.init==A.init); ?

D)
Additionally, we would need a sister function firstDifference(T1, 
T2)(T1 a, T2 b) that finds the first difference among a and b. 
How to represent that in a generic way is not clear (should for 
example, find a particular field being different, etc). How about 
a string output:
assert(firstDifference([1,2,3],[1,2,3]) == null);
assert(firstDifference([1,2,3],[1,3]) == "a.length");
assert(firstDifference([1,2,3],[1,4,3]) == "a[1]");
A a1; A a2; a1.x=1; a2.x=2;
assert(firstDifference(a1,a2) == "a.x");
assert(firstDifference(a1,void) == "typeid(a)");

Both would be very useful for debugging.
Mar 19 2013
next sibling parent "simendsjo" <simendsjo gmail.com> writes:
On Tuesday, 19 March 2013 at 08:25:45 UTC, timotheecour wrote:
 we need a std.algorithm.equalRecurse(T1,T2)(T1 a, T2 b) that 
 compares recursively a and b;

 its behavior should be:

 if opEqual is defined, call it
 else, if its a range, call std.algorithm.equal (ie compare nb 
 elements, then each element for equality)
 else, if it's a class/struct, make sure types are same and call 
 it recursively on each field.
 else if it's a numerical type, call "=="
 else (is there an else?)
Not recursively, but how about something like this? 1) One type, one other -> false 2) Both types -> is(A == B) 3) Both symbols && __traits(isSame) -> true 4) A symbol && __traits(compiles, (A.opEquals(B))) -> A.opEquals(B) 5) B symbol && __traits(compiles, (B.opEquals(A))) -> B.opEquals(A) 6) A symbol && __traits(compiles, (A.equals(B))) -> A.equals(B) 7) B symbol && __traits(compiles, (B.equals(A))) -> B.equals(A) 8) Both symbols or values && __traits(compiles, A == B) -> A == B 9) assert(0)
Mar 19 2013
prev sibling next sibling parent reply Jonathan M Davis <jmdavisProg gmx.com> writes:
On Tuesday, March 19, 2013 09:25:43 timotheecour wrote:
 we need a std.algorithm.equalRecurse(T1,T2)(T1 a, T2 b) that
 compares recursively a and b;
 
 its behavior should be:
 
 if opEqual is defined, call it
 else, if its a range, call std.algorithm.equal (ie compare nb
 elements, then each element for equality)
 else, if it's a class/struct, make sure types are same and call
 it recursively on each field.
 else if it's a numerical type, call "=="
 else (is there an else?)
 
 just as std.algorithm.equal, we should have
 equalRecurse([1],[1.0]);
If you want recursive equal, then do equal!equal. Granted, that's only one level of recursion, but how many levels deep are you really going to have your ranges? And you have to get to == eventually anyway in order to compare the deepest elements. Going beyond a range of ranges is likely to be quite rare, and when it does happen, you can simply nest equal as many times as you need. - Jonathan M Davis
Mar 19 2013
next sibling parent reply "monarch_dodra" <monarchdodra gmail.com> writes:
On Tuesday, 19 March 2013 at 10:08:43 UTC, Jonathan M Davis wrote:
 On Tuesday, March 19, 2013 09:25:43 timotheecour wrote:
 we need a std.algorithm.equalRecurse(T1,T2)(T1 a, T2 b) that
 compares recursively a and b;
 
 its behavior should be:
 
 if opEqual is defined, call it
 else, if its a range, call std.algorithm.equal (ie compare nb
 elements, then each element for equality)
 else, if it's a class/struct, make sure types are same and call
 it recursively on each field.
 else if it's a numerical type, call "=="
 else (is there an else?)
 
 just as std.algorithm.equal, we should have
 equalRecurse([1],[1.0]);
If you want recursive equal, then do equal!equal. Granted, that's only one level of recursion, but how many levels deep are you really going to have your ranges? And you have to get to == eventually anyway in order to compare the deepest elements. Going beyond a range of ranges is likely to be quite rare, and when it does happen, you can simply nest equal as many times as you need. - Jonathan M Davis
"equal!equal(RoR1, RoR2)" That looks cute, but I think it says something about how powerful and expressive D can be, while being compile-time optimized. It's those little things that still amaze me about D.
Mar 19 2013
parent reply "John Colvin" <john.loughran.colvin gmail.com> writes:
On Tuesday, 19 March 2013 at 11:46:14 UTC, monarch_dodra wrote:
 On Tuesday, 19 March 2013 at 10:08:43 UTC, Jonathan M Davis 
 wrote:
 On Tuesday, March 19, 2013 09:25:43 timotheecour wrote:
 we need a std.algorithm.equalRecurse(T1,T2)(T1 a, T2 b) that
 compares recursively a and b;
 
 its behavior should be:
 
 if opEqual is defined, call it
 else, if its a range, call std.algorithm.equal (ie compare nb
 elements, then each element for equality)
 else, if it's a class/struct, make sure types are same and 
 call
 it recursively on each field.
 else if it's a numerical type, call "=="
 else (is there an else?)
 
 just as std.algorithm.equal, we should have
 equalRecurse([1],[1.0]);
If you want recursive equal, then do equal!equal. Granted, that's only one level of recursion, but how many levels deep are you really going to have your ranges? And you have to get to == eventually anyway in order to compare the deepest elements. Going beyond a range of ranges is likely to be quite rare, and when it does happen, you can simply nest equal as many times as you need. - Jonathan M Davis
"equal!equal(RoR1, RoR2)" That looks cute, but I think it says something about how powerful and expressive D can be, while being compile-time optimized. It's those little things that still amaze me about D.
and then: template NumDimensions (T) { static if(is(ElementType!T == void)) const NumDimensions = 0; else const NumDimensions = 1 + NumDimensions!(ElementType!T); } bool rec_equal(R0, R1)(R0 r0, R1 r1) if(NumDimensions!R0 == NumDimensions!R1) { mixin("return " ~ replicate("equal!", NumDimensions!(R0)-1) ~ "equal(r0, r1);"); } obviously it requires some more checks, but it works nicely (except if you feed it two integer literals, in which case the compiler throws an out of memory error!).
Mar 19 2013
next sibling parent "John Colvin" <john.loughran.colvin gmail.com> writes:
On Tuesday, 19 March 2013 at 12:16:54 UTC, John Colvin wrote:
 On Tuesday, 19 March 2013 at 11:46:14 UTC, monarch_dodra wrote:
 On Tuesday, 19 March 2013 at 10:08:43 UTC, Jonathan M Davis 
 wrote:
 On Tuesday, March 19, 2013 09:25:43 timotheecour wrote:
 we need a std.algorithm.equalRecurse(T1,T2)(T1 a, T2 b) that
 compares recursively a and b;
 
 its behavior should be:
 
 if opEqual is defined, call it
 else, if its a range, call std.algorithm.equal (ie compare nb
 elements, then each element for equality)
 else, if it's a class/struct, make sure types are same and 
 call
 it recursively on each field.
 else if it's a numerical type, call "=="
 else (is there an else?)
 
 just as std.algorithm.equal, we should have
 equalRecurse([1],[1.0]);
If you want recursive equal, then do equal!equal. Granted, that's only one level of recursion, but how many levels deep are you really going to have your ranges? And you have to get to == eventually anyway in order to compare the deepest elements. Going beyond a range of ranges is likely to be quite rare, and when it does happen, you can simply nest equal as many times as you need. - Jonathan M Davis
"equal!equal(RoR1, RoR2)" That looks cute, but I think it says something about how powerful and expressive D can be, while being compile-time optimized. It's those little things that still amaze me about D.
and then: template NumDimensions (T) { static if(is(ElementType!T == void)) const NumDimensions = 0; else const NumDimensions = 1 + NumDimensions!(ElementType!T); } bool rec_equal(R0, R1)(R0 r0, R1 r1) if(NumDimensions!R0 == NumDimensions!R1) { mixin("return " ~ replicate("equal!", NumDimensions!(R0)-1) ~ "equal(r0, r1);"); } obviously it requires some more checks, but it works nicely (except if you feed it two integer literals, in which case the compiler throws an out of memory error!).
correction: bool rec_equal(R0, R1)(R0 r0, R1 r1) if(NumDimensions!R0 == NumDimensions!R1) { static if(NumDimensions!R0 == 0) return r0 == r1; else mixin("return " ~ replicate("equal!", NumDimensions!(R0)-1) ~ "equal(r0, r1);"); }
Mar 19 2013
prev sibling parent "John Colvin" <john.loughran.colvin gmail.com> writes:
On Tuesday, 19 March 2013 at 12:16:54 UTC, John Colvin wrote:
 (except if you feed it two integer literals, in which case the 
 compiler throws an out of memory error!).
Which is a legitimate error, seeing as -1 as a size_t is size_t.max which meant replicate was requesting size_t.max * int.sizeof bytes. See my correction to the code in my other reply.
Mar 19 2013
prev sibling parent reply "bearophile" <bearophileHUGS lycos.com> writes:
Jonathan M Davis:

 Going beyond a range of ranges is likely to be quite rare,
I agree.
 and when it does happen, you can simply nest equal as many 
 times as you need.
A similar function seems useful for Phobos because it's automatic, you don't need to tell it how many levels there are. Bye, bearophile
Mar 19 2013
parent reply "monarch_dodra" <monarchdodra gmail.com> writes:
On Tuesday, 19 March 2013 at 12:11:50 UTC, bearophile wrote:
 Jonathan M Davis:

 Going beyond a range of ranges is likely to be quite rare,
I agree.
 and when it does happen, you can simply nest equal as many 
 times as you need.
A similar function seems useful for Phobos because it's automatic, you don't need to tell it how many levels there are. Bye, bearophile
But the compiler can't tell how many levels deep you want to go. There are legitimate cases of "Range of Stuff" where Stuff just so happens to be a range, but you don't want range comparison. For example: class MyClassRange; MyClassRange a, b; SuperEqual(a, b); //? I think using equal!equal is not that much more verbose, but safer/explicit.
Mar 19 2013
parent reply "John Colvin" <john.loughran.colvin gmail.com> writes:
On Tuesday, 19 March 2013 at 12:34:04 UTC, monarch_dodra wrote:
 On Tuesday, 19 March 2013 at 12:11:50 UTC, bearophile wrote:
 Jonathan M Davis:

 Going beyond a range of ranges is likely to be quite rare,
I agree.
 and when it does happen, you can simply nest equal as many 
 times as you need.
A similar function seems useful for Phobos because it's automatic, you don't need to tell it how many levels there are. Bye, bearophile
But the compiler can't tell how many levels deep you want to go. There are legitimate cases of "Range of Stuff" where Stuff just so happens to be a range, but you don't want range comparison. For example: class MyClassRange; MyClassRange a, b; SuperEqual(a, b); //? I think using equal!equal is not that much more verbose, but safer/explicit.
I think if your calling a recursive version of equal, then you assume it will go as deep as possible. If you want control as to how deep, then you can either use equal manually, or there could be a maximum depth parameter to the recursive equal.
Mar 19 2013
parent reply "John Colvin" <john.loughran.colvin gmail.com> writes:
On Tuesday, 19 March 2013 at 12:36:34 UTC, John Colvin wrote:
 On Tuesday, 19 March 2013 at 12:34:04 UTC, monarch_dodra wrote:
 On Tuesday, 19 March 2013 at 12:11:50 UTC, bearophile wrote:
 Jonathan M Davis:

 Going beyond a range of ranges is likely to be quite rare,
I agree.
 and when it does happen, you can simply nest equal as many 
 times as you need.
A similar function seems useful for Phobos because it's automatic, you don't need to tell it how many levels there are. Bye, bearophile
But the compiler can't tell how many levels deep you want to go. There are legitimate cases of "Range of Stuff" where Stuff just so happens to be a range, but you don't want range comparison. For example: class MyClassRange; MyClassRange a, b; SuperEqual(a, b); //? I think using equal!equal is not that much more verbose, but safer/explicit.
I think if your calling a recursive version of equal, then you assume it will go as deep as possible. If you want control as to how deep, then you can either use equal manually, or there could be a maximum depth parameter to the recursive equal.
i.e. import std.array : replicate; import std.algorithm : min; import std.range : ElementType, equal; bool rec_equal(size_t max_depth = size_t.max, R0, R1)(R0 r0, R1 r1) if(NumDims!R0 == NumDims!R1) { static if(NumDims!R0 == 0 || max_depth == 0) return r0 == r1; else { mixin("return " ~ replicate("equal!", min(max_depth-1, NumDims!(R0)-1)) ~ "equal(r0, r1);" ); } } template NumDims (T) { static if(is(ElementType!T == void)) const NumDims = 0; else const NumDims = 1 + NumDims!(ElementType!T); }
Mar 19 2013
parent reply "timotheecour" <timothee.cour2 gmail.com> writes:
I think none of you got the point of equalRecurse, so let me 
clarify.

equalRecurse should work on any type, not just inputRanges, see 
my example in the OT with a struct containing an array: currently 
std.algorithm.equal has isInputRange conditions on the arguments.

So, yes, it can go deep in practice (struct containing an array 
of struct, etc), unlike nesting ranges where it's not common in 
practice.

I wouldn't think it's that hard to do, at least for the common 
cases in practice.
Maybe what was confusing is that I wanted it in std.algorithm, 
but all functions in there operate on ranges, so maybe somewhere 
else, but I don't see a relevant package. Maybe a new 
std.algorithm2 for non-ranges?

Also, the OT's firstDifference would go there too, and I have a 
recursive (to specified level) toStringRecurse that would belong 
there too.
Mar 19 2013
next sibling parent "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Tuesday, March 19, 2013 18:08:53 timotheecour wrote:
 I think none of you got the point of equalRecurse, so let me
 clarify.
 
 equalRecurse should work on any type, not just inputRanges,
Which is fundamentally wrong. == and equal are two very different operations, even if their intent may be similar. And aside from ranges, unless you're creating your own function for checking for some other type of equality beyond what ==, is, or equal check for, all you'd ever use is == or is anyway. So, if you want to compare ranges with equal, then use equal. Otherwise use ==. Expecting the compiler to automagically figure out how to compare two types for you is just begging for trouble (especially when you add stuff like alias this into the mix). - Jonathan M Davis
Mar 19 2013
prev sibling next sibling parent reply "timotheecour" <timothee.cour2 gmail.com> writes:
 somewhere else, but I don't see a relevant package. Maybe a new 
 std.algorithm2 for non-ranges?

 Also, the OT's firstDifference would go there too, and I have a 
 recursive (to specified level) toStringRecurse that would 
 belong there too.
Also, I'd add to that list copyRecurse and some more, that operate on arbitrary types, not just ranges, so we have: equalRecurse copyRecurse (deep copy) toStringRecurse firstDifference (see OT) toHashRecurse (should compare equal with a data structure serialized and then deserialized via a serialization function, eg std.orange) I'm sure there's more. that seems a starting point for a new package that operates on any type recursively (not just ranges), no? std.deep?std.recurse? Some of those could have a depth level compile time parameter that stops recursion at that level, which would be infinity by default.
Mar 19 2013
next sibling parent reply "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Tuesday, March 19, 2013 18:26:16 timotheecour wrote:
 somewhere else, but I don't see a relevant package. Maybe a new
 std.algorithm2 for non-ranges?
 
 Also, the OT's firstDifference would go there too, and I have a
 recursive (to specified level) toStringRecurse that would
 belong there too.
Also, I'd add to that list copyRecurse and some more, that operate on arbitrary types, not just ranges, so we have: equalRecurse copyRecurse (deep copy) toStringRecurse firstDifference (see OT) toHashRecurse (should compare equal with a data structure serialized and then deserialized via a serialization function, eg std.orange) I'm sure there's more. that seems a starting point for a new package that operates on any type recursively (not just ranges), no? std.deep?std.recurse? Some of those could have a depth level compile time parameter that stops recursion at that level, which would be infinity by default.
And how do you even have the concept of recursion without some sort of range or container to recursively iterate through? - Jonathan M Davis
Mar 19 2013
parent reply "timotheecour" <timothee.cour2 gmail.com> writes:
 And how do you even have the concept of recursion without some 
 sort of range or container to recursively iterate through?
It's not hard: I've done it for a serialization library I've been working on (which has some advantages over std.orange, such as serializing to json/binary etc. more on this later), as well as for my toStringRecurse: it works on any data type roughly as follows: auto equalRecurse(T)(T a) { static if isAssociativeArray!(T)) { } else static if(isArray!(T)) { } else static if(is(T == struct) ){ foreach(i, member; a.tupleof) { ... } else static if... }
 Expecting the compiler to automagically figure out how to 
 compare two types for you is just begging for trouble
I beg to differ. I like to be as lazy as possible when writing user code (as opposed to library code). The compiler can do a lot of stuff automagically in D thanks to CT reflection. I didn't run into problems when using the serialization on rather complex nested objects.
Mar 19 2013
parent reply "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Tuesday, March 19, 2013 19:13:14 timotheecour wrote:
 And how do you even have the concept of recursion without some
 sort of range or container to recursively iterate through?
It's not hard: I've done it for a serialization library I've been working on (which has some advantages over std.orange, such as serializing to json/binary etc. more on this later), as well as for my toStringRecurse: it works on any data type roughly as follows: auto equalRecurse(T)(T a) { static if isAssociativeArray!(T)) { } else static if(isArray!(T)) { } else static if(is(T == struct) ){ foreach(i, member; a.tupleof) { ... } else static if... }
 Expecting the compiler to automagically figure out how to
 compare two types for you is just begging for trouble
I beg to differ. I like to be as lazy as possible when writing user code (as opposed to library code). The compiler can do a lot of stuff automagically in D thanks to CT reflection. I didn't run into problems when using the serialization on rather complex nested objects.
Serialization is completely different from comparing the equality of two objects. That's opEquals' job. It deals with recursive comparisons like you're describing here just fine. There's no reason to define that externallly to the type. equal's job, on the other hand, is to compare the elements of two ranges - _not_ the ranges themselves - which is fundamentally different from ==. - Jonathan M Davis
Mar 19 2013
parent reply "timotheecour" <timothee.cour2 gmail.com> writes:
 That's opEquals' job. It deals with recursive comparisons like 
 you're describing here just fine.
Except the functionality is very different, as opEquals is blind to range semantics. I want equalRecurse to work as follows: struct A{ int[]b=[1]; } void main(){ auto a1=A(); auto a2=A(); assert(a1!=a2); //opEquals says it's different assert(equalRecurse(a1,a2)); //equalRecurse says it's the same // assert(equal(a1,a2)); //equal doesn't compile, only works for ranges. }
 There's no reason to define that externallly to the type.
Yes there is: it can be done automagically to work on vast majority of cases (see what several ppl have posted), and it avoids overly redundant boilerplate code. In my example here, I don't want to redefine an opEquals to make the "assert(a1==a2);" above pass, because: -it's boilerplate -equalRecurse has different semantics anyways -meaning of equalRecurse should be clear to anyone.
Mar 19 2013
parent "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Tuesday, March 19, 2013 21:13:21 timotheecour wrote:
 That's opEquals' job. It deals with recursive comparisons like
 you're describing here just fine.
Except the functionality is very different, as opEquals is blind to range semantics.
opEquals defines equality however it chooses to define equality. It's trivial for it to use equal on any member variables which are ranges. opEquals is the way that the language defines equality. equal is only necessary when trying to compare the elements of two ranges rather than compare two objects. == is what is supposed to be used to compare two objects. You seem to basically be trying to work your way around the language rather than using it the way it's designed, which is just asking for trouble. And if the problem is that you don't want to write boilerplate code, and you want to define an opEquals that compares ranges with equal and everything else with ==, then it would be fairly trivial to write a string or template mixin to mix in opEquals for you. If you avoid defining == properly, then you're going to have problems wiith all kinds of stuff which assumes that == is used to define equality (including stuff like the built-in AAs). Avoiding boilerplate code is poor excuse to give your types odd semantics by not defining opEquals and using an external function to compare for equality. - Jonathan M Davis
Mar 19 2013
prev sibling parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Tue, Mar 19, 2013 at 01:48:43PM -0400, Jonathan M Davis wrote:
 On Tuesday, March 19, 2013 18:26:16 timotheecour wrote:
 somewhere else, but I don't see a relevant package. Maybe a new
 std.algorithm2 for non-ranges?
 
 Also, the OT's firstDifference would go there too, and I have a
 recursive (to specified level) toStringRecurse that would
 belong there too.
Also, I'd add to that list copyRecurse and some more, that operate on arbitrary types, not just ranges, so we have: equalRecurse copyRecurse (deep copy) toStringRecurse firstDifference (see OT) toHashRecurse (should compare equal with a data structure serialized and then deserialized via a serialization function, eg std.orange) I'm sure there's more. that seems a starting point for a new package that operates on any type recursively (not just ranges), no? std.deep?std.recurse? Some of those could have a depth level compile time parameter that stops recursion at that level, which would be infinity by default.
And how do you even have the concept of recursion without some sort of range or container to recursively iterate through?
[...] One can iterate over every member of a struct/class and recursively invoke equalRecurse on them. Something like this: bool recursiveEqual(T)(T a, T b) { static if (isAtomic!T) { return a == b; } else { foreach (attr; __traits(getAllMembers, T)) { // TBD: need to skip stuff like member // functions or internal stuff like // compiler-generated attributes, hidden // context ptrs, etc. alias attr1 = __traits(getMember, a, attr); alias attr2 = __traits(getMember, b, attr); if (!recursiveEqual(attr, attr2)) return false; } } return true; } T -- The best way to destroy a cause is to defend it poorly.
Mar 19 2013
prev sibling parent reply "Dan" <dbdavidson yahoo.com> writes:
On Tuesday, 19 March 2013 at 17:08:54 UTC, timotheecour wrote:
 I think none of you got the point of equalRecurse, so let me 
 clarify.
I think I understand what you are after and I like the idea. I've got a few: -typesDeepEqual -typesDeepCmp -deepHash Have a look at: https://github.com/patefacio/d-help/blob/master/d-help/opmix/mix.d Thanks Dan
Mar 19 2013
next sibling parent "timotheecour" <timothee.cour2 gmail.com> writes:
 I think I understand what you are after and I like the idea. 
 I've got a few:
 -typesDeepEqual
 -typesDeepCmp
 -deepHash
approxEqualRecurse could also be useful for numerical applications (same as equalRecurse, except uses approxEqual for floating point comparisons)
 https://github.com/patefacio/d-help/blob/master/d-help/opmix/mix.d
Thanks for the link, I'll take a look.
Mar 19 2013
prev sibling parent reply "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Tuesday, March 19, 2013 20:36:09 Dan wrote:
 On Tuesday, 19 March 2013 at 17:08:54 UTC, timotheecour wrote:
 I think none of you got the point of equalRecurse, so let me
 clarify.
I think I understand what you are after and I like the idea. I've got a few: -typesDeepEqual -typesDeepCmp -deepHash
Those are what opEquals, opCmp, and toHash are for. It might make sense to define mixins which implement them for you (dealing with whatever recursive semantics are necessary), but using external functions for those just isn't going to fly. The language and standard library are designed around them being part of the types themselves. Stuff like AAs won't work if those functions aren't defined on the types themselves. - Jonathan M Davis
Mar 19 2013
parent reply "Dan" <dbdavidson yahoo.com> writes:
On Tuesday, 19 March 2013 at 20:28:09 UTC, Jonathan M Davis wrote:
 Those are what opEquals, opCmp, and toHash are for. It might 
 make sense to
 define mixins which implement them for you (dealing with 
 whatever recursive
 semantics are necessary), but using external functions for 
 those just isn't
 going to fly.
Understand the sentiment. But you can easily have those external functions call the member equivalents if they exist, and if not still proceed and work. This is what they do, in fact. For instance, it is nice to have the mixin for opCmp that actually calls member opCmp if they exist. This opens the door a bit. It allows you to have a reasonable opCmp for a type with a member that has none. The same approach can be used for dup/gdup - which we have discussed a few times now.
 The language and standard library are designed around them being
 part of the types themselves. Stuff like AAs won't work if 
 those functions
 aren't defined on the types themselves.
Sorry, I don't understand the last statement.
Mar 19 2013
parent reply "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Tuesday, March 19, 2013 21:50:55 Dan wrote:
 On Tuesday, 19 March 2013 at 20:28:09 UTC, Jonathan M Davis wrote:
 The language and standard library are designed around them being
 part of the types themselves. Stuff like AAs won't work if
 those functions
 aren't defined on the types themselves.
Sorry, I don't understand the last statement.
Lots of stuff uses ==, toHash, etc. That's the way the language is designed. Defining your types without defining those properly just isn't going to work for a _lot_ of stuff. Creating an external function to compare objects or generate hashes or anything like that is just going to cause problems, because only your stuff would use it. The built-in stuff wouldn't, and the standard library wouldn't. For instance, AAs require that opEquals and toHash be defined, and no matter how clever your external functions for comparing objects or generating hashes are, they're not going to work with the built-in AAs. Any type which is going to work with the built-in AAs must define opEquals and toHash. So, if the problem is boilerplate code, mixins can be used for opEquals, opCmp, etc. in order to define those for you, but creating external functions to do those same operations is not how the language was designed and isn't how anything is going to expect types to work. Anything that wants to be comparable, should define ==. Anything that wants to be hashable should define toHash. Etc. Something like equalRecurse ultimately really makes no sense at all. - Jonathan M Davis
Mar 19 2013
parent reply "Dan" <dbdavidson yahoo.com> writes:
On Tuesday, 19 March 2013 at 21:11:12 UTC, Jonathan M Davis wrote:
 Lots of stuff uses ==, toHash, etc. That's the way the language 
 is designed.
Agreed
 Defining your types without defining those properly just isn't 
 going to work for
 a _lot_ of stuff. Creating an external function to compare 
 objects or generate
 hashes or anything like that is just going to cause problems, 
 because only
 your stuff would use it.
Ok - you are making strong statement and I don't see why it is the case. Here is an example that does work for hashing. Note: R implements its own hash and just prints to show nothing up the sleeve. Other classes (S,T) have no toHash implemented directly at all - yet you can see u1 and u3 resolve to the same hash and u2 a different, as you would expect. ------------------------- import std.stdio; import opmix.mix; import pprint.pp; struct R { string r; hash_t toHash() const nothrow { try { writeln("toHash called on r"); } catch(Exception) { } return typeid(string).getHash(&r); } } struct S { R r; string s; string[int] si; } struct T { S s; } struct U { mixin ToHash; T t; } void main() { U u1 = U(T(S(R("a"), "foo", [1:"goo"]))); U u2 = U(T(S(R("a"), "foo".idup))); U u3 = U(T(S(R("a"), "foo".idup, [1:"goo".idup]))); writeln(deepHash(u1)); writeln(deepHash(u2)); writeln(deepHash(u3)); writeln(pp(u1)); int[U] aa; aa[u1] = 100; writeln("Is u1 in aa ", u1 in aa); writeln("Is u2 in aa ", u2 in aa); aa[u2] = 100; writeln("Is u2 in now aa ", u2 in aa); } ----------------------- OUTPUT -------------------------- toHash called on r 614624 toHash called on r 582446 toHash called on r 614624 { (U).t = { (T).s = { (S).r = { (R).r = "a" } (S).s = "foo" (S).si = { (K(1)[0] => V("goo")), } } } } toHash called on r toHash called on r Is u1 in aa 7F2C7E556FC0 toHash called on r Is u2 in aa null toHash called on r toHash called on r Is u2 in now aa 7F2C7E556F40 ----------------------------------------------------------------------
 The built-in stuff wouldn't, and the standard library
 wouldn't. For instance, AAs require that opEquals and toHash be 
 defined, and no
 matter how clever your external functions for comparing objects 
 or generating
 hashes are, they're not going to work with the built-in AAs. 
 Any type which is
 going to work with the built-in AAs must define opEquals and 
 toHash.
The above works with the built-in AAs. Please offer an example. Thanks Dan
Mar 19 2013
parent reply "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Tuesday, March 19, 2013 22:43:10 Dan wrote:
 The above works with the built-in AAs.
 Please offer an example.
It works because the outer type defines toHash. Without toHash, the built-in AAs won't work. If you're dealing with member variables which don't have toHash, then doing something like you did is an option, but the outer type still needs toHash, and if any of the member variables define an opEquals but not a toHash, then you risk having your hash change when a member variable changes even when the new value of the member variable is considered equal to the old one (e.g. because a cached value or some other member variable in that type is not considered to be part of the equality of an object but _would_ be considered to be part of the hash if you use reflection to generate a hash from all of that type's member variables). So, using reflection to generate hashes for arbitrary types can be risky. Whether it works on not depends on how those types are defined and what you're doing with them. But the main problem that I'm pointing out is that you can't define your own, non-standard functions for equality or hashing or whatever and expect your types to play nicely with other stuff. If your stuff is wrapped in types that do define the proper functions for that (like in your example), then it can work, but the types which were wrapped won't play nice outside of the wrapper. - Jonathan M Davis
Mar 19 2013
parent reply "Dan" <dbdavidson yahoo.com> writes:
On Tuesday, 19 March 2013 at 23:13:19 UTC, Jonathan M Davis wrote:
 On Tuesday, March 19, 2013 22:43:10 Dan wrote:
 The above works with the built-in AAs.
 Please offer an example.
It works because the outer type defines toHash. Without toHash, the built-in AAs won't work. If you're dealing with member variables which don't have toHash, then doing something like you did is an option, but the outer type still needs toHash, and if any of the member variables define an opEquals but not a toHash, then you risk having your hash change when a member variable changes even when the new value of the member variable is considered equal to the old one (e.g. because a cached value or some other member variable in that type is not considered to be part of the equality of an object but _would_ be considered to be part of the hash if you use reflection to generate a hash from all of that type's member variables). So, using reflection to generate hashes for arbitrary types can be risky. Whether it works on not depends on how those types are defined and what you're doing with them.
For hashing it is not foolproof as you mentioned. It is important to understand that opEquals and toHash should share the same semantics (i.e. if they are equal they should hash the same). But, in general, when you control the code you don't need toHash or opEquals at every level of composition and it will still work fine (compile-time recursively) as a key in a AA.
 But the main problem that I'm pointing out is that you can't 
 define your own,
 non-standard functions for equality or hashing or whatever and 
 expect your
 types to play nicely with other stuff. If your stuff is wrapped 
 in types that do
 define the proper functions for that (like in your example), 
 then it can work,
 but the types which were wrapped won't play nice outside of the 
 wrapper.
This is true, but then my code is by definition not standard. However, theoretically, the language writers could. For example, any '==' could be lowered to a 'standard' function, probably better named 'intancesDeepEqual(a,b)' and that function could use reflection to provide equal when not available else call opEquals if it is available. Similar with opCmp, dup, idup, ... In other words, in the vein of the original poster, why not allow all of these nice goodies (equality comparison, opCmp comparison, dup) without requiring boilerplate code while still honoring/using it when it is provided. Thanks Dan
Mar 19 2013
next sibling parent "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Wednesday, March 20, 2013 01:17:13 Dan wrote:
 This is true, but then my code is by definition not standard.
 However, theoretically, the language writers could. For example,
 any '==' could be lowered to a 'standard' function, probably
 better named 'intancesDeepEqual(a,b)' and that function could use
 reflection to provide equal when not available else call opEquals
 if it is available. Similar with opCmp, dup, idup, ...
 In other words, in the vein of the original poster, why not allow
 all of these nice goodies (equality comparison, opCmp comparison,
 dup) without requiring boilerplate code while still
 honoring/using it when it is provided.
There _are_ defaults for ==, and it really does quite a lot for you (e.g. recursively using == on each of a type's member variables), so you don't always have to define opEquals, but as soon as you have a layer of indirection (aside from arrays), you do have to define opEquals, or you'll generally end up checking for reference/pointer equality rather than equality of the actual objects. And in some cases, that's exactly the right thing to do. In others, it's not. It all depends on the context. As for comparing ranges, it would actually be very bad for performance if they defaulted to what equal does (particularly with regards to strings), and the language really doesn't do much to support ranges directly. Aside from foreach, they're completely a library construct. And really, equal exists with the idea that ranges of different types (but the same element type) can be compared, which isn't what you're generally trying to do with opEquals at all. If you want to avoid boilerplate, then use mixins. It should be fairly trivial to define a mixin such that you can do something like mixin(defineOpEquals!(typeof(this))); which defines an opEquals which does deep comparison. But also remember that what is necessary for deep comparison often depends on the types being compared (e.g. a range may or may not require equal to get the comparison that you want), so while it may be possible to define a mixin which creates an opEquals which will work in most situations, there are plenty where it won't. For instance, given that the standard library supports ranges, it's likely that something like defineOpEquals would know about them and do what is most likely to be the correct thing for them, but if ranges weren't part of the standard library, thhen defineOpEquals wouldn't know about them and couldn't handle them properly. The same will be true for any user-defined type that requires a particular idiom in order to be compared correctly. - Jonathan M Davis
Mar 19 2013
prev sibling next sibling parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Wed, Mar 20, 2013 at 01:17:13AM +0100, Dan wrote:
 On Tuesday, 19 March 2013 at 23:13:19 UTC, Jonathan M Davis wrote:
[...]
But the main problem that I'm pointing out is that you can't define
your own, non-standard functions for equality or hashing or whatever
and expect your types to play nicely with other stuff. If your stuff
is wrapped in types that do define the proper functions for that
(like in your example), then it can work, but the types which were
wrapped won't play nice outside of the wrapper.
This is true, but then my code is by definition not standard. However, theoretically, the language writers could. For example, any '==' could be lowered to a 'standard' function, probably better named 'intancesDeepEqual(a,b)' and that function could use reflection to provide equal when not available else call opEquals if it is available. Similar with opCmp, dup, idup, ... In other words, in the vein of the original poster, why not allow all of these nice goodies (equality comparison, opCmp comparison, dup) without requiring boilerplate code while still honoring/using it when it is provided.
[...] I like this idea. By default, provide something that recursively compares struct/class members, array elements, etc.. But if at any level an opEquals is defined, then that is used instead. This maximizes convenience for those cases where you *do* just want a literal equality of all sub-structures, but also allows you to override that behaviour if your class/struct needs some other special processing. T -- "How are you doing?" "Doing what?"
Mar 19 2013
prev sibling next sibling parent reply "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Tuesday, March 19, 2013 18:20:35 H. S. Teoh wrote:
 I like this idea. By default, provide something that recursively
 compares struct/class members, array elements, etc.. But if at any level
 an opEquals is defined, then that is used instead. This maximizes
 convenience for those cases where you *do* just want a literal equality
 of all sub-structures, but also allows you to override that behaviour if
 your class/struct needs some other special processing.
We already get this. That's what == does by default. It's just that it uses == on each member, so if == doesn't work for a particular member variable and the semantics you want for == on the type it's in, you need to override opEquals. - Jonathan M Davis
Mar 19 2013
parent reply "Dan" <dbdavidson yahoo.com> writes:
On Wednesday, 20 March 2013 at 02:03:31 UTC, Jonathan M Davis 
wrote:
 We already get this. That's what == does by default. It's just 
 that it uses ==
 on each member, so if == doesn't work for a particular member 
 variable and the
 semantics you want for == on the type it's in, you need to 
 override opEquals.
Really? string is one most people would like == to just work for. This writes true then false. This certainly takes getting used to. It alone is a good reason for the mixins and potentially a non-member instancesDeepEqual. import std.stdio; struct S { string s; } void main() { writeln("foo" == "foo".idup); writeln(S("foo") == S("foo".idup)); }
Mar 19 2013
parent reply Jonathan M Davis <jmdavisProg gmx.com> writes:
On Wednesday, March 20, 2013 03:48:38 Dan wrote:
 On Wednesday, 20 March 2013 at 02:03:31 UTC, Jonathan M Davis
 
 wrote:
 We already get this. That's what == does by default. It's just
 that it uses ==
 on each member, so if == doesn't work for a particular member
 variable and the
 semantics you want for == on the type it's in, you need to
 override opEquals.
Really? string is one most people would like == to just work for. This writes true then false. This certainly takes getting used to. It alone is a good reason for the mixins and potentially a non-member instancesDeepEqual. import std.stdio; struct S { string s; } void main() { writeln("foo" == "foo".idup); writeln(S("foo") == S("foo".idup)); }
That's a bug: http://d.puremagic.com/issues/show_bug.cgi?id=3789 - Jonathan M Davis
Mar 19 2013
parent "Dan" <dbdavidson yahoo.com> writes:
On Wednesday, 20 March 2013 at 02:54:23 UTC, Jonathan M Davis 
wrote:
 On Wednesday, March 20, 2013 03:48:38 Dan wrote:
 On Wednesday, 20 March 2013 at 02:03:31 UTC, Jonathan M Davis
 
 wrote:
 We already get this. That's what == does by default. It's 
 just
 that it uses ==
 on each member, so if == doesn't work for a particular member
 variable and the
 semantics you want for == on the type it's in, you need to
 override opEquals.
Really? string is one most people would like == to just work for. This writes true then false. This certainly takes getting used to. It alone is a good reason for the mixins and potentially a non-member instancesDeepEqual. import std.stdio; struct S { string s; } void main() { writeln("foo" == "foo".idup); writeln(S("foo") == S("foo".idup)); }
That's a bug: http://d.puremagic.com/issues/show_bug.cgi?id=3789
From Feb 2010. Maybe by now it is so understood how it works that at some point fixing it could be a problem. For some the language is better defined by how the compiler treats your code than what is listed in bugzilla. Even in looking through the history of that bug I could not find any definitive - some say its a bug, others say its not. You refer to TDPL which is a good source but if it is not viewed as a bug by Walter...
Mar 19 2013
prev sibling parent reply Jonathan M Davis <jmdavisProg gmx.com> writes:
On Wednesday, March 20, 2013 01:17:13 Dan wrote:
 This is true, but then my code is by definition not standard.
 However, theoretically, the language writers could. For example,
 any '==' could be lowered to a 'standard' function, probably
 better named 'intancesDeepEqual(a,b)' and that function could use
 reflection to provide equal when not available else call opEquals
 if it is available. Similar with opCmp, dup, idup, ...
 In other words, in the vein of the original poster, why not allow
 all of these nice goodies (equality comparison, opCmp comparison,
 dup) without requiring boilerplate code while still
 honoring/using it when it is provided.
Okay. I'm going to take a second stab at replying to this, because I think that I can explain it much better. The standard function that == lowers to is opEquals. Nothing else is needed. The way that == works is integral types, char types, pointers, and bool: bitwise comparison floating point types: equality which is similar to bitwise comparison but takes NaN into account arrays: The ptr and length attributes are checked, and if the lengths are equal but the ptrs are not, then element-wise comparison is used, where each element is compared with == (whereas if the lengths are different or if the ptr and length attributes are identical, no further comparison is needed). structs and classes: Their opEquals is used. If a struct or class does not define opEquals, then one is generated where each member variable is compared in turn with ==. So, they are compared recursively. They _only_ times that you need you need to worry about defining opEquals are when 1. the default comparison is comparing pointers, and you want to compare what they point to rather than the pointers themselves. 2. you want equality to mean something other than calling == on each member variable (including doing something other than == on a particular member variable as might be the case with a member variable which is a range). The way == is defined is very clean and straightforward. There _are_ bugs which complicate things (e.g. http://d.puremagic.com/issues/show_bug.cgi?id=3789 ), but those can and will be fixed. The design itself is solid. It's true that equal is often need for comparing ranges of the same type, but that's arguably a defect in the implementations of those ranges. equal is specifically designed to compare ranges which are different types but have the same element type. == is what's for comparing objects of the same type. But most ranges don't currently define opEquals, even when they need it in order for it to do a proper comparison. That's arguably something that should be fixed, but it's a design issue of ranges, not ==. But if you want to ensure that equal is used, you can always use a mixin to define opEquals that way (though you actually risk making the comparison less efficient than it would be if the ranges themselves defined opEquals). So, I can see a definite argument that ranges should make sure that they define opEquals (which wouldn't negate the general need for equal as that's specifically for comparing ranges of different types which have the same element type), but there's no need to change how == works. The design is actually very clean. - Jonathan M Davis
Mar 19 2013
parent reply "Dan" <dbdavidson yahoo.com> writes:
On Wednesday, 20 March 2013 at 03:10:41 UTC, Jonathan M Davis 
wrote:
 The way == is defined is very clean and straightforward. There 
 _are_ bugs which
 complicate things (e.g. 
 http://d.puremagic.com/issues/show_bug.cgi?id=3789 ),
 but those can and will be fixed. The design itself is solid.
Thanks for the detailed explanation. If 3789 covers strings, dynamic arrays and associative arrays and it were fixed then I agree, it is clean and straightforward. If I read it correctly with this fix it would change the current opEquals from semantically shallow to semantically deep by default. Semantically deep equality is comfortable to me, but I would imagine by now there is a fair amount of code that might rely on a false result from opEquals if the members (slices, associative arrays) are not bitwise the same. Thanks Dan
Mar 20 2013
parent "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Wednesday, March 20, 2013 12:47:38 Dan wrote:
 I would
 imagine by now there is a fair amount of code that might rely on
 a false result from opEquals if the members (slices, associative
 arrays) are not bitwise the same.
I think that it's more likely that there's a lot of code that expects strings and the like to have their elements checked for equality when they're in a struct and is therefore buggy at present due to the fact that they aren't. As for AAs (which I forgot about), I don't remember what they're supposed to do with ==. They might just do a pointer comparison - or they may do a deeper comparison; I don't know. But the big problem there is that structs need to compare each of their members with ==, and that's not working properly at the moment, so that throws a major wrench in how == works. - Jonathan M Davis
Mar 20 2013
prev sibling next sibling parent "John Colvin" <john.loughran.colvin gmail.com> writes:
On Tuesday, 19 March 2013 at 08:25:45 UTC, timotheecour wrote:
 we need a std.algorithm.equalRecurse(T1,T2)(T1 a, T2 b) that 
 compares recursively a and b;

 its behavior should be:

 if opEqual is defined, call it
The problem is, Object defines opEqual as "bool opEquals(Object o){return this is o;}", so that ruins a lot of class comparisons. An example implementation of what I think you're getting at: http://dpaste.dzfl.pl/a7889060 bool deep_equal(T)(T a, T b) { static if(isInputRange!T) return rec_equal(a, b); else static if(__traits(isScalar, T) /*|| __traits(hasMember, T, "opEquals")*/) return a == b; else { foreach(i, a_field; a.tupleof) if(!deep_equal(a_field, b.tupleof[i])) return false; return true; } } bool rec_equal(size_t max_depth = size_t.max, R0, R1)(R0 r0, R1 r1) if(NumDims!R0 == NumDims!R1) { static if(NumDims!R0 == 0 || max_depth == 0) return r0 == r1; else { mixin("return " ~ replicate("equal!", min(max_depth-1, NumDims!(R0)-1)) ~ "equal(r0, r1);" ); } } template NumDims (T) { static if(is(ElementType!T == void)) const NumDims = 0; else const NumDims = 1 + NumDims!(ElementType!T); }
Mar 19 2013
prev sibling parent "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Tuesday, March 19, 2013 11:12:38 H. S. Teoh wrote:
 On Tue, Mar 19, 2013 at 01:48:43PM -0400, Jonathan M Davis wrote:
 On Tuesday, March 19, 2013 18:26:16 timotheecour wrote:
 somewhere else, but I don't see a relevant package. Maybe a new
 std.algorithm2 for non-ranges?
 
 Also, the OT's firstDifference would go there too, and I have a
 recursive (to specified level) toStringRecurse that would
 belong there too.
Also, I'd add to that list copyRecurse and some more, that operate on arbitrary types, not just ranges, so we have: equalRecurse copyRecurse (deep copy) toStringRecurse firstDifference (see OT) toHashRecurse (should compare equal with a data structure serialized and then deserialized via a serialization function, eg std.orange) I'm sure there's more. that seems a starting point for a new package that operates on any type recursively (not just ranges), no? std.deep?std.recurse? Some of those could have a depth level compile time parameter that stops recursion at that level, which would be infinity by default.
And how do you even have the concept of recursion without some sort of range or container to recursively iterate through?
[...] One can iterate over every member of a struct/class and recursively invoke equalRecurse on them. Something like this: bool recursiveEqual(T)(T a, T b) { static if (isAtomic!T) { return a == b; } else { foreach (attr; __traits(getAllMembers, T)) { // TBD: need to skip stuff like member // functions or internal stuff like // compiler-generated attributes, hidden // context ptrs, etc. alias attr1 = __traits(getMember, a, attr); alias attr2 = __traits(getMember, b, attr); if (!recursiveEqual(attr, attr2)) return false; } } return true; }
True, but that's completely different from what equal does, and it would be very dangerous IMHO. opEquals is supposed to be handling comparing types recursively like that. Deciding on your own that a type's member variables should be compared with equal instead of == risk doing some nasty things to the semantics of the types that you're dealing with. If a type's member variables need to be compared with equal, then its opEquals should do that. equal is for handling the case where you're comparing two arbitrary ranges of elements. You don't care what the types of the ranges are. You're comparing the elements, not the ranges. You would only need to recursively compare the elements with anything other than == if they were ranges, and you wanted to ultimately compare the elements of the deepest ranges. == already exists to compare the elements themselves recursively. I'd strongly argue that a function like the one that you just gave is an extremely bad idea. It's trying to implement == for a type. Let the type define that for itself like it's supposed to. - jonathan M Davis
Mar 19 2013