digitalmars.D - Ordering comparisons
- Andrei Alexandrescu (17/17) Mar 06 2017 While reviewing work on array comparisons, Vladimir found an odd issue:
- Jonathan M Davis via Digitalmars-d (10/21) Mar 06 2017 If we can, we should probably make structs that have opCmp use opCmp lik...
- Dominikus Dittes Scherkl (3/10) Mar 07 2017 +1
- Nick Sabalausky (Abscissa) (3/5) Mar 06 2017 This one is very surprising. How is that so, if both structs being
- ketmar (14/21) Mar 06 2017 if we're talking about equality, yes. but opCmp is "less/greater"
- Jacob Carlborg (7/9) Mar 06 2017 The structs for a given run will be of the same endian-ness. But if you
- Andrei Alexandrescu (7/14) Mar 07 2017 On a big endian system, comparing integers with memcmp works correctly
- deadalnix (4/9) Mar 07 2017 There should be no assumption that structs are comparable, so the
- H. S. Teoh via Digitalmars-d (41/54) Mar 07 2017 Couple of points:
- Andrei Alexandrescu (5/11) Mar 07 2017 We currently do memcmp.
- Stefan Koch (5/21) Mar 07 2017 We should deprecate that array comparison behavior immediately.
- H. S. Teoh via Digitalmars-d (9/29) Mar 07 2017 Unfortunately, because strings are arrays, deprecating array comparisons
- Andrei Alexandrescu (4/6) Mar 07 2017 Misunderstanding, string comparisons are legit because character
While reviewing work on array comparisons, Vladimir found an odd issue: https://issues.dlang.org/show_bug.cgi?id=17244 Investigation reveals that during array comparison for inequality, structs are compared by means of memcmp - even if they don't define opCmp, and regardless of the types of their fields. This has obvious unpleasant consequences: * Floating point values don't compare well with memcmp * A struct S { int x; } compares differently on little endian and big endian system (!) * A struct containing a field that defines opCmp ignores it * ... and many other nefarious issues The question is what to do to minimize breakage yet "break the bad code". The most backward-compatible solution is to define opCmp automatically to do a field-by-field lexicographical comparison. The most radical solution is disable ordering comparisons unless explicitly implemented by the user. Andrei
Mar 06 2017
On Monday, March 06, 2017 20:27:56 Andrei Alexandrescu via Digitalmars-d wrote:While reviewing work on array comparisons, Vladimir found an odd issue: https://issues.dlang.org/show_bug.cgi?id=17244 Investigation reveals that during array comparison for inequality, structs are compared by means of memcmp - even if they don't define opCmp, and regardless of the types of their fields. This has obvious unpleasant consequences:The question is what to do to minimize breakage yet "break the bad code". The most backward-compatible solution is to define opCmp automatically to do a field-by-field lexicographical comparison. The most radical solution is disable ordering comparisons unless explicitly implemented by the user.If we can, we should probably make structs that have opCmp use opCmp like they should, and those that don't continue to use memcmp for the moment but get a deprecation message. Then after whatever amount of time we think is appropriate, we disable the comparisons. So, we don't break anything immediately, but ultimately end up with the correct situation of comparisons not working unless opCmp being defined. That being said, depending on how all of that is implemented, I don't know how easy that approach is. - Jonathan M Davis
Mar 06 2017
On Tuesday, 7 March 2017 at 02:51:37 UTC, Jonathan M Davis wrote:If we can, we should probably make structs that have opCmp use opCmp like they should, and those that don't continue to use memcmp for the moment but get a deprecation message. Then after whatever amount of time we think is appropriate, we disable the comparisons. So, we don't break anything immediately, but ultimately end up with the correct situation of comparisons not working unless opCmp being defined.+1 This is the only sensible approach.
Mar 07 2017
On 03/06/2017 08:27 PM, Andrei Alexandrescu wrote:* A struct S { int x; } compares differently on little endian and big endian system (!)This one is very surprising. How is that so, if both structs being compared are of the same endian-ness?
Mar 06 2017
Nick Sabalausky (Abscissa) wrote:On 03/06/2017 08:27 PM, Andrei Alexandrescu wrote:if we're talking about equality, yes. but opCmp is "less/greater" comparison. so, 0x290a will be: 0x0a 0x29 on little-endian, and: 0x29 0x0a on big-endian. let's take 0x1930, and compare it with 0x290a, using memcmp: for big-endian, first byte: 0x29:0x19: hit, 0x290a is greater. for little-endian, first byte: 0x0a:0x30: hit, 0x290a is *lesser* than 0x1930. oops.* A struct S { int x; } compares differently on little endian and big endian system (!)This one is very surprising. How is that so, if both structs being compared are of the same endian-ness?
Mar 06 2017
On 2017-03-07 04:44, Nick Sabalausky (Abscissa) wrote:This one is very surprising. How is that so, if both structs being compared are of the same endian-ness?The structs for a given run will be of the same endian-ness. But if you run the same code on two different systems, one with little endian and one with big endian, you will get different results when sorting an array, for example. If I understand this correctly. -- /Jacob Carlborg
Mar 06 2017
On 03/06/2017 10:44 PM, Nick Sabalausky (Abscissa) wrote:On 03/06/2017 08:27 PM, Andrei Alexandrescu wrote:On a big endian system, comparing integers with memcmp works correctly because higher-rank bytes are compared before lower-rank bytes. On a little endian system, lower-rank bytes are compared first, which would make e.g. 256 less than 1. Porting code that relies on comparing arrays of structs for ordering is likely to be a nightmare. We really need to fix this. -- Andrei* A struct S { int x; } compares differently on little endian and big endian system (!)This one is very surprising. How is that so, if both structs being compared are of the same endian-ness?
Mar 07 2017
On Tuesday, 7 March 2017 at 01:27:56 UTC, Andrei Alexandrescu wrote:The question is what to do to minimize breakage yet "break the bad code". The most backward-compatible solution is to define opCmp automatically to do a field-by-field lexicographical comparison. The most radical solution is disable ordering comparisons unless explicitly implemented by the user.There should be no assumption that structs are comparable, so the later.
Mar 07 2017
On Mon, Mar 06, 2017 at 08:27:56PM -0500, Andrei Alexandrescu via Digitalmars-d wrote:While reviewing work on array comparisons, Vladimir found an odd issue: https://issues.dlang.org/show_bug.cgi?id=17244 Investigation reveals that during array comparison for inequality, structs are compared by means of memcmp - even if they don't define opCmp, and regardless of the types of their fields. This has obvious unpleasant consequences:Couple of points: (1) I may be remembering wrong, but I thought structs had always been intended to be compared field-wise? I remember when working on AA's that the compiler would emit a default implementation of opEquals that did member-wise comparisons. I had always assumed that something similar was done with inequalities... or was that just unfounded extrapolation? (2) I didn't realize that arrays allowed inequalities by default, though in retrospect it does make sense (since we do define string inequalities, and string are arrays). But I would have expected that structs would *not* allow inequalities by default, because it's not clear that a default ordering relation makes sense for every struct. Consider for example, a struct representation of a complex number: struct Complex { float re, im; } It would be wrong to assume a default ordering relation on this struct because the complex numbers do not have a linear ordering. [...]The question is what to do to minimize breakage yet "break the bad code". The most backward-compatible solution is to define opCmp automatically to do a field-by-field lexicographical comparison. The most radical solution is disable ordering comparisons unless explicitly implemented by the user.[...] I wouldn't say disabling ordering comparisons is "radical". In fact, I think it makes the most sense by assuming the least about the user's intended semantics for the struct. Lexicographical ordering by default makes sense for arrays (e.g., strings), but I don't think it makes sense for general structs. Without the user stating what his intents are, it seems unfounded to presume lexicographic ordering by default. If the user has a struct of orthogonal information, e.g., name, phone number, address, there is no reason why changing the order of fields (during code refactoring, for example) should result in a completely different ordering just because the language assumes lexicographical ordering by default. It's better to make it an error to involve structs in inequalities if the user didn't explicitly define opCmp, than to silently accept such comparisons, which are likely to be buggy, and then have unusual behaviour later when the user reorders fields and a seemingly unrelated part of the code suddenly begins producing different output. I'd say allowing inequalities on structs by default is surprising behaviour, whereas asking the user to explicitly specify an ordering is more expected. T -- If the comments and the code disagree, it's likely that *both* are wrong. -- Christopher
Mar 07 2017
On 03/07/2017 12:54 PM, H. S. Teoh via Digitalmars-d wrote:(1) I may be remembering wrong, but I thought structs had always been intended to be compared field-wise? I remember when working on AA's that the compiler would emit a default implementation of opEquals that did member-wise comparisons. I had always assumed that something similar was done with inequalities... or was that just unfounded extrapolation?We currently do memcmp. Equality by memberwise comparison is almost always meaningful; ordering by lexicographic comparison of members is not. Andrei
Mar 07 2017
On Tuesday, 7 March 2017 at 19:40:53 UTC, Andrei Alexandrescu wrote:On 03/07/2017 12:54 PM, H. S. Teoh via Digitalmars-d wrote:We should deprecate that array comparison behavior immediately. It should not break much code, otherwise the issue would have popped up before.(1) I may be remembering wrong, but I thought structs had always been intended to be compared field-wise? I remember when working on AA's that the compiler would emit a default implementation of opEquals that did member-wise comparisons. I had always assumed that something similar was done with inequalities... or was that just unfounded extrapolation?We currently do memcmp. Equality by memberwise comparison is almost always meaningful; ordering by lexicographic comparison of members is not. Andrei
Mar 07 2017
On Tue, Mar 07, 2017 at 07:53:37PM +0000, Stefan Koch via Digitalmars-d wrote:On Tuesday, 7 March 2017 at 19:40:53 UTC, Andrei Alexandrescu wrote:Unfortunately, because strings are arrays, deprecating array comparisons *will* break a *lot* of code. I'm all for deprecating comparisons between arrays of structs, though. The best way to do this is to deprecate implicit struct comparisons (the user has to define opCmp if he wants to compare structs). T -- What doesn't kill me makes me stranger.On 03/07/2017 12:54 PM, H. S. Teoh via Digitalmars-d wrote:We should deprecate that array comparison behavior immediately. It should not break much code, otherwise the issue would have popped up before.(1) I may be remembering wrong, but I thought structs had always been intended to be compared field-wise? I remember when working on AA's that the compiler would emit a default implementation of opEquals that did member-wise comparisons. I had always assumed that something similar was done with inequalities... or was that just unfounded extrapolation?We currently do memcmp. Equality by memberwise comparison is almost always meaningful; ordering by lexicographic comparison of members is not. Andrei
Mar 07 2017
On 03/07/2017 03:49 PM, H. S. Teoh via Digitalmars-d wrote:Unfortunately, because strings are arrays, deprecating array comparisons *will* break a *lot* of code.Misunderstanding, string comparisons are legit because character comparisons are legit. This is about deprecating array comparisons that don't have a corresponding element comparison. -- Andrei
Mar 07 2017