digitalmars.D - [Vote] Should defining opCmp to be inconsistent with opEquals be
- Stewart Gordon (24/24) Aug 06 2005 It is possible to define the opCmp and opEquals methods so that unequal
- Regan Heath (49/53) Aug 06 2005 I see in your first example/link you're defining a character class for
- Stewart Gordon (34/68) Aug 08 2005 Yes. But my example was on the contrived side, but maybe the idea can
- Regan Heath (17/65) Aug 08 2005 Sure. But in the case of an AA we need to assign an
-
Stewart Gordon
(11/15)
Aug 09 2005
- Regan Heath (29/38) Aug 09 2005 opCmp
- Stewart Gordon (18/49) Aug 09 2005 Adding or looking up a key in an AA isn't comparing either.
- Regan Heath (13/44) Aug 09 2005 In which case it should use opRank or opSort, no? (if it exists)
-
Stewart Gordon
(14/29)
Aug 10 2005
- Regan Heath (42/64) Aug 10 2005 Sure.
- Ben Hinkle (9/15) Aug 06 2005 My first preference would be 1 since I think it would be a maintenance
- Burton Radons (16/17) Aug 07 2005 2, I don't care what incorrect code does and I don't want any speed
- Ben Hinkle (34/50) Aug 07 2005 True. I was talking about <>= in the context of opCmp for objects. Since...
- Manfred Nowak (22/24) Aug 07 2005 I already mentioned, that the AA-implementation is broken:
- Stewart Gordon (12/35) Aug 08 2005 That's why it talks about calling .rehash after populating the AA. But
- Manfred Nowak (18/21) Aug 08 2005 Optimal number of hash slots under which target function? And if you
-
Stewart Gordon
(9/11)
Aug 08 2005
- Sean Kelly (12/31) Aug 08 2005 That's because they do match, at least as far as an ordered binary tree ...
- Sean Kelly (6/11) Aug 08 2005 My response was poorly worded (one of these days I'll learn not to post ...
-
Stewart Gordon
(8/12)
Aug 08 2005
- Sean Kelly (7/15) Aug 08 2005 I don't have writefln available in my current setup :p. But it looks li...
- Stewart Gordon (10/19) Aug 09 2005 Update your compiler, or change the example to use printf.
- Sean Kelly (4/10) Aug 09 2005 I disagree. DMD is a reference implementation, so where the spec is vag...
- Manfred Nowak (5/8) Aug 09 2005 Then: what is the bugs group for? Only for deviations of dmd to the
- Sean Kelly (11/18) Aug 09 2005 Mostly, yes. Though I've posted things there in the past that were more...
- Derek Parnell (12/21) Aug 09 2005 Do you try to be thick?
- Stewart Gordon (14/26) Aug 09 2005 This would mean that:
- Sean Kelly (8/30) Aug 09 2005 See my other post for clarification. I didn't mean to imply that DMD sh...
- Derek Parnell (13/41) Aug 09 2005 Absolutely true!
- Sean Kelly (7/12) Aug 08 2005 Oh, I should mention that all my prior experience with hash maps has bee...
-
Manfred Nowak
(25/30)
Aug 22 2005
Stewart Gordon
wrote in
It is possible to define the opCmp and opEquals methods so that unequal objects can rank equally in order. This is allowed by the Java specs, and indeed one or two Java API classes do this. However, when this is done in D, the current implementation of associative arrays falls apart. http://www.digitalmars.com/drn-bin/wwwnews?digitalmars.D.bugs/4556 The problem is that the AA implementation assumes that if opCmp returns zero then the keys match. As far as I can see, the spec doesn't in any way forbid behaviour to the contrary. But there seems to be room for debate on what the best solution is. Possibilities: 1. State in the spec that it is an incorrect practice to define opCmp to return 0 if opEquals returns true. 2. Keep it as an acceptable practice, but warn people properly that such a type may not work correctly as an AA key. 3. Keep it as an acceptable practice, and fix the AA implementation to work on such types. Before you dismiss this idea as inefficient, please read http://www.digitalmars.com/drn-bin/wwwnews?digitalmars.D.bugs/4710 Cast your votes now! Stewart. -- My e-mail is valid but not my primary mailbox. Please keep replies on on the 'group where everyone may benefit.
Aug 06 2005
On Sat, 06 Aug 2005 23:10:26 +0100, Stewart Gordon <smjg_1998 yahoo.com> wrote:It is possible to define the opCmp and opEquals methods so that unequal objects can rank equally in order.I see in your first example/link you're defining a character class for which opEquals does a case-sensitive compare, but opCmp does a case insensitive comparrison. The goal, I assume, is to have characters/strings that sort/rank in a case insensitive fashion but are only exactly equal if the case is the same. Am I right? A recent thread on the array 'sort' operator has made me think that perhaps we can apply the same idea to AA's. That idea is the ability to assign a 'sort'/'opCmp' method so the array. So, we pick:1. State in the spec that it is an incorrect practice to define opCmp to return 0 if opEquals returns true.and instead allow the following... class IChar { static int opCmpCase(IChar lhs, IChar rhs) {} ..etc.. } int[IChar] list; list.opCmp = &IChar.opCmpCase; ..add items to 'list'.. and/or allow assignment of a delegate, eg. class ICharSort { int[] sortOrder; //by name, then by date, then by .. int opCmp(IChar lhs, IChar rhs) {} } ICharSort s = new ICharSort(); ..define sortOrder in 's'.. int[IChar] list; list.opCmp = &s.opCmp; ..add items to 'list'.. to allow us to sort/rank in a more complex fashion. For an AA it could either be illegal, or would cause a rehash to assign the delegate when it contains items. For an array it wouldn't matter (except that it would sort of invalidate the state i.e. if you sort, then assign it's not technically 'sorted' by it's current operator any more) The advantage here is that we have only one array/AA implementation and there should be little (no?) performance penalty using types that do or don't have this requirement. Of course one argument against this idea may be that the class ('IChar' in this example) should never be sorted/ranked in any other way. In which case... Why can't we have a 3rd operator opRank (or opSort) - definable in IChar, and make the the array/AA smart. The array/AA could, upon creation, assign a comparrison delegate to this new operator if present and opCmp otherwise. In this way there is no additional performance penalty (as it only checks once, upon creation, not every comparrison) Thoughts? Regan
Aug 06 2005
Regan Heath wrote:On Sat, 06 Aug 2005 23:10:26 +0100, Stewart Gordon <smjg_1998 yahoo.com> wrote:Yes. But my example was on the contrived side, but maybe the idea can be put to practical use.It is possible to define the opCmp and opEquals methods so that unequal objects can rank equally in order.I see in your first example/link you're defining a character class for which opEquals does a case-sensitive compare, but opCmp does a case insensitive comparrison. The goal, I assume, is to have characters/strings that sort/rank in a case insensitive fashion but are only exactly equal if the case is the same. Am I right?A recent thread on the array 'sort' operator has made me think that perhaps we can apply the same idea to AA's. That idea is the ability to assign a 'sort'/'opCmp' method so the array.I personally prefer the idea of having array.sort(cmp) available here. Not only is writing array.sort(&cmpByDate); shorter than array.opCmp = &cmpByDate; array.sort; but it also doesn't bloat memory requirements.So, we pick:I'm not sure opCmp is the best name - it looks like you're setting the function that will be used to compare this AA with another AA. Maybe call it comparator? That said, what's the point of this? To allow some AAs to be case-sensitive and others to be case-insensitive? If you're going to do this, do we gain much over doing assignments and lookups to aa[toupper(key)]? If we solved the problem by giving every AA a comparator property, then every AA that uses such a keytype would need to set the comparator anyway. So why not go the whole hog and let the type define the AA comparator? <snip>1. State in the spec that it is an incorrect practice to define opCmp to return 0 if opEquals returns true.and instead allow the following... class IChar { static int opCmpCase(IChar lhs, IChar rhs) {} ..etc.. } int[IChar] list; list.opCmp = &IChar.opCmpCase; ..add items to 'list'..Why can't we have a 3rd operator opRank (or opSort) - definable in IChar, and make the the array/AA smart. The array/AA could, upon creation, assign a comparrison delegate to this new operator if present and opCmp otherwise.<snip> But calling it opSort would imply using it for .sort as well. This is pointless IMO - normal opCmp can be used for this. If you define opCmp such that unequal objects can rank equally, then you're stating that when sorting, you don't care what order equally-ranking objects end up in. Maybe call it opStrictCmp. AAs (and any other applications relying on a comparator consistent with opEquals) can use this, and the default .sort can use plain opCmp. Stewart. -- My e-mail is valid but not my primary mailbox. Please keep replies on on the 'group where everyone may benefit.
Aug 08 2005
On Mon, 08 Aug 2005 12:31:59 +0100, Stewart Gordon <smjg_1998 yahoo.com> wrote:Sure. But in the case of an AA we need to assign an operator/function/delegate for it to use in it's internal ordering on every insert/lookup/etc.A recent thread on the array 'sort' operator has made me think that perhaps we can apply the same idea to AA's. That idea is the ability to assign a 'sort'/'opCmp' method so the array.I personally prefer the idea of having array.sort(cmp) available here. Not only is writing array.sort(&cmpByDate); shorter than array.opCmp = &cmpByDate; array.sort; but it also doesn't bloat memory requirements.We can call it whatever.So, we pick:I'm not sure opCmp is the best name - it looks like you're setting the function that will be used to compare this AA with another AA. Maybe call it comparator?1. State in the spec that it is an incorrect practice to define opCmp to return 0 if opEquals returns true.and instead allow the following... class IChar { static int opCmpCase(IChar lhs, IChar rhs) {} ..etc.. } int[IChar] list; list.opCmp = &IChar.opCmpCase; ..add items to 'list'..That said, what's the point of this? To allow some AAs to be case-sensitive and others to be case-insensitive? If you're going to do this, do we gain much over doing assignments and lookups to aa[toupper(key)]?To allow you to store your weird objects for which opEquals and opCmp disagree.If we solved the problem by giving every AA a comparator property, then every AA that uses such a keytype would need to set the comparator anyway. So why not go the whole hog and let the type define the AA comparator?Sure, that's exactly what I suggest below.<snip>But that's _exactly_ what it is used for, ranking/sorting the object. I'm suggesting that opEquals and opCmp should _always_ agree and opSort may disagree with opCmp. AA's would use opCmp, sort would use opSort. This lets you use them in AA's and sort them how you want in normal arrays. You suggest the opposite below, either way a 3rd operator to handle your unusual type with AA's and arrays automagically picking up and using the new operator when poresent is the solution I like best.Why can't we have a 3rd operator opRank (or opSort) - definable in IChar, and make the the array/AA smart. The array/AA could, upon creation, assign a comparrison delegate to this new operator if present and opCmp otherwise.<snip> But calling it opSort would imply using it for .sort as well.This is pointless IMO - normal opCmp can be used for this. If you define opCmp such that unequal objects can rank equally, then you're stating that when sorting, you don't care what order equally-ranking objects end up in. Maybe call it opStrictCmp. AAs (and any other applications relying on a comparator consistent with opEquals) can use this, and the default .sort can use plain opCmp.Regan
Aug 08 2005
Regan Heath wrote: <snip>I'm suggesting that opEquals and opCmp should _always_ agree and opSort may disagree with opCmp. AA's would use opCmp, sort would use opSort. This lets you use them in AA's and sort them how you want in normal arrays.<snip> Which would the < <= > >= operators use under your proposal? Since this is akin to sorting, it would make sense to use opSort, but then it becomes confusing that there is a method called opCmp but it doesn't do this. Stewart. -- My e-mail is valid but not my primary mailbox. Please keep replies on on the 'group where everyone may benefit.
Aug 09 2005
On Tue, 09 Aug 2005 12:06:53 +0100, Stewart Gordon <smjg_1998 yahoo.com> wrote:Regan Heath wrote: <snip>opCmpI'm suggesting that opEquals and opCmp should _always_ agree and opSort may disagree with opCmp. AA's would use opCmp, sort would use opSort. This lets you use them in AA's and sort them how you want in normal arrays.<snip> Which would the < <= > >= operators use under your proposal?Since this is akin to sorting, it would make sense to use opSortI disagree. You're comparing not sorting thus opCmp makes sense. More important however is _how_ you want to compare in the general case with this special type, eg. IChar a = 'a'; IChar b = 'B'; if (a < b) { //true or false? } Assume for the sake of argument: 'a' = 0x61 or 91 decimal 'B' = 0x42 or 66 decimal Is the statement true or false? Does the answer depend on context (if so, which is more common case or no case?) or will this type always be compared in one way? If you generally want to do a case-insensitive compare then you're defining an opCmp that AA's cannot use. If you want to do case-sensitive compare generally then you're defining an opCmp that AA's can use. I suspect you want the former, right? In which case the solution might be to have another operator (as you suggested) opCmpStrict or something used by AA's. This is what I referred to when I said "either way" earlier. This operator can be found and used automagically by the AA without requiring a change to the 'balanced binary tree' part of the implementation. Regan p.s. My intention in starting this part of the thread was to explore solutions that don't involve changing the AA implementation. I'm not saying that's not the solution, I'm just exploring...
Aug 09 2005
Regan Heath wrote:On Tue, 09 Aug 2005 12:06:53 +0100, Stewart Gordon <smjg_1998 yahoo.com> wrote:<snip>Adding or looking up a key in an AA isn't comparing either. Under the way I had it, one comparator is used externally and one internally. It also remains possible to define < <= > >= to have a non-strict comparator. (OK, so there would remain inconsistency between qwert <= yuiop and qwert < yuiop || qwert == yuiop but is this really an excuse?)Which would the < <= > >= operators use under your proposal?opCmpSince this is akin to sorting, it would make sense to use opSortI disagree. You're comparing not sorting thus opCmp makes sense.More important however is _how_ you want to compare in the general case with this special type, eg. IChar a = 'a'; IChar b = 'B'; if (a < b) { //true or false? } Assume for the sake of argument: 'a' = 0x61 or 91 decimal 'B' = 0x42 or 66 decimal Is the statement true or false? Does the answer depend on context (if so, which is more common case or no case?) or will this type always be compared in one way?My thought was that it will always be compared in this way.If you generally want to do a case-insensitive compare then you're defining an opCmp that AA's cannot use. If you want to do case-sensitive compare generally then you're defining an opCmp that AA's can use. I suspect you want the former, right?<snip> The idea of my example is that they rank case-insensitively but match case-sensitively. Stewart. -- My e-mail is valid but not my primary mailbox. Please keep replies on on the 'group where everyone may benefit.
Aug 09 2005
On Tue, 09 Aug 2005 17:14:42 +0100, Stewart Gordon <smjg_1998 yahoo.com> wrote:In which case it should use opRank or opSort, no? (if it exists) Comparing is required for sorting, but not vice-versa. What AA's do is akin to sorting, not comparing. Right?On Tue, 09 Aug 2005 12:06:53 +0100, Stewart Gordon <smjg_1998 yahoo.com> wrote:<snip>Adding or looking up a key in an AA isn't comparing either.Which would the < <= > >= operators use under your proposal?opCmpSince this is akin to sorting, it would make sense to use opSortI disagree. You're comparing not sorting thus opCmp makes sense.Under the way I had it, one comparator is used externally and one internally.externally: case-insensitive (opCmp) internally: case-sensitive (opRank/opSort/opAA ..) Right?Always compared in a case insensitive fashion? Except.. when compared during sorting/ranking.More important however is _how_ you want to compare in the general case with this special type, eg. IChar a = 'a'; IChar b = 'B'; if (a < b) { //true or false? } Assume for the sake of argument: 'a' = 0x61 or 91 decimal 'B' = 0x42 or 66 decimal Is the statement true or false? Does the answer depend on context (if so, which is more common case or no case?) or will this type always be compared in one way?My thought was that it will always be compared in this way.By 'match' do you mean only '==' or '<' '<=' etc as well? ReganIf you generally want to do a case-insensitive compare then you're defining an opCmp that AA's cannot use. If you want to do case-sensitive compare generally then you're defining an opCmp that AA's can use. I suspect you want the former, right?<snip> The idea of my example is that they rank case-insensitively but match case-sensitively.
Aug 09 2005
Regan Heath wrote:On Tue, 09 Aug 2005 17:14:42 +0100, Stewart Gordon <smjg_1998 yahoo.com> wrote:<snip> [Regan:]Yes, except that an AA is sorted only internally. OTOH array.sort is external sorting, i.e. the application, and possibly even the user, concerns him/her/itself with the order generated by array.sort. <snip>In which case it should use opRank or opSort, no? (if it exists) Comparing is required for sorting, but not vice-versa. What AA's do is akin to sorting, not comparing. Right?I disagree. You're comparing not sorting thus opCmp makes sense.Adding or looking up a key in an AA isn't comparing either.By 'rank' I mean what < <= > >= do - maybe 'order' is a better word. By 'match' I mean ==, and hence what one would naturally expect of AA key matching. Stewart. -- My e-mail is valid but not my primary mailbox. Please keep replies on on the 'group where everyone may benefit.The idea of my example is that they rank case-insensitively but match case-sensitively.By 'match' do you mean only '==' or '<' '<=' etc as well?
Aug 10 2005
On Wed, 10 Aug 2005 18:12:48 +0100, Stewart Gordon <smjg_1998 yahoo.com> wrote:Regan Heath wrote:Sure. It doesn't matter how the AA's sort works provided it does not call two unequal items equal (the problem with your special type and it's opCmp operator). Whereas the array.sort matters, and the user may want to change/define it differently in each call of array.sort.On Tue, 09 Aug 2005 17:14:42 +0100, Stewart Gordon <smjg_1998 yahoo.com> wrote:<snip> [Regan:]Yes, except that an AA is sorted only internally. OTOH array.sort is external sorting, i.e. the application, and possibly even the user, concerns him/her/itself with the order generated by array.sort.In which case it should use opRank or opSort, no? (if it exists) Comparing is required for sorting, but not vice-versa. What AA's do is akin to sorting, not comparing. Right?I disagree. You're comparing not sorting thus opCmp makes sense.Adding or looking up a key in an AA isn't comparing either.<snip>I would have said that < <= etc "compare" not "rank" or "order", and == was "compare" also, but perhaps "match". Basically, in my head < <= etc and = are all "compare" operations. To "rank" or "order" you need some sort of context. i.e. inside an AA or when calling array.sort. You can "compare" any 2 items, you can "rank" or "order" a set of x (where x is > 1), ranking 2 items is the same as comparing, to decide which you need the context. Anyway.. I think the solution is another operator which you can define for your types and is magically found and used by the AA implementation. So you'd have: class IChar { char data; int opCmp(IChar rhs) { return toupper(data) - toupper(rhs.data); } int opEquals(IChar rhs) { return data == rhs.data; } int opAA(IChar rhs) { return data - rhs.data; } } Giving you a type that you can compare with < <= etc (opCmp), compare with = (opEquals) and store in an AA (opAA - or whatever we want to call it) The AA implementation needs only one minor change, the ability to find and use the opAA operator. Essentially instead of calling opCmp it has a delegate which it assigns upon creation to opAA or opCmp then uses for all it's internal workings. ReganBy 'rank' I mean what < <= > >= do - maybe 'order' is a better word. By 'match' I mean ==, and hence what one would naturally expect of AA key matching.The idea of my example is that they rank case-insensitively but match case-sensitively.By 'match' do you mean only '==' or '<' '<=' etc as well?
Aug 10 2005
1. State in the spec that it is an incorrect practice to define opCmp to return 0 if opEquals returns true. 2. Keep it as an acceptable practice, but warn people properly that such a type may not work correctly as an AA key. 3. Keep it as an acceptable practice, and fix the AA implementation to work on such types.My first preference would be 1 since I think it would be a maintenance nightmare to allow values such that x<=y and y<=x but x!=y. It takes me back to my TA days in freshman calculus and the algebra abuse that students are capable of. But given that D has the <> and !<> operators presumably to allow users to distinguish those concepts from != and == I'll vote for 2. So either I say go with 1 and look hard at !<> and <> or go with 2 and say that if you design a type to use <> and !<> that you should give big blinking warnings. ps - what does <>= do? It is a fun way of checking if an opCmp throws?
Aug 06 2005
2, I don't care what incorrect code does and I don't want any speed penalty from confirming that I'm writing correct code when I always test every line I write. Object.opCmp should be changed to return float, so that it can use the semantics the compiler actually applies to the return value of comparison operators. In that case 0 does mean equality - float.nan means unordered. This change should've been made years ago, and the operator overloading page should be fixed up as well. Someone send Walter a patch. Ben Hinkle wrote:ps - what does <>= do? It is a fun way of checking if an opCmp throws?The throw semantics are irreproducible, and anyway either the standard or the implementation is wrong - it doesn't throw and I don't think there's a Phobos way to enable FP exceptions. But directly related to your question, it returns whether the values are not unordered. So "float.nan <>= float.nan" is false, while "4 <>= 4" is true.
Aug 07 2005
"Burton Radons" <burton-radons smocky.com> wrote in message news:dd6alf$17kj$1 digitaldaemon.com...2, I don't care what incorrect code does and I don't want any speed penalty from confirming that I'm writing correct code when I always test every line I write. Object.opCmp should be changed to return float, so that it can use the semantics the compiler actually applies to the return value of comparison operators. In that case 0 does mean equality - float.nan means unordered. This change should've been made years ago, and the operator overloading page should be fixed up as well. Someone send Walter a patch. Ben Hinkle wrote:True. I was talking about <>= in the context of opCmp for objects. Since opCmp currently returns an int <>= only tests if the opCmp throws. I agree if opCmp returned a float then we can start giving <>= meaning for objects. Note also that nan behavior is different from opCmp and opEqual implementing different notions of equality. To see what I mean concretely about <>= and the semantics of <> and != try running class X { int val; int opCmp(Object y) { X y2 = cast(X)y; int diff = val - y2.val; if (diff < 5 && -diff < 5) return 0; return diff; } int opEqual(Object y) { X y2 = cast(X)y; return val == y2.val; } } int main() { X a = new X; X b = new X; a.val = 10; b.val = 12; assert( a <= b ); assert( a >= b ); assert( a != b ); assert( a !<> b ); assert( a <>= b ); // should always be true return 0; }ps - what does <>= do? It is a fun way of checking if an opCmp throws?The throw semantics are irreproducible, and anyway either the standard or the implementation is wrong - it doesn't throw and I don't think there's a Phobos way to enable FP exceptions. But directly related to your question, it returns whether the values are not unordered. So "float.nan <>= float.nan" is false, while "4 <>= 4" is true.
Aug 07 2005
Stewart Gordon <smjg_1998 yahoo.com> wrote: My vote is:3. Keep it as an acceptable practice, and fix the AA implementation to work on such types.I already mentioned, that the AA-implementation is broken: http://www.digitalmars.com/drn-bin/wwwnews?digitalmars.D/22170 | Hashing needs at least one hash function, an _equal_ operator | and one of these: an upper bound on the number of elements or a | rehash-operation. The opCmp is not needed for hashing. The current Implementation uses opCmp for traversing its unbalanced binary trees, which can degenerate to linear lists, depending on the quality of toHash and the insertion sequence. Degenerated AA's will be terrible slow, because in the average case a lookup will traverse half of the elements in the AA, whereas a pure hash with conflict resolution by jumping can be guaranteed to bind the lookup-time by a constant, i.e. independent of the number of elements stored. Because only opEqual is needed there is no necessity to restrict or warn. The drawback is, that if removing must be supported the space requirements are temporarily increased instead of decreased. So clearly choice 3 with a clear preference for a pure hash. -manfred
Aug 07 2005
Manfred Nowak wrote:Stewart Gordon <smjg_1998 yahoo.com> wrote: My vote is:That's why it talks about calling .rehash after populating the AA. But yes, it doesn't specify how thoroughly it should reorganise the array. I guess a sensible approach (under the current AA imp) is to recalculate all the hash values, allocate the optimum number of hash slots and balance the binary trees.3. Keep it as an acceptable practice, and fix the AA implementation to work on such types.I already mentioned, that the AA-implementation is broken: http://www.digitalmars.com/drn-bin/wwwnews?digitalmars.D/22170 | Hashing needs at least one hash function, an _equal_ operator | and one of these: an upper bound on the number of elements or a | rehash-operation. The opCmp is not needed for hashing. The current Implementation uses opCmp for traversing its unbalanced binary trees, which can degenerate to linear lists, depending on the quality of toHash and the insertion sequence.Degenerated AA's will be terrible slow, because in the average case a lookup will traverse half of the elements in the AA, whereas a pure hash with conflict resolution by jumping can be guaranteed to bind the lookup-time by a constant, i.e. independent of the number of elements stored.<snip> You have a point. But what do you mean by jumping, exactly? Stewart. -- My e-mail is valid but not my primary mailbox. Please keep replies on on the 'group where everyone may benefit.
Aug 08 2005
Stewart Gordon <smjg_1998 yahoo.com> wrote: [...]allocate the optimum number of hash slots and balance the binary trees.Optimal number of hash slots under which target function? And if you have to press any balancing algorithm into code for the sake of collision resolution, that would imply, that you do not trust the hashing part. Then: why use hashing at all? [...]But what do you mean by jumping, exactly?If you find that the hash slot provided by toHash is already occupied, you choose the next slot to probe for occupation by increasing the index of the current hash slot by a given amount, i.e. jump to the next hash slot. All you have to guarantee is that if there is an empty hash slot you can find it by jumping. Up to a fill rate of about 0.9 thoretical results show, that the length of the jump distance is unimportant and the length of the average probing sequnence in search for an empty hash slot can be bound by the reciprocal of 1 minus the fill rate, i.e. 10 in the case of a fill rate of 0.9. -manfred
Aug 08 2005
Stewart Gordon wrote: <snip>1. State in the spec that it is an incorrect practice to define opCmp to return 0 if opEquals returns true.<snip> Oops, I meant if opEquals returns false. But I'm guessing that anyone who voted for this option misread it as this anyway.... Stewart. -- My e-mail is valid but not my primary mailbox. Please keep replies on on the 'group where everyone may benefit.
Aug 08 2005
In article <dd3cea$26u7$1 digitaldaemon.com>, Stewart Gordon says...It is possible to define the opCmp and opEquals methods so that unequal objects can rank equally in order. This is allowed by the Java specs, and indeed one or two Java API classes do this.Yes. I do this in C++ all the time and consider it an important feature.However, when this is done in D, the current implementation of associative arrays falls apart. http://www.digitalmars.com/drn-bin/wwwnews?digitalmars.D.bugs/4556 The problem is that the AA implementation assumes that if opCmp returns zero then the keys match. As far as I can see, the spec doesn't in any way forbid behaviour to the contrary.That's because they do match, at least as far as an ordered binary tree are concerned. The C++ standard defines this as equivalence, ie. two objects have the same sort order.But there seems to be room for debate on what the best solution is. Possibilities: 1. State in the spec that it is an incorrect practice to define opCmp to return 0 if opEquals returns true.I'm not sure I like this. While I generally make equivalence a looser match than equality, it's reasonable that an application might violate this condition.2. Keep it as an acceptable practice, but warn people properly that such a type may not work correctly as an AA key.Why wouldn't it work properly?3. Keep it as an acceptable practice, and fix the AA implementation to work on such types. Before you dismiss this idea as inefficient, please read http://www.digitalmars.com/drn-bin/wwwnews?digitalmars.D.bugs/4710If hash maps are going to change to use equality rather than equivalence then it has to happen soon. But frankly, what I really want is a way to supply a comparison delegate on declaration. Sean
Aug 08 2005
In article <dd7sau$2ktr$1 digitaldaemon.com>, Sean Kelly says...In article <dd3cea$26u7$1 digitaldaemon.com>, Stewart Gordon says...My response was poorly worded (one of these days I'll learn not to post right after waking up). But now this has me confused. Are you saying that the spec should forbid two objects from being equivalent if they are equal? Or did you mean "if opEquals returns false?" Either way, I don't like it. Sean1. State in the spec that it is an incorrect practice to define opCmp to return 0 if opEquals returns true.I'm not sure I like this. While I generally make equivalence a looser match than equality, it's reasonable that an application might violate this condition.
Aug 08 2005
Sean Kelly wrote: <snip><snip> Try the example in the post I linked to above and you'll see. Stewart. -- My e-mail is valid but not my primary mailbox. Please keep replies on on the 'group where everyone may benefit.2. Keep it as an acceptable practice, but warn people properly that such a type may not work correctly as an AA key.Why wouldn't it work properly?
Aug 08 2005
In article <dd7udk$2n7b$1 digitaldaemon.com>, Stewart Gordon says...Sean Kelly wrote: <snip>I don't have writefln available in my current setup :p. But it looks like the point you were trying to make is that 'C' and 'c' are considered equivalent keys, which I would consider correct behavior given the current implementation. Whether or not hash maps should be changed to use unordered slists for collisions so opEquals can be used instead is a separate issue IMO. Sean<snip> Try the example in the post I linked to above and you'll see.2. Keep it as an acceptable practice, but warn people properly that such a type may not work correctly as an AA key.Why wouldn't it work properly?
Aug 08 2005
Sean Kelly wrote:In article <dd7udk$2n7b$1 digitaldaemon.com>, Stewart Gordon says...<snip>Update your compiler, or change the example to use printf.Try the example in the post I linked to above and you'll see.I don't have writefln available in my current setup :p.But it looks like the point you were trying to make is that 'C' and 'c' are considered equivalent keys, which I would consider correct behavior given the current implementation.Correct behaviour is determined by the spec, not by some implementaion. The natural assumption is that keys match by equality, not equivalence.Whether or not hash maps should be changed to use unordered slists for collisions so opEquals can be used instead is a separate issue IMO.Indeed, separate from whether they should be fixed to use opEquals as well. Stewart. -- My e-mail is valid but not my primary mailbox. Please keep replies on on the 'group where everyone may benefit.
Aug 09 2005
In article <dda28h$21m5$1 digitaldaemon.com>, Stewart Gordon says...Sean Kelly wrote:I disagree. DMD is a reference implementation, so where the spec is vague DMD should serve as an authoritative refernce on expected behavior. SeanBut it looks like the point you were trying to make is that 'C' and 'c' are considered equivalent keys, which I would consider correct behavior given the current implementation.Correct behaviour is determined by the spec, not by some implementaion.
Aug 09 2005
Sean Kelly <sean f4.ca> wrote: [...]I disagree. DMD is a reference implementation, so where the spec is vague DMD should serve as an authoritative refernce on expected behavior.Then: what is the bugs group for? Only for deviations of dmd to the specs? -manfred
Aug 09 2005
In article <ddajcc$2p86$1 digitaldaemon.com>, Manfred Nowak says...Sean Kelly <sean f4.ca> wrote: [...]Mostly, yes. Though I've posted things there in the past that were more of a feature change request than a bug report. Thing is, Walter is not a tech writer, and he tends to spend time on compiler changes over documentation. So in instances where the spec is unclear or just plain doesn't say anything, you have to make an educated guess about what's intended, and DMD is a useful reference for this. As far as this hash issue is concerned--I consider the use of opCmp to be a requirement simply because a program wouldn't be portable between compilers if one were implemented using opEquals and the other were implemented using opCmp, and DMD uses opCmp. SeanI disagree. DMD is a reference implementation, so where the spec is vague DMD should serve as an authoritative refernce on expected behavior.Then: what is the bugs group for? Only for deviations of dmd to the specs?
Aug 09 2005
On Tue, 9 Aug 2005 15:50:36 +0000 (UTC), Manfred Nowak wrote:Sean Kelly <sean f4.ca> wrote: [...]Do you try to be thick? DMD is not perfect - it has mistakes that even we simpletons can see. And in spite of that, it is still further along the track than the official documentation. This is not how it should be, I know, but that's the reality we have to put up with. And complaining about that aspect has got even less chance of an improvement than some of the weirdo things on our wishlists. Its simply not Walter's development methodology to document before coding. -- Derek Parnell Melbourne, Australia 10/08/2005 7:46:42 AMI disagree. DMD is a reference implementation, so where the spec is vague DMD should serve as an authoritative refernce on expected behavior.Then: what is the bugs group for? Only for deviations of dmd to the specs?
Aug 09 2005
Sean Kelly wrote:In article <dda28h$21m5$1 digitaldaemon.com>, Stewart Gordon says...This would mean that: 1. All undefined testcases in DStress should be either pass or xfail, defeating the point. 2. If some undefined code crashes the compiler, then crashing the compiler is correct behaviour. 3. Third-party compiler writers would have to hop back and forth between the spec and the reference implementation. This strikes me as absurd. A language should be implementable from the spec alone. Stewart. -- My e-mail is valid but not my primary mailbox. Please keep replies on on the 'group where everyone may benefit.Sean Kelly wrote:I disagree. DMD is a reference implementation, so where the spec is vague DMD should serve as an authoritative refernce on expected behavior.But it looks like the point you were trying to make is that 'C' and 'c' are considered equivalent keys, which I would consider correct behavior given the current implementation.Correct behaviour is determined by the spec, not by some implementaion.
Aug 09 2005
In article <ddal12$2sda$1 digitaldaemon.com>, Stewart Gordon says...Sean Kelly wrote:See my other post for clarification. I didn't mean to imply that DMD should be law so much as a reference for intended behavior when the spec is lacking. But some extrapolation is required as well.In article <dda28h$21m5$1 digitaldaemon.com>, Stewart Gordon says...This would mean that: 1. All undefined testcases in DStress should be either pass or xfail, defeating the point. 2. If some undefined code crashes the compiler, then crashing the compiler is correct behaviour.Sean Kelly wrote:I disagree. DMD is a reference implementation, so where the spec is vague DMD should serve as an authoritative refernce on expected behavior.But it looks like the point you were trying to make is that 'C' and 'c' are considered equivalent keys, which I would consider correct behavior given the current implementation.Correct behaviour is determined by the spec, not by some implementaion.3. Third-party compiler writers would have to hop back and forth between the spec and the reference implementation. This strikes me as absurd. A language should be implementable from the spec alone.I agree completely. But at the moment, the spec is a bit sparse. In some respects, I rather like this, as it makes me feel like some features I don't like aren't yet set in stone. Sean
Aug 09 2005
On Tue, 09 Aug 2005 17:20:07 +0100, Stewart Gordon wrote:Sean Kelly wrote:Absolutely true! But that's not how it is. Reality supersedes theory. DMD is both the best reference we have to date and it is not complete. It contains mistakes and 'prototype' code. The official D Language specification is not as up-to-date as the implementation of Walter's ideas - DMD. Walter does not follow best practices when it comes to software development; get over it. -- Derek Parnell Melbourne, Australia 10/08/2005 7:54:39 AMIn article <dda28h$21m5$1 digitaldaemon.com>, Stewart Gordon says...This would mean that: 1. All undefined testcases in DStress should be either pass or xfail, defeating the point. 2. If some undefined code crashes the compiler, then crashing the compiler is correct behaviour. 3. Third-party compiler writers would have to hop back and forth between the spec and the reference implementation. This strikes me as absurd. A language should be implementable from the spec alone.Sean Kelly wrote:I disagree. DMD is a reference implementation, so where the spec is vague DMD should serve as an authoritative refernce on expected behavior.But it looks like the point you were trying to make is that 'C' and 'c' are considered equivalent keys, which I would consider correct behavior given the current implementation.Correct behaviour is determined by the spec, not by some implementaion.
Aug 09 2005
In article <dd3cea$26u7$1 digitaldaemon.com>, Stewart Gordon says...It is possible to define the opCmp and opEquals methods so that unequal objects can rank equally in order. This is allowed by the Java specs, and indeed one or two Java API classes do this. However, when this is done in D, the current implementation of associative arrays falls apart.Oh, I should mention that all my prior experience with hash maps has been that they use equality for key comparisons, not equivalence. I was a bit surprised when I learned D's implementation doesn't work this way. While binary trees might be a tad faster in some cases, I'm not sure I like that anything I hash must be sortable. Sean
Aug 08 2005
Stewart Gordon <smjg_1998 yahoo.com> wrote in news:dd3cea$26u7$1 digitaldaemon.com [...]But there seems to be room for debate on what the best solution is. Possibilities: 1. State in the spec that it is an incorrect practice to define opCmp to return 0 if opEquals returns true.[...] Introducing alternative 4: Redesign of overloading operators. 1. In case of singletons also an `opEqual' is senseless and should be accompanied by an implementation with an `assert( 0);' 2. When an `opEqual' makes sense, then it is so basic, that it should be executed prior to the other cases except for performance reasons. 2.a. Therefore an `opNotEqual' should accompany the `opEqual' and the `opCmp' should be synthesized from both along the line: eq= opEqual(); if( eq) return 0; neq= opNotEqual(); if( eq != neq) return neq; else throw new Exception( "Aequivalence not allowed"); 2.b. If the above template is not suitable out of performance reasons the `opCmp' should not be synthesized and replaced by the definition of an `opCmpPerf'. Its then up to the adult programmer to assure the consistency of the design. -manfred
Aug 22 2005