digitalmars.D - AA implementation algo should be mentioned
- Davidl 126.com (5/5) Mar 17 2007 without looking into phobos or join irc channel people would
- Sean Kelly (9/14) Mar 17 2007 I personally love the C++ spec in this respect. Everything in the
- Frits van Bommel (9/23) Mar 17 2007 I agree it would be nice if this were specified.
- Frits van Bommel (4/7) Mar 17 2007 Oh sorry, forgot about resizing. What does that make it, average
- Sean Kelly (3/11) Mar 17 2007 I think so.
- Sean Kelly (7/32) Mar 17 2007 Oops, you're right. They're typically O(N) with a best case (omega?) of...
- Frits van Bommel (9/21) Mar 17 2007 I don't think so. AFAIK, it's just a technique that attempts to avoid
- Sean Kelly (7/17) Mar 17 2007 Could be, I suppose. Though I'd have to tell a Data Structures
- Dejan Lekic (1/1) Mar 17 2007 You can always get the source, David...
- Derek Parnell (9/14) Mar 17 2007 Isn't the alogrithm used an implementation detail of the compiler and no...
- Tom (5/16) Mar 17 2007 I agree, but I think that time/space complexity IS mandatory in the
without looking into phobos or join irc channel people would never know what algo is used by AA. and that would make people feel AA is not reliable, why not mention the algo used by AA and the complexity on the page? i think that would make the doc looks better
Mar 17 2007
Davidl 126.com wrote:without looking into phobos or join irc channel people would never know what algo is used by AA. and that would make people feel AA is not reliable, why not mention the algo used by AA and the complexity on the page? i think that would make the doc looks betterI personally love the C++ spec in this respect. Everything in the library is treated as an algorithm and complexity information is defined for all operations. However, I'm not sure that specifics about the algorithm should be put into the D spec for AAs beyond that they must be O(1) for lookups, O(N) for iteration via foreach, and possibly some constraints on insert/remove performance (possibly amortized O(1) for inserts and maybe O(N) for removals?). Sean
Mar 17 2007
Sean Kelly wrote:Davidl 126.com wrote:I agree it would be nice if this were specified. However, I don't think inserts are currently amortized O(1), are they? IIRC the Phobos implementation uses a hash table with binary trees for collision handling, which would mean /average/ O(1) for "good" toHash() implementations, but worst-case O(log N) (for "bad" toHash()s). Also, that would mean removals are currently O(log N) as well, but specifying O(N) would allow other standard library implementations to use linked lists instead of binary trees so perhaps that's for the best.without looking into phobos or join irc channel people would never know what algo is used by AA. and that would make people feel AA is not reliable, why not mention the algo used by AA and the complexity on the page? i think that would make the doc looks betterI personally love the C++ spec in this respect. Everything in the library is treated as an algorithm and complexity information is defined for all operations. However, I'm not sure that specifics about the algorithm should be put into the D spec for AAs beyond that they must be O(1) for lookups, O(N) for iteration via foreach, and possibly some constraints on insert/remove performance (possibly amortized O(1) for inserts and maybe O(N) for removals?).
Mar 17 2007
Frits van Bommel wrote: [about inserts]IIRC the Phobos implementation uses a hash table with binary trees for collision handling, which would mean /average/ O(1) for "good" toHash() implementations, but worst-case O(log N) (for "bad" toHash()s).Oh sorry, forgot about resizing. What does that make it, average amortized O(1)? :P
Mar 17 2007
Frits van Bommel wrote:Frits van Bommel wrote: [about inserts]I think so. SeanIIRC the Phobos implementation uses a hash table with binary trees for collision handling, which would mean /average/ O(1) for "good" toHash() implementations, but worst-case O(log N) (for "bad" toHash()s).Oh sorry, forgot about resizing. What does that make it, average amortized O(1)? :P
Mar 17 2007
Frits van Bommel wrote:Sean Kelly wrote:Oops, you're right. They're typically O(N) with a best case (omega?) of 1. Double hashing is O(1) for inserts I think, but removals tend to be somewhat messy.Davidl 126.com wrote:I agree it would be nice if this were specified. However, I don't think inserts are currently amortized O(1), are they?without looking into phobos or join irc channel people would never know what algo is used by AA. and that would make people feel AA is not reliable, why not mention the algo used by AA and the complexity on the page? i think that would make the doc looks betterI personally love the C++ spec in this respect. Everything in the library is treated as an algorithm and complexity information is defined for all operations. However, I'm not sure that specifics about the algorithm should be put into the D spec for AAs beyond that they must be O(1) for lookups, O(N) for iteration via foreach, and possibly some constraints on insert/remove performance (possibly amortized O(1) for inserts and maybe O(N) for removals?).IIRC the Phobos implementation uses a hash table with binary trees for collision handling, which would mean /average/ O(1) for "good" toHash() implementations, but worst-case O(log N) (for "bad" toHash()s). Also, that would mean removals are currently O(log N) as well, but specifying O(N) would allow other standard library implementations to use linked lists instead of binary trees so perhaps that's for the best.Yup, which is correct IMO. There should be no restriction on implementation other than that some form of hashing is desired :-) Sean
Mar 17 2007
Sean Kelly wrote:Double hashing is O(1) for inserts I think, but removals tend to be somewhat messy.I don't think so. AFAIK, it's just a technique that attempts to avoid the clustering effect that linear probing suffers from. However, in a full table it can still take O(N) probes before an empty slot is found. Wikipedia seems to back me up on this: "Like all other forms of open addressing, double hashing becomes linear as the hash table approaches maximum capacity" (http://en.wikipedia.org/wiki/Double_hashing)Frits van Bommel wrote:Right, I just wanted to mention that. (IIRC that's also the philosophy the C++ standard follows on this point)IIRC the Phobos implementation uses a hash table with binary trees for collision handling, which would mean /average/ O(1) for "good" toHash() implementations, but worst-case O(log N) (for "bad" toHash()s). Also, that would mean removals are currently O(log N) as well, but specifying O(N) would allow other standard library implementations to use linked lists instead of binary trees so perhaps that's for the best.Yup, which is correct IMO. There should be no restriction on implementation other than that some form of hashing is desired :-)
Mar 17 2007
Frits van Bommel wrote:Sean Kelly wrote:Could be, I suppose. Though I'd have to tell a Data Structures professor I know that his proof is incorrect. I suppose I could be mis-remembering the conclusion as well though. Hm... it may have been based on particular growth conditions, like never letting the table get more than half full before a rehash. SeanDouble hashing is O(1) for inserts I think, but removals tend to be somewhat messy.I don't think so. AFAIK, it's just a technique that attempts to avoid the clustering effect that linear probing suffers from. However, in a full table it can still take O(N) probes before an empty slot is found. Wikipedia seems to back me up on this: "Like all other forms of open addressing, double hashing becomes linear as the hash table approaches maximum capacity" (http://en.wikipedia.org/wiki/Double_hashing)
Mar 17 2007
On Sat, 17 Mar 2007 23:29:18 +0800, Davidl 126.com wrote:without looking into phobos or join irc channel people would never know what algo is used by AA. and that would make people feel AA is not reliable, why not mention the algo used by AA and the complexity on the page? i think that would make the doc looks betterIsn't the alogrithm used an implementation detail of the compiler and not the language? I would see that any description on the AA alogrithm should be placed in the docs for the compiler and not mandated into the language. -- Derek Parnell Melbourne, Australia "Justice for David Hicks!" skype: derek.j.parnell
Mar 17 2007
Derek Parnell escribió:On Sat, 17 Mar 2007 23:29:18 +0800, Davidl 126.com wrote:I agree, but I think that time/space complexity IS mandatory in the language specification. -- Tom;without looking into phobos or join irc channel people would never know what algo is used by AA. and that would make people feel AA is not reliable, why not mention the algo used by AA and the complexity on the page? i think that would make the doc looks betterIsn't the alogrithm used an implementation detail of the compiler and not the language? I would see that any description on the AA alogrithm should be placed in the docs for the compiler and not mandated into the language.
Mar 17 2007