digitalmars.D - Imprecise running time for topN?
- Magnus Lie Hetland (17/17) Feb 01 2011 I was reading the docs for std.algorithm, when I came across topN. This
- Andrei Alexandrescu (5/19) Feb 01 2011 You're right (and randomization should be there, too). Could you please
- Magnus Lie Hetland (6/15) Feb 01 2011 Will do. :)
I was reading the docs for std.algorithm, when I came across topN. This is, of course, a highly useful problem, with several solutions; I was a bit surprised to see the claim that it runs in linear time. As far as I know, the only ways of achieving that would be (1) using the super-elegant, but highly inefficient, algorithm of Blum, Floyd, Pratt, Rivest and Tarjan, often known as Select, or (2) using soft heaps. (The latter, I know less about.) Checking the source, I found that -- as I suspected -- it uses the more common Randomized-Select (without actual randomization here, though), which only has an *expected* (or average-case) linear running time. It suffers the same worst-case problems as Quicksort. I'm not objecting to the use of algorithm -- it's a good choice in practice -- but the docs should probably specify that the linear guarantee does not hold in the worst case? -- Magnus Lie Hetland http://hetland.org
Feb 01 2011
On 2/1/11 8:12 AM, Magnus Lie Hetland wrote:I was reading the docs for std.algorithm, when I came across topN. This is, of course, a highly useful problem, with several solutions; I was a bit surprised to see the claim that it runs in linear time. As far as I know, the only ways of achieving that would be (1) using the super-elegant, but highly inefficient, algorithm of Blum, Floyd, Pratt, Rivest and Tarjan, often known as Select, or (2) using soft heaps. (The latter, I know less about.) Checking the source, I found that -- as I suspected -- it uses the more common Randomized-Select (without actual randomization here, though), which only has an *expected* (or average-case) linear running time. It suffers the same worst-case problems as Quicksort. I'm not objecting to the use of algorithm -- it's a good choice in practice -- but the docs should probably specify that the linear guarantee does not hold in the worst case?You're right (and randomization should be there, too). Could you please add a bugzilla entry so we don't forget about this? http://d.puremagic.com/issues. Thanks! Andrei
Feb 01 2011
On 2011-02-01 16:29:56 +0100, Andrei Alexandrescu said:On 2/1/11 8:12 AM, Magnus Lie Hetland wrote:[snip]Will do. :)I'm not objecting to the use of algorithm -- it's a good choice in practice -- but the docs should probably specify that the linear guarantee does not hold in the worst case?You're right (and randomization should be there, too). Could you please add a bugzilla entry so we don't forget about this? http://d.puremagic.com/issues. Thanks!Andrei-- Magnus Lie Hetland http://hetland.org
Feb 01 2011