digitalmars.D - Heap: container or range?
- Andrei Alexandrescu (15/15) Jan 29 2009 A "computer science heap" is a structure that offers fast access to the
- Daniel Keep (6/27) Jan 29 2009 I suppose it comes down to whether or not the typical case is to either
- Bill Baxter (12/26) Jan 29 2009 In response to your previous question of how to distinguish from a
- Andrei Alexandrescu (10/41) Jan 29 2009 Ah, never mind all that. I realized that I can't have a heap range. This...
- Bill Baxter (13/62) Jan 29 2009 Messed 'h' up or messed 'a' up?
- Andrei Alexandrescu (8/31) Jan 29 2009 'a' is messed up (mutated) by definition. The problem is that after h
- Bill Baxter (5/38) Jan 29 2009 Ok. great. I thought that the slicing notation to make a sequence
- Christopher Wright (3/22) Jan 29 2009 I'm not seeing that. take should iterate through _h_, which will pop
- Jason House (3/48) Jan 29 2009 I would have expected index take(h,5)[0] to be 16...
- Andrei Alexandrescu (7/11) Jan 29 2009 Both ways have advantages and disadvantages. My early experiments
- Sean Kelly (11/21) Jan 29 2009 Hm... so assuming this is a min heap, I have:
- Sean Kelly (6/30) Jan 29 2009 Oh, I would also expect:
- Andrei Alexandrescu (4/39) Jan 29 2009 Yah, that's what I meant. h still thinks its length is 10 and that those...
- Brad Roberts (5/46) Jan 29 2009 Since take is a mutator, that implies that h needs to be a reference
- Andrei Alexandrescu (46/48) Jan 30 2009 That's a great question, and one that I don't have a definitive answer
- Denis Koroskin (3/7) Jan 30 2009 Why is a not [2, 3] now?
- Sean Kelly (7/7) Jan 29 2009 I don't know if it matters, but I added the heap routines to Tango
- Andrei Alexandrescu (7/13) Jan 30 2009 A great use for heaps is topNCopy: given an input range (!), give me
- Sergey Gromov (11/49) Jan 30 2009 Actually, if ranges are by value, I'd expect
- Andrei Alexandrescu (13/62) Jan 30 2009 Me too. It does work indeed BUT NOT FOR INPUT RANGES. So in short you
- Denis Koroskin (3/13) Jan 30 2009 Same here. I feel that take (and some other adaptors) should accept rang...
- Andrei Alexandrescu (23/35) Jan 30 2009 While I'm not 100% clear on what take should do, this particular
- Brad Roberts (6/32) Jan 30 2009 I hate to come back to naming, but I think that's a major part of what's...
- Andrei Alexandrescu (3/38) Jan 30 2009 Names are important. "take" is taken from Haskell.
- Derek Parnell (6/7) Jan 30 2009 No it hasn't. Haskell still has it so its been "copied" from Haskell. :-...
- BCS (2/6) Jan 29 2009 what would be the problem of making it (optionaly) both?
- Yigal Chripun (3/18) Jan 29 2009 regarding your previous question:
- Bill Baxter (6/8) Jan 29 2009 I the distinction is that a Heap is one way to implement a Priority
- Yigal Chripun (2/11) Jan 30 2009 Ah. right.. :)
- Don (16/37) Jan 30 2009 I think this depends on whether heap operations are required (and
A "computer science heap" is a structure that offers fast access to the largest element and fast extraction of it (which in turn provides access to the next largest element etc.). I'm just done working on the heap in std.algorithm. Now, it turns out that heap supports both a meaningful definition as a full-fledged container, and a beautiful definition as a range. If Heap is a range, you initiate it with another range, which Heap organizes in the heap manner. Then, successive calls to next() nicely extract elements starting from the largest. If the underlying range supports put(), then Heap also supports put() to insert into the heap. Heap as a container would offer similar primitives but in addition would "own" its data (would call destructors upon destruction, and would support value copying). What do you think? Should I make Heap a container or a range? Andrei
Jan 29 2009
Andrei Alexandrescu wrote:A "computer science heap" is a structure that offers fast access to the largest element and fast extraction of it (which in turn provides access to the next largest element etc.). I'm just done working on the heap in std.algorithm. Now, it turns out that heap supports both a meaningful definition as a full-fledged container, and a beautiful definition as a range. If Heap is a range, you initiate it with another range, which Heap organizes in the heap manner. Then, successive calls to next() nicely extract elements starting from the largest. If the underlying range supports put(), then Heap also supports put() to insert into the heap. Heap as a container would offer similar primitives but in addition would "own" its data (would call destructors upon destruction, and would support value copying). What do you think? Should I make Heap a container or a range? AndreiI suppose it comes down to whether or not the typical case is to either have a heap or heap-view of another container. I'd suspect the first, to be honest. You could always just have Heap and HeapView. :) -- Daniel
Jan 29 2009
On Fri, Jan 30, 2009 at 11:49 AM, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:A "computer science heap" is a structure that offers fast access to the largest element and fast extraction of it (which in turn provides access to the next largest element etc.).In response to your previous question of how to distinguish from a memory heap, you can use the specific name for the kind of heap it is, like BinaryHeap, Binomial Heap, Fibbonocci Heap, etc.I'm just done working on the heap in std.algorithm. Now, it turns out that heap supports both a meaningful definition as a full-fledged container, and a beautiful definition as a range. If Heap is a range, you initiate it with another range, which Heap organizes in the heap manner. Then, successive calls to next() nicely extract elements starting from the largest. If the underlying range supports put(), then Heap also supports put() to insert into the heap. Heap as a container would offer similar primitives but in addition would "own" its data (would call destructors upon destruction, and would support value copying). What do you think? Should I make Heap a container or a range?Both sound useful depending on circumstances. One provides the all-in-one convenient solution, the other is more of a "non-invasive heap". If you emphasize the non-invasive version, then creating the all-in-one version out of that should be pretty easy. But you can't take an all-in-one and turn it non-invasive so easily. --bb
Jan 29 2009
Bill Baxter wrote:On Fri, Jan 30, 2009 at 11:49 AM, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:BinaryHeap it is. Thanks!A "computer science heap" is a structure that offers fast access to the largest element and fast extraction of it (which in turn provides access to the next largest element etc.).In response to your previous question of how to distinguish from a memory heap, you can use the specific name for the kind of heap it is, like BinaryHeap, Binomial Heap, Fibbonocci Heap, etc.Ah, never mind all that. I realized that I can't have a heap range. This is because heaps must mutate the underlying range in-place and any copy will mess the heap up. Here's the example that clarify it for me: int[] a = [ 4, 1, 3, 2, 16, 9, 10, 14, 8, 7 ]; auto h = Heap!(int[])(a); assert(equal(take(5, h) == [ 8, 9, 10, 14, 16 ]); At this point the mutations done by take messed up h already!! AndreiI'm just done working on the heap in std.algorithm. Now, it turns out that heap supports both a meaningful definition as a full-fledged container, and a beautiful definition as a range. If Heap is a range, you initiate it with another range, which Heap organizes in the heap manner. Then, successive calls to next() nicely extract elements starting from the largest. If the underlying range supports put(), then Heap also supports put() to insert into the heap. Heap as a container would offer similar primitives but in addition would "own" its data (would call destructors upon destruction, and would support value copying). What do you think? Should I make Heap a container or a range?Both sound useful depending on circumstances. One provides the all-in-one convenient solution, the other is more of a "non-invasive heap". If you emphasize the non-invasive version, then creating the all-in-one version out of that should be pretty easy. But you can't take an all-in-one and turn it non-invasive so easily.
Jan 29 2009
On Fri, Jan 30, 2009 at 12:14 PM, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:Bill Baxter wrote:Messed 'h' up or messed 'a' up? Anyway, in that case it would be nice if you can easily run the heap indirectly on an index or pointer buffer. Will something like this work? HeavyElement[] arrayOfHeavies; auto h = Heap!(int[], (int i, int j){return arrayOfHeavies[i]<arrayOfHeavies[j];} )( 0..arrayOfHeavies.length ) Now arrayOfHeavies[h.pop()] should give you the smallest element of the array of heavies, without ever having to copy/swap anything but ints. --bbOn Fri, Jan 30, 2009 at 11:49 AM, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:BinaryHeap it is. Thanks!A "computer science heap" is a structure that offers fast access to the largest element and fast extraction of it (which in turn provides access to the next largest element etc.).In response to your previous question of how to distinguish from a memory heap, you can use the specific name for the kind of heap it is, like BinaryHeap, Binomial Heap, Fibbonocci Heap, etc.Ah, never mind all that. I realized that I can't have a heap range. This is because heaps must mutate the underlying range in-place and any copy will mess the heap up. Here's the example that clarify it for me: int[] a = [ 4, 1, 3, 2, 16, 9, 10, 14, 8, 7 ]; auto h = Heap!(int[])(a); assert(equal(take(5, h) == [ 8, 9, 10, 14, 16 ]); At this point the mutations done by take messed up h already!!I'm just done working on the heap in std.algorithm. Now, it turns out that heap supports both a meaningful definition as a full-fledged container, and a beautiful definition as a range. If Heap is a range, you initiate it with another range, which Heap organizes in the heap manner. Then, successive calls to next() nicely extract elements starting from the largest. If the underlying range supports put(), then Heap also supports put() to insert into the heap. Heap as a container would offer similar primitives but in addition would "own" its data (would call destructors upon destruction, and would support value copying). What do you think? Should I make Heap a container or a range?Both sound useful depending on circumstances. One provides the all-in-one convenient solution, the other is more of a "non-invasive heap". If you emphasize the non-invasive version, then creating the all-in-one version out of that should be pretty easy. But you can't take an all-in-one and turn it non-invasive so easily.
Jan 29 2009
Bill Baxter wrote:On Fri, Jan 30, 2009 at 12:14 PM, Andrei Alexandrescu'a' is messed up (mutated) by definition. The problem is that after h was copied to do take's deed, it got messed up (i.e., doesn't respect the heap property anymore).Ah, never mind all that. I realized that I can't have a heap range. This is because heaps must mutate the underlying range in-place and any copy will mess the heap up. Here's the example that clarify it for me: int[] a = [ 4, 1, 3, 2, 16, 9, 10, 14, 8, 7 ]; auto h = Heap!(int[])(a); assert(equal(take(5, h) == [ 8, 9, 10, 14, 16 ]); At this point the mutations done by take messed up h already!!Messed 'h' up or messed 'a' up?Anyway, in that case it would be nice if you can easily run the heap indirectly on an index or pointer buffer. Will something like this work? HeavyElement[] arrayOfHeavies; auto h = Heap!(int[], (int i, int j){return arrayOfHeavies[i]<arrayOfHeavies[j];} )( 0..arrayOfHeavies.length ) Now arrayOfHeavies[h.pop()] should give you the smallest element of the array of heavies, without ever having to copy/swap anything but ints.Yah, that should work. You just need to initially fill the heap with the numbers 0 to arrayOfHeavies.length. I presume that's what you meant with the illegal slicing notation :o). Andrei
Jan 29 2009
On Fri, Jan 30, 2009 at 1:17 PM, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:Bill Baxter wrote:Ok. great. I thought that the slicing notation to make a sequence worked everywhere now in D2. I guess it's only inside a foreach. --bbOn Fri, Jan 30, 2009 at 12:14 PM, Andrei Alexandrescu'a' is messed up (mutated) by definition. The problem is that after h was copied to do take's deed, it got messed up (i.e., doesn't respect the heap property anymore).Ah, never mind all that. I realized that I can't have a heap range. This is because heaps must mutate the underlying range in-place and any copy will mess the heap up. Here's the example that clarify it for me: int[] a = [ 4, 1, 3, 2, 16, 9, 10, 14, 8, 7 ]; auto h = Heap!(int[])(a); assert(equal(take(5, h) == [ 8, 9, 10, 14, 16 ]); At this point the mutations done by take messed up h already!!Messed 'h' up or messed 'a' up?Anyway, in that case it would be nice if you can easily run the heap indirectly on an index or pointer buffer. Will something like this work? HeavyElement[] arrayOfHeavies; auto h = Heap!(int[], (int i, int j){return arrayOfHeavies[i]<arrayOfHeavies[j];} )( 0..arrayOfHeavies.length ) Now arrayOfHeavies[h.pop()] should give you the smallest element of the array of heavies, without ever having to copy/swap anything but ints.Yah, that should work. You just need to initially fill the heap with the numbers 0 to arrayOfHeavies.length. I presume that's what you meant with the illegal slicing notation :o).
Jan 29 2009
Andrei Alexandrescu wrote:Bill Baxter wrote:I'm not seeing that. take should iterate through _h_, which will pop elements from _a_ (or a copy thereof) while maintaining the heap invariant.On Fri, Jan 30, 2009 at 12:14 PM, Andrei Alexandrescu'a' is messed up (mutated) by definition. The problem is that after h was copied to do take's deed, it got messed up (i.e., doesn't respect the heap property anymore).Ah, never mind all that. I realized that I can't have a heap range. This is because heaps must mutate the underlying range in-place and any copy will mess the heap up. Here's the example that clarify it for me: int[] a = [ 4, 1, 3, 2, 16, 9, 10, 14, 8, 7 ]; auto h = Heap!(int[])(a); assert(equal(take(5, h) == [ 8, 9, 10, 14, 16 ]); At this point the mutations done by take messed up h already!!Messed 'h' up or messed 'a' up?
Jan 29 2009
Andrei Alexandrescu Wrote:Bill Baxter wrote:I would have expected index take(h,5)[0] to be 16... I still have yet to come to terms with passing ranges by value. I would expect take to modify my range. I naturally expect heaps to be destructive as elements are taken out. I also expect ranges to shrink. I don't see any issue...On Fri, Jan 30, 2009 at 11:49 AM, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:BinaryHeap it is. Thanks!A "computer science heap" is a structure that offers fast access to the largest element and fast extraction of it (which in turn provides access to the next largest element etc.).In response to your previous question of how to distinguish from a memory heap, you can use the specific name for the kind of heap it is, like BinaryHeap, Binomial Heap, Fibbonocci Heap, etc.Ah, never mind all that. I realized that I can't have a heap range. This is because heaps must mutate the underlying range in-place and any copy will mess the heap up. Here's the example that clarify it for me: int[] a = [ 4, 1, 3, 2, 16, 9, 10, 14, 8, 7 ]; auto h = Heap!(int[])(a); assert(equal(take(5, h) == [ 8, 9, 10, 14, 16 ]); At this point the mutations done by take messed up h already!!I'm just done working on the heap in std.algorithm. Now, it turns out that heap supports both a meaningful definition as a full-fledged container, and a beautiful definition as a range. If Heap is a range, you initiate it with another range, which Heap organizes in the heap manner. Then, successive calls to next() nicely extract elements starting from the largest. If the underlying range supports put(), then Heap also supports put() to insert into the heap. Heap as a container would offer similar primitives but in addition would "own" its data (would call destructors upon destruction, and would support value copying). What do you think? Should I make Heap a container or a range?Both sound useful depending on circumstances. One provides the all-in-one convenient solution, the other is more of a "non-invasive heap". If you emphasize the non-invasive version, then creating the all-in-one version out of that should be pretty easy. But you can't take an all-in-one and turn it non-invasive so easily.
Jan 29 2009
I still have yet to come to terms with passing ranges by value. I would expect take to modify my range. I naturally expect heaps to be destructive as elements are taken out. I also expect ranges to shrink. I don't see any issue...Both ways have advantages and disadvantages. My early experiments involved passing by reference and it was an absolute mess: I'd forget a "ref" here and there with weird results, or I'd forget to save a copy of my range and I'd find is shrunk like an dehydrated fig in no time. One advantage of pass by value is that code is shorter and nicer - no need for a lot of temporaries. Andrei
Jan 29 2009
Andrei Alexandrescu wrote:Ah, never mind all that. I realized that I can't have a heap range. This is because heaps must mutate the underlying range in-place and any copy will mess the heap up. Here's the example that clarify it for me: int[] a = [ 4, 1, 3, 2, 16, 9, 10, 14, 8, 7 ]; auto h = Heap!(int[])(a); assert(equal(take(5, h) == [ 8, 9, 10, 14, 16 ]); At this point the mutations done by take messed up h already!!Hm... so assuming this is a min heap, I have: a = [4,1,3,2,16,9,10,14,8,7]; h = [1,2,3,4,7,9,10,14,8,16]; take(5, h); h = [8,10,9,16,14]; Shouldn't take call next repeatedly on the heap, which will in turn perform a popHeap operation? It's difficult to be sure what the result will be after take(5,h) occurs, but it should still be a valid heap, correct? Sean
Jan 29 2009
Sean Kelly wrote:Andrei Alexandrescu wrote:Oh, I would also expect: a = [8,10,9,16,14, garbage]; Since it isn't aware of the length adjustment. Perhaps this is what you meant? SeanAh, never mind all that. I realized that I can't have a heap range. This is because heaps must mutate the underlying range in-place and any copy will mess the heap up. Here's the example that clarify it for me: int[] a = [ 4, 1, 3, 2, 16, 9, 10, 14, 8, 7 ]; auto h = Heap!(int[])(a); assert(equal(take(5, h) == [ 8, 9, 10, 14, 16 ]); At this point the mutations done by take messed up h already!!Hm... so assuming this is a min heap, I have: a = [4,1,3,2,16,9,10,14,8,7]; h = [1,2,3,4,7,9,10,14,8,16]; take(5, h); h = [8,10,9,16,14]; Shouldn't take call next repeatedly on the heap, which will in turn perform a popHeap operation? It's difficult to be sure what the result will be after take(5,h) occurs, but it should still be a valid heap, correct?
Jan 29 2009
Sean Kelly wrote:Sean Kelly wrote:Yah, that's what I meant. h still thinks its length is 10 and that those 10 elements respect the heap property. AndreiAndrei Alexandrescu wrote:Oh, I would also expect: a = [8,10,9,16,14, garbage]; Since it isn't aware of the length adjustment. Perhaps this is what you meant? SeanAh, never mind all that. I realized that I can't have a heap range. This is because heaps must mutate the underlying range in-place and any copy will mess the heap up. Here's the example that clarify it for me: int[] a = [ 4, 1, 3, 2, 16, 9, 10, 14, 8, 7 ]; auto h = Heap!(int[])(a); assert(equal(take(5, h) == [ 8, 9, 10, 14, 16 ]); At this point the mutations done by take messed up h already!!Hm... so assuming this is a min heap, I have: a = [4,1,3,2,16,9,10,14,8,7]; h = [1,2,3,4,7,9,10,14,8,16]; take(5, h); h = [8,10,9,16,14]; Shouldn't take call next repeatedly on the heap, which will in turn perform a popHeap operation? It's difficult to be sure what the result will be after take(5,h) occurs, but it should still be a valid heap, correct?
Jan 29 2009
Andrei Alexandrescu wrote:Sean Kelly wrote:Since take is a mutator, that implies that h needs to be a reference type or a by ref value, right? So, why not make it so? Later, BradSean Kelly wrote:Yah, that's what I meant. h still thinks its length is 10 and that those 10 elements respect the heap property. AndreiAndrei Alexandrescu wrote:Oh, I would also expect: a = [8,10,9,16,14, garbage]; Since it isn't aware of the length adjustment. Perhaps this is what you meant? SeanAh, never mind all that. I realized that I can't have a heap range. This is because heaps must mutate the underlying range in-place and any copy will mess the heap up. Here's the example that clarify it for me: int[] a = [ 4, 1, 3, 2, 16, 9, 10, 14, 8, 7 ]; auto h = Heap!(int[])(a); assert(equal(take(5, h) == [ 8, 9, 10, 14, 16 ]); At this point the mutations done by take messed up h already!!Hm... so assuming this is a min heap, I have: a = [4,1,3,2,16,9,10,14,8,7]; h = [1,2,3,4,7,9,10,14,8,16]; take(5, h); h = [8,10,9,16,14]; Shouldn't take call next repeatedly on the heap, which will in turn perform a popHeap operation? It's difficult to be sure what the result will be after take(5,h) occurs, but it should still be a valid heap, correct?
Jan 29 2009
Brad Roberts wrote:Since take is a mutator, that implies that h needs to be a reference type or a by ref value, right? So, why not make it so?That's a great question, and one that I don't have a definitive answer to. There are several reasons for which by-ref manipulation is dicey: 1. Composition is harder. Consider: foreach (e; take(10, map!("a*a")(range))) ... In this case, take is passed an rvalue. Rvalues can't and shouldn't bind to references (I think C++'s rule of binding rvalues to const references was a subtle disaster that necessitated endless contortions in defining rvalue references). So then we need to explicitate: auto t = map!("a*a")(range); foreach (e; take(10, t)) ... 2. Difficulties in coding. My first attempt at ranges was in some formatting code. A by-ref doctrine requires that "ref" is added religiously to the trafficked values. That's not much of a problem except that, if you forget, the code still compiles and runs as told, except that the way it's told is not the way you meant. 3. More difficulties in coding Coding with more non-local dependencies is harder than coding that creates new, relatively independent entities. The examples given so far were as simple as: int[] a = [ 1, 2, 3 ]; take(1, a); // WHY IS A NOT [2, 3] NOW??? In reality, code is almost never that simple and the reality is that it becomes exceedingly harder to assess the state of a range after several other lazy ranges mutate it at their leisure. That's what I have so far. On the other hand, other functions are a tougher call. Consider "drop". drop(n, range) throws away at most n elements from the range. There are a few possible ways to define drop: a) By reference int[] a = [ 1, 2, 3 ]; drop(1, a); assert(a == [ 2, 3]); b) By value int[] a = [ 1, 2, 3 ]; drop(1, a); assert(a == [ 1, 2, 3]); a = drop(1, a); assert(a == [ 2, 3]); Unlike take, drop is eager, hence its by-ref implementation doesn't raise as many problems. I'm not sure what's the best choice for drop. Andrei
Jan 30 2009
Andrei Alexandrescu Wrote: [snip]int[] a = [ 1, 2, 3 ]; take(1, a);Why is a not [2, 3] now?
Jan 30 2009
I don't know if it matters, but I added the heap routines to Tango because I wanted heapsort, so they kind of came for free. Perhaps a key distinction between ranges and algorithms is that algorithms may mutate the underlying data, but ranges may not? I've been trying to think of another example of a mutating range and I haven't come up with one yet... unless you consider input ranges mutating, I suppose. Sean
Jan 29 2009
Sean Kelly wrote:I don't know if it matters, but I added the heap routines to Tango because I wanted heapsort, so they kind of came for free. Perhaps a key distinction between ranges and algorithms is that algorithms may mutate the underlying data, but ranges may not? I've been trying to think of another example of a mutating range and I haven't come up with one yet... unless you consider input ranges mutating, I suppose.A great use for heaps is topNCopy: given an input range (!), give me back the top N items in it, fast. The way that works is by maintaining a heap in the return value. I'm sure interesting cases will arise with ranges that mutate their host, but heap doesn't look like a good candidate. Andrei
Jan 30 2009
Thu, 29 Jan 2009 21:26:39 -0800, Andrei Alexandrescu wrote:Sean Kelly wrote:Actually, if ranges are by value, I'd expect assert(equal(take(5, h) == take(5, h))); OTOH, I'd expect something like the following to also work: auto f = new File("blah.txt"); auto top = take(10, f.lines); I.e. I'd expect some ranges to be Translators, and others to be Mutators. Translators are read-only views into underlying anything which may involve some expensive bookkeeping. Mutators are basically references into a particular container and make sure that the container and other Mutators stay consistent.Sean Kelly wrote:Yah, that's what I meant. h still thinks its length is 10 and that those 10 elements respect the heap property.Andrei Alexandrescu wrote:Oh, I would also expect: a = [8,10,9,16,14, garbage]; Since it isn't aware of the length adjustment. Perhaps this is what you meant? SeanAh, never mind all that. I realized that I can't have a heap range. This is because heaps must mutate the underlying range in-place and any copy will mess the heap up. Here's the example that clarify it for me: int[] a = [ 4, 1, 3, 2, 16, 9, 10, 14, 8, 7 ]; auto h = Heap!(int[])(a); assert(equal(take(5, h) == [ 8, 9, 10, 14, 16 ]); At this point the mutations done by take messed up h already!!Hm... so assuming this is a min heap, I have: a = [4,1,3,2,16,9,10,14,8,7]; h = [1,2,3,4,7,9,10,14,8,16]; take(5, h); h = [8,10,9,16,14]; Shouldn't take call next repeatedly on the heap, which will in turn perform a popHeap operation? It's difficult to be sure what the result will be after take(5,h) occurs, but it should still be a valid heap, correct?
Jan 30 2009
Sergey Gromov wrote:Thu, 29 Jan 2009 21:26:39 -0800, Andrei Alexandrescu wrote:Me too. It does work indeed BUT NOT FOR INPUT RANGES. So in short you are showing yet another reason why heaps can't be good ranges.Sean Kelly wrote:Actually, if ranges are by value, I'd expect assert(equal(take(5, h) == take(5, h)));Sean Kelly wrote:Yah, that's what I meant. h still thinks its length is 10 and that those 10 elements respect the heap property.Andrei Alexandrescu wrote:Oh, I would also expect: a = [8,10,9,16,14, garbage]; Since it isn't aware of the length adjustment. Perhaps this is what you meant? SeanAh, never mind all that. I realized that I can't have a heap range. This is because heaps must mutate the underlying range in-place and any copy will mess the heap up. Here's the example that clarify it for me: int[] a = [ 4, 1, 3, 2, 16, 9, 10, 14, 8, 7 ]; auto h = Heap!(int[])(a); assert(equal(take(5, h) == [ 8, 9, 10, 14, 16 ]); At this point the mutations done by take messed up h already!!Hm... so assuming this is a min heap, I have: a = [4,1,3,2,16,9,10,14,8,7]; h = [1,2,3,4,7,9,10,14,8,16]; take(5, h); h = [8,10,9,16,14]; Shouldn't take call next repeatedly on the heap, which will in turn perform a popHeap operation? It's difficult to be sure what the result will be after take(5,h) occurs, but it should still be a valid heap, correct?OTOH, I'd expect something like the following to also work: auto f = new File("blah.txt"); auto top = take(10, f.lines);That also works (just that in the new std.stdio File is a struct, so it's not dynamically allocated). However, f.lines is understandably an input range so this will NOT work: auto f = new File("blah.txt"); assert(equal(take(5, f.lines) == take(5, f.lines))); The two f.lines instances operate on the same underlying file handle, so they are input ranges: advancing one advances all others.I.e. I'd expect some ranges to be Translators, and others to be Mutators. Translators are read-only views into underlying anything which may involve some expensive bookkeeping. Mutators are basically references into a particular container and make sure that the container and other Mutators stay consistent.I think the distinction is input vs. forward and mutable vs. immutable ranges (the latter distinction is not checked in yet). Andrei
Jan 30 2009
Sergey Gromov Wrote: [snip]OTOH, I'd expect something like the following to also work: auto f = new File("blah.txt"); auto top = take(10, f.lines);Same here. I feel that take (and some other adaptors) should accept range by reference. One of the advantages of ranges over opApply is that you can start foreach, break somewhere (throw an exception, recover) and then continue. It also means that the range is mutated inside foreach. If foreach mutates the range, so can other algorithms that operate on ranges do.I.e. I'd expect some ranges to be Translators, and others to be Mutators. Translators are read-only views into underlying anything which may involve some expensive bookkeeping. Mutators are basically references into a particular container and make sure that the container and other Mutators stay consistent.
Jan 30 2009
Denis Koroskin wrote:Sergey Gromov Wrote: [snip]While I'm not 100% clear on what take should do, this particular argument is not that compelling. I really mean not compelling at all. Consider: int[] a = [ 1, 2, 3, 4, 5]; foreach (e; a) { if (e == 3) break; } assert(a == [4, 5]); Do you seriously expect that to be the case? Of course not. However, you would inconsistently expect this to happen: int[] a = [ 1, 2, 3, 4, 5]; foreach (e; take(4, a)) { if (e == 3) break; } assert(a == [4, 5]); Ranges are modeled to correspond to slices. Slices have few operations that manipulate them by reference (notably the controversial ~=) and range are made to be unsurprising when compared to slices. They are really extensions of slices. AndreiOTOH, I'd expect something like the following to also work: auto f = new File("blah.txt"); auto top = take(10, f.lines);Same here. I feel that take (and some other adaptors) should accept range by reference. One of the advantages of ranges over opApply is that you can start foreach, break somewhere (throw an exception, recover) and then continue. It also means that the range is mutated inside foreach. If foreach mutates the range, so can other algorithms that operate on ranges do.
Jan 30 2009
On Fri, 30 Jan 2009, Andrei Alexandrescu wrote:Consider: int[] a = [ 1, 2, 3, 4, 5]; foreach (e; a) { if (e == 3) break; } assert(a == [4, 5]); Do you seriously expect that to be the case? Of course not. However, you would inconsistently expect this to happen: int[] a = [ 1, 2, 3, 4, 5]; foreach (e; take(4, a)) { if (e == 3) break; } assert(a == [4, 5]); Ranges are modeled to correspond to slices. Slices have few operations that manipulate them by reference (notably the controversial ~=) and range are made to be unsurprising when compared to slices. They are really extensions of slices. AndreiI hate to come back to naming, but I think that's a major part of what's causing grief here. Take implies mutation of the input range. How about 'subrange' or 'subset' or 'leftmost' anything that doesn't imply removal. Later, Brad
Jan 30 2009
Brad Roberts wrote:On Fri, 30 Jan 2009, Andrei Alexandrescu wrote:Names are important. "take" is taken from Haskell. AndreiConsider: int[] a = [ 1, 2, 3, 4, 5]; foreach (e; a) { if (e == 3) break; } assert(a == [4, 5]); Do you seriously expect that to be the case? Of course not. However, you would inconsistently expect this to happen: int[] a = [ 1, 2, 3, 4, 5]; foreach (e; take(4, a)) { if (e == 3) break; } assert(a == [4, 5]); Ranges are modeled to correspond to slices. Slices have few operations that manipulate them by reference (notably the controversial ~=) and range are made to be unsurprising when compared to slices. They are really extensions of slices. AndreiI hate to come back to naming, but I think that's a major part of what's causing grief here. Take implies mutation of the input range. How about 'subrange' or 'subset' or 'leftmost' anything that doesn't imply removal. Later, Brad
Jan 30 2009
On Fri, 30 Jan 2009 14:46:28 -0800, Andrei Alexandrescu wrote:Names are important. "take" is taken from Haskell.No it hasn't. Haskell still has it so its been "copied" from Haskell. :-) -- Derek Parnell Melbourne, Australia skype: derek.j.parnell
Jan 30 2009
Hello Andrei,What do you think? Should I make Heap a container or a range? Andreiwhat would be the problem of making it (optionaly) both?
Jan 29 2009
Andrei Alexandrescu wrote:A "computer science heap" is a structure that offers fast access to the largest element and fast extraction of it (which in turn provides access to the next largest element etc.). I'm just done working on the heap in std.algorithm. Now, it turns out that heap supports both a meaningful definition as a full-fledged container, and a beautiful definition as a range. If Heap is a range, you initiate it with another range, which Heap organizes in the heap manner. Then, successive calls to next() nicely extract elements starting from the largest. If the underlying range supports put(), then Heap also supports put() to insert into the heap. Heap as a container would offer similar primitives but in addition would "own" its data (would call destructors upon destruction, and would support value copying). What do you think? Should I make Heap a container or a range? Andreiregarding your previous question: isn't a Heap also called a Priority Queue?
Jan 29 2009
On Fri, Jan 30, 2009 at 4:01 PM, Yigal Chripun <yigal100 gmail.com> wrote:regarding your previous question: isn't a Heap also called a Priority Queue?I the distinction is that a Heap is one way to implement a Priority Queue. You could implement a priority queue by scanning a linked list for the smallest item and it would still be a (really inefficient) Priority Queue. --bb
Jan 29 2009
Bill Baxter wrote:On Fri, Jan 30, 2009 at 4:01 PM, Yigal Chripun<yigal100 gmail.com> wrote:Ah. right.. :)regarding your previous question: isn't a Heap also called a Priority Queue?I the distinction is that a Heap is one way to implement a Priority Queue. You could implement a priority queue by scanning a linked list for the smallest item and it would still be a (really inefficient) Priority Queue. --bb
Jan 30 2009
Andrei Alexandrescu wrote:A "computer science heap" is a structure that offers fast access to the largest element and fast extraction of it (which in turn provides access to the next largest element etc.). I'm just done working on the heap in std.algorithm. Now, it turns out that heap supports both a meaningful definition as a full-fledged container, and a beautiful definition as a range. If Heap is a range, you initiate it with another range, which Heap organizes in the heap manner. Then, successive calls to next() nicely extract elements starting from the largest. If the underlying range supports put(), then Heap also supports put() to insert into the heap. Heap as a container would offer similar primitives but in addition would "own" its data (would call destructors upon destruction, and would support value copying). What do you think? Should I make Heap a container or a range? AndreiI think this depends on whether heap operations are required (and usable) in places other than in a heap container. For some applications (certain graph algorithms), you want two ways to access the data. You use an IndexedPriorityQueue structure, which contains both a heap of (key, value) pairs ordered by value, allowing you to pop the item with minimum value in O(log n) time, and also an array allowing you to access any item by key in O(1) time. Whenever you move items in the heap, you update the array at the same time. It's impossible to implement such a data stucture using STL heap primitives, and likewise, my guess is that it would be impossible in both a range and container Heaps (basically, you need to be able to hook in a user-defined function to be called whenever items are swapped in the heap). But if it is possible in one but not the other implementation, it should be favoured.
Jan 30 2009