digitalmars.D.learn - InputRange for data structure without order?
- Andy Valencia (11/11) Jun 03 I have a Set data structure which has no concept of order; its
- monkyyy (50/61) Jun 03 ```d
- =?UTF-8?Q?Ali_=C3=87ehreli?= (19/24) Jun 03 front() does not imply an order. It will work as long as you can provide...
- Andy Valencia (8/11) Jun 03 Ok, but now the data structure is told to popFront(). I
- H. S. Teoh (15/26) Jun 03 [...]
- Andy Valencia (18/24) Jun 03 So, concretely, my container has an opApply method, and because
- Jonathan M Davis (35/57) Jun 04 chain chains two ranges with the same element type. opApply doesn't real...
- Andy Valencia (4/9) Jun 04 Brilliant, thank you. Now I know what I'm looking for, and I see
- Monkyyy (4/8) Jun 04 Not "general". It's a safety vs speed tradeoff. Immutable
- =?UTF-8?Q?Ali_=C3=87ehreli?= (8/17) Jun 04 I don't understand. Like H. S. Teoh said, a range over a container
- H. S. Teoh (7/18) Jun 04 What has this got to do with immutable? I never said the container
- monkyyy (12/35) Jun 04 Iota should mutate the underlining data and does. Unjustifiably
- =?UTF-8?Q?Ali_=C3=87ehreli?= (28/35) Jun 04 There is no underlying data for some ranges like iota because they are
- monkyyy (6/17) Jun 04 Nullable is a 1 or 0 length container which did not need to have
- Andy Valencia (46/48) Jun 04 As an exercise in chaining opApply based containers, I tried:
I have a Set data structure which has no concept of order; its members are stored and can be searched efficiently based on their hash. I have a situation where chain()'ing them together would be convenient, but InputRange requires front() and popFront(). There really _isn't_ a front, and creating a new Set with an element removed would be unpleasant. What would be the best way to iterate across a list of such Set's? For now I'm just going to write the explicit loop. Thanks! Andy
Jun 03
On Tuesday, 3 June 2025 at 16:03:15 UTC, Andy Valencia wrote:I have a Set data structure which has no concept of order; its members are stored and can be searched efficiently based on their hash. I have a situation where chain()'ing them together would be convenient, but InputRange requires front() and popFront(). There really _isn't_ a front, and creating a new Set with an element removed would be unpleasant. What would be the best way to iterate across a list of such Set's? For now I'm just going to write the explicit loop. Thanks! Andy```d import std; struct myset{ bool has3; bool has5; auto opBinary(string s:"+")(typeof(this) a){ return myset(has3||a.has3,has5||a.has5); } alias chain=opBinary!"+"; int front(){ if(has3){ return 3;} if(has5){ return 5;} assert(0); } void popFront(){ if(has3){has3=false; return;} has5=false; } bool empty()=>( ! has3) && ( ! has5); } myset toset(R)(R r){ myset o; foreach(i;r){ if(i==3){o.has3=true;} if(i==5){o.has5=true;} } return o; } unittest{ myset a,b; a.has3=true; b.has5=true; (a+b).writeln; a.chain(a).chain(a).writeln; b.chain(b).chain(b).writeln; a.chain(b).writeln; auto c=a+b; c.map!(a=>a*2).writeln; iota(10).filter!(a=>a==3).toset.chain( iota(10).filter!(a=>a!=3).toset) .writeln; } ``` You can overload the names inside the datastructure and eagerly lower it as needed if you need it in something nested I have some more workarounds, but they start to suck.
Jun 03
On 6/3/25 9:03 AM, Andy Valencia wrote:I have a Set data structure which has no concept of order; its members are stored and can be searched efficiently based on their hash. I have a situation where chain()'ing them together would be convenient, but InputRange requires front() and popFront(). There really _isn't_ a frontfront() does not imply an order. It will work as long as you can provide the elements in a sequence. The built-in associative array feature is an example: import std.stdio; import std.range; void main() { int[int] squares; foreach (i; 0 .. 10) { squares[i] = i * i; } writeln("keys : ", squares.byKey); writeln("values: ", squares.byValue); } The elements are not in any particular order, yet associative arrays provide InputRanges (byKey and byValue return InputRange objects.): keys : [0, 6, 7, 2, 3, 1, 8, 5, 4, 9] values: [0, 36, 49, 4, 9, 1, 64, 25, 16, 81] Ali
Jun 03
On Tuesday, 3 June 2025 at 20:51:05 UTC, Ali Çehreli wrote:front() does not imply an order. It will work as long as you can provide the elements in a sequence. The built-in associative array feature is an example:Ok, but now the data structure is told to popFront(). I completely see how this would work for, say, slices. But when hashed, arbitrarily large, and unordered? Not so much. (I know how to code this walker using generators, but that does not play with the Range oriented API's so far as I can tell.) Thanks, Andy
Jun 03
On Tue, Jun 03, 2025 at 11:54:20PM +0000, Andy Valencia via Digitalmars-d-learn wrote:On Tuesday, 3 June 2025 at 20:51:05 UTC, Ali Çehreli wrote:[...] In general, it's a bad idea to conflate the container with the range over its elements. Built-in arrays are a bad example in this case, because they are both, and it just so happened that they work OK as such. But in general, containers should NOT be conflated with ranges. That only leads to wrong design. The correct way to do this is to create a method in the container that returns a range over its contents. Popping the range should NOT mutate the container. It should be regarded as something separate from the container. Only then will you get sane semantics. Trying to conflate an unordered container with a range will only lead to pain and bugs. T -- You have to expect the unexpected. -- RLfront() does not imply an order. It will work as long as you can provide the elements in a sequence. The built-in associative array feature is an example:Ok, but now the data structure is told to popFront(). I completely see how this would work for, say, slices. But when hashed, arbitrarily large, and unordered? Not so much. (I know how to code this walker using generators, but that does not play with the Range oriented API's so far as I can tell.)
Jun 03
On Wednesday, 4 June 2025 at 02:11:18 UTC, H. S. Teoh wrote:The correct way to do this is to create a method in the container that returns a range over its contents. Popping the range should NOT mutate the container. It should be regarded as something separate from the container. Only then will you get sane semantics. Trying to conflate an unordered container with a range will only lead to pain and bugs.So, concretely, my container has an opApply method, and because of that foreach works to iterate its contents. Imagine that I now want to chain() across several of these, along the lines of: ```d foreach(val; s1) { ...; } // But now: foreach(val; chain(s1, s2, s3)) { ...; } ``` What API do I add to my container so that chain will do its magic? I don't see the infrastructure under std.range.chain using it. Or is there a non-Range chain I haven't found yet? Thank you again, Andy
Jun 03
On Tuesday, June 3, 2025 9:39:35 PM Mountain Daylight Time Andy Valencia via Digitalmars-d-learn wrote:On Wednesday, 4 June 2025 at 02:11:18 UTC, H. S. Teoh wrote:chain chains two ranges with the same element type. opApply doesn't really have anything to do with ranges. opApply was the way to give a type the ability to be used with foreach in D1, and it was left in for D2. You can still use it, but in general, the preferred approach in D2 is to use ranges. Typically, a container will implement opSlice (or opIndex, since confusingly, both can be used in this case) and have it return a range over the container. So, then you can do auto range = myContainer[]; and then iterate over the range. You can then also do foreach(e; myContainer[]) { ... } but in addition to foreach understanding the range API, it also will slice what it's given if that type can be sliced with no arguments. So, foreach(e; myContainer) { ... } will have the same semantics if myContainer implements opSlice with no arguments, and its opSlice returns a range. For the range API to work, a type needs to implement bool empty(); T front(); void popFront(); and ideally it would also implement typeof(this) save(); so that you can get a copy of the range and thus save the current iteration state. If you look at std.container, each container type follows this pattern. It implements opSlice which then returns a range over the container. You can then use the range over the container with range-based functions such as chain. - Jonathan M DavisThe correct way to do this is to create a method in the container that returns a range over its contents. Popping the range should NOT mutate the container. It should be regarded as something separate from the container. Only then will you get sane semantics. Trying to conflate an unordered container with a range will only lead to pain and bugs.So, concretely, my container has an opApply method, and because of that foreach works to iterate its contents. Imagine that I now want to chain() across several of these, along the lines of: ```d foreach(val; s1) { ...; } // But now: foreach(val; chain(s1, s2, s3)) { ...; } ``` What API do I add to my container so that chain will do its magic? I don't see the infrastructure under std.range.chain using it. Or is there a non-Range chain I haven't found yet?
Jun 04
On Wednesday, 4 June 2025 at 10:47:24 UTC, Jonathan M Davis wrote:Typically, a container will implement opSlice (or opIndex, since confusingly, both can be used in this case) and have it return a range over the container. So, then you can do auto range = myContainer[]; ...Brilliant, thank you. Now I know what I'm looking for, and I see the relevance of all the bits and pieces I'll need. Andy
Jun 04
On Wednesday, 4 June 2025 at 02:11:18 UTC, H. S. Teoh wrote:. But in general, containers should NOT be conflated with ranges. That only leads to wrong design. range should NOT mutate the container. It should be regarded as something separate from the container.Not "general". It's a safety vs speed tradeoff. Immutable whatever's data structure do just make allot of copies or allot of pointer overhead and indirection.
Jun 04
On 6/4/25 4:10 AM, Monkyyy wrote:On Wednesday, 4 June 2025 at 02:11:18 UTC, H. S. Teoh wrote:I don't understand. Like H. S. Teoh said, a range over a container should not alter the container. I'm not aware of any data structure where that's not the case. If a container has a way of accessing the elements (which it definitely has to because otherwise it's not really a container), then a light-weight range can be specified over the elements. Ali. But in general, containers should NOT be conflated with ranges. That only leads to wrong design. range should NOT mutate the container. It should be regarded as something separate from the container.Not "general". It's a safety vs speed tradeoff. Immutable whatever's data structure do just make allot of copies or allot of pointer overhead and indirection.
Jun 04
On Wed, Jun 04, 2025 at 12:00:57PM -0700, Ali Çehreli via Digitalmars-d-learn wrote:On 6/4/25 4:10 AM, Monkyyy wrote:What has this got to do with immutable? I never said the container should be immutable, only that range over its contents should not mutate it. T -- Being forced to write comments actually improves code, because it is easier to fix a crock than to explain it. -- G. SteeleOn Wednesday, 4 June 2025 at 02:11:18 UTC, H. S. Teoh wrote:. But in general, containers should NOT be conflated with ranges. That only leads to wrong design. range should NOT mutate the container. It should be regarded as something separate from the container.Not "general". It's a safety vs speed tradeoff. Immutable whatever's data structure do just make allot of copies or allot of pointer overhead and indirection.
Jun 04
On Wednesday, 4 June 2025 at 19:00:57 UTC, Ali Çehreli wrote:On 6/4/25 4:10 AM, Monkyyy wrote:Iota should mutate the underlining data and does. Unjustifiably slices are in the weird limbo, the autodecoding's popFront is raw mutation. I think the some of the rng primitives also just mutate. Nullable is a range for some God-Forsaken reason, it would have to mutate data. And thats just whats in phoboes rn, not even stuff Id write(op said he had a set and was worried about the speed). The purist "view of data" came after, and stuff like sorting *absolutely shouldnt* be held to it. Trade offs. 80% of datastructures and 90% of algorthims should be functional-y; but dont go full purist.On Wednesday, 4 June 2025 at 02:11:18 UTC, H. S. Teoh wrote:ranges. That. But in general, containers should NOT be conflated withregarded asonly leads to wrong design. range should NOT mutate the container. It should bewhatever'ssomething separate from the container.Not "general". It's a safety vs speed tradeoff. Immutabledata structure do just make allot of copies or allot ofpointer overheadand indirection.I don't understand. Like H. S. Teoh said, a range over a container should not alter the container. I'm not aware of any data structure where that's not the case. If a container has a way of accessing the elements (which it definitely has to because otherwise it's not really a container), then a light-weight range can be specified over the elements. Ali
Jun 04
On 6/4/25 12:53 PM, monkyyy wrote:Iota should mutate the underlining data and does.There is no underlying data for some ranges like iota because they are generators. Yes, it's a complication that D uses the term "range" for generators as well.Unjustifiably slices are in the weird limbo,My mental model of slices differ from many old-timer D people here. There is no limbo in my mental model: Slices are always a lightweight interface (i.e. the range) to contained elements. What is missing in D language is that there is always a container of elements that the slice provides access to. The fact that the container (the array) not having a name causes the confusion. People necessarily say "add an element to the slice". That is not accurate in my mental model: The element is always added to the container, the slice is just an interface to it. Yes, it's weird that an element accessor is capable of adding elements to the unnamed array. I accept my mental model despite this weirdness.the autodecoding's popFront is raw mutation.As the following example shows, there is no mutation to the contained elements though. Only the slice (the range) changes: import std.range : popFront; void main() { auto slice1 = [ 1, 10, 42 ]; auto slice2 = slice1; slice2.popFront(); assert(slice1 == [ 1, 10, 42 ]); // <-- No mutation }I think the some of the rng primitives also just mutate. Nullable is a range for some God-Forsaken reason, it would have to mutate data.I wasn't aware of that. I wonder whether someone thought it would improve generality? But that topic is beside the point here, which was the separation of containers and ranges. Ali
Jun 04
On Wednesday, 4 June 2025 at 20:34:30 UTC, Ali Çehreli wrote:generators My mental model of slices differNullable is a 1 or 0 length container which did not need to have a [] called on it; and I dont really want to hear you call it a generator. Your already making an exception for the "ranges are views of data"; its just not surviving contact with all counter examples.I think the some of the rng primitives also just mutate.Nullable is arange for some God-Forsaken reason, it would have to mutatedata. I wasn't aware of that. I wonder whether someone thought it would improve generality? But that topic is beside the point here, which was the separation of containers and ranges. Ali
Jun 04
On Tuesday, 3 June 2025 at 16:03:15 UTC, Andy Valencia wrote:I have a situation where chain()'ing them together would be convenient, but InputRange requires front() and popFront().As an exercise in chaining opApply based containers, I tried: ```d // Chain across containers; their type is T and what they hold is type B class Chain(T,B) { private T[] members; this(T...)(T args) { foreach(m; args) { this.members ~= *m; } } int opApply(int delegate(ref B) ops) const { int res = 0; foreach(m; this.members) { foreach(val; m) { res = ops(val); if (res) { break; } } } return res; } } ``` I'm a little iffy on the walk of the args and the *m deref. But it tests out OK in practice. Using one looks like: ```d import tiny.set : Set; alias SetInt = Set!int; auto s1 = new SetInt(); s1.add(1); s1.add(2); s1.add(3); auto s2 = new SetInt(); s2.add(4); s2.add(5); s2.add(6); auto s3 = new SetInt(); foreach(x; new Chain!(SetInt,int)(s1, s2)) { s3.add(x); } ``` Andy
Jun 04