www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - InputRange for data structure without order?

reply Andy Valencia <dont spam.me> writes:
I have a Set data structure which has no concept of order; its 
members are stored and can be searched efficiently based on their 
hash.

I have a situation where chain()'ing them together would be 
convenient, but InputRange requires front() and popFront().  
There really _isn't_ a front, and creating a new Set with an 
element removed would be unpleasant.

What would be the best way to iterate across a list of such 
Set's?  For now I'm just going to write the explicit loop.

Thanks!
Andy
Jun 03
next sibling parent monkyyy <crazymonkyyy gmail.com> writes:
On Tuesday, 3 June 2025 at 16:03:15 UTC, Andy Valencia wrote:
 I have a Set data structure which has no concept of order; its 
 members are stored and can be searched efficiently based on 
 their hash.

 I have a situation where chain()'ing them together would be 
 convenient, but InputRange requires front() and popFront().  
 There really _isn't_ a front, and creating a new Set with an 
 element removed would be unpleasant.

 What would be the best way to iterate across a list of such 
 Set's?  For now I'm just going to write the explicit loop.

 Thanks!
 Andy
```d import std; struct myset{ bool has3; bool has5; auto opBinary(string s:"+")(typeof(this) a){ return myset(has3||a.has3,has5||a.has5); } alias chain=opBinary!"+"; int front(){ if(has3){ return 3;} if(has5){ return 5;} assert(0); } void popFront(){ if(has3){has3=false; return;} has5=false; } bool empty()=>( ! has3) && ( ! has5); } myset toset(R)(R r){ myset o; foreach(i;r){ if(i==3){o.has3=true;} if(i==5){o.has5=true;} } return o; } unittest{ myset a,b; a.has3=true; b.has5=true; (a+b).writeln; a.chain(a).chain(a).writeln; b.chain(b).chain(b).writeln; a.chain(b).writeln; auto c=a+b; c.map!(a=>a*2).writeln; iota(10).filter!(a=>a==3).toset.chain( iota(10).filter!(a=>a!=3).toset) .writeln; } ``` You can overload the names inside the datastructure and eagerly lower it as needed if you need it in something nested I have some more workarounds, but they start to suck.
Jun 03
prev sibling next sibling parent reply =?UTF-8?Q?Ali_=C3=87ehreli?= <acehreli yahoo.com> writes:
On 6/3/25 9:03 AM, Andy Valencia wrote:
 I have a Set data structure which has no concept of order; its members
 are stored and can be searched efficiently based on their hash.

 I have a situation where chain()'ing them together would be convenient,
 but InputRange requires front() and popFront(). There really _isn't_ a
 front
front() does not imply an order. It will work as long as you can provide the elements in a sequence. The built-in associative array feature is an example: import std.stdio; import std.range; void main() { int[int] squares; foreach (i; 0 .. 10) { squares[i] = i * i; } writeln("keys : ", squares.byKey); writeln("values: ", squares.byValue); } The elements are not in any particular order, yet associative arrays provide InputRanges (byKey and byValue return InputRange objects.): keys : [0, 6, 7, 2, 3, 1, 8, 5, 4, 9] values: [0, 36, 49, 4, 9, 1, 64, 25, 16, 81] Ali
Jun 03
parent reply Andy Valencia <dont spam.me> writes:
On Tuesday, 3 June 2025 at 20:51:05 UTC, Ali Çehreli wrote:
 front() does not imply an order. It will work as long as you 
 can provide the elements in a sequence. The built-in 
 associative array feature is an example:
Ok, but now the data structure is told to popFront(). I completely see how this would work for, say, slices. But when hashed, arbitrarily large, and unordered? Not so much. (I know how to code this walker using generators, but that does not play with the Range oriented API's so far as I can tell.) Thanks, Andy
Jun 03
parent reply "H. S. Teoh" <hsteoh qfbox.info> writes:
On Tue, Jun 03, 2025 at 11:54:20PM +0000, Andy Valencia via Digitalmars-d-learn
wrote:
 On Tuesday, 3 June 2025 at 20:51:05 UTC, Ali Çehreli wrote:
 front() does not imply an order. It will work as long as you can
 provide the elements in a sequence. The built-in associative array
 feature is an example:
Ok, but now the data structure is told to popFront(). I completely see how this would work for, say, slices. But when hashed, arbitrarily large, and unordered? Not so much. (I know how to code this walker using generators, but that does not play with the Range oriented API's so far as I can tell.)
[...] In general, it's a bad idea to conflate the container with the range over its elements. Built-in arrays are a bad example in this case, because they are both, and it just so happened that they work OK as such. But in general, containers should NOT be conflated with ranges. That only leads to wrong design. The correct way to do this is to create a method in the container that returns a range over its contents. Popping the range should NOT mutate the container. It should be regarded as something separate from the container. Only then will you get sane semantics. Trying to conflate an unordered container with a range will only lead to pain and bugs. T -- You have to expect the unexpected. -- RL
Jun 03
next sibling parent reply Andy Valencia <dont spam.me> writes:
On Wednesday, 4 June 2025 at 02:11:18 UTC, H. S. Teoh wrote:
 The correct way to do this is to create a method in the 
 container that returns a range over its contents.  Popping the 
 range should NOT mutate the container.  It should be regarded 
 as something separate from the container.  Only then will you 
 get sane semantics.  Trying to conflate an unordered container 
 with a range will only lead to pain and bugs.
So, concretely, my container has an opApply method, and because of that foreach works to iterate its contents. Imagine that I now want to chain() across several of these, along the lines of: ```d foreach(val; s1) { ...; } // But now: foreach(val; chain(s1, s2, s3)) { ...; } ``` What API do I add to my container so that chain will do its magic? I don't see the infrastructure under std.range.chain using it. Or is there a non-Range chain I haven't found yet? Thank you again, Andy
Jun 03
parent reply Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:
On Tuesday, June 3, 2025 9:39:35 PM Mountain Daylight Time Andy Valencia via
Digitalmars-d-learn wrote:
 On Wednesday, 4 June 2025 at 02:11:18 UTC, H. S. Teoh wrote:
 The correct way to do this is to create a method in the
 container that returns a range over its contents.  Popping the
 range should NOT mutate the container.  It should be regarded
 as something separate from the container.  Only then will you
 get sane semantics.  Trying to conflate an unordered container
 with a range will only lead to pain and bugs.
So, concretely, my container has an opApply method, and because of that foreach works to iterate its contents. Imagine that I now want to chain() across several of these, along the lines of: ```d foreach(val; s1) { ...; } // But now: foreach(val; chain(s1, s2, s3)) { ...; } ``` What API do I add to my container so that chain will do its magic? I don't see the infrastructure under std.range.chain using it. Or is there a non-Range chain I haven't found yet?
chain chains two ranges with the same element type. opApply doesn't really have anything to do with ranges. opApply was the way to give a type the ability to be used with foreach in D1, and it was left in for D2. You can still use it, but in general, the preferred approach in D2 is to use ranges. Typically, a container will implement opSlice (or opIndex, since confusingly, both can be used in this case) and have it return a range over the container. So, then you can do auto range = myContainer[]; and then iterate over the range. You can then also do foreach(e; myContainer[]) { ... } but in addition to foreach understanding the range API, it also will slice what it's given if that type can be sliced with no arguments. So, foreach(e; myContainer) { ... } will have the same semantics if myContainer implements opSlice with no arguments, and its opSlice returns a range. For the range API to work, a type needs to implement bool empty(); T front(); void popFront(); and ideally it would also implement typeof(this) save(); so that you can get a copy of the range and thus save the current iteration state. If you look at std.container, each container type follows this pattern. It implements opSlice which then returns a range over the container. You can then use the range over the container with range-based functions such as chain. - Jonathan M Davis
Jun 04
parent Andy Valencia <dont spam.me> writes:
On Wednesday, 4 June 2025 at 10:47:24 UTC, Jonathan M Davis wrote:
 Typically, a container will implement opSlice (or opIndex, 
 since confusingly, both can be used in this case) and have it 
 return a range over the container. So, then you can do

     auto range = myContainer[];
...
Brilliant, thank you. Now I know what I'm looking for, and I see the relevance of all the bits and pieces I'll need. Andy
Jun 04
prev sibling parent reply Monkyyy <crazymonkyyy gmail.com> writes:
On Wednesday, 4 June 2025 at 02:11:18 UTC, H. S. Teoh wrote:
 . But in general, containers should NOT be conflated with 
 ranges. That only leads to wrong design.

 range should NOT mutate the container.  It should be regarded 
 as something separate from the container.
Not "general". It's a safety vs speed tradeoff. Immutable whatever's data structure do just make allot of copies or allot of pointer overhead and indirection.
Jun 04
parent reply =?UTF-8?Q?Ali_=C3=87ehreli?= <acehreli yahoo.com> writes:
On 6/4/25 4:10 AM, Monkyyy wrote:
 On Wednesday, 4 June 2025 at 02:11:18 UTC, H. S. Teoh wrote:
 . But in general, containers should NOT be conflated with ranges. That
 only leads to wrong design.

 range should NOT mutate the container.  It should be regarded as
 something separate from the container.
Not "general". It's a safety vs speed tradeoff. Immutable whatever's data structure do just make allot of copies or allot of pointer overhead and indirection.
I don't understand. Like H. S. Teoh said, a range over a container should not alter the container. I'm not aware of any data structure where that's not the case. If a container has a way of accessing the elements (which it definitely has to because otherwise it's not really a container), then a light-weight range can be specified over the elements. Ali
Jun 04
next sibling parent "H. S. Teoh" <hsteoh qfbox.info> writes:
On Wed, Jun 04, 2025 at 12:00:57PM -0700, Ali Çehreli via Digitalmars-d-learn
wrote:
 On 6/4/25 4:10 AM, Monkyyy wrote:
 On Wednesday, 4 June 2025 at 02:11:18 UTC, H. S. Teoh wrote:
 . But in general, containers should NOT be conflated with ranges.
 That only leads to wrong design.

 range should NOT mutate the container.  It should be regarded as
 something separate from the container.
Not "general". It's a safety vs speed tradeoff. Immutable whatever's data structure do just make allot of copies or allot of pointer overhead and indirection.
What has this got to do with immutable? I never said the container should be immutable, only that range over its contents should not mutate it. T -- Being forced to write comments actually improves code, because it is easier to fix a crock than to explain it. -- G. Steele
Jun 04
prev sibling parent reply monkyyy <crazymonkyyy gmail.com> writes:
On Wednesday, 4 June 2025 at 19:00:57 UTC, Ali Çehreli wrote:
 On 6/4/25 4:10 AM, Monkyyy wrote:
 On Wednesday, 4 June 2025 at 02:11:18 UTC, H. S. Teoh wrote:
 . But in general, containers should NOT be conflated with
ranges. That
 only leads to wrong design.

 range should NOT mutate the container.  It should be
regarded as
 something separate from the container.
Not "general". It's a safety vs speed tradeoff. Immutable
whatever's
 data structure do just make allot of copies or allot of
pointer overhead
 and indirection.
I don't understand. Like H. S. Teoh said, a range over a container should not alter the container. I'm not aware of any data structure where that's not the case. If a container has a way of accessing the elements (which it definitely has to because otherwise it's not really a container), then a light-weight range can be specified over the elements. Ali
Iota should mutate the underlining data and does. Unjustifiably slices are in the weird limbo, the autodecoding's popFront is raw mutation. I think the some of the rng primitives also just mutate. Nullable is a range for some God-Forsaken reason, it would have to mutate data. And thats just whats in phoboes rn, not even stuff Id write(op said he had a set and was worried about the speed). The purist "view of data" came after, and stuff like sorting *absolutely shouldnt* be held to it. Trade offs. 80% of datastructures and 90% of algorthims should be functional-y; but dont go full purist.
Jun 04
parent reply =?UTF-8?Q?Ali_=C3=87ehreli?= <acehreli yahoo.com> writes:
On 6/4/25 12:53 PM, monkyyy wrote:

 Iota should mutate the underlining data and does.
There is no underlying data for some ranges like iota because they are generators. Yes, it's a complication that D uses the term "range" for generators as well.
 Unjustifiably slices
 are in the weird limbo,
My mental model of slices differ from many old-timer D people here. There is no limbo in my mental model: Slices are always a lightweight interface (i.e. the range) to contained elements. What is missing in D language is that there is always a container of elements that the slice provides access to. The fact that the container (the array) not having a name causes the confusion. People necessarily say "add an element to the slice". That is not accurate in my mental model: The element is always added to the container, the slice is just an interface to it. Yes, it's weird that an element accessor is capable of adding elements to the unnamed array. I accept my mental model despite this weirdness.
 the autodecoding's popFront is raw mutation.
As the following example shows, there is no mutation to the contained elements though. Only the slice (the range) changes: import std.range : popFront; void main() { auto slice1 = [ 1, 10, 42 ]; auto slice2 = slice1; slice2.popFront(); assert(slice1 == [ 1, 10, 42 ]); // <-- No mutation }
 I
 think the some of the rng primitives also just mutate. Nullable is a
 range for some God-Forsaken reason, it would have to mutate data.
I wasn't aware of that. I wonder whether someone thought it would improve generality? But that topic is beside the point here, which was the separation of containers and ranges. Ali
Jun 04
parent monkyyy <crazymonkyyy gmail.com> writes:
On Wednesday, 4 June 2025 at 20:34:30 UTC, Ali Çehreli wrote:
 generators

 My mental model of slices differ
 I
 think the some of the rng primitives also just mutate.
Nullable is a
 range for some God-Forsaken reason, it would have to mutate
data. I wasn't aware of that. I wonder whether someone thought it would improve generality? But that topic is beside the point here, which was the separation of containers and ranges. Ali
Nullable is a 1 or 0 length container which did not need to have a [] called on it; and I dont really want to hear you call it a generator. Your already making an exception for the "ranges are views of data"; its just not surviving contact with all counter examples.
Jun 04
prev sibling parent Andy Valencia <dont spam.me> writes:
On Tuesday, 3 June 2025 at 16:03:15 UTC, Andy Valencia wrote:
 I have a situation where chain()'ing them together would be 
 convenient, but InputRange requires front() and popFront().
As an exercise in chaining opApply based containers, I tried: ```d // Chain across containers; their type is T and what they hold is type B class Chain(T,B) { private T[] members; this(T...)(T args) { foreach(m; args) { this.members ~= *m; } } int opApply(int delegate(ref B) ops) const { int res = 0; foreach(m; this.members) { foreach(val; m) { res = ops(val); if (res) { break; } } } return res; } } ``` I'm a little iffy on the walk of the args and the *m deref. But it tests out OK in practice. Using one looks like: ```d import tiny.set : Set; alias SetInt = Set!int; auto s1 = new SetInt(); s1.add(1); s1.add(2); s1.add(3); auto s2 = new SetInt(); s2.add(4); s2.add(5); s2.add(6); auto s3 = new SetInt(); foreach(x; new Chain!(SetInt,int)(s1, s2)) { s3.add(x); } ``` Andy
Jun 04