digitalmars.D.learn - InputRange for data structure without order?

Andy Valencia (11/11) Jun 03 I have a Set data structure which has no concept of order; its

monkyyy (50/61) Jun 03 ```d
=?UTF-8?Q?Ali_=C3=87ehreli?= (19/24) Jun 03 front() does not imply an order. It will work as long as you can provide...

Andy Valencia (8/11) Jun 03 Ok, but now the data structure is told to popFront(). I

H. S. Teoh (15/26) Jun 03 [...]

Andy Valencia (18/24) Jun 03 So, concretely, my container has an opApply method, and because

Jonathan M Davis (35/57) Jun 04 chain chains two ranges with the same element type. opApply doesn't real...

Andy Valencia (4/9) Jun 04 Brilliant, thank you. Now I know what I'm looking for, and I see

Monkyyy (4/8) Jun 04 Not "general". It's a safety vs speed tradeoff. Immutable

=?UTF-8?Q?Ali_=C3=87ehreli?= (8/17) Jun 04 I don't understand. Like H. S. Teoh said, a range over a container

H. S. Teoh (7/18) Jun 04 What has this got to do with immutable? I never said the container
monkyyy (12/35) Jun 04 Iota should mutate the underlining data and does. Unjustifiably

=?UTF-8?Q?Ali_=C3=87ehreli?= (28/35) Jun 04 There is no underlying data for some ranges like iota because they are

monkyyy (6/17) Jun 04 Nullable is a 1 or 0 length container which did not need to have

Andy Valencia (46/48) Jun 04 As an exercise in chaining opApply based containers, I tried:

Andy Valencia <dont spam.me> writes:

I have a Set data structure which has no concept of order; its 
members are stored and can be searched efficiently based on their 
hash.

I have a situation where chain()'ing them together would be 
convenient, but InputRange requires front() and popFront().  
There really _isn't_ a front, and creating a new Set with an 
element removed would be unpleasant.

What would be the best way to iterate across a list of such 
Set's?  For now I'm just going to write the explicit loop.

Thanks!
Andy

Jun 03

monkyyy <crazymonkyyy gmail.com> writes:

On Tuesday, 3 June 2025 at 16:03:15 UTC, Andy Valencia wrote:
 I have a Set data structure which has no concept of order; its 
 members are stored and can be searched efficiently based on 
 their hash.

 I have a situation where chain()'ing them together would be 
 convenient, but InputRange requires front() and popFront().  
 There really _isn't_ a front, and creating a new Set with an 
 element removed would be unpleasant.

 What would be the best way to iterate across a list of such 
 Set's?  For now I'm just going to write the explicit loop.

 Thanks!
 Andy

```d
import std;
struct myset{
	bool has3;
	bool has5;
	auto opBinary(string s:"+")(typeof(this) a){
		return myset(has3||a.has3,has5||a.has5);
	}
	alias chain=opBinary!"+";
	int front(){
		if(has3){ return 3;}
		if(has5){ return 5;}
		assert(0);
	}
	void popFront(){
		if(has3){has3=false; return;}
		has5=false;
	}
	bool empty()=>( ! has3) && ( ! has5);
}
myset toset(R)(R r){
	myset o;
	foreach(i;r){
		if(i==3){o.has3=true;}
		if(i==5){o.has5=true;}
	}
	return o;
}
unittest{
	myset a,b;
	a.has3=true;
	b.has5=true;
	(a+b).writeln;
	
	a.chain(a).chain(a).writeln;
	b.chain(b).chain(b).writeln;
	a.chain(b).writeln;
	
	auto c=a+b;
	c.map!(a=>a*2).writeln;
	iota(10).filter!(a=>a==3).toset.chain(
		iota(10).filter!(a=>a!=3).toset)
		.writeln;
}
```

You can overload the names inside the datastructure and eagerly 
lower it as needed
if you need it in something nested I have some more workarounds, 
but they start to suck.

Jun 03

=?UTF-8?Q?Ali_=C3=87ehreli?= <acehreli yahoo.com> writes:

On 6/3/25 9:03 AM, Andy Valencia wrote:
 I have a Set data structure which has no concept of order; its members
 are stored and can be searched efficiently based on their hash.

 I have a situation where chain()'ing them together would be convenient,
 but InputRange requires front() and popFront(). There really _isn't_ a
 front

front() does not imply an order. It will work as long as you can provide 
the elements in a sequence. The built-in associative array feature is an 
example:

import std.stdio;
import std.range;

void main() {
     int[int] squares;
     foreach (i; 0 .. 10) {
         squares[i] = i * i;
     }

     writeln("keys  : ", squares.byKey);
     writeln("values: ", squares.byValue);
}

The elements are not in any particular order, yet associative arrays 
provide InputRanges (byKey and byValue return InputRange objects.):

keys  : [0, 6, 7, 2, 3, 1, 8, 5, 4, 9]
values: [0, 36, 49, 4, 9, 1, 64, 25, 16, 81]

Ali

Jun 03

Andy Valencia <dont spam.me> writes:

On Tuesday, 3 June 2025 at 20:51:05 UTC, Ali Çehreli wrote:
 front() does not imply an order. It will work as long as you 
 can provide the elements in a sequence. The built-in 
 associative array feature is an example:

Ok, but now the data structure is told to popFront().  I 
completely see how this would work for, say, slices.  But when 
hashed, arbitrarily large, and unordered?  Not so much.

(I know how to code this walker using generators, but that does 
not play with the Range oriented API's so far as I can tell.)

Thanks,
Andy

Jun 03

"H. S. Teoh" <hsteoh qfbox.info> writes:

On Tue, Jun 03, 2025 at 11:54:20PM +0000, Andy Valencia via Digitalmars-d-learn
wrote:
 On Tuesday, 3 June 2025 at 20:51:05 UTC, Ali �ehreli wrote:
 front() does not imply an order. It will work as long as you can
 provide the elements in a sequence. The built-in associative array
 feature is an example:

 
 Ok, but now the data structure is told to popFront().  I completely
 see how this would work for, say, slices.  But when hashed,
 arbitrarily large, and unordered?  Not so much.
 
 (I know how to code this walker using generators, but that does not
 play with the Range oriented API's so far as I can tell.)

[...]

In general, it's a bad idea to conflate the container with the range
over its elements.  Built-in arrays are a bad example in this case,
because they are both, and it just so happened that they work OK as
such. But in general, containers should NOT be conflated with ranges.
That only leads to wrong design.

The correct way to do this is to create a method in the container that
returns a range over its contents.  Popping the range should NOT mutate
the container.  It should be regarded as something separate from the
container.  Only then will you get sane semantics.  Trying to conflate
an unordered container with a range will only lead to pain and bugs.


T

-- 
You have to expect the unexpected. -- RL

Jun 03

Andy Valencia <dont spam.me> writes:

On Wednesday, 4 June 2025 at 02:11:18 UTC, H. S. Teoh wrote:
 The correct way to do this is to create a method in the 
 container that returns a range over its contents.  Popping the 
 range should NOT mutate the container.  It should be regarded 
 as something separate from the container.  Only then will you 
 get sane semantics.  Trying to conflate an unordered container 
 with a range will only lead to pain and bugs.

So, concretely, my container has an opApply method, and because 
of that foreach works to iterate its contents.  Imagine that I 
now want to chain() across several of these, along the lines of:

```d
     foreach(val; s1) {
         ...;
     }

     // But now:
     foreach(val; chain(s1, s2, s3)) {
         ...;
     }
```

What API do I add to my container so that chain will do its 
magic?  I don't see the infrastructure under std.range.chain 
using it.  Or is there a non-Range chain I haven't found yet?

Thank you again,
Andy

Jun 03

Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:

On Tuesday, June 3, 2025 9:39:35 PM Mountain Daylight Time Andy Valencia via
Digitalmars-d-learn wrote:
 On Wednesday, 4 June 2025 at 02:11:18 UTC, H. S. Teoh wrote:
 The correct way to do this is to create a method in the
 container that returns a range over its contents.  Popping the
 range should NOT mutate the container.  It should be regarded
 as something separate from the container.  Only then will you
 get sane semantics.  Trying to conflate an unordered container
 with a range will only lead to pain and bugs.

 So, concretely, my container has an opApply method, and because
 of that foreach works to iterate its contents.  Imagine that I
 now want to chain() across several of these, along the lines of:

 ```d
      foreach(val; s1) {
          ...;
      }

      // But now:
      foreach(val; chain(s1, s2, s3)) {
          ...;
      }
 ```

 What API do I add to my container so that chain will do its
 magic?  I don't see the infrastructure under std.range.chain
 using it.  Or is there a non-Range chain I haven't found yet?

chain chains two ranges with the same element type. opApply doesn't really
have anything to do with ranges. opApply was the way to give a type the
ability to be used with foreach in D1, and it was left in for D2. You can
still use it, but in general, the preferred approach in D2 is to use ranges.

Typically, a container will implement opSlice (or opIndex, since
confusingly, both can be used in this case) and have it return a range over
the container. So, then you can do

    auto range = myContainer[];

and then iterate over the range. You can then also do

    foreach(e; myContainer[])
    {
        ...
    }

but in addition to foreach understanding the range API, it also will slice
what it's given if that type can be sliced with no arguments. So,

    foreach(e; myContainer)
    {
        ...
    }

will have the same semantics if myContainer implements opSlice with no
arguments, and its opSlice returns a range.

For the range API to work, a type needs to implement

    bool empty();
    T front();
    void popFront();

and ideally it would also implement

    typeof(this) save();

so that you can get a copy of the range and thus save the current iteration
state.

If you look at std.container, each container type follows this pattern. It
implements opSlice which then returns a range over the container.

You can then use the range over the container with range-based functions
such as chain.

- Jonathan M Davis

Jun 04

Andy Valencia <dont spam.me> writes:

On Wednesday, 4 June 2025 at 10:47:24 UTC, Jonathan M Davis wrote:
 Typically, a container will implement opSlice (or opIndex, 
 since confusingly, both can be used in this case) and have it 
 return a range over the container. So, then you can do

     auto range = myContainer[];
...

Brilliant, thank you.  Now I know what I'm looking for, and I see 
the relevance of all the bits and pieces I'll need.

Andy

Jun 04

Monkyyy <crazymonkyyy gmail.com> writes:

On Wednesday, 4 June 2025 at 02:11:18 UTC, H. S. Teoh wrote:
 . But in general, containers should NOT be conflated with 
 ranges. That only leads to wrong design.

 range should NOT mutate the container.  It should be regarded 
 as something separate from the container.

Not "general". It's a safety vs speed tradeoff. Immutable 
whatever's data structure do just make allot of copies or allot 
of pointer overhead and indirection.

Jun 04

=?UTF-8?Q?Ali_=C3=87ehreli?= <acehreli yahoo.com> writes:

On 6/4/25 4:10 AM, Monkyyy wrote:
 On Wednesday, 4 June 2025 at 02:11:18 UTC, H. S. Teoh wrote:
 . But in general, containers should NOT be conflated with ranges. That
 only leads to wrong design.

 range should NOT mutate the container.  It should be regarded as
 something separate from the container.

 Not "general". It's a safety vs speed tradeoff. Immutable whatever's
 data structure do just make allot of copies or allot of pointer overhead
 and indirection.

I don't understand. Like H. S. Teoh said, a range over a container 
should not alter the container. I'm not aware of any data structure 
where that's not the case.

If a container has a way of accessing the elements (which it definitely 
has to because otherwise it's not really a container), then a 
light-weight range can be specified over the elements.

Ali

Jun 04

"H. S. Teoh" <hsteoh qfbox.info> writes:

On Wed, Jun 04, 2025 at 12:00:57PM -0700, Ali �ehreli via Digitalmars-d-learn
wrote:
 On 6/4/25 4:10 AM, Monkyyy wrote:
 On Wednesday, 4 June 2025 at 02:11:18 UTC, H. S. Teoh wrote:
 . But in general, containers should NOT be conflated with ranges.
 That only leads to wrong design.

 range should NOT mutate the container.  It should be regarded as
 something separate from the container.

 Not "general". It's a safety vs speed tradeoff. Immutable whatever's
 data structure do just make allot of copies or allot of pointer
 overhead and indirection.


What has this got to do with immutable?  I never said the container
should be immutable, only that range over its contents should not mutate
it.


T

-- 
Being forced to write comments actually improves code, because it is easier to
fix a crock than to explain it. -- G. Steele

Jun 04

monkyyy <crazymonkyyy gmail.com> writes:

On Wednesday, 4 June 2025 at 19:00:57 UTC, Ali Çehreli wrote:
 On 6/4/25 4:10 AM, Monkyyy wrote:
 On Wednesday, 4 June 2025 at 02:11:18 UTC, H. S. Teoh wrote:
 . But in general, containers should NOT be conflated with


 ranges. That
 only leads to wrong design.

 range should NOT mutate the container.  It should be


 regarded as
 something separate from the container.

 Not "general". It's a safety vs speed tradeoff. Immutable

 whatever's
 data structure do just make allot of copies or allot of

 pointer overhead
 and indirection.

 I don't understand. Like H. S. Teoh said, a range over a 
 container should not alter the container. I'm not aware of any 
 data structure where that's not the case.

 If a container has a way of accessing the elements (which it 
 definitely has to because otherwise it's not really a 
 container), then a light-weight range can be specified over the 
 elements.

 Ali

Iota should mutate the underlining data and does. Unjustifiably 
slices are in the weird limbo, the autodecoding's popFront is raw 
mutation. I think the some of the rng primitives also just 
mutate. Nullable is a range for some God-Forsaken reason, it 
would have to mutate data.

And thats just whats in phoboes rn, not even stuff Id write(op 
said he had a set and was worried about the speed).

The purist "view of data" came after, and stuff like sorting 
*absolutely shouldnt* be held to it.

Trade offs. 80% of datastructures and 90% of algorthims should be 
functional-y; but dont go full purist.

Jun 04

=?UTF-8?Q?Ali_=C3=87ehreli?= <acehreli yahoo.com> writes:

On 6/4/25 12:53 PM, monkyyy wrote:

 Iota should mutate the underlining data and does.

There is no underlying data for some ranges like iota because they are 
generators. Yes, it's a complication that D uses the term "range" for 
generators as well.

 Unjustifiably slices
 are in the weird limbo,

My mental model of slices differ from many old-timer D people here. 
There is no limbo in my mental model: Slices are always a lightweight 
interface (i.e. the range) to contained elements.

What is missing in D language is that there is always a container of 
elements that the slice provides access to. The fact that the container 
(the array) not having a name causes the confusion. People necessarily 
say "add an element to the slice". That is not accurate in my mental 
model: The element is always added to the container, the slice is just 
an interface to it.

Yes, it's weird that an element accessor is capable of adding elements 
to the unnamed array. I accept my mental model despite this weirdness.

 the autodecoding's popFront is raw mutation.

As the following example shows, there is no mutation to the contained 
elements though. Only the slice (the range) changes:

import std.range : popFront;

void main() {
     auto slice1 = [ 1, 10, 42 ];

     auto slice2 = slice1;
     slice2.popFront();

     assert(slice1 == [ 1, 10, 42 ]);    // <-- No mutation
}

 I
 think the some of the rng primitives also just mutate. Nullable is a
 range for some God-Forsaken reason, it would have to mutate data.

I wasn't aware of that. I wonder whether someone thought it would 
improve generality? But that topic is beside the point here, which was 
the separation of containers and ranges.

Ali

Jun 04

monkyyy <crazymonkyyy gmail.com> writes:

On Wednesday, 4 June 2025 at 20:34:30 UTC, Ali Çehreli wrote:
 generators

 My mental model of slices differ
 I
 think the some of the rng primitives also just mutate.

 Nullable is a
 range for some God-Forsaken reason, it would have to mutate

 data.

 I wasn't aware of that. I wonder whether someone thought it 
 would improve generality? But that topic is beside the point 
 here, which was the separation of containers and ranges.

 Ali

Nullable is a 1 or 0 length container which did not need to have 
a [] called on it; and I dont really want to hear you call it a 
generator.

Your already making an exception for the "ranges are views of 
data"; its just not surviving contact with all counter examples.

Jun 04

Andy Valencia <dont spam.me> writes:

On Tuesday, 3 June 2025 at 16:03:15 UTC, Andy Valencia wrote:
 I have a situation where chain()'ing them together would be 
 convenient, but InputRange requires front() and popFront().

As an exercise in chaining opApply based containers, I tried:

```d
// Chain across containers; their type is T and what they hold is 
type B
class Chain(T,B) {
     private T[] members;
     this(T...)(T args) {
         foreach(m; args) {
             this.members ~= *m;
         }
     }
     int
     opApply(int delegate(ref B) ops) const {
         int res = 0;
         foreach(m; this.members) {
             foreach(val; m) {
                 res = ops(val);
                 if (res) {
                     break;
                 }
             }
         }
         return res;
     }
}
```

I'm a little iffy on the walk of the args and the *m deref.  But 
it tests out OK in practice.  Using one looks like:

```d
     import tiny.set : Set;
     alias SetInt = Set!int;

     auto s1 = new SetInt();
     s1.add(1);
     s1.add(2);
     s1.add(3);
     auto s2 = new SetInt();
     s2.add(4);
     s2.add(5);
     s2.add(6);
     auto s3 = new SetInt();
     foreach(x; new Chain!(SetInt,int)(s1, s2)) {
         s3.add(x);
     }
```

Andy

Jun 04

D Programming

C/C++ Programming

Other

digitalmars.D.learn - InputRange for data structure without order?