www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Ranges with length that cannot be named

reply berni <someone something.org> writes:
We are used to divide ranges in two: Ranges with length and 
infinite ranges. But there is something in between. Some ranges 
have a length, that does not fit in an (u)long and therefore 
cannot be named in D. The same might be the case, when the 
calculation of the length is somehow problematic.

Think of iota(-5.0,double.max,1.0) or iota(8.5300000000000011, 
438.17803623529841, 0.4296480362352984) - both given in issue 


Having thought about this for some time now, I think, there is a 
certain parallel to floatingpoint numbers: We know, that there 
are numbers between 1.0L and 1.0L+real.epsilon but we cannot name 
them.

While we learned to live with the gaps in floatingpoint numbers, 
we don't have an idea on how to cope with those ranges, yet.

When investigating on the subject I often found people claiming, 
that ranges of that length are of no use and therefore we do not 
need to care. But I think, this is not true. Infinite ranges are 
even longer and still usefull. Think of 
iota(...).drop(100).take(50) or similar or in case of random 
access ranges, slicing or accessing some elements.

As long, as the length is not queried, these ranges are usefull. 
And even if the length is queried, the information, that it 
cannot be named, might be sufficient. For me, the question 
therefore is mainly, how to tell the caller of length(), that it 
cannot be named.

Some ideas came to my mind:

a) return 0, but empty()==false
b) throw an Exception
c) return ulong.max (or size_t.max or long.max or whatever type 
is used for the length of ranges)
d) using BigInt

My thoughts on that:

a) Has the advantage, that length() can be nothrow. This solution 
might break existing code, but I think, this affects only code 
that's broken anyway or that uses length instead of empty to 
decide if the range is empty, which is somewhat bad style.

b) Is a clean solution but might add the need for exception 
handling, where in lots of cases no exception is to be expected.

c) On first sight, this doesn't feel good. But meanwhile I think, 
that it might work quite well in practice. (But I'm not sure on 
this and might be completely wrong.)

d) I dislike this approach. Most usecases work with much smaller 
numbers and BigInt would slow them down all. And it still would 
not solve the problem, when calculation is difficult.

Im curious on your oppinion or better ideas. :-)
Sep 18 2019
next sibling parent reply John Colvin <john.loughran.colvin gmail.com> writes:
On Wednesday, 18 September 2019 at 19:10:43 UTC, berni wrote:
 We are used to divide ranges in two: Ranges with length and 
 infinite ranges. But there is something in between. Some ranges 
 have a length, that does not fit in an (u)long and therefore 
 cannot be named in D. The same might be the case, when the 
 calculation of the length is somehow problematic.

 [...]
Maybe I'm misunderstanding, but are you missing that you can just not define the length member? Then your range doesn't have a defined length (it might be infinite, it might be length 0, it might 2*ulong.max).
Sep 18 2019
next sibling parent Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:
On Wednesday, 18 September 2019 at 19:21:23 UTC, John Colvin 
wrote:
 Maybe I'm misunderstanding, but are you missing that you can 
 just not define the length member? Then your range doesn't have 
 a defined length (it might be infinite, it might be length 0, 
 it might 2*ulong.max).
There is also quite a nice thing that Rust has for its iterators, where it defines the bounds that the length could be. This means that for example one can define cases with a clear upper bound on length but no well defined lower bound (e.g. the results of applying a filter to a range), but also cases which have a clear _lower_ bound without defining an upper bound (which could have several uses, including the case described here where the length is guaranteed finite but larger than size_t.max).
Sep 18 2019
prev sibling parent reply berni <someone something.org> writes:
On Wednesday, 18 September 2019 at 19:21:23 UTC, John Colvin 
wrote:
 Maybe I'm misunderstanding, but are you missing that you can 
 just not define the length member? Then your range doesn't have 
 a defined length (it might be infinite, it might be length 0, 
 it might 2*ulong.max).
Yes and no. We, in the real world, know, that the range is finite. And in most cases, we also know, that the range has more than 0 elements. But either it's too long (e.g. 2*ulong.max as you write) or the computation is too complex to calculate the length in reasonable time. The problem is, that we cannot just leave length undefined, making it a non random access range, because in most situations length is needed. An example:
 iota(0.0,10.0,1.0)
 iota(0.0,1000000000.0,1.0)
 iota(0.0,100000000000000000000.0,1.0)
 iota(0.0,1000000000000000000000000000000000000000000000000.0,1.0)
Should iota have length or not? Obviously, the first three versions should have length, and it's more likely, that they will appear in applications. But the fourth is too long. It wouldn't justify to remove the length property from iota, just because this fourth one has no "defined length". And the question is, what to do.
Sep 18 2019
parent berni <someone something.org> writes:
On Thursday, 19 September 2019 at 05:30:23 UTC, berni wrote:
 iota(0.0,100000000000000000000.0,1.0)
Sorry, should have been one zero less...
Sep 18 2019
prev sibling next sibling parent Dominikus Dittes Scherkl <dominikus.scherkl continental-corporation.com> writes:
On Wednesday, 18 September 2019 at 19:10:43 UTC, berni wrote:
 We are used to divide ranges in two: Ranges with length and 
 infinite ranges. But there is something in between. Some ranges 
 have a length, that does not fit in an (u)long and therefore 
 cannot be named in D. The same might be the case, when the 
 calculation of the length is somehow problematic.
[...]
 Some ideas came to my mind:

 a) return 0, but empty()==false
 b) throw an Exception
 c) return ulong.max (or size_t.max or long.max or whatever type 
 is used for the length of ranges)
 d) using BigInt
I would go with (c) [better: return T.max, with T the type of length] because why would you ask for the length? Because you want to ensure that you can get enough from the stream e.g. to fill some buffer or to check if it would fit in some data structure. And to both the answer T.max would be fine: you can always get enough and it won't fit in whatever you have reserved. (d) would only be useful if the target is also a range with voldemort-length, which would very seldom the case, and even there T.max is a good starting point.
Sep 19 2019
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.com> writes:
On 9/18/19 3:10 PM, berni wrote:
 We are used to divide ranges in two: Ranges with length and infinite ranges.
This is incorrect - there are plenty of ranges with finite length that is, however, not known in O(1), such as the result of filter. The isInfinite trait describes ranges that are statically known to be infinite, and has turned out to be of moderate value. Also, there's a more productive view - instead of a classification of ranges, think of "length" as an optional attribute of a range. (This is related to the concepts vs. DbI debate. Concepts: "We define the RangeWithLength and RangeWithoutLength concept, have at it." DbI: "A range may or may not have length, I'll go about things accordingly.")`
Sep 25 2019
parent reply berni <someone something.org> writes:
On Wednesday, 25 September 2019 at 17:03:07 UTC, Andrei 
Alexandrescu wrote:
 Also, there's a more productive view - instead of a 
 classification of ranges, think of "length" as an optional 
 attribute of a range.
On a more abstract level this is indeed a nice view. In case of iota though, I'm lost a little bit with this, especially the version with "if (isFloatingPoint!(CommonType!(B, E, S)))". Depending on B, E and S this might have a calculatable length or not. Should we omit length here altogether? If not, how to cope with the cases, where we cannot calculate it?
Sep 25 2019
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.com> writes:
On 9/25/19 2:05 PM, berni wrote:
 On Wednesday, 25 September 2019 at 17:03:07 UTC, Andrei Alexandrescu wrote:
 Also, there's a more productive view - instead of a classification of 
 ranges, think of "length" as an optional attribute of a range.
On a more abstract level this is indeed a nice view. In case of iota though, I'm lost a little bit with this, especially the version with "if (isFloatingPoint!(CommonType!(B, E, S)))". Depending on B, E and S this might have a calculatable length or not. Should we omit length here altogether? If not, how to cope with the cases, where we cannot calculate it?
Yes, certain iota instantiations should not define length. For such ranges, length is not expressible as a size_t, and that's okay. Just manipulate it as any range with no known length.
Sep 25 2019