digitalmars.D - Requirements for Slicing Ranges
- Jonathan M Davis (31/31) Sep 01 One thing that I've been thinking about with regards to improving the ra...
- Richard (Rikki) Andrew Cattermole (10/11) Sep 01 Indexing and range intervals are indeed distinct things.
- Jonathan M Davis (31/42) Sep 01 The range API uses size_t for indices, and it is required that the indic...
- Richard (Rikki) Andrew Cattermole (2/56) Sep 01 Your question appears to be answered :)
- Jonathan M Davis (16/17) Sep 01 Probably. I'm just trying to make sure that there isn't some corner case
- Sebastiaan Koppe (11/22) Sep 02 I briefly wondered how it would work if the range is
- Paul Backus (16/19) Sep 02 Taking a slice does not necessarily copy the range, nor does it
- Sebastiaan Koppe (12/21) Sep 03 Apologies, I expressed myself poorly, and I have to admit it is a
- Jonathan M Davis (14/36) Sep 03 Well, with the proposed API, that would not be possible, because the abi...
- Jonathan M Davis (100/124) Sep 02 Slicing doesn't copy the underlying data. It just gives a range which re...
- Paul Backus (4/10) Sep 02 Infinite forward ranges can implement slicing, but can't be
- Jonathan M Davis (33/43) Sep 02 Well, with the current rules, infinite ranges don't require back in orde...
- monkyyy (29/72) Sep 02 I think slicing should probably be cut from that main api, Id
- Jonathan M Davis (22/49) Sep 02 As things stand, slicing finite ranges (and infinite ranges when using $...
- monkyyy (23/87) Sep 03 As thing stands, I bet that the 50ish algorthims that should work
- Quirin Schroll (21/44) Sep 05 I can answer this direction immediately and formally. If a range
One thing that I've been thinking about with regards to improving the range API which I forgot to discuss previously is the requirements for hasSlicing. Right now, for whatever reason, slicing is divorced from random-access ranges. In practice, random-access ranges have slicing (unless the programmer just didn't bother to implement it), and other ranges don't, but currently, hasSlicing is a separate trait that can theoretically pass with a forward range, and it's not guaranteed that a random-access range has slicing. So, my question is: 1. Are there any realistic situations where a range could be a forward or bidirectional range _and_ implement slicing in O(1), but it _can't_ actually be implemented as a random-access range? So, it would need to be possible to take slices of elements of such a range by index in O(1) and yet somehow not be possible to access individual elements by index in O(1). 2. Are there any realistic situations where a range could be a random-access range but _can't_ implement slicing in O(1)? So, it would need to be possible to access individual elements by index in O(1) and yet somehow not be possible to take slices of elements by index in O(1). As far as I can tell, the ability to index a range and the ability to slice it are inextricably linked. If you can do range[5 .. 12] or range[7 .. $] in O(1), then you should be able to do range[5] or range[12] in O(1). Similarly, if you can do range[5] in O(1), it should be possible to do range[5 .. 12] in O(1). Does anyone know of a situation where that would not be the case? I would love to just ditch the test for slicing in the updated range API and tie the ability to random-access ranges, since it would be simplify things. If I had to guess without digging through all of the history, I suspect that random-access ranges were added before slicing was and that that's why they weren't tied together, but in practice, they always seem to be, and I can't think of a reason at the moment why they can't be. Can anyone see something that I'm missing here? - Jonathan M Davis
Sep 01
On 02/09/2024 2:47 PM, Jonathan M Davis wrote:Can anyone see something that I'm missing here?Indexing and range intervals are indeed distinct things. An interval does not have to evaluate to a specific bit of data, but an index does. They can also use non-integral keys. Typically both ``size_t`` and ``ptrdiff_t`` will have defined sequentiality, but they may not in the case of a map. Whereby an index works but not an interval. So the question is, how many assumptions surrounding sequentiality is being made, along with the key type?
Sep 01
On Sunday, September 1, 2024 9:31:23 PM MDT Richard (Rikki) Andrew Cattermole via Digitalmars-d wrote:On 02/09/2024 2:47 PM, Jonathan M Davis wrote:The range API uses size_t for indices, and it is required that the indices be sequential. range[3] gives you the 4th element and range[3 .. 9] gives you a 6 element range starting at the 4th element. Containers may do other things with keys (particularly containers like hash tables and sorted maps), but the range API treats indices the same way that arrays do. Doing anything else would complicate things considerably, and if anything, the range API is arguably already too complex. In essence, the range API provides an abstraction based on arrays, with random-access ranges providing the full capabilities of a dynamic array / slice aside from the capabilities related to increasing their size or doing any kind allocations. Each lower category of range then loses capabilities as it moves further away from being an array, with basic input ranges just providing a way to iterate through the elements once and nothing else. Or, if you want to view the situation in reverse, we start with a basic input range being only able to iterate through a sequential list of elements once, and each category of range builds up towards the capabilities of an array. As such, the situation with keys is simple in that there are no keys in the general sense. We're just dealing with indices, and an index needs to mean the same thing that it would with an array. It's simply a question of where in the hierarchy we put each capability. Indexing requires random-access ranges. Slicing at present does not, but as far as I can tell, from a technical persective, with regards to a sequential data structure such as a range, it is not possible to implement indexing without also being able to implement slicing, and I don't see how it would be possible to implement slicing but not be possible to implement indexing. Non-sequential data structures really don't have anything to do with ranges aside from cases where such a data structure provides a sequential view into its data via a range. - Jonathan M DavisCan anyone see something that I'm missing here?Indexing and range intervals are indeed distinct things. An interval does not have to evaluate to a specific bit of data, but an index does. They can also use non-integral keys. Typically both ``size_t`` and ``ptrdiff_t`` will have defined sequentiality, but they may not in the case of a map. Whereby an index works but not an interval. So the question is, how many assumptions surrounding sequentiality is being made, along with the key type?
Sep 01
On 02/09/2024 4:35 PM, Jonathan M Davis wrote:On Sunday, September 1, 2024 9:31:23 PM MDT Richard (Rikki) Andrew Cattermole via Digitalmars-d wrote:Your question appears to be answered :)On 02/09/2024 2:47 PM, Jonathan M Davis wrote:The range API uses size_t for indices, and it is required that the indices be sequential. range[3] gives you the 4th element and range[3 .. 9] gives you a 6 element range starting at the 4th element. Containers may do other things with keys (particularly containers like hash tables and sorted maps), but the range API treats indices the same way that arrays do. Doing anything else would complicate things considerably, and if anything, the range API is arguably already too complex. In essence, the range API provides an abstraction based on arrays, with random-access ranges providing the full capabilities of a dynamic array / slice aside from the capabilities related to increasing their size or doing any kind allocations. Each lower category of range then loses capabilities as it moves further away from being an array, with basic input ranges just providing a way to iterate through the elements once and nothing else. Or, if you want to view the situation in reverse, we start with a basic input range being only able to iterate through a sequential list of elements once, and each category of range builds up towards the capabilities of an array. As such, the situation with keys is simple in that there are no keys in the general sense. We're just dealing with indices, and an index needs to mean the same thing that it would with an array. It's simply a question of where in the hierarchy we put each capability. Indexing requires random-access ranges. Slicing at present does not, but as far as I can tell, from a technical persective, with regards to a sequential data structure such as a range, it is not possible to implement indexing without also being able to implement slicing, and I don't see how it would be possible to implement slicing but not be possible to implement indexing. Non-sequential data structures really don't have anything to do with ranges aside from cases where such a data structure provides a sequential view into its data via a range. - Jonathan M DavisCan anyone see something that I'm missing here?Indexing and range intervals are indeed distinct things. An interval does not have to evaluate to a specific bit of data, but an index does. They can also use non-integral keys. Typically both ``size_t`` and ``ptrdiff_t`` will have defined sequentiality, but they may not in the case of a map. Whereby an index works but not an interval. So the question is, how many assumptions surrounding sequentiality is being made, along with the key type?
Sep 01
On Sunday, September 1, 2024 10:54:02 PM MDT Richard (Rikki) Andrew Cattermole via Digitalmars-d wrote:Your question appears to be answered :)Probably. I'm just trying to make sure that there isn't some corner case that I'm missing. As far as I can tell, if you can index a range in O(1), it should be possible to implement slicing in O(1), and if you can slice a range in O(1), it should be possible to index it in O(1) (at least as long as the requirement to have slicing give you the same type remains in place), but the current range API does not make that assumption, which is kind of weird. So, I'm asking if anyone is able to see a hole in that logic or a corner case where it somehow really would make sense to be able to slice a range but not be able to also index that range. I don't _think_ that such a thing is possible, but it's possible that I'm missing something here, and someone else may see something that I don't. I'm 99% certain that I have a proper grasp of the situation, but I'd just like to be doubly sure. - Jonathan M Davis
Sep 01
On Monday, 2 September 2024 at 02:47:16 UTC, Jonathan M Davis wrote:Does anyone know of a situation where that would not be the case? I would love to just ditch the test for slicing in the updated range API and tie the ability to random-access ranges, since it would be simplify things. If I had to guess without digging through all of the history, I suspect that random-access ranges were added before slicing was and that that's why they weren't tied together, but in practice, they always seem to be, and I can't think of a reason at the moment why they can't be. Can anyone see something that I'm missing here? - Jonathan M DavisI briefly wondered how it would work if the range is non-copyable, since taking a slice is effectively taking a copy of the underlying data the range captures (and its lifetime). But I remember you saying in another thread that forward ranges have to be copyable, correct? So I guess the question for me boils down to whether there exists a need for a non-copyable range that provides random access and doesn't allow copies. (Although I guess with ref and dip1000 you could make that work, but lets not go there :smile).
Sep 02
On Monday, 2 September 2024 at 09:59:04 UTC, Sebastiaan Koppe wrote:I briefly wondered how it would work if the range is non-copyable, since taking a slice is effectively taking a copy of the underlying data the range captures (and its lifetime).Taking a slice does not necessarily copy the range, nor does it necessarily copy the range's elements. For example: struct DontCopyMe { this(this) { assert(0); } } void main() { auto r1 = [DontCopyMe(), DontCopyMe(), DontCopyMe()]; auto r2 = r1[1 .. $]; // No AssertError => elements not copied assert(r1 != r2); // Not equal => r2 is not a copy of r1 }
Sep 02
On Monday, 2 September 2024 at 11:53:55 UTC, Paul Backus wrote:On Monday, 2 September 2024 at 09:59:04 UTC, Sebastiaan Koppe wrote:Apologies, I expressed myself poorly, and I have to admit it is a bit of a stretch. I meant that taking a slice often creates another alias to the same data and/or state or part thereof. This is in contrast to indexing, which just returns the underlying item, possible as value. If you would want to restrict/control a range so much so that you make it non-copyable, you could argue you would want to avoiding taking a slice from it as well. It might very well have random access though, so you would end up with one that exposes indexing but not slicing.I briefly wondered how it would work if the range is non-copyable, since taking a slice is effectively taking a copy of the underlying data the range captures (and its lifetime).Taking a slice does not necessarily copy the range, nor does it necessarily copy the range's elements.
Sep 03
On Tuesday, September 3, 2024 7:42:28 AM MDT Sebastiaan Koppe via Digitalmars- d wrote:On Monday, 2 September 2024 at 11:53:55 UTC, Paul Backus wrote:Well, with the proposed API, that would not be possible, because the ability to copy is what differentiates between a basic input range and a forward range (i.e. copying replaces save), in which case, if you have a non-copyable range, all you can do is iterate through the elements once. To have random access would require being a random-access range, which would require both being a bidirectional range and a forward range, so it would have to be copyable. Now, you could still give a basic input range opIndex, since you can have extra functions on your ranges, but the range API would not consider it a random-access range, and it would not work with the normal algorithms that required anything beyond a basic input range. - Jonathan M DavisOn Monday, 2 September 2024 at 09:59:04 UTC, Sebastiaan Koppe wrote:Apologies, I expressed myself poorly, and I have to admit it is a bit of a stretch. I meant that taking a slice often creates another alias to the same data and/or state or part thereof. This is in contrast to indexing, which just returns the underlying item, possible as value. If you would want to restrict/control a range so much so that you make it non-copyable, you could argue you would want to avoiding taking a slice from it as well. It might very well have random access though, so you would end up with one that exposes indexing but not slicing.I briefly wondered how it would work if the range is non-copyable, since taking a slice is effectively taking a copy of the underlying data the range captures (and its lifetime).Taking a slice does not necessarily copy the range, nor does it necessarily copy the range's elements.
Sep 03
On Monday, September 2, 2024 3:59:04 AM MDT Sebastiaan Koppe via Digitalmars-d wrote:On Monday, 2 September 2024 at 02:47:16 UTC, Jonathan M Davis wrote:Slicing doesn't copy the underlying data. It just gives a range which refers to fewer elements from the original where the new range can be iterated independently. Just think about dynamic arrays / slices (which is what the range API is attempting to create an abstraction of). If you do something like arr[5 .. 12], you get a new dynamic array which is a slice of the original dynamic array starting at index 5 and going through index 11. Each dynamic array / slice refers to the same memory, and none of the elements were actually copied. The same occurs when slicing a range. You get two ranges which can be independently iterated, but the elements themselves are not necessarily independent (they will be if the range is a value type, but in most cases, they'll refer to the same memory; it's just the iteration state which is guaranteed to be independent). Similarly, save gives you a range which is independent with regards to its iteration state. Both the copy and the original can be iterated without affecting the other, but the elements themselves are not necessarily copied (though in some cases they are), and mutating the elements in one will often mutate the elements in the other. If you want to make sure that all of the elements are copied, then you'd use something like std.array.array to copy all of the range's elements into a dynamic array. Essentially, save is equivalent to range[0 .. $], but there are plenty of ranges that can support copying their full iteration state but which can't support copying their iteration state while skipping a bunch of elements. With the proposed API, copying ranges in general then becomes essentially equivalent to save (though an implementation could use reference counting and avoid actually doing save internally until the range is mutated in order to avoid copying its state in the cases where there's only one copy). So, a non-copyable range would be equivalent to a basic input range - and in general, such a range is not a forward range precisely because it can't implement save in O(1). And if a range can't implement save in O(1), then it can't implement slicing in O(1). So, if a range has to be a basic input range, then it makes no sense for it to have slicing - and the current API does not allow such a range to have slicing, since hasSlicing checks for isForwardRange. My issue is that as far as I can tell, if a range can implement slicing in O(1), then it should be possible for it to implement indexing in O(1) (and vice-versa), in which case, we might as well simplify things and just have isRandomAccessRange require slicing and not support ranges which have slicing but not random-access with the idea that there's no reason why a range that can support one can't support the other.Does anyone know of a situation where that would not be the case? I would love to just ditch the test for slicing in the updated range API and tie the ability to random-access ranges, since it would be simplify things. If I had to guess without digging through all of the history, I suspect that random-access ranges were added before slicing was and that that's why they weren't tied together, but in practice, they always seem to be, and I can't think of a reason at the moment why they can't be. Can anyone see something that I'm missing here? - Jonathan M DavisI briefly wondered how it would work if the range is non-copyable, since taking a slice is effectively taking a copy of the underlying data the range captures (and its lifetime). But I remember you saying in another thread that forward ranges have to be copyable, correct?So I guess the question for me boils down to whether there exists a need for a non-copyable range that provides random access and doesn't allow copies. (Although I guess with ref and dip1000 you could make that work, but lets not go there :smile).If a range cannot implement save - or with the new API, cannot be copyable - then that's because it cannot copy its iteration state in O(1). I suppose that there could be cases where it's technically possible to implement save / copying in O(1), but the programmer doesn't want to (e.g. that would arguably make sense with random number generators, since copying those accidentally can cause result in reusing numbers). However, in the normal case, a range doesn't implement save (or isn't copyable in the new API), because it can't do it in O(1). If a range can't copy its iteration state in O(1), then slicing won't work (since that needs to be O(1), and it's copying the iteration state), and in such a case, I'm pretty sure that it's not possible to implement indexing in O(1) - though maybe someone can come up with a corner case where it is possible. But as long as the ability to implement indexing in O(1) implies the ability to implement slicing in O(1) (and vice versa), it would make sense to just tie random-access and slicing together in order to simplify the API. I guess that the question then is ranges which _could_ copy their iteration state in O(1) but where the programmer decided to not allow it for one reason or another. Sure, then it could be possible to implement random-access in O(1), since in that situation, the lack of copyability has nothing to do with the ability to implement it, but as things stand, bidirectional ranges are always forward ranges, and finite random-access ranges are always bidirectional ranges (whereas infinite random-access ranges are forward ranges but can't be bidirectional ranges for obvious reasons). So, to allow a non-copyable range to be be bidirectional or random-access would make it so that it's no longer the case that bidirectional ranges and random-access ranges are guaranteed to be forward ranges. And we _could_ technically do that, but that would certainly complicate things, and it would be for a pretty rare use case. Also, I'm pretty sure that it's the case that most algorithms which require access to the end of a range or to elements in the middle of a range also require the ability to copy the iteration state of a range. There will be exceptions, but I'm not sure that they're worth the trouble of further complicatin the range API. The question for which I created this thread was based on the assumption that a range would implement all of the functionality that it could. If it doesn't, then it's certainly the case that a range _could_ implement slicing and not be random-access - just like it's possible to have a range that could implement save but which doesn't. So, yeah, the situation changes considerably if we don't treat the range categories as being a linear hierarchy, but I'm inclined to keep the hierarchy as-is, because it's much simpler that way, and I expect that the need for something like a random-access range which can't be a forward range would be quite rare. The whole reason that I'm asking if anyone can think of an example where it would be possible to implement indexing but not slicing (or vice versa) is in order to be able to simplify the range API by getting rid of hasSlicing, whereas if we allow basic input ranges to be bidirectional or random-access, then those traits become more like hasSlicing, and the hierarchy is lost, which makes the whole situation more complicated rather than simpler. And in cases where you have something on the stack which you don't want to copy, then you have basically the same situation as static arrays, and you can always create a range that refers to that memory and use that rather than having a range be non-copyable because of where it is. Sure, that then starts getting into scope and DIP 1000 if you want the compiler to track stuff for you, but the way that range-based algorithms typically work, the dangers of escaping are pretty minimal, since either they take the argument by ref and mutate the original, or they take a copy and return a copy. - Jonathan M Davis
Sep 02
On Monday, 2 September 2024 at 02:47:16 UTC, Jonathan M Davis wrote:1. Are there any realistic situations where a range could be a forward or bidirectional range _and_ implement slicing in O(1), but it _can't_ actually be implemented as a random-access range? So, it would need to be possible to take slices of elements of such a range by index in O(1) and yet somehow not be possible to access individual elements by index in O(1).Infinite forward ranges can implement slicing, but can't be random-access ranges because they don't support `r.back`.
Sep 02
On Monday, September 2, 2024 6:04:08 AM MDT Paul Backus via Digitalmars-d wrote:On Monday, 2 September 2024 at 02:47:16 UTC, Jonathan M Davis wrote:Well, with the current rules, infinite ranges don't require back in order to be random-access ranges. You can have an infinite range which implements indexing - and therefore is a random-access range - and it is of course also a forward range, but it's not a bidirectional range, because it doesn't implement back. That corner case probably does cause occasional problems (since finite random-access ranges do have to be bidirectional ranges), because there's probably code that checks for random-access ranges without checking whether the range is infinite and then decides to use back, but you wouldn't usually try to use an infinite range with something that needed back anyway, because it wouldn't even make sense to try. Either way, I suspect that there are a variety of bugs out there with range-based code that doesn't actually take infinite ranges into account but which hasn't been a problem simply because infinite ranges aren't all that common. As for slicing an infinite range, if you slice it with two indices, you get a finite range (which probably causes problems for some code, because the type isn't the same, but it's obviously not possible to return the same type in that case), whereas if you slice it with an index and $, you get the same range type, just like you would when slicing a finite range. So, with regards to infinite ranegs, the question is whether it's possible to implement range[1 .. 5] (which gives a finite range) or range[5 .. $] (which gives the same type of range) in O(1) without being able to implement range[5] in O(1) - or if it's possible to implement range[5] in O(1), but it's not possible to implement range[1 .. 5] or range[5 .. $] in O(1). I'm pretty sure that they're inextricably linked and that if you can implement slicing, you can implement random-access (and vice-versa), but again, I could be missing something. Regardless, random-access and slicing infinite ranges gets a bit weird, because the "end" has to be treated differently, and you can't get a finite slice of the same type as the infinite range when slicing, since the lack of end is part of the type itself. - Jonathan M Davis1. Are there any realistic situations where a range could be a forward or bidirectional range _and_ implement slicing in O(1), but it _can't_ actually be implemented as a random-access range? So, it would need to be possible to take slices of elements of such a range by index in O(1) and yet somehow not be possible to access individual elements by index in O(1).Infinite forward ranges can implement slicing, but can't be random-access ranges because they don't support `r.back`.
Sep 02
On Monday, 2 September 2024 at 02:47:16 UTC, Jonathan M Davis wrote:One thing that I've been thinking about with regards to improving the range API which I forgot to discuss previously is the requirements for hasSlicing. Right now, for whatever reason, slicing is divorced from random-access ranges. In practice, random-access ranges have slicing (unless the programmer just didn't bother to implement it), and other ranges don't, but currently, hasSlicing is a separate trait that can theoretically pass with a forward range, and it's not guaranteed that a random-access range has slicing. So, my question is: 1. Are there any realistic situations where a range could be a forward or bidirectional range _and_ implement slicing in O(1), but it _can't_ actually be implemented as a random-access range? So, it would need to be possible to take slices of elements of such a range by index in O(1) and yet somehow not be possible to access individual elements by index in O(1). 2. Are there any realistic situations where a range could be a random-access range but _can't_ implement slicing in O(1)? So, it would need to be possible to access individual elements by index in O(1) and yet somehow not be possible to take slices of elements by index in O(1). As far as I can tell, the ability to index a range and the ability to slice it are inextricably linked. If you can do range[5 .. 12] or range[7 .. $] in O(1), then you should be able to do range[5] or range[12] in O(1). Similarly, if you can do range[5] in O(1), it should be possible to do range[5 .. 12] in O(1). Does anyone know of a situation where that would not be the case? I would love to just ditch the test for slicing in the updated range API and tie the ability to random-access ranges, since it would be simplify things. If I had to guess without digging through all of the history, I suspect that random-access ranges were added before slicing was and that that's why they weren't tied together, but in practice, they always seem to be, and I can't think of a reason at the moment why they can't be. Can anyone see something that I'm missing here? - Jonathan M DavisI think slicing should probably be cut from that main api, Id argue its the hardest to get correct of any of the functions while its the most fragile in meta programming in my experiments Instead there should be a collection of implict range functions given special treatment(std.range.interface, object.d, somewhere new for ranges) --- ```d auto tail(R)(R r){ r.pop; return r; } ``` It would suck to implement tail everywhere, so tail should not be in the main api; but often you need to make a poped dup. If tail doesnt exist, several range algorithms will need 3 extra lines of code repeated several time, if its in the main api, every algorthim will need to impliment a `auto tail()=>r.tail;` busy work as a specail treatment floating function it is 3 lines, users dont need to implement anything and it will reduce complexity and clarity of several algorithms(consider zip.pop being implimented with static map) likewise `sliceable` should be a floating function and it wraps ranges and adds opSlice, implements the meta programming of detecting if its implementable *once*, instead of the each and every algorthim needing to juggle the complexity of opDollar existing or not, infinite or finite, error behavoir etc.
Sep 02
On Monday, September 2, 2024 10:03:27 AM MDT monkyyy via Digitalmars-d wrote:I think slicing should probably be cut from that main api, Id argue its the hardest to get correct of any of the functions while its the most fragile in meta programming in my experiments Instead there should be a collection of implict range functions given special treatment(std.range.interface, object.d, somewhere new for ranges) --- ```d auto tail(R)(R r){ r.pop; return r; } ``` It would suck to implement tail everywhere, so tail should not be in the main api; but often you need to make a poped dup. If tail doesnt exist, several range algorithms will need 3 extra lines of code repeated several time, if its in the main api, every algorthim will need to impliment a `auto tail()=>r.tail;` busy work as a specail treatment floating function it is 3 lines, users dont need to implement anything and it will reduce complexity and clarity of several algorithms(consider zip.pop being implimented with static map) likewise `sliceable` should be a floating function and it wraps ranges and adds opSlice, implements the meta programming of detecting if its implementable *once*, instead of the each and every algorthim needing to juggle the complexity of opDollar existing or not, infinite or finite, error behavoir etc.As things stand, slicing finite ranges (and infinite ranges when using $) always results in the original type - just like it does when slicing dynamic arrays. I don't think that we want to lose that property of slicing just to avoid making it so that someone has to implement opSlice to get slicing. And if you're already implementing indexing, I don't think that it's particularly onerous to have to implement slicing as well. Maybe we could provide code to mix in which would make it easier, but ranges are supposed to be an abstraction for dynamic arrays / slices (with ranges in lower categories having fewer capabilities), and making slices have a different type breaks that abstraction. Obviously, the abstraction isn't perfect as things stand, but in general, we want code that's written to work with dynamic arrays to be able to work with random-access ranges with as few changes as possible (preferably with no changes outside of the function signature). Also, it's going to make algorithm code simpler if it doesn't have to worry about whether random-access ranges have slicing. As it is, we already have too many range-related traits that algorithms potentially have to branch on, and I expect that simplifying that buys us more than trying to make it so that someone who wants to implement a random-access range doesn't have to implement slicing. - Jonathan M Davis
Sep 02
On Tuesday, 3 September 2024 at 00:54:32 UTC, Jonathan M Davis wrote:On Monday, September 2, 2024 10:03:27 AM MDT monkyyy via Digitalmars-d wrote:I think slicing should probably be cut from that main api, Id argue its the hardest to get correct of any of the functions while its the most fragile in meta programming in my experiments Instead there should be a collection of implict range functions given special treatment(std.range.interface, object.d, somewhere new for ranges) --- ```d auto tail(R)(R r){ r.pop; return r; } ``` It would suck to implement tail everywhere, so tail should not be in the main api; but often you need to make a poped dup. If tail doesnt exist, several range algorithms will need 3 extra lines of code repeated several time, if its in the main api, every algorthim will need to impliment a `auto tail()=>r.tail;` busy work as a specail treatment floating function it is 3 lines, users dont need to implement anything and it will reduce complexity and clarity of several algorithms(consider zip.pop being implimented with static map) likewise `sliceable` should be a floating function and it wraps ranges and adds opSlice, implements the meta programming of detecting if its implementable *once*, instead of the each and every algorthim needing to juggle the complexity of opDollar existing or not, infinite or finite, error behavoir etc.As things stand, slicing finite ranges (and infinite ranges when using $) always results in the original type - just like it does when slicing dynamic arrays. I don't think that we want to lose that property of slicing just to avoid making it so that someone has to implement opSlice to get slicing. And if you're already implementing indexing, I don't think that it's particularly onerous to have to implement slicing as well. Maybe we could provide code to mix in which would make it easier, but ranges are supposed to be an abstraction for dynamic arrays / slices (with ranges in lower categories having fewer capabilities), and making slices have a different type breaks that abstraction. Obviously, the abstraction isn't perfect as things stand, but in general, we want code that's written to work with dynamic arrays to be able to work with random-access ranges with as few changes as possible (preferably with no changes outside of the function signature). Also, it's going to make algorithm code simpler if it doesn't have to worry about whether random-access ranges have slicing. As it is, we already have too many range-related traits that algorithms potentially have to branch on, and I expect that simplifying that buys us more than trying to make it so that someone who wants to implement a random-access range doesn't have to implement slicing. - Jonathan M Davis*As things stand* *just*As thing stands, I bet that the 50ish algorthims that should work correctly with slicing, dont handle the complexitys of all infinite, opDollar, nested ranges, with coherent error behavior in d2 and its had the 10 years and 100+ whatever contributors. I dont think is possible to get it correct with the old api; it always feels like a crapshoot if array like ranges work.I dont know why anyone would sign up to recreate anything resembling d2. My experiments had more ambitious plans for slicing, but it very quickly broke down. You should make a proof of concept algorithms lib with lets say 20 algorithms 10 of which are suppose to work with slicing; can you juggle all the complexity with the requirements, are you sure? It will only get worse with scaling it up to 100 people, real world use cases, changing requirements, and feature requests as you attempt to reach for 150-200 algorithms in d3. From my prospective your generating a very very **very** long wishlist on pure math theory that will not survive contact with d as a buggy compiler and templates that merely pretends to be context-free; instead of implementing and feeling it out what the compiler actually is; redoing the foundational pieces to be simple and strong as possible.
Sep 03
On Monday, 2 September 2024 at 02:47:16 UTC, Jonathan M Davis wrote:One thing that I've been thinking about with regards to improving the range API which I forgot to discuss previously is the requirements for hasSlicing. Right now, for whatever reason, slicing is divorced from random-access ranges. In practice, random-access ranges have slicing (unless the programmer just didn't bother to implement it), and other ranges don't, but currently, hasSlicing is a separate trait that can theoretically pass with a forward range, and it's not guaranteed that a random-access range has slicing. So, my question is: 1. Are there any realistic situations where a range could be a forward or bidirectional range _and_ implement slicing in O(1), but it _can't_ actually be implemented as a random-access range? So, it would need to be possible to take slices of elements of such a range by index in O(1) and yet somehow not be possible to access individual elements by index in O(1).I can answer this direction immediately and formally. If a range has O(1) slicing, it has O(1) indexing, even if an opIndex isn't implemented: If I want range[i], I can replace it by either range[i .. i + 1].front or range[i .. $].front. It's guaranteed to exist by virtue of the assumption that slicing exists and the slicing operation returns a range, and every range has a front.2. Are there any realistic situations where a range could be a random-access range but _can't_ implement slicing in O(1)? So, it would need to be possible to access individual elements by index in O(1) and yet somehow not be possible to take slices of elements by index in O(1).For the other direction, i.e. has indexing, but you want slicing, I think you're really in the practical uses category. Maybe a sparse dynamic array using a splay tree as a backing structure is a counterexample to your idea. A splay tree is an established search tree that optimizes repeated adjacent accesses of the same element, i.e. if you seek x (average O(log n), worst case O(n)) and then x again, the second lookup (and any third, fourth, ...) will be O(1) guaranteed. I don't see how this thing can implement slicing. But, again, it may not be an actual counterexample. I also remember that Judy arrays are supposedly really difficult to implement. Maybe they're a counterexample. I really don't know much about them. I looked some stuff up, but that wasn't enough to judge whether they can implement slicing or not.
Sep 05