www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - `drop` not `opSlice` should be in the range api

reply monkyyy <crazymonkyyy gmail.com> writes:
hot take:

Getting opSlice to act correctly with a linked list a `[0..$-1]` 
call just requires 3 op overloads and in theory *every* 
bidirectional algorithm is optionally doubly linked list; its 
allot of complexity for what benefit?

I think this is my final range api:

```
core:
	front (techincally optional now)
	pop (popFront when phoboes compadiblity matters)
	empty

reference:
	data* (with opIndex(key))
	key
	keyback

length

bidirectional
	back
	popback
	keyback

"slicable":
	drop
	dropback
```
Jan 08
parent reply Dukc <ajieskola gmail.com> writes:
On Wednesday, 8 January 2025 at 21:59:04 UTC, monkyyy wrote:
 Getting opSlice to act correctly with a linked list a 
 `[0..$-1]` call just requires 3 op overloads and in theory 
 *every* bidirectional algorithm is optionally doubly linked 
 list; its allot of complexity for what benefit?
The problem here is not implementation complexity. You are right it would be pretty trivial to implement. The problem is the execution time of the slicing operation. The convention is that if the run time (or memory use, if it is significant) grows faster than the logarithm of the range size as it increases, then the range should not provide a slicing operation. With a linked list, the execution time increases linearly with the amount of elements to be dropped, meaning it shouldn't have a slice operation as per convention. Unless I'm mistaken, this particular convention is inherited from C++ Standard Template Library. The user can still use `drop` and `take`/`takeExactly` when there's no issue with linear execution time, and it'll work with any forward range.
Jan 10
parent reply monkyyy <crazymonkyyy gmail.com> writes:
On Friday, 10 January 2025 at 16:06:18 UTC, Dukc wrote:
 
 With a linked list, the execution time increases linearly with 
 the amount of elements to be dropped, meaning it shouldn't have 
 a slice operation as per convention. Unless I'm mistaken, this 
 particular convention is inherited from C++ Standard Template 
 Library.
That maybe the theory in practice Im pretty sure Phobos has this very high standard for a "random access range" and all slicing is behind it, while most algorithms instantly discard one of the requirements so nothing maintains it Theres a reason allot of code uses .array as glue
Jan 10
parent reply Dukc <ajieskola gmail.com> writes:
On Friday, 10 January 2025 at 18:46:33 UTC, monkyyy wrote:
 Theres a reason allot of code uses .array as glue
It's as it should be. Most of the time you're dealing with relatively small data amounts, so your code doesn't need to be fast. Nothing wrong with unoptimised code. The result is you can quickly make a good guess whether the code you're reading is efficient for big ranges. Lost of unnecessary `.array`s, `.drop`s and `.take`s means it's probably not.
Jan 10
parent reply monkyyy <crazymonkyyy gmail.com> writes:
On Friday, 10 January 2025 at 21:07:53 UTC, Dukc wrote:
 On Friday, 10 January 2025 at 18:46:33 UTC, monkyyy wrote:
 Theres a reason allot of code uses .array as glue
It's as it should be. Most of the time you're dealing with relatively small data amounts, so your code doesn't need to be fast. Nothing wrong with unoptimised code. The result is you can quickly make a good guess whether the code you're reading is efficient for big ranges. Lost of unnecessary `.array`s, `.drop`s and `.take`s means it's probably not.
users code should do the easiest thing that gets the task done lib code should be more ambitious and Im discussing range api design of my lib code
Jan 10
parent reply Dukc <ajieskola gmail.com> writes:
On Friday, 10 January 2025 at 21:21:10 UTC, monkyyy wrote:
 users code should do the easiest thing that gets the task done

 lib code should be more ambitious and Im discussing range api 
 design of my lib code
Maybe you misunderstood. I wasn't writing that users don't need to care about performance. I meant that user's don't care about performance _most_ of the time, but they do care about it _some_ of the time. The convention is there to help users to distinguish their unoptimised and optimised code from each other. Since you're writing about API design, it's very much a case of user code - that is, users of your library. It doesn't have anything to do with how you ambitiously you optimise your library, or what range API (if any) you use internally - that is entirely up to you to decide and independent of the user API design. Note, I wrote this assuming you meant _user-facing_ range API. If you meant an internal API, I'll have to write another reply.
Jan 10
parent reply monkyyy <crazymonkyyy gmail.com> writes:
On Friday, 10 January 2025 at 21:54:35 UTC, Dukc wrote:
 On Friday, 10 January 2025 at 21:21:10 UTC, monkyyy wrote:
 users code should do the easiest thing that gets the task done

 lib code should be more ambitious and Im discussing range api 
 design of my lib code
Maybe you misunderstood. I wasn't writing that users don't need to care about performance. I meant that user's don't care about performance _most_ of the time, but they do care about it _some_ of the time. The convention is there to help users to distinguish their unoptimised and optimised code from each other. Since you're writing about API design, it's very much a case of user code - that is, users of your library. It doesn't have anything to do with how you ambitiously you optimise your library, or what range API (if any) you use internally - that is entirely up to you to decide and independent of the user API design.
Its also bad that users may need to spam .array to get something that may otherwise work. There isnt really a bench mark (even if there should be, Im extermely anti the upstream plan of "we will talk out apis and the best design will appear out of thin air", demos, test runs, etc.) but ideally you want some set of leet code problems you want to solve in <5 function calls each and youd cross reference some prosposed functions limilment sets of problems nicely. Slicing is 4 operations at once (drop(int) drop(opdollar+i), dropback(i),dropback(opdollar-i)) if it breaks one that cant be a "random access range" anymore, and opps geuss a techinically unnessery but extra work and knowlegde check on the user before it works.
 Note, I wrote this assuming you meant _user-facing_ range API. 
 If you meant an internal API, I'll have to write another reply.
is there a difference? I private nothing, users should be able to slot in custom algorithms call to then call my data structures, to impalement their data structures in a big mess of whatever happens happens.
Jan 10
parent reply Dukc <ajieskola gmail.com> writes:
On Saturday, 11 January 2025 at 00:43:44 UTC, monkyyy wrote:
 Its also bad that users may need to spam .array to get 
 something that may otherwise work.

 There isnt really a bench mark (even if there should be, Im 
 extermely anti the upstream plan of "we will talk out apis and 
 the best design will appear out of thin air", demos, test runs, 
 etc.) but ideally you want some set of leet code problems you 
 want to solve in <5 function calls each and youd cross 
 reference some prosposed functions limilment sets of problems 
 nicely. Slicing is 4 operations at once (drop(int) 
 drop(opdollar+i), dropback(i),dropback(opdollar-i)) if it 
 breaks one that cant be a "random access range" anymore, and 
 opps geuss a techinically unnessery but extra work and 
 knowlegde check on the user before it works.
Sorry, this is written so carelessly that I'm not sure what you mean. My best guess is you're opining that insisting on performance guarantees of random access ranges is not worth it because the slicing operator has just too much expressive power. If you're designing your own range API anyway, it's a reasonable choice IMO. Just don't do it for Phobos range APIs if you happen to implement any.
 Note, I wrote this assuming you meant _user-facing_ range API. 
 If you meant an internal API, I'll have to write another reply.
is there a difference? I private nothing, users should be able to slot in custom algorithms call to then call my data structures, to impalement their data structures in a big mess of whatever happens happens.
Even without `private`, libraries usually do make a distinction between functions that are intended for the user and functions that are intended only/mainly for the library itself. For user functions, there is effort made by the library maintainer to not break the API between versions, for internal functions there is not. Now, you don't _have_ to distinguish between those, but if you don't you'll have hard time offering any real stability between library releases. On the other hand, nothing wrong with an unstable library, especially if it's still in alpha stages. Maybe you want to leave drawing the distinction for later in which case there's no difference, and my last post applies.
Jan 12
parent monkyyy <crazymonkyyy gmail.com> writes:
On Sunday, 12 January 2025 at 17:48:48 UTC, Dukc wrote:
  libraries usually do make a distinction between functions that 
 are intended for the user and functions that are intended 
 only/mainly for the library itself.
I consider it a mistake that phoboes hides the "meta" range algorithms separately in std.range (like zip) while `levenshteinDistanceAndPath` is in algorithm. By virtue of the input api being the output api theres no real clear flow distinction; it is just a big mess, which is the power of the template hell but its also why it is just unreadable when theres minor syntax error that causes problems in some extreme edge case. When you can (and should) write something small that impliments the range api, you may be swapping back and forth between mine and your code every other function, that isnt true of some json parser which may have some structure where they hide the conversion functions in a file and that can just be isolated.
Jan 12