digitalmars.D - Feedback on Streams concept, similar to Ranges
- Andrew (34/34) May 15 2023 So I've been working on a small side project for the last few
- Sergey (9/13) May 15 2023 Thanks for sharing!
- Andrew (14/22) May 15 2023 I haven't added such a buffered wrapper type to this library yet,
- Monkyyy (3/37) May 16 2023 How would this help me with say rendering video and enforcing
- Andrew (12/14) May 17 2023 I'm not really sure how you think that an IO stream library would
- Monkyyy (8/22) May 17 2023 For ranges I could use `takeExactly` and store the range
- Jacob Shtokolov (34/39) May 17 2023 First of all, thanks for investing your time into this!
- Andrew (24/56) May 18 2023 Yeah, I talked with schveiguy on the discord server for a bit; it
- Jacob Shtokolov (12/13) May 22 2023 There is definitely a need for a good set of functions that are
- Steven Schveighoffer (20/32) May 22 2023 Yeah, this is the intention of iopipe. Once you get to a buffer, you can...
So I've been working on a small side project for the last few days, and I think that it's gotten to the point where I think that it's ready to be reviewed/critiqued. The project is available on GitHub here: https://github.com/andrewlalis/streams It introduces the concept of **Streams**, which is anything with either of the following function signatures: - `int read(T[] buffer)` - this is an **input stream**. - `int write(T[] buffer)` - this is an **output stream**. The README.md on the project's homepage describes the motivation in more detail, but in short, I'm not 100% satisfied with Phobos' ranges, and I think that streams could be introduced as a lower-level primitive that's also more familiar to programmers coming from a variety of other languages, while still trying to be as idiomatically D as possible. Just for the sake of demonstration, here's an example of using streams to transfer the contents of a file to some arbitrary output stream. Of course pretty much anything done with streams can also be done with ranges, but I think that the simpler interface will make some things more ergonomic. ```d import streams; void readFileTo(S)(string filename, S stream) if (isOutputStream!(S, ubyte)) { import std.stdio; auto fIn = FileInputStream(filename); transferTo(fIn, stream); } ``` So, I'd appreciate if anyone could take a look at my project, tell me if you think this is a good idea or not, if I should introduce a DIP for this change if added to Phobos (I know the DIP process is closed at the moment), or if you have any other feedback for this.
May 15 2023
On Monday, 15 May 2023 at 09:53:22 UTC, Andrew wrote:that it's ready to be reviewed/critiqued. The project is available on GitHub here: https://github.com/andrewlalis/streams feedback for this.Thanks for sharing! Probably will have a look later, just a couple of questions: 1) Does it support auto buffer for performance purposes (something like BufReader/BufWriter in other langs , for example https://zig.news/kristoff/how-to-add-buffering-to-a-writer-reader-in-zig-7jd) 2) while implementing have you consider to look into undead repo? https://github.com/dlang/undeaD/blob/master/src/undead/stream.d 3) will it be possible to connect it with things like Kafka?
May 15 2023
On Monday, 15 May 2023 at 10:26:24 UTC, Sergey wrote:1) Does it support auto buffer for performance purposes (something like BufReader/BufWriter in other langs , for example https://zig.news/kristoff/how-to-add-buffering-to-a-writer-reader-in-zig-7jd)I haven't added such a buffered wrapper type to this library yet, but now that I read that article, it seems entirely doable to add that to this implementation, so I'll do that shortly!2) while implementing have you consider to look into undead repo? https://github.com/dlang/undeaD/blob/master/src/undead/stream.dundead/stream.d is, as far as I can see, a purely OOP-style approach to IO streams, which looks like it's loosely inspired by interface (like Phobos does for ranges), the main goal is to use compile-time checks to let anything be a stream if it behaves like one.3) will it be possible to connect it with things like Kafka?Well, yes, that is possible, but I don't personally have much experience with Kafka's binary protocol. But generally, it should be rather trivial to translate existing implementations that use a similar IO approach to my proposed streams implementation.
May 15 2023
On Monday, 15 May 2023 at 09:53:22 UTC, Andrew wrote:So I've been working on a small side project for the last few days, and I think that it's gotten to the point where I think that it's ready to be reviewed/critiqued. The project is available on GitHub here: https://github.com/andrewlalis/streams It introduces the concept of **Streams**, which is anything with either of the following function signatures: - `int read(T[] buffer)` - this is an **input stream**. - `int write(T[] buffer)` - this is an **output stream**. The README.md on the project's homepage describes the motivation in more detail, but in short, I'm not 100% satisfied with Phobos' ranges, and I think that streams could be introduced as a lower-level primitive that's also more familiar to programmers coming from a variety of other languages, while still trying to be as idiomatically D as possible. Just for the sake of demonstration, here's an example of using streams to transfer the contents of a file to some arbitrary output stream. Of course pretty much anything done with streams can also be done with ranges, but I think that the simpler interface will make some things more ergonomic. ```d import streams; void readFileTo(S)(string filename, S stream) if (isOutputStream!(S, ubyte)) { import std.stdio; auto fIn = FileInputStream(filename); transferTo(fIn, stream); } ``` So, I'd appreciate if anyone could take a look at my project, tell me if you think this is a good idea or not, if I should introduce a DIP for this change if added to Phobos (I know the DIP process is closed at the moment), or if you have any other feedback for this.How would this help me with say rendering video and enforcing frames are syncef? If T[] doesnt have any flexable logic?
May 16 2023
On Wednesday, 17 May 2023 at 00:50:19 UTC, Monkyyy wrote:How would this help me with say rendering video and enforcing frames are syncef? If T[] doesnt have any flexable logic?I'm not really sure how you think that an IO stream library would help you particularly more than any other one would... that said, it would certainly be easier to enforce that frames are synced using this library than, say, Phobos ranges, because you can more gracefully handle stream errors without having to use exceptions/GC stuff. Additionally, like Sergey suggested, I've added "buffered" streams, as decorators for any base stream, so that you could, for example, use a buffered input stream to read exactly as many bytes from a video stream as needed to fill a framebuffer (or something like that, I'm not familiar with video stuff).
May 17 2023
On Wednesday, 17 May 2023 at 08:29:38 UTC, Andrew wrote:On Wednesday, 17 May 2023 at 00:50:19 UTC, Monkyyy wrote:For ranges I could use `takeExactly` and store the range somewhere to have the frame syncing enforcment; that sort of thing comes from just duck typing templates so you can build up your concept. By having your primitive be [] rather then a list of functions to match, airnt you reducing the expressiveness if actaully adopted with a liberty of algorthims?How would this help me with say rendering video and enforcing frames are syncef? If T[] doesnt have any flexable logic?I'm not really sure how you think that an IO stream library would help you particularly more than any other one would... that said, it would certainly be easier to enforce that frames are synced using this library than, say, Phobos ranges, because you can more gracefully handle stream errors without having to use exceptions/GC stuff. Additionally, like Sergey suggested, I've added "buffered" streams, as decorators for any base stream, so that you could, for example, use a buffered input stream to read exactly as many bytes from a video stream as needed to fill a framebuffer (or something like that, I'm not familiar with video stuff).
May 17 2023
On Wednesday, 17 May 2023 at 13:47:15 UTC, Monkyyy wrote:For ranges I could use `takeExactly` and store the range somewhere to have the frame syncing enforcment; that sort of thing comes from just duck typing templates so you can build up your concept. By having your primitive be [] rather then a list of functions to match, airnt you reducing the expressiveness if actaully adopted with a liberty of algorthims?Yes, I am reducing the expressiveness, but I think it's a good idea. Ranges aren't nogc/betterC compatible, and they don't support giving extra context about how many items were written or read in a consistent way. Streams are also defining their primitive as anything implementing `int readFromStream(T[] items)` or `int writeToStream(T[] items)`, or both. I know it's mostly a matter of personal preference, but I think there is value in having the standard library use a restrictive interface, instead of duck-typing, since it'll (hopefully) be used all over the place.
May 17 2023
On Wednesday, 17 May 2023 at 14:40:48 UTC, Andrew wrote:On Wednesday, 17 May 2023 at 13:47:15 UTC, Monkyyy wrote:What does duck typing have to do with nogc? If you said `writeToStream(T)(T items)` and assumed the user would provide a T that defined opSlice, opIndex and a length why couldnt whatever systems have nogc somewhere in the pipeline that makes it werkFor ranges I could use `takeExactly` and store the range somewhere to have the frame syncing enforcment; that sort of thing comes from just duck typing templates so you can build up your concept. By having your primitive be [] rather then a list of functions to match, airnt you reducing the expressiveness if actaully adopted with a liberty of algorthims?Yes, I am reducing the expressiveness, but I think it's a good idea. Ranges aren't nogc/betterC compatible, and they don't support giving extra context about how many items were written or read in a consistent way. Streams are also defining their primitive as anything implementing `int readFromStream(T[] items)` or `int writeToStream(T[] items)`, or both. I know it's mostly a matter of personal preference, but I think there is value in having the standard library use a restrictive interface, instead of duck-typing, since it'll (hopefully) be used all over the place.
May 17 2023
On Wednesday, 17 May 2023 at 14:53:04 UTC, monkyyy wrote:What does duck typing have to do with nogc?Nothing; duck typing just results in code that's harder to read, and harder to reason about than restrictive code, usually.If you said `writeToStream(T)(T items)` and assumed the user would provide a T that defined opSlice, opIndex and a length why couldnt whatever systems have nogc somewhere in the pipeline that makes it werkI don't understand what you're trying to say here. Yes, the intention is that my library is nogc compatible by default, and anyone can choose to make a stream that is or isn't nogc compatible, and it'll work with the library.
May 17 2023
On Monday, 15 May 2023 at 09:53:22 UTC, Andrew wrote:So, I'd appreciate if anyone could take a look at my project, tell me if you think this is a good idea or not, if I should introduce a DIP for this change if added to Phobos (I know the DIP process is closed at the moment), or if you have any other feedback for this.First of all, thanks for investing your time into this! Have got some questions: 1. Have you looked at the [IOPipe](https://github.com/schveiguy/iopipe) library? 2. What are the main benefits over the existing Ranges concept? Say, given your example: ```d import streams; void readFileTo(S)(string filename, S stream) if (isOutputStream!(S, ubyte)) { import std.stdio; auto fIn = FileInputStream(filename); transferTo(fIn, stream); } ``` With ranges, this would look something like: ```d import std; void readFileTo(S)(string filename, ref S stream) if (isOutputRange!(S, ubyte[])) { File(filename).byChunk().copy(stream); } ``` Which is more or less the same. 3. In the README you write: ``` Phobos' concept of an Input Range relies on implicit buffering of results... This doesn't map as easily to many low-level resources ``` AFAIK, the read/write buffers are anywhere, except, probably, `sendfile()` and some combination of `mmap` and `write`. But I'm struggling to get how this streams concept maps onto `sendfile` as well.
May 17 2023
On Thursday, 18 May 2023 at 01:31:21 UTC, Jacob Shtokolov wrote:1. Have you looked at the [IOPipe](https://github.com/schveiguy/iopipe) library?Yeah, I talked with schveiguy on the discord server for a bit; it honestly looks like a better concept than what I'm doing, lol. But I didn't notice it until yesterday. So maybe I should focus my efforts there? I don't know yet.2. What are the main benefits over the existing Ranges concept? Say, given your example: ```d import streams; void readFileTo(S)(string filename, S stream) if (isOutputStream!(S, ubyte)) { import std.stdio; auto fIn = FileInputStream(filename); transferTo(fIn, stream); } ``` With ranges, this would look something like: ```d import std; void readFileTo(S)(string filename, ref S stream) if (isOutputRange!(S, ubyte[])) { File(filename).byChunk().copy(stream); } ``` Which is more or less the same.I would say that there isn't really a big benefit over using ranges in terms of how they're expressed, but more that I think (and I may be wrong) that streams are a simpler, easier concept for programmers to grasp, especially those that are migrating to D from some other language that uses a similar stream concept. Another benefit is that, as pointed out by Guillame Pilot in another thread, phobos ranges (and most of phobos for that matter) has no real convention for naming schemes, and they generally don't try to be betterC-compatible, so it makes it difficult to use them in any low-level code. Finally, my streams allow the code to handle errors more gracefully without needing exceptions, which isn't always convenient to do with ranges. But you're right; to the average D programmer, streams are just a different flavor of accomplishing the same thing.3. In the README you write: ``` Phobos' concept of an Input Range relies on implicit buffering of results... This doesn't map as easily to many low-level resources ``` AFAIK, the read/write buffers are anywhere, except, probably, `sendfile()` and some combination of `mmap` and `write`. But I'm struggling to get how this streams concept maps onto `sendfile` as well.Yeah, I suppose what I was trying to say, is that this library puts the programmer in more control of if and when buffers are allocated and used with IO. But of course for `sendfile` there's no need for streams.
May 18 2023
On Thursday, 18 May 2023 at 16:10:44 UTC, Andrew wrote:So maybe I should focus my efforts there? I don't know yet.There is definitely a need for a good set of functions that are interchangeable between different file types in Phobos: for instance, Socket seems to have no such primitive as File.byChunk(), etc. So if we can have this kind of functionality somehow compatible with the built-in `std.algorithm` and specifically for IO, that would be really cool! What's your nickname in Discord, BTW? I feel like there are multiple on-going efforts from different people targeting the same core concept: the easy-to-use IO operations. Would be nice to exchange some ideas!
May 22 2023
On 5/22/23 6:49 AM, Jacob Shtokolov wrote:On Thursday, 18 May 2023 at 16:10:44 UTC, Andrew wrote:More effort on iopipe is always welcome! The library needs a lot of polish.So maybe I should focus my efforts there? I don't know yet.There is definitely a need for a good set of functions that are interchangeable between different file types in Phobos: for instance, Socket seems to have no such primitive as File.byChunk(), etc.Yeah, this is the intention of iopipe. Once you get to a buffer, you can use whatever you want on it. The idea is that I don't have to care whether it's a socket, file, or memory buffer, I can run my e.g. parser on it. `byChunk` would be trivial to put on top of this (though there's little reason to use it in this context). I already have `delimitedText` and the more specific `byLine` pipes, which extend to the next delimiter code point (https://schveiguy.github.io/iopipe/iopipe/textpipe/delimitedText.html) What iopipe lacks quite a bit is polish and probably a bunch of shortcuts (setting up a buffered stream is a lot more verbose than I would like). I also need to really focus on a formatting API.So if we can have this kind of functionality somehow compatible with the built-in `std.algorithm` and specifically for IO, that would be really cool!My next focus is async i/o. That is a precursor to what I really want to create -- a web server/framework. It's unfortunately slow going though, as this is spare time project for me.What's your nickname in Discord, BTW? I feel like there are multiple on-going efforts from different people targeting the same core concept: the easy-to-use IO operations. Would be nice to exchange some ideas!I believe Andrew's Discord is pretty straightforward. Also feel free to ping me if you want to discuss (schveiguy). -Steve
May 22 2023