digitalmars.D - Feedback on Streams concept, similar to Ranges

Andrew (34/34) May 15 2023 So I've been working on a small side project for the last few

Sergey (9/13) May 15 2023 Thanks for sharing!

Andrew (14/22) May 15 2023 I haven't added such a buffered wrapper type to this library yet,

Monkyyy (3/37) May 16 2023 How would this help me with say rendering video and enforcing

Andrew (12/14) May 17 2023 I'm not really sure how you think that an IO stream library would

Monkyyy (8/22) May 17 2023 For ranges I could use `takeExactly` and store the range

Andrew (11/18) May 17 2023 Yes, I am reducing the expressiveness, but I think it's a good

monkyyy (6/24) May 17 2023 What does duck typing have to do with nogc?

Andrew (7/12) May 17 2023 Nothing; duck typing just results in code that's harder to read,

Jacob Shtokolov (34/39) May 17 2023 First of all, thanks for investing your time into this!

Andrew (24/56) May 18 2023 Yeah, I talked with schveiguy on the discord server for a bit; it

Jacob Shtokolov (12/13) May 22 2023 There is definitely a need for a good set of functions that are

Steven Schveighoffer (20/32) May 22 2023 Yeah, this is the intention of iopipe. Once you get to a buffer, you can...

Andrew <andrewlalisofficial gmail.com> writes:

So I've been working on a small side project for the last few 
days, and I think that it's gotten to the point where I think 
that it's ready to be reviewed/critiqued.

The project is available on GitHub here: 
https://github.com/andrewlalis/streams

It introduces the concept of **Streams**, which is anything with 
either of the following function signatures:
- `int read(T[] buffer)` - this is an **input stream**.
- `int write(T[] buffer)` - this is an **output stream**.

The README.md on the project's homepage describes the motivation 
in more detail, but in short, I'm not 100% satisfied with Phobos' 
ranges, and I think that streams could be introduced as a 
lower-level primitive that's also more familiar to programmers 
coming from a variety of other languages, while still trying to 
be as idiomatically D as possible.

Just for the sake of demonstration, here's an example of using 
streams to transfer the contents of a file to some arbitrary 
output stream. Of course pretty much anything done with streams 
can also be done with ranges, but I think that the simpler 
interface will make some things more ergonomic.

```d
import streams;

void readFileTo(S)(string filename, S stream) if 
(isOutputStream!(S, ubyte)) {
   import std.stdio;
   auto fIn = FileInputStream(filename);
   transferTo(fIn, stream);
}
```

So, I'd appreciate if anyone could take a look at my project, 
tell me if you think this is a good idea or not, if I should 
introduce a DIP for this change if added to Phobos (I know the 
DIP process is closed at the moment), or if you have any other 
feedback for this.

May 15 2023

Sergey <kornburn yandex.ru> writes:

On Monday, 15 May 2023 at 09:53:22 UTC, Andrew wrote:
 that it's ready to be reviewed/critiqued.

 The project is available on GitHub here: 
 https://github.com/andrewlalis/streams

 feedback for this.

Thanks for sharing!

Probably will have a look later, just a couple of questions:
1) Does it support auto buffer for performance purposes 
(something like BufReader/BufWriter in other langs , for example 
https://zig.news/kristoff/how-to-add-buffering-to-a-writer-reader-in-zig-7jd)

2) while implementing have you consider to look into undead repo? 
https://github.com/dlang/undeaD/blob/master/src/undead/stream.d

3) will it be possible to connect it with things like Kafka?

May 15 2023

Andrew <andrewlalisofficial gmail.com> writes:

On Monday, 15 May 2023 at 10:26:24 UTC, Sergey wrote:
 1) Does it support auto buffer for performance purposes 
 (something like BufReader/BufWriter in other langs , for 
 example 
 https://zig.news/kristoff/how-to-add-buffering-to-a-writer-reader-in-zig-7jd)

I haven't added such a buffered wrapper type to this library yet, 
but now that I read that article, it seems entirely doable to add 
that to this implementation, so I'll do that shortly!

 2) while implementing have you consider to look into undead 
 repo? 
 https://github.com/dlang/undeaD/blob/master/src/undead/stream.d

undead/stream.d is, as far as I can see, a purely OOP-style 
approach to IO streams, which looks like it's loosely inspired by 

interface (like Phobos does for ranges), the main goal is to use 
compile-time checks to let anything be a stream if it behaves 
like one.

 3) will it be possible to connect it with things like Kafka?

Well, yes, that is possible, but I don't personally have much 
experience with Kafka's binary protocol. But generally, it should 
be rather trivial to translate existing implementations that use 
a similar IO approach to my proposed streams implementation.

May 15 2023

Monkyyy <crazymonkyyy gmail.com> writes:

On Monday, 15 May 2023 at 09:53:22 UTC, Andrew wrote:
 So I've been working on a small side project for the last few 
 days, and I think that it's gotten to the point where I think 
 that it's ready to be reviewed/critiqued.

 The project is available on GitHub here: 
 https://github.com/andrewlalis/streams

 It introduces the concept of **Streams**, which is anything 
 with either of the following function signatures:
 - `int read(T[] buffer)` - this is an **input stream**.
 - `int write(T[] buffer)` - this is an **output stream**.

 The README.md on the project's homepage describes the 
 motivation in more detail, but in short, I'm not 100% satisfied 
 with Phobos' ranges, and I think that streams could be 
 introduced as a lower-level primitive that's also more familiar 
 to programmers coming from a variety of other languages, while 
 still trying to be as idiomatically D as possible.

 Just for the sake of demonstration, here's an example of using 
 streams to transfer the contents of a file to some arbitrary 
 output stream. Of course pretty much anything done with streams 
 can also be done with ranges, but I think that the simpler 
 interface will make some things more ergonomic.

 ```d
 import streams;

 void readFileTo(S)(string filename, S stream) if 
 (isOutputStream!(S, ubyte)) {
   import std.stdio;
   auto fIn = FileInputStream(filename);
   transferTo(fIn, stream);
 }
 ```

 So, I'd appreciate if anyone could take a look at my project, 
 tell me if you think this is a good idea or not, if I should 
 introduce a DIP for this change if added to Phobos (I know the 
 DIP process is closed at the moment), or if you have any other 
 feedback for this.

How would this help me with say rendering video and enforcing 
frames are syncef? If T[] doesnt have any flexable logic?

May 16 2023

Andrew <andrewlalisofficial gmail.com> writes:

On Wednesday, 17 May 2023 at 00:50:19 UTC, Monkyyy wrote:
 How would this help me with say rendering video and enforcing 
 frames are syncef? If T[] doesnt have any flexable logic?

I'm not really sure how you think that an IO stream library would 
help you particularly more than any other one would... that said, 
it would certainly be easier to enforce that frames are synced 
using this library than, say, Phobos ranges, because you can more 
gracefully handle stream errors without having to use 
exceptions/GC stuff.

Additionally, like Sergey suggested, I've added "buffered" 
streams, as decorators for any base stream, so that you could, 
for example, use a buffered input stream to read exactly as many 
bytes from a video stream as needed to fill a framebuffer (or 
something like that, I'm not familiar with video stuff).

May 17 2023

Monkyyy <crazymonkyyy gmail.com> writes:

On Wednesday, 17 May 2023 at 08:29:38 UTC, Andrew wrote:
 On Wednesday, 17 May 2023 at 00:50:19 UTC, Monkyyy wrote:
 How would this help me with say rendering video and enforcing 
 frames are syncef? If T[] doesnt have any flexable logic?

 I'm not really sure how you think that an IO stream library 
 would help you particularly more than any other one would... 
 that said, it would certainly be easier to enforce that frames 
 are synced using this library than, say, Phobos ranges, because 
 you can more gracefully handle stream errors without having to 
 use exceptions/GC stuff.

 Additionally, like Sergey suggested, I've added "buffered" 
 streams, as decorators for any base stream, so that you could, 
 for example, use a buffered input stream to read exactly as 
 many bytes from a video stream as needed to fill a framebuffer 
 (or something like that, I'm not familiar with video stuff).

For ranges I could use `takeExactly` and store the range 
somewhere to have the frame syncing enforcment; that sort of 
thing comes from just duck typing templates so you can build up 
your concept.
By having your primitive be [] rather then a list of functions to 
match, airnt you reducing the expressiveness if actaully adopted 
with a liberty of algorthims?

May 17 2023

Andrew <andrewlalisofficial gmail.com> writes:

On Wednesday, 17 May 2023 at 13:47:15 UTC, Monkyyy wrote:
 For ranges I could use `takeExactly` and store the range 
 somewhere to have the frame syncing enforcment; that sort of 
 thing comes from just duck typing templates so you can build up 
 your concept.
 By having your primitive be [] rather then a list of functions 
 to match, airnt you reducing the expressiveness if actaully 
 adopted with a liberty of algorthims?

Yes, I am reducing the expressiveness, but I think it's a good 
idea. Ranges aren't  nogc/betterC compatible, and they don't 
support giving extra context about how many items were written or 
read in a consistent way. Streams are also defining their 
primitive as anything implementing `int readFromStream(T[] 
items)` or `int writeToStream(T[] items)`, or both.

I know it's mostly a matter of personal preference, but I think 
there is value in having the standard library use a restrictive 
interface, instead of duck-typing, since it'll (hopefully) be 
used all over the place.

May 17 2023

monkyyy <crazymonkyyy gmail.com> writes:

On Wednesday, 17 May 2023 at 14:40:48 UTC, Andrew wrote:
 On Wednesday, 17 May 2023 at 13:47:15 UTC, Monkyyy wrote:
 For ranges I could use `takeExactly` and store the range 
 somewhere to have the frame syncing enforcment; that sort of 
 thing comes from just duck typing templates so you can build 
 up your concept.
 By having your primitive be [] rather then a list of functions 
 to match, airnt you reducing the expressiveness if actaully 
 adopted with a liberty of algorthims?

 Yes, I am reducing the expressiveness, but I think it's a good 
 idea. Ranges aren't  nogc/betterC compatible, and they don't 
 support giving extra context about how many items were written 
 or read in a consistent way. Streams are also defining their 
 primitive as anything implementing `int readFromStream(T[] 
 items)` or `int writeToStream(T[] items)`, or both.

 I know it's mostly a matter of personal preference, but I think 
 there is value in having the standard library use a restrictive 
 interface, instead of duck-typing, since it'll (hopefully) be 
 used all over the place.

What does duck typing have to do with nogc?

If you said `writeToStream(T)(T items)` and assumed the user 
would provide a T that defined opSlice, opIndex and a length why 
couldnt whatever systems have nogc somewhere in the pipeline that 
makes it werk

May 17 2023

Andrew <andrewlalisofficial gmail.com> writes:

On Wednesday, 17 May 2023 at 14:53:04 UTC, monkyyy wrote:
 What does duck typing have to do with nogc?

Nothing; duck typing just results in code that's harder to read, 
and harder to reason about than restrictive code, usually.

 If you said `writeToStream(T)(T items)` and assumed the user 
 would provide a T that defined opSlice, opIndex and a length 
 why couldnt whatever systems have nogc somewhere in the 
 pipeline that makes it werk

I don't understand what you're trying to say here. Yes, the 
intention is that my library is  nogc compatible by default, and 
anyone can choose to make a stream that is or isn't  nogc 
compatible, and it'll work with the library.

May 17 2023

Jacob Shtokolov <jacob.100205 gmail.com> writes:

On Monday, 15 May 2023 at 09:53:22 UTC, Andrew wrote:
 So, I'd appreciate if anyone could take a look at my project, 
 tell me if you think this is a good idea or not, if I should 
 introduce a DIP for this change if added to Phobos (I know the 
 DIP process is closed at the moment), or if you have any other 
 feedback for this.

First of all, thanks for investing your time into this!

Have got some questions:

1. Have you looked at the 
[IOPipe](https://github.com/schveiguy/iopipe) library?
2. What are the main benefits over the existing Ranges concept? 
Say, given your example:

```d
import streams;

void readFileTo(S)(string filename, S stream) if 
(isOutputStream!(S, ubyte)) {
     import std.stdio;
     auto fIn = FileInputStream(filename);
     transferTo(fIn, stream);
}
```

With ranges, this would look something like:

```d
import std;

void readFileTo(S)(string filename, ref S stream) if 
(isOutputRange!(S, ubyte[])) {
     File(filename).byChunk().copy(stream);
}
```

Which is more or less the same.

3. In the README you write:

```
Phobos' concept of an Input Range relies on implicit buffering of 
results... This doesn't map as easily to many low-level resources
```

AFAIK, the read/write buffers are anywhere, except, probably, 
`sendfile()` and some combination of `mmap` and `write`. But I'm 
struggling to get how this streams concept maps onto `sendfile` 
as well.

May 17 2023

Andrew <andrewlalisofficial gmail.com> writes:

On Thursday, 18 May 2023 at 01:31:21 UTC, Jacob Shtokolov wrote:
 1. Have you looked at the 
 [IOPipe](https://github.com/schveiguy/iopipe) library?

Yeah, I talked with schveiguy on the discord server for a bit; it 
honestly looks like a better concept than what I'm doing, lol. 
But I didn't notice it until yesterday. So maybe I should focus 
my efforts there? I don't know yet.

 2. What are the main benefits over the existing Ranges concept? 
 Say, given your example:

 ```d
 import streams;

 void readFileTo(S)(string filename, S stream) if 
 (isOutputStream!(S, ubyte)) {
     import std.stdio;
     auto fIn = FileInputStream(filename);
     transferTo(fIn, stream);
 }
 ```

 With ranges, this would look something like:

 ```d
 import std;

 void readFileTo(S)(string filename, ref S stream) if 
 (isOutputRange!(S, ubyte[])) {
     File(filename).byChunk().copy(stream);
 }
 ```

 Which is more or less the same.

I would say that there isn't really a big benefit over using 
ranges in terms of how they're expressed, but more that I think 
(and I may be wrong) that streams are a simpler, easier concept 
for programmers to grasp, especially those that are migrating to 
D from some other language that uses a similar stream concept.

Another benefit is that, as pointed out by Guillame Pilot in 
another thread, phobos ranges (and most of phobos for that 
matter) has no real convention for naming schemes, and they 
generally don't try to be betterC-compatible, so it makes it 
difficult to use them in any low-level code.

Finally, my streams allow the code to handle errors more 
gracefully without needing exceptions, which isn't always 
convenient to do with ranges.

But you're right; to the average D programmer, streams are just a 
different flavor of accomplishing the same thing.

 3. In the README you write:

 ```
 Phobos' concept of an Input Range relies on implicit buffering 
 of results... This doesn't map as easily to many low-level 
 resources
 ```

 AFAIK, the read/write buffers are anywhere, except, probably, 
 `sendfile()` and some combination of `mmap` and `write`. But 
 I'm struggling to get how this streams concept maps onto 
 `sendfile` as well.

Yeah, I suppose what I was trying to say, is that this library 
puts the programmer in more control of if and when buffers are 
allocated and used with IO. But of course for `sendfile` there's 
no need for streams.

May 18 2023

Jacob Shtokolov <jacob.100205 gmail.com> writes:

On Thursday, 18 May 2023 at 16:10:44 UTC, Andrew wrote:
 So maybe I should focus my efforts there? I don't know yet.

There is definitely a need for a good set of functions that are 
interchangeable between different file types in Phobos: for 
instance, Socket seems to have no such primitive as 
File.byChunk(), etc.

So if we can have this kind of functionality somehow compatible 
with the built-in `std.algorithm` and specifically for IO, that 
would be really cool!

What's your nickname in Discord, BTW? I feel like there are 
multiple on-going efforts from different people targeting the 
same core concept: the easy-to-use IO operations. Would be nice 
to exchange some ideas!

May 22 2023

Steven Schveighoffer <schveiguy gmail.com> writes:

On 5/22/23 6:49 AM, Jacob Shtokolov wrote:
 On Thursday, 18 May 2023 at 16:10:44 UTC, Andrew wrote:
 So maybe I should focus my efforts there? I don't know yet.


More effort on iopipe is always welcome! The library needs a lot of polish.

 
 There is definitely a need for a good set of functions that are 
 interchangeable between different file types in Phobos: for instance, 
 Socket seems to have no such primitive as File.byChunk(), etc.

Yeah, this is the intention of iopipe. Once you get to a buffer, you can 
use whatever you want on it. The idea is that I don't have to care 
whether it's a socket, file, or memory buffer, I can run my e.g. parser 
on it.

`byChunk` would be trivial to put on top of this (though there's little 
reason to use it in this context). I already have `delimitedText` and 
the more specific `byLine` pipes, which extend to the next delimiter 
code point 
(https://schveiguy.github.io/iopipe/iopipe/textpipe/delimitedText.html)

What iopipe lacks quite a bit is polish and probably a bunch of 
shortcuts (setting up a buffered stream is a lot more verbose than I 
would like). I also need to really focus on a formatting API.

 So if we can have this kind of functionality somehow compatible with the 
 built-in `std.algorithm` and specifically for IO, that would be really 
 cool!

My next focus is async i/o. That is a precursor to what I really want to 
create -- a web server/framework. It's unfortunately slow going though, 
as this is spare time project for me.

 What's your nickname in Discord, BTW? I feel like there are multiple 
 on-going efforts from different people targeting the same core concept: 
 the easy-to-use IO operations. Would be nice to exchange some ideas!

I believe Andrew's Discord is pretty straightforward. Also feel free to 
ping me if you want to discuss (schveiguy).

-Steve

May 22 2023

D Programming

C/C++ Programming

Other

digitalmars.D - Feedback on Streams concept, similar to Ranges