digitalmars.D - Another new io library

Steven Schveighoffer (79/79) Feb 16 2016 It's no secret that I've been looking to create an updated io library

Rikki Cattermole (7/86) Feb 16 2016 A few things:

Steven Schveighoffer (9/15) Feb 16 2016 What is front for an input stream? A byte? A character? A word? A line?

yawniek (5/9) Feb 17 2016 https://en.wikipedia.org/wiki/Principle_of_least_astonishment

Steven Schveighoffer (5/12) Feb 18 2016 There are exceptions (e.g. byLine), but the likelihood that providing a

John Colvin (4/15) Feb 17 2016 Why not just say it's a ubyte and then compose with ranges from

Adam D. Ruppe (35/37) Feb 17 2016 You could put a range interface on it... but I think it would be

Steven Schveighoffer (23/39) Feb 18 2016 seeking a stream is not a focus of my library. I'm focusing on raw data

Steven Schveighoffer (10/26) Feb 18 2016 If I provide a range by element (it may not be ubyte), then that's

Wyatt (24/51) Feb 18 2016 I hadn't thought of this before, but if we accept that a stream

Steven Schveighoffer (13/60) Feb 18 2016 An iopipe is typed however you want it to be.

Wyatt (23/40) Feb 18 2016 Sorry, sorry, just thinking (too much?) in terms of the

Steven Schveighoffer (11/27) Feb 18 2016 An "item" in a stream may be a line of text, it may be a packet of data,...

H. S. Teoh via Digitalmars-d (9/27) Feb 18 2016 [...]

Steven Schveighoffer (11/35) Feb 18 2016 But the point of a stream is that it's contiguous data. A string[] has

deadalnix (16/16) Feb 17 2016 First, I'm very happy to see that. Sounds like a good project.

Jonathan M Davis (4/6) Feb 17 2016 Or for those poor souls who can't read French... ;)

deadalnix (3/9) Feb 17 2016 Thank you for the fixup :)

Steven Schveighoffer (49/58) Feb 18 2016 I have one class, the IODevice. As I said in the announcement, this

Wyatt (12/24) Feb 18 2016 This looks pretty all-right so far. Would something like this

Steven Schveighoffer (17/36) Feb 18 2016 Yes, that is the intent. All without copying.

Wyatt (8/22) Feb 18 2016 Great!

Steven Schveighoffer (27/39) Feb 18 2016 The philosophy that I settled on is to create an iopipe that extends one...

Kagamin (5/13) Feb 19 2016 You mean window has current element and context - lookahead and

Steven Schveighoffer (40/55) Feb 19 2016 window doesn't have any "current" pointer. The window itself is the

Chad Joan (69/73) Feb 18 2016 Hi everyone, it's been a while.

Steven Schveighoffer (14/68) Feb 18 2016 To me, this is a higher-level function. popAs cannot assume to know how

Chad Joan (18/84) Feb 19 2016 I think I understand what you mean. We are entering the problem

Dejan Lekic (8/8) Feb 19 2016 Steven, this is superb!

Steven Schveighoffer (7/14) Feb 19 2016 Thanks! It is definitely true that my time with Tango opened up my eyes

Steven Schveighoffer <schveiguy yahoo.com> writes:

It's no secret that I've been looking to create an updated io library 
for phobos. In fact, I've been working on one on and off since 2011 (ouch).

After about 5 iterations of API and design, and testing out ideas, I 
think I have come up with something pretty interesting. It started out 
as a plan to replace std.stdio (and that did not go over well: 
https://forum.dlang.org/post/j3u0l4$1atr$1 digitalmars.com), in addition 
to trying to find a better way to deal with i/o. However, I've scaled 
back my plan of world domination to just try for the latter, and save 
tackling the replacement of Phobos's i/o guts for a later battle, if at 
all. It's much easier to reason about something new than to muddle the 
discussion with how it will break code. It's also much easier to build 
something that doesn't have to be a drop-in replacement of something so 
insanely complex.

I also have been inspired over the last few years by various great 
presentations and libraries, two being Dmitry's proof-of-concept library 
to have buffers that automatically move/fill when more data is needed, 
and Andrei's std.allocator library. They have changed drastically the 
way I have approached this challenge.

Therefore, I now have a new dub-based repository available for playing 
with: https://github.com/schveiguy/iopipe. First, the candy:

- This is a piping library. It allows one to hook buffered i/o through 
various processors/transformers much like unix pipes or range 
functions/algorithms. However, unlike unix pipes, this library attempts 
to make as few copies as possible of the data.

example:

foreach(line; (new IODevice(0)).bufferedInput
     .asText!(UTFType.UTF8)
     .byLine
     .asInputRange)
    // handle line

- It can handle 5 forms of UTF encoding - UTF8, UTF16, UTF16LE, UTF32, 
UTF32LE (phobos only partially handles UTF8). Sorry, no grapheme support 
or other utf-related things, but this of course can be added later.

- Arrays are first-class ioPipe types. This works:

foreach(line; "one\ntwo\nthree\nfour\n".byLine.asInputRange)

- Everything is compile-time for the most part, and uses lots of 
introspection. The intent is to give the compiler full gamut of 
optimization capabilities.

- I added rudimentary compression/decompression support using 
etc.c.zlib. Using compression is done like so:

foreach(line; (new IODevice(0)).bufferedInput
     .unzip
     .asText!(UTFType.UTF8)
     .byLine
     .asInputRange)

- The plan is for this to be a basis to make super-fast and modular 
parsing libraries. I plan to write a JSON one as a proof of concept. So 
all you have to do is add a parseJSON function to the end of any chain, 
as long as the the input is some pipe of text data (including a string 
literal).


=================

I will stress some very very important things:

1. This library is FAR from finished. Even the concepts probably need 
some tweaking. But I'm very happy with the current API/usage.

2. Docs are very thin. Unit tests are sparse (but do pass).

3. The focus of this library is NOT replacement of std.stream, or even 
low-level i/o in general. In fact, I have copied over my stream class 
from previous attempts at this i/o rewrite ONLY as a mechanism to have 
something that can read/write from file descriptors with the right API 
(located in iopipe/stream.d). I admit to never having looked at 
std.stream really, so I have no idea how it would compare.

4. As the stream framework is only for playing with the other useful 
parts of the library, I only wrote it for my OS (OSX), so you won't be 
able to play out of the box on Windows (probably can be added without 
much effort, or use another stream library such as this one that was 
recently announced: 
https://forum.dlang.org/post/xtxiuxcmewxnhseubyik forum.dlang.org), but 
it will likely work on other Unixen.

5. This is NOT thread-aware out of the box.

6. There is a concept in here I called "valves". It's very weird, but it 
allows unifying input and output into one seamless chain. In fact, I 
can't think of how I could have done output in this regime without them. 
See the convert example application for details on how it is used.

7. I expect to be changing the buffer API, as I think perhaps I have the 
wrong abstraction for buffers. However, I did attempt to have a 
std.allocator version of the buffer.

8. It's not on code.dlang.org yet. I'll work on this.

Destroy!

-Steve

Feb 16 2016

Rikki Cattermole <alphaglosined gmail.com> writes:

On 17/02/16 7:45 PM, Steven Schveighoffer wrote:
 It's no secret that I've been looking to create an updated io library
 for phobos. In fact, I've been working on one on and off since 2011 (ouch).

 After about 5 iterations of API and design, and testing out ideas, I
 think I have come up with something pretty interesting. It started out
 as a plan to replace std.stdio (and that did not go over well:
 https://forum.dlang.org/post/j3u0l4$1atr$1 digitalmars.com), in addition
 to trying to find a better way to deal with i/o. However, I've scaled
 back my plan of world domination to just try for the latter, and save
 tackling the replacement of Phobos's i/o guts for a later battle, if at
 all. It's much easier to reason about something new than to muddle the
 discussion with how it will break code. It's also much easier to build
 something that doesn't have to be a drop-in replacement of something so
 insanely complex.

 I also have been inspired over the last few years by various great
 presentations and libraries, two being Dmitry's proof-of-concept library
 to have buffers that automatically move/fill when more data is needed,
 and Andrei's std.allocator library. They have changed drastically the
 way I have approached this challenge.

 Therefore, I now have a new dub-based repository available for playing
 with: https://github.com/schveiguy/iopipe. First, the candy:

 - This is a piping library. It allows one to hook buffered i/o through
 various processors/transformers much like unix pipes or range
 functions/algorithms. However, unlike unix pipes, this library attempts
 to make as few copies as possible of the data.

 example:

 foreach(line; (new IODevice(0)).bufferedInput
      .asText!(UTFType.UTF8)
      .byLine
      .asInputRange)
     // handle line

 - It can handle 5 forms of UTF encoding - UTF8, UTF16, UTF16LE, UTF32,
 UTF32LE (phobos only partially handles UTF8). Sorry, no grapheme support
 or other utf-related things, but this of course can be added later.

 - Arrays are first-class ioPipe types. This works:

 foreach(line; "one\ntwo\nthree\nfour\n".byLine.asInputRange)

 - Everything is compile-time for the most part, and uses lots of
 introspection. The intent is to give the compiler full gamut of
 optimization capabilities.

 - I added rudimentary compression/decompression support using
 etc.c.zlib. Using compression is done like so:

 foreach(line; (new IODevice(0)).bufferedInput
      .unzip
      .asText!(UTFType.UTF8)
      .byLine
      .asInputRange)

 - The plan is for this to be a basis to make super-fast and modular
 parsing libraries. I plan to write a JSON one as a proof of concept. So
 all you have to do is add a parseJSON function to the end of any chain,
 as long as the the input is some pipe of text data (including a string
 literal).


 =================

 I will stress some very very important things:

 1. This library is FAR from finished. Even the concepts probably need
 some tweaking. But I'm very happy with the current API/usage.

 2. Docs are very thin. Unit tests are sparse (but do pass).

 3. The focus of this library is NOT replacement of std.stream, or even
 low-level i/o in general. In fact, I have copied over my stream class
 from previous attempts at this i/o rewrite ONLY as a mechanism to have
 something that can read/write from file descriptors with the right API
 (located in iopipe/stream.d). I admit to never having looked at
 std.stream really, so I have no idea how it would compare.

 4. As the stream framework is only for playing with the other useful
 parts of the library, I only wrote it for my OS (OSX), so you won't be
 able to play out of the box on Windows (probably can be added without
 much effort, or use another stream library such as this one that was
 recently announced:
 https://forum.dlang.org/post/xtxiuxcmewxnhseubyik forum.dlang.org), but
 it will likely work on other Unixen.

 5. This is NOT thread-aware out of the box.

 6. There is a concept in here I called "valves". It's very weird, but it
 allows unifying input and output into one seamless chain. In fact, I
 can't think of how I could have done output in this regime without them.
 See the convert example application for details on how it is used.

 7. I expect to be changing the buffer API, as I think perhaps I have the
 wrong abstraction for buffers. However, I did attempt to have a
 std.allocator version of the buffer.

 8. It's not on code.dlang.org yet. I'll work on this.

 Destroy!

 -Steve

A few things: 
https://github.com/schveiguy/iopipe/blob/master/source/iopipe/traits.d#L126 
why isn't that used more especially with e.g. window?
After all, window seems like a very well used word...

I don't like that a stream isn't inherently an input range.
This seems to me like a good place to use this abstraction by default.

Feb 16 2016

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 2/17/16 1:58 AM, Rikki Cattermole wrote:

 A few things:
 https://github.com/schveiguy/iopipe/blob/master/source/iopipe/traits.d#L126
 why isn't that used more especially with e.g. window?
 After all, window seems like a very well used word...

Not sure what you mean.

 I don't like that a stream isn't inherently an input range.
 This seems to me like a good place to use this abstraction by default.

What is front for an input stream? A byte? A character? A word? A line?

It's not there by default because it would be too assuming IMO. You can 
create an input range out of a stream quite easily.

e.g. 
https://github.com/schveiguy/iopipe/blob/master/source/iopipe/bufpipe.d#L664

What would be the benefit of having it an input range by default?

-Steve

Feb 16 2016

yawniek <dlang srtnwz.com> writes:

On Wednesday, 17 February 2016 at 07:15:01 UTC, Steven 
Schveighoffer wrote:
 On 2/17/16 1:58 AM, Rikki Cattermole wrote:
 What would be the benefit of having it an input range by 
 default?

 -Steve

https://en.wikipedia.org/wiki/Principle_of_least_astonishment
something the D community is lacking a bit in general imho.

but awesome library, will definitely use, thanks!

Feb 17 2016

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 2/17/16 3:54 AM, yawniek wrote:
 On Wednesday, 17 February 2016 at 07:15:01 UTC, Steven Schveighoffer wrote:
 On 2/17/16 1:58 AM, Rikki Cattermole wrote:
 What would be the benefit of having it an input range by default?

 https://en.wikipedia.org/wiki/Principle_of_least_astonishment
 something the D community is lacking a bit in general imho.

There are exceptions (e.g. byLine), but the likelihood that providing a 
range interface is the range that the user would expect is pretty low.

 but awesome library, will definitely use, thanks!

Thanks! Please let me know what you think if you end up using it.

-Steve

Feb 18 2016

John Colvin <john.loughran.colvin gmail.com> writes:

On Wednesday, 17 February 2016 at 07:15:01 UTC, Steven 
Schveighoffer wrote:
 On 2/17/16 1:58 AM, Rikki Cattermole wrote:

 A few things:
 https://github.com/schveiguy/iopipe/blob/master/source/iopipe/traits.d#L126
 why isn't that used more especially with e.g. window?
 After all, window seems like a very well used word...

 Not sure what you mean.

 I don't like that a stream isn't inherently an input range.
 This seems to me like a good place to use this abstraction by 
 default.

 What is front for an input stream? A byte? A character? A word? 
 A line?

Why not just say it's a ubyte and then compose with ranges from 
there?

Feb 17 2016

Adam D. Ruppe <destructionator gmail.com> writes:

On Wednesday, 17 February 2016 at 10:54:56 UTC, John Colvin wrote:
 Why not just say it's a ubyte and then compose with ranges from 
 there?

You could put a range interface on it... but I think it would be 
of very limited value. For one, what about fseek? How does that 
interact with the range interface?


Or, what about reading a network interface where you get 
variable-sized packets?

A ubyte[] is probably the closest thing you can get to 
usefulness, but even then you'd need non-range buffering controls 
to make it efficient and usable. Consider the following:

Packet 1: 11\nHello
Packet 2:  World05\nD ro
Packet 3: x


You take the ubyte[] thing that gives each packet at a time as it 
comes off the hardware interface. Good, you can process as it 
comes and it fits the range interface.

But it isn't terribly useful. Are you going to copy the partial 
message into another buffer so the next range.popFront doesn't 
overwrite it? Or will you present the incomplete message from 
packet 1 to the consumer? The former is less than efficient (and 
still needs to wrap the range in some other interface to make the 
user code pretty) and the latter leads to ugly user code being 
directly exposed.

Copying it into a buffer is probably the most sane... but it is a 
wasteful copy if your existing buffer has enough space. But how 
to you say that to a range? popFront takes no arguments.

What about packet 2, which has part of the first message and part 
of the second message? Can you tell it that you already consumed 
the first six bytes and it can now append the next packet to the 
existing buffer, but please return that slice on the next call?



Ranges are great for a sequence of data that is the same type on 
each call. Files, however, tend to have variable length (which 
you might want to skip large sections of) and different types of 
data as you iterate through them.

I find std.stdio's byChunk and byLine to be almost completely 
useless in my cases.

Feb 17 2016

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 2/17/16 9:52 AM, Adam D. Ruppe wrote:
 On Wednesday, 17 February 2016 at 10:54:56 UTC, John Colvin wrote:
 Why not just say it's a ubyte and then compose with ranges from there?

 You could put a range interface on it... but I think it would be of very
 limited value. For one, what about fseek? How does that interact with
 the range interface?

seeking a stream is not a focus of my library. I'm focusing on raw data 
throughput for an established pipeline that you expect not to move around.

A seek would require resetting the pipeline (something that is possible, 
but I haven't planned for it).

 Or, what about reading a network interface where you get variable-sized
 packets?

This I HAVE planned for, and it should work quite nicely. I agree that 
providing a by-default range interface may not be the most useful thing.

 Copying it into a buffer is probably the most sane... but it is a
 wasteful copy if your existing buffer has enough space. But how to you
 say that to a range? popFront takes no arguments.

The asInputRange adapter in iopipe/bufpipe.d provides the following 
crude interface:

1. front is the current window
2. empty returns true if the window is empty.
3. popFront discards the window, and extends in the next window.

With this, any ioPipe can be turned into a crude range. It should be 
good enough for things like std.algorithm.copy. And in the case of 
byLine, it allows one to create an iopipe that caters to creating a 
range, while also giving useful functionality as a pipe.

I'm on the fence as to whether all ioPipes should be ranges. Yes, it's 
easy to do (though a lot of boilerplate, you can't UFCS this), but I 
just can't see the use case being worth it.

 Ranges are great for a sequence of data that is the same type on each
 call. Files, however, tend to have variable length (which you might want
 to skip large sections of) and different types of data as you iterate
 through them.

Very much agree.

 I find std.stdio's byChunk and byLine to be almost completely useless in
 my cases.

byLine I find useful (think of grep), byChunk I've never found a reason 
to use.

-Steve

Feb 18 2016

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 2/17/16 5:54 AM, John Colvin wrote:
 On Wednesday, 17 February 2016 at 07:15:01 UTC, Steven Schveighoffer wrote:
 On 2/17/16 1:58 AM, Rikki Cattermole wrote:

 A few things:
 https://github.com/schveiguy/iopipe/blob/master/source/iopipe/traits.d#L126

 why isn't that used more especially with e.g. window?
 After all, window seems like a very well used word...

 Not sure what you mean.

 I don't like that a stream isn't inherently an input range.
 This seems to me like a good place to use this abstraction by default.

 What is front for an input stream? A byte? A character? A word? A line?

 Why not just say it's a ubyte and then compose with ranges from there?

If I provide a range by element (it may not be ubyte), then that's 
likely not the most useful range to have.

For example, the byLine iopipe gives you one more line of data each time 
you call extend. But the data in the window is not necessarily one line, 
and the element type is char, wchar, or dchar. None of those I would 
this is what someone would expect or want.

This is why I think it's better to have the user specifically tell me 
"this is how I want to range-ify this stream" rather than assume.

-Steve

Feb 18 2016

Wyatt <wyatt.epp gmail.com> writes:

On Thursday, 18 February 2016 at 15:44:00 UTC, Steven 
Schveighoffer wrote:
 On 2/17/16 5:54 AM, John Colvin wrote:
 On Wednesday, 17 February 2016 at 07:15:01 UTC, Steven 
 Schveighoffer wrote:
 On 2/17/16 1:58 AM, Rikki Cattermole wrote:

 A few things:
 https://github.com/schveiguy/iopipe/blob/master/source/iopipe/traits.d#L126

 why isn't that used more especially with e.g. window?
 After all, window seems like a very well used word...

 Not sure what you mean.

 I don't like that a stream isn't inherently an input range.
 This seems to me like a good place to use this abstraction 
 by default.

 What is front for an input stream? A byte? A character? A 
 word? A line?

 Why not just say it's a ubyte and then compose with ranges 
 from there?

 If I provide a range by element (it may not be ubyte), then 
 that's likely not the most useful range to have.

I hadn't thought of this before, but if we accept that a stream 
is raw, untyped data, it may be best _not_ to provide a range 
interface directly.  It's easy enough to

alias source = sourceStream.as!ubyte;

anyway, right?

 This is why I think it's better to have the user specifically 
 tell me "this is how I want to range-ify this stream" rather 
 than assume.

I think this makes more sense with TLV encodings, too.  Thinking 
of things like:

switch(source.as!(BERType).popFront){
     case(UNIVERSAL|PRIMITIVE|UTF8STRING){
         int len;
         if(source.as!(BERLength).front & 0b10_00_00_00) {
             // X.690? Never heard of 'em!
         } else {
             len = source.as!(BERLength).popFront;
         }
         return source.buffered(len).as!(string).popFront;
     }
     ...etc.
}

Musing: I'd probably want a helper like popAs!() so I don't 
forget popFront()...

-Wyatt

Feb 18 2016

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 2/18/16 12:08 PM, Wyatt wrote:
 On Thursday, 18 February 2016 at 15:44:00 UTC, Steven Schveighoffer wrote:
 On 2/17/16 5:54 AM, John Colvin wrote:
 On Wednesday, 17 February 2016 at 07:15:01 UTC, Steven Schveighoffer
 wrote:
 On 2/17/16 1:58 AM, Rikki Cattermole wrote:

 A few things:
 https://github.com/schveiguy/iopipe/blob/master/source/iopipe/traits.d#L126


 why isn't that used more especially with e.g. window?
 After all, window seems like a very well used word...

 Not sure what you mean.

 I don't like that a stream isn't inherently an input range.
 This seems to me like a good place to use this abstraction by default.

 What is front for an input stream? A byte? A character? A word? A line?

 Why not just say it's a ubyte and then compose with ranges from there?

 If I provide a range by element (it may not be ubyte), then that's
 likely not the most useful range to have.

 I hadn't thought of this before, but if we accept that a stream is raw,
 untyped data, it may be best _not_ to provide a range interface
 directly.  It's easy enough to

 alias source = sourceStream.as!ubyte;

 anyway, right?

An iopipe is typed however you want it to be.

bufferedInput by default uses an ArrayBuffer!ubyte. You can have it use 
any type of buffer you want, it doesn't discriminate. The only 
requirement is that the buffer's window is a random-access range 
(although I'm having thoughts that I should just require it to be an array).

But the concept of what constitutes an "item" in a stream may not be the 
"element type". That's what I'm getting at.

 This is why I think it's better to have the user specifically tell me
 "this is how I want to range-ify this stream" rather than assume.

 I think this makes more sense with TLV encodings, too.  Thinking of
 things like:

 switch(source.as!(BERType).popFront){
      case(UNIVERSAL|PRIMITIVE|UTF8STRING){
          int len;
          if(source.as!(BERLength).front & 0b10_00_00_00) {
              // X.690? Never heard of 'em!
          } else {
              len = source.as!(BERLength).popFront;
          }
          return source.buffered(len).as!(string).popFront;
      }
      ...etc.
 }

Very cool looking!

However, you have some issues there :) popFront doesn't return anything. 
And I think parsing/processing stream data works better by examining the 
buffer than shoehorning range functions in there.

-Steve

Feb 18 2016

Wyatt <wyatt.epp gmail.com> writes:

On Thursday, 18 February 2016 at 18:35:40 UTC, Steven 
Schveighoffer wrote:
 On 2/18/16 12:08 PM, Wyatt wrote:
 I hadn't thought of this before, but if we accept that a 
 stream is raw,
 untyped data, it may be best _not_ to provide a range interface
 directly.  It's easy enough to

 alias source = sourceStream.as!ubyte;

 anyway, right?

 An iopipe is typed however you want it to be.

Sorry, sorry, just thinking (too much?) in terms of the 
conceptual underpinnings.

But I don't think we really disagree, either: if you don't give a 
stream a type it doesn't have one "naturally", so it's best to be 
explicit even if you're just asking for raw bytes.  That's all 
I'm really saying there.

 But the concept of what constitutes an "item" in a stream may 
 not be the "element type". That's what I'm getting at.

Hmm, I guess I'm not seeing it.  Like, what even is an "item" in 
a stream?  It sort of precludes that by definition, which is why 
we have to give it a type manually.  What benefit is there to 
giving the buffer type separately from the window that gives you 
a typed slice into it? (I like that, btw.)

 However, you have some issues there :) popFront doesn't return 
 anything.

Clearly, as!() returns the data! ;)

But criminy, I do actually forget that ALL the damn time!  (I 
blame Broadcom.)  The worst part is I think I've even read the 
rationale for why it's like that and agreed with it with much 
nodding of the head and all that. :(

 And I think parsing/processing stream data works better by 
 examining the buffer than shoehorning range functions in there.

I think it's debatable.  But part of stream semantics is being 
able to use it like a stream, and my BER toy was in that vein.  
Sorry again, this is probably not the place for it unless you try 
to replace the std.stream for real.

-Wyatt

Feb 18 2016

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 2/18/16 2:53 PM, Wyatt wrote:
 On Thursday, 18 February 2016 at 18:35:40 UTC, Steven Schveighoffer wrote:

 But the concept of what constitutes an "item" in a stream may not be
 the "element type". That's what I'm getting at.

 Hmm, I guess I'm not seeing it.  Like, what even is an "item" in a
 stream?  It sort of precludes that by definition, which is why we have
 to give it a type manually.  What benefit is there to giving the buffer
 type separately from the window that gives you a typed slice into it? (I
 like that, btw.)

An "item" in a stream may be a line of text, it may be a packet of data, 
it may actually be a byte. But the compiler requires we type the buffer 
as something rigid that it can work with.

The elements of the stream are the basic fixed-sized units we use (the 
array element type). The items are less concrete.

 And I think parsing/processing stream data works better by examining
 the buffer than shoehorning range functions in there.

 I think it's debatable.  But part of stream semantics is being able to
 use it like a stream, and my BER toy was in that vein. Sorry again, this
 is probably not the place for it unless you try to replace the
 std.stream for real.

I think stream semantics are what you should use. I haven't used 
std.stream, so I don't know what the API looks like.

I assumed as! was something that returns a range of that type. Maybe I'm 
wrong?

-Steve

Feb 18 2016

"H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:

On Thu, Feb 18, 2016 at 03:20:58PM -0500, Steven Schveighoffer via
Digitalmars-d wrote:
 On 2/18/16 2:53 PM, Wyatt wrote:
On Thursday, 18 February 2016 at 18:35:40 UTC, Steven Schveighoffer wrote:

 
But the concept of what constitutes an "item" in a stream may not be
the "element type". That's what I'm getting at.

Hmm, I guess I'm not seeing it.  Like, what even is an "item" in a
stream?  It sort of precludes that by definition, which is why we
have to give it a type manually.  What benefit is there to giving the
buffer type separately from the window that gives you a typed slice
into it? (I like that, btw.)

 
 An "item" in a stream may be a line of text, it may be a packet of
 data, it may actually be a byte. But the compiler requires we type the
 buffer as something rigid that it can work with.
 
 The elements of the stream are the basic fixed-sized units we use (the
 array element type). The items are less concrete.

[...]

But array elements don't necessarily have to be fixed-sized, do they?
For example, an array of lines can be string[] (or const(char)[][]). Of
course, dealing with variable-sized items is messy, and probably rather
annoying to implement.  But it's *possible*, in theory.


T

-- 
People tell me that I'm paranoid, but they're just out to get me.

Feb 18 2016

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 2/18/16 4:02 PM, H. S. Teoh via Digitalmars-d wrote:
 On Thu, Feb 18, 2016 at 03:20:58PM -0500, Steven Schveighoffer via
Digitalmars-d wrote:
 On 2/18/16 2:53 PM, Wyatt wrote:
 On Thursday, 18 February 2016 at 18:35:40 UTC, Steven Schveighoffer wrote:

 But the concept of what constitutes an "item" in a stream may not be
 the "element type". That's what I'm getting at.

 Hmm, I guess I'm not seeing it.  Like, what even is an "item" in a
 stream?  It sort of precludes that by definition, which is why we
 have to give it a type manually.  What benefit is there to giving the
 buffer type separately from the window that gives you a typed slice
 into it? (I like that, btw.)

 An "item" in a stream may be a line of text, it may be a packet of
 data, it may actually be a byte. But the compiler requires we type the
 buffer as something rigid that it can work with.

 The elements of the stream are the basic fixed-sized units we use (the
 array element type). The items are less concrete.

 [...]

 But array elements don't necessarily have to be fixed-sized, do they?
 For example, an array of lines can be string[] (or const(char)[][]). Of
 course, dealing with variable-sized items is messy, and probably rather
 annoying to implement.  But it's *possible*, in theory.

But the point of a stream is that it's contiguous data. A string[] has 
contiguous data that are pointers and lengths of a fixed size 
(sizeof(string) is fixed).

This is not how you'd get data from a file or socket.

Since this library doesn't discriminate what the data source provides 
(it will accept string[] as window type), it's possible. In this case, 
the element type might make sense as the range front type, but it's not 
a typical case. However, it might be interesting as, say, a message 
stream from one thread to another.

-Steve

Feb 18 2016

deadalnix <deadalnix gmail.com> writes:

First, I'm very happy to see that. Sounds like a good project. 
Some remarks:
  - You seems to be using classes. These are good to compose at 
runtime, but we can do better at compile time using value types. 
I suggest using value types and have a class wrapper that can be 
used to make things composable at runtime if desirable.
  - Being able to read.write from an io device in a generator like 
manner is I think important if we are rolling out something new. 
Literally the only thing that can explain the success of Node.js 

(https://msdn.microsoft.com/fr-fr/library/hh191443.aspx) or Hack 
(https://docs.hhvm.com/hack/async/introduction).
  - I like the input range stuff. Input ranges needs more love.
  - Please explain valves more.
  - ...
  - Profit ?

Feb 17 2016

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Wednesday, 17 February 2016 at 22:47:27 UTC, deadalnix wrote:

 (https://msdn.microsoft.com/fr-fr/library/hh191443.aspx)

Or for those poor souls who can't read French... ;)

https://msdn.microsoft.com/en-us/library/hh191443.aspx

- Jonathan M Davis

Feb 17 2016

deadalnix <deadalnix gmail.com> writes:

On Wednesday, 17 February 2016 at 23:15:51 UTC, Jonathan M Davis 
wrote:
 On Wednesday, 17 February 2016 at 22:47:27 UTC, deadalnix wrote:

 (https://msdn.microsoft.com/fr-fr/library/hh191443.aspx)

 Or for those poor souls who can't read French... ;)

 https://msdn.microsoft.com/en-us/library/hh191443.aspx

 - Jonathan M Davis

Thank you for the fixup :)

Feb 17 2016

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 2/17/16 5:47 PM, deadalnix wrote:
 First, I'm very happy to see that. Sounds like a good project. Some
 remarks:
   - You seems to be using classes. These are good to compose at runtime,

I have one class, the IODevice. As I said in the announcement, this 
isn't a focus of the library, just a way to play with the other pieces 
:) It's utility isn't very important. One thing it does do (a relic from 
when I was thinking of trying to replace stdio.File innards) is take 
over a FILE *, and close the FILE * on destruction.

But I'm steadfastly against using classes for the meat of the library 
(i.e. the range-like pipeline types). I do happen to think classes work 
well for raw i/o, since the OS treats i/o items that way (e.g. a network 
socket is a file descriptor, not some other type), but it would be nice 
if you could have class features for non-GC lifetimes. Classes are bad 
for correct deallocation of i/o resources.

   - Being able to read.write from an io device in a generator like
 manner is I think important if we are rolling out something new.

I'm not quite sure what this means.

 Literally the only thing that can explain the success of Node.js is this


async I/O I was hoping could be handled like vibe does (i.e. under the 
hood with fibers).

   - Please explain valves more.

Valves allow all the types that process buffered input to process 
buffered output without changing pretty much anything. It allows me to 
have a "push" mechanism by pulling from the other end automatically.

In essence, the problem of buffered input is very different from the 
problem of buffered output. One is pulling data chunks at a time, and 
processing in finer detail, the other is processing data in finer detail 
and then pushing out chunks that are ready.

The big difference is the end of the pipe that needs user intervention. 
For input, the user is the consumer of data. With output, the user is 
the provider of data.

The problem is, how do you construct such a pipeline? The iopipe 
convention is to wrap the upstream data. For output, the upstream data 
is what you need access to. A std.algorithm.map doesn't give you access 
to the underlying range, right? So if you need access to the earlier 
part of the pipeline, how do you get to it? And how do you know how FAR 
to get to it (i.e. pipline.subpipe.subpipe.subpipe....)

This is what the valve is for. The valve has 3 parts, the inlet, the 
processed data, and the outlet. The inlet works like a normal iopipe, 
but instead of releasing data upstream, it pushes the data to the 
processed data area. The outlet can only pull data from the processed 
data. So this really provides a way for the user to control the flow of 
data. (note, a lot of this is documented in the concepts.txt document)

The reason it's special is because every iopipe is required to provide 
access to an upstream valve inlet if it exists. This makes the API of 
accessing the upstream data MUCH easier to deal with. (i.e. pipeline.valve)

Then I have this wrapper called autoValve, which automatically flushes 
the downstream data when more space is needed, and makes it look like 
you are just dealing with the upstream end. This is exactly the model we 
need for buffered output.

This way, I can have a push mechanism for output, and all the processing 
pieces (for instance, byte swapping, converting to a different array 
type, etc.) don't even need to care about providing a push mechanism.

   - Profit ?

Yes, absolutely :)

-Steve

Feb 18 2016

Wyatt <wyatt.epp gmail.com> writes:

On Wednesday, 17 February 2016 at 06:45:41 UTC, Steven 
Schveighoffer wrote:
 foreach(line; (new IODevice(0)).bufferedInput
     .asText!(UTFType.UTF8)
     .byLine
     .asInputRange)
    // handle line

This looks pretty all-right so far.  Would something like this 
work?

foreach(pollItem; zmqSocket.bufferedInput
     .as!(zmqPollItem)
     .asInputRange)

 3. The focus of this library is NOT replacement of std.stream, 
 or even low-level i/o in general.

Oh.  Well maybe that's not the case, but it may have potential 
anyway.  If nothing else, for testing API concepts.

 6. There is a concept in here I called "valves". It's very 
 weird, but it allows unifying input and output into one 
 seamless chain. In fact, I can't think of how I could have done 
 output in this regime without them. See the convert example 
 application for details on how it is used.

This... might be cool?  It bears some similarity to my own ideas. 
  I'd like to see more examples, though.

-Wyatt

Feb 18 2016

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 2/18/16 11:07 AM, Wyatt wrote:
 On Wednesday, 17 February 2016 at 06:45:41 UTC, Steven Schveighoffer wrote:
 foreach(line; (new IODevice(0)).bufferedInput
     .asText!(UTFType.UTF8)
     .byLine
     .asInputRange)
    // handle line

 This looks pretty all-right so far.  Would something like this work?

 foreach(pollItem; zmqSocket.bufferedInput
      .as!(zmqPollItem)
      .asInputRange)

Yes, that is the intent. All without copying.

Note, asInputRange may not do what you want here. If multiple 
zmqPollItems come in at once (I'm not sure how your socket works), the 
input range's front will provide the entire window of data, and flush it 
on popFront.

I'll also point at arrayCastPipe 
(https://github.com/schveiguy/iopipe/blob/master/source/iopipe/bufpipe.d#L399), 
which simply casts the input array window to a new type of array window 
(if the items are coming in binary form).

I'm thinking I'll change the name byInputRange to byWindow, and add a 
byElement for an element-wise input range.

 6. There is a concept in here I called "valves". It's very weird, but
 it allows unifying input and output into one seamless chain. In fact,
 I can't think of how I could have done output in this regime without
 them. See the convert example application for details on how it is used.

 This... might be cool?  It bears some similarity to my own ideas.  I'd
 like to see more examples, though.

I'm hoping people can come up with ideas for other uses for them. I 
really like the concept, but the only use case I have right now is 
output streams.

It would be cool to see if there's a use case for multiple valves.

-Steve

Feb 18 2016

Wyatt <wyatt.epp gmail.com> writes:

On Thursday, 18 February 2016 at 16:36:37 UTC, Steven 
Schveighoffer wrote:
 On 2/18/16 11:07 AM, Wyatt wrote:
 This looks pretty all-right so far.  Would something like this 
 work?

 foreach(pollItem; zmqSocket.bufferedInput
      .as!(zmqPollItem)
      .asInputRange)

 Yes, that is the intent. All without copying.

Great!

 Note, asInputRange may not do what you want here. If multiple 
 zmqPollItems come in at once (I'm not sure how your socket 
 works), the input range's front will provide the entire window 
 of data, and flush it on popFront.

Not so great!  That's really not what I'd expect at all. :(  
(This isn't to say it doesn't make sense semantically, but I 
don't like how it feels.)

 I'm thinking I'll change the name byInputRange to byWindow, and 
 add a byElement for an element-wise input range.

Oh, I see.  Naming.  Naming is hard.

-Wyatt

Feb 18 2016

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 2/18/16 12:16 PM, Wyatt wrote:
 On Thursday, 18 February 2016 at 16:36:37 UTC, Steven Schveighoffer wrote:
 Note, asInputRange may not do what you want here. If multiple
 zmqPollItems come in at once (I'm not sure how your socket works), the
 input range's front will provide the entire window of data, and flush
 it on popFront.

 Not so great!  That's really not what I'd expect at all. :( (This isn't
 to say it doesn't make sense semantically, but I don't like how it feels.)

The philosophy that I settled on is to create an iopipe that extends one 
"item" at a time, even if more are available. Then, apply the range 
interface on that.

When I first started to write byLine, I made it a range. Then I thought, 
"what if you wanted to iterate by 2 lines at a time, or iterate by one 
line at a time, but see the last 2 for context?", well, then that would 
be another type, and I'd have to abstract out the functionality of line 
searching.

So I decided to just make an abstract "asInputRange" and just wrap the 
functionality of extending data one line at a time. The idea is to make 
building blocks as simple and useful as possible.

So what I think may be a good fit for your application (without knowing 
all the details) is to create an iopipe that delineates each message and 
extends exactly one message per call to extend. Then, you can wrap that 
in asInputRange, or create your own range which translates the actual 
binary data to a nicer object for each call to front.

So something like:

foreach(pollItem; zmqSocket.bufferedInput
     .byZmqPacket
     .asInputRange)

I'm still not 100% sure that this is the right way to do it...

Hm... if asInputRange took a template parameter of what type it should 
return, then asInputRange!zmqPacket could return zmqPacket(pipe.window) 
for front. That's kind of nice.

 I'm thinking I'll change the name byInputRange to byWindow, and add a
 byElement for an element-wise input range.

 Oh, I see.  Naming.  Naming is hard.

Yes. It's especially hard when you haven't seen how others react to it :)

-Steve

Feb 18 2016

Kagamin <spam here.lot> writes:

On Thursday, 18 February 2016 at 18:27:28 UTC, Steven 
Schveighoffer wrote:
 The philosophy that I settled on is to create an iopipe that 
 extends one "item" at a time, even if more are available. Then, 
 apply the range interface on that.

 When I first started to write byLine, I made it a range. Then I 
 thought, "what if you wanted to iterate by 2 lines at a time, 
 or iterate by one line at a time, but see the last 2 for 
 context?", well, then that would be another type, and I'd have 
 to abstract out the functionality of line searching.

You mean window has current element and context - lookahead and 
lookbehind? I stumbled across this article 
http://blog.jooq.org/2016/01/06/2016-will-be-the-year-remembered-as-when-java-finally-h
d-window-functions/ it suggests that such window abstraction is generally
useful for data analysis.

Feb 19 2016

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 2/19/16 5:22 AM, Kagamin wrote:
 On Thursday, 18 February 2016 at 18:27:28 UTC, Steven Schveighoffer wrote:
 The philosophy that I settled on is to create an iopipe that extends
 one "item" at a time, even if more are available. Then, apply the
 range interface on that.

 When I first started to write byLine, I made it a range. Then I
 thought, "what if you wanted to iterate by 2 lines at a time, or
 iterate by one line at a time, but see the last 2 for context?", well,
 then that would be another type, and I'd have to abstract out the
 functionality of line searching.

 You mean window has current element and context - lookahead and
 lookbehind? I stumbled across this article
 http://blog.jooq.org/2016/01/06/2016-will-be-the-year-remembered-as-when-java-finally-had-window-functions/
 it suggests that such window abstraction is generally useful for data
 analysis.

window doesn't have any "current" pointer. The window itself is the 
current data. But with byLine, you could potentially remember where the 
last N lines were delineated. Hm...

auto byLineWithContext(size_t extraLines = 1, Chain)(Chain c)
{
    auto input = byLine(c);
    static struct ByLineWithContext
    {
       typeof(input) chain;
       size_t[extraLines] prevLines;
       auto front() { return chain.window[prevLines[$-1] .. $]; }
       void popFront()
       {
           auto offset = prevLines[0];
           foreach(i; 0 .. prevLines.length-1)
           {
               prevLines[i] = prevLines[i+1] - offset;
           }
           prevLines[$-1] = chain.window.length - offset;
           chain.release(offset);
           chain.extend(0); // extend in the next line
       }
       void empty()
       {
           return chain.window.length != prevLines[$-1];
       }
       // previous line of context (i = 0 is the oldest context line)
       auto contextLine(size_t i)
       {
           assert(i < prevLines.length);
           return chain.window[i == 0 ? 0 : prevLines[i-1] .. prevLines[i])
       }
    }
    return ByLineWithContext(input);
}

It's an interesting transition to think about looking at an entire 
buffer of data instead of some pointer to a single point in a stream as 
the primitive that you have.

-Steve

Feb 19 2016

Chad Joan <chadjoan gmail.com> writes:

On Wednesday, 17 February 2016 at 06:45:41 UTC, Steven 
Schveighoffer wrote:
 It's no secret that I've been looking to create an updated io 
 library for phobos. In fact, I've been working on one on and 
 off since 2011 (ouch).

 ...

Hi everyone, it's been a while.

I wanted to chime in on the streams-as-ranges thing, since I've 
thought about this quite a bit in the past and discussed it with 
Wyatt outside of the forum.

Steve: My apologies in advance if I a misunderstood any of the 
functionality of your IO library.  I haven't read any of the 
documentation, just this thread, and I my time is over-committed 
as usual.

Anyhow...

I believe that when I am dealing with streams, >90% of the time I 
am dealing with data that is *structured* and *heterogeneous*.  
Here are some use-cases:
1. Parsing/writing configuration files (ex: XML, TOML, etc)
2. Parsing/writing messages from some protocol, possibly over a 
network socket (or sockets).  Example: I am writing a PostgreSQL 
client and need to deserialize messages: 
http://www.postgresql.org/docs/9.2/static/protocol-message-formats.html
3. Serializing/deserializing some data structures to/from disk.  
Example: I am writing a game and I need to implement save/load 
functionality.
4. Serializing/deserializing tabular data to/from disk (ex: .CSV 
files).
5. Reading/writing binary data, such as images or video, from/to 
disk.  This will probably involve doing a bunch of (3), which is 
kind of like (2), but followed by large homogenous arrays of some 
data (ex: pixels).
6. Receiving unstructured user input.  This is my <10%.

Note that (6) is likely to happen eventually but also likely to 
be minuscule: why are we receiving user input?  Maybe it's just 
to store it for retrieval later.  BUT, maybe we actually want it 
to DO something.  If we want it to do something, then we need to 
structure it before code will be able to operate on it.

(5) is a mix of structured heterogeneous data and structured 
homogenous data.  In aggregate, this is structured heterogeneous 
data, because you need to do parsing to figure out where the 
arrays of homogeneous data start and end (and what they *mean*).

This is why I think it will be much more important to have at 
least these two interfaces take front-and-center:
A.  The presence of a .popAs!(...) operation (mentioned by Wyatt 
in this thread, IIRC) for simple deserialization, and maybe for 
other miscellaneous things like structured user interaction.
B.  The ability to attach parsers to streams easily.  This might 
be as easy as coercing the input stream into the basic encoding 
that the parser expects (ex: char/wchar/dchar Ranges for 
compilers, or maybe ubyte Ranges for our PostgreSQL client's 
network layer), though it might need (A) to help a bit first if 
the encoding isn't known in advance (text files can be 
represented in sooo many ways!  isn't it fabulous!).

I understand that most unsuspecting programmers will arrive at a 
stream library expecting to immediately see an InputRange 
interface.  This /probably/ is not what they really want at the 
end of the day.  So, I think it will be very important for any 
such library to concisely and convincingly explain the design 
methodology and rationale early and aggressively.  Neglect to do 
this, and the library and it's documentation will become a 
frustration and a violation of expectations (an "astonishment").  
Do it right, and the library's documentation will become a 
teaching tool that leaves visitors feeling enlightened and 
empowered.

Of course, I have to wonder if someone else has contrasting 
experiences with stream use-cases.  Maybe they really would be 
frustrated with a range-agnostic design.  I don't want to 
alienate this hypothetical individual either, so if this is you, 
then please share your experiences.

I hope this helps and is worth making a bunch of you read a wall 
of text ;)

- Chad

Feb 18 2016

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 2/18/16 6:52 PM, Chad Joan wrote:
 Steve: My apologies in advance if I a misunderstood any of the
 functionality of your IO library.  I haven't read any of the
 documentation, just this thread, and I my time is over-committed as usual.

Understandable.

 Anyhow...

 I believe that when I am dealing with streams, >90% of the time I am
 dealing with data that is *structured* and *heterogeneous*. Here are
 some use-cases:
 1. Parsing/writing configuration files (ex: XML, TOML, etc)
 2. Parsing/writing messages from some protocol, possibly over a network
 socket (or sockets).  Example: I am writing a PostgreSQL client and need
 to deserialize messages:
 http://www.postgresql.org/docs/9.2/static/protocol-message-formats.html
 3. Serializing/deserializing some data structures to/from disk. Example:
 I am writing a game and I need to implement save/load functionality.
 4. Serializing/deserializing tabular data to/from disk (ex: .CSV files).
 5. Reading/writing binary data, such as images or video, from/to disk.
 This will probably involve doing a bunch of (3), which is kind of like
 (2), but followed by large homogenous arrays of some data (ex: pixels).
 6. Receiving unstructured user input.  This is my <10%.

 Note that (6) is likely to happen eventually but also likely to be
 minuscule: why are we receiving user input?  Maybe it's just to store it
 for retrieval later.  BUT, maybe we actually want it to DO something.
 If we want it to do something, then we need to structure it before code
 will be able to operate on it.

 (5) is a mix of structured heterogeneous data and structured homogenous
 data.  In aggregate, this is structured heterogeneous data, because you
 need to do parsing to figure out where the arrays of homogeneous data
 start and end (and what they *mean*).

 This is why I think it will be much more important to have at least
 these two interfaces take front-and-center:
 A.  The presence of a .popAs!(...) operation (mentioned by Wyatt in this
 thread, IIRC) for simple deserialization, and maybe for other
 miscellaneous things like structured user interaction.

To me, this is a higher-level function. popAs cannot assume to know how 
to read what it is reading. If you mean something like reading an entire 
struct in binary form, that's not difficult to do.

 B.  The ability to attach parsers to streams easily.  This might be as
 easy as coercing the input stream into the basic encoding that the
 parser expects (ex: char/wchar/dchar Ranges for compilers, or maybe
 ubyte Ranges for our PostgreSQL client's network layer), though it might
 need (A) to help a bit first if the encoding isn't known in advance
 (text files can be represented in sooo many ways!  isn't it fabulous!).

This is the fundamental goal for my library -- enabling parsers to read 
data from a "stream" efficiently no matter how that data is sourced. I 
know your time is limited, but I would invite you to take a look at the 
convert program example that I created in my library. In it, I handle 
converting any UTF format to any other UTF format.

https://github.com/schveiguy/iopipe/blob/master/examples/convert/convert.d

 I understand that most unsuspecting programmers will arrive at a stream
 library expecting to immediately see an InputRange interface.  This
 /probably/ is not what they really want at the end of the day.  So, I
 think it will be very important for any such library to concisely and
 convincingly explain the design methodology and rationale early and
 aggressively.  Neglect to do this, and the library and it's
 documentation will become a frustration and a violation of expectations
 (an "astonishment"). Do it right, and the library's documentation will
 become a teaching tool that leaves visitors feeling enlightened and
 empowered.

Good points! I will definitely spend some time explaining this.

 Of course, I have to wonder if someone else has contrasting experiences
 with stream use-cases.  Maybe they really would be frustrated with a
 range-agnostic design.  I don't want to alienate this hypothetical
 individual either, so if this is you, then please share your experiences.

 I hope this helps and is worth making a bunch of you read a wall of text ;)

Thanks for taking the time.

-Steve

Feb 18 2016

Chad Joan <chadjoan gmail.com> writes:

On Friday, 19 February 2016 at 01:29:15 UTC, Steven Schveighoffer 
wrote:
 On 2/18/16 6:52 PM, Chad Joan wrote:
 ...

 This is why I think it will be much more important to have at 
 least
 these two interfaces take front-and-center:
 A.  The presence of a .popAs!(...) operation (mentioned by 
 Wyatt in this
 thread, IIRC) for simple deserialization, and maybe for other
 miscellaneous things like structured user interaction.

 To me, this is a higher-level function. popAs cannot assume to 
 know how to read what it is reading. If you mean something like 
 reading an entire struct in binary form, that's not difficult 
 to do.

I think I understand what you mean.  We are entering the problem 
domain of serializing and deserializing arbitrary types.

I think what I'd expect is to have the basic language types 
(ubyte, int, char, string, etc) all covered, and to provide some 
way (or ways) to integrate with serialization code provided by 
other types.  So you can do ".popAs!int" out of the box, but 
".popAs!MyType" will require MyType to provide a .deserialize 
member function.  Understandably, this may require some thought 
(ex: what if MyType is already under constraints from some other 
API that expects serialization? what does this look like if there 
are multiple serialization frameworks? etc etc).  I don't have 
the answer right now and I don't expect it to be solved quickly ;)

 B.  The ability to attach parsers to streams easily.  This 
 might be as
 easy as coercing the input stream into the basic encoding that 
 the
 parser expects (ex: char/wchar/dchar Ranges for compilers, or 
 maybe
 ubyte Ranges for our PostgreSQL client's network layer), 
 though it might
 need (A) to help a bit first if the encoding isn't known in 
 advance
 (text files can be represented in sooo many ways!  isn't it 
 fabulous!).

 This is the fundamental goal for my library -- enabling parsers 
 to read data from a "stream" efficiently no matter how that 
 data is sourced. I know your time is limited, but I would 
 invite you to take a look at the convert program example that I 
 created in my library. In it, I handle converting any UTF 
 format to any other UTF format.

 https://github.com/schveiguy/iopipe/blob/master/examples/convert/convert.d

Awesome!

 I understand that most unsuspecting programmers will arrive at 
 a stream
 library expecting to immediately see an InputRange interface.  
 This
 /probably/ is not what they really want at the end of the day.
  So, I
 think it will be very important for any such library to 
 concisely and
 convincingly explain the design methodology and rationale 
 early and
 aggressively.  Neglect to do this, and the library and it's
 documentation will become a frustration and a violation of 
 expectations
 (an "astonishment"). Do it right, and the library's 
 documentation will
 become a teaching tool that leaves visitors feeling 
 enlightened and
 empowered.

 Good points! I will definitely spend some time explaining this.

Best of luck :)

 Of course, I have to wonder if someone else has contrasting 
 experiences
 with stream use-cases.  Maybe they really would be frustrated 
 with a
 range-agnostic design.  I don't want to alienate this 
 hypothetical
 individual either, so if this is you, then please share your 
 experiences.

 I hope this helps and is worth making a bunch of you read a 
 wall of text ;)

 Thanks for taking the time.

 -Steve

Thank you for making progress on this problem!

- Chad

Feb 19 2016

Dejan Lekic <dejan.lekic gmail.com> writes:

Steven, this is superb!

Some 10+ years ago, I talked to Tango guys when they worked on 
I/O part of the Tango library and told them that in my head ideal 
abstraction for any I/O work is pipe and that I would actually 
build an I/O library around this abstraction instead of the 
Channel in Java or Conduit in Tango (well, we all know Tango 
borrowed ideas from Java API).

Your work is precisely what I was talking about. Well-done!

Feb 19 2016

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 2/19/16 6:27 AM, Dejan Lekic wrote:
 Steven, this is superb!

 Some 10+ years ago, I talked to Tango guys when they worked on I/O part
 of the Tango library and told them that in my head ideal abstraction for
 any I/O work is pipe and that I would actually build an I/O library
 around this abstraction instead of the Channel in Java or Conduit in
 Tango (well, we all know Tango borrowed ideas from Java API).

 Your work is precisely what I was talking about. Well-done!

Thanks! It is definitely true that my time with Tango opened up my eyes 
to how I/O could be better. I actually wrote the ThreadPipe conduit: 
https://github.com/SiegeLord/Tango-D2/blob/d2port/tango/io/device/ThreadPipe.d

This is one of those libraries where the source code is almost writing 
itself. I feel like I got it right :) Took 5 tries though...

-Steve

Feb 19 2016

D Programming

C/C++ Programming

Other

digitalmars.D - Another new io library