digitalmars.D - Output range primitives

Joel Nilsson (15/15) Jul 05 2017 For something to be an output range it has to define the put

Jonathan M Davis via Digitalmars-d (23/38) Jul 05 2017 The reality of the matter is that output ranges have never been well fle...

Jack Stouffer (3/5) Jul 05 2017 Yes, and it will take someone writing a DIP for this to be fixed

H. S. Teoh via Digitalmars-d (48/66) Jul 05 2017 The way I see it, isOutputRange defines a bare minimum, no-frills output
Jonathan M Davis via Digitalmars-d (11/12) Jul 05 2017 That's really what it comes down to, but as you suggest, I expect that

Joel Nilsson <nilserikjoel96 hotmail.com> writes:

For something to be an output range it has to define the put 
primitive. I am wondering why there is only a single type of 
output range. Would it not make sense to have, say, a fillable 
output range which defines  property bool full()?

This could be used in functions like copy that currently just 
spam the output range until there's nothing left to throw at it. 
I can imagine situations where you would want to continuously 
copy from the input range until the output range was "satisfied".

Let's say we have a range that acts as a message queue between 
two threads. The consumer would have no issue checking if there 
are any available messages by calling empty, but the producer 
currently has no standard way of checking if the buffer is full.

Is there any reason that this isn't in the range spec, or has the 
need for it not arised yet? Maybe it's just a poor idea? 
Enlighten me.

Jul 05 2017

Jonathan M Davis via Digitalmars-d <digitalmars-d puremagic.com> writes:

On Wednesday, July 05, 2017 21:13:57 Joel Nilsson via Digitalmars-d wrote:
 For something to be an output range it has to define the put
 primitive. I am wondering why there is only a single type of
 output range. Would it not make sense to have, say, a fillable
 output range which defines  property bool full()?

 This could be used in functions like copy that currently just
 spam the output range until there's nothing left to throw at it.
 I can imagine situations where you would want to continuously
 copy from the input range until the output range was "satisfied".

 Let's say we have a range that acts as a message queue between
 two threads. The consumer would have no issue checking if there
 are any available messages by calling empty, but the producer
 currently has no standard way of checking if the buffer is full.

 Is there any reason that this isn't in the range spec, or has the
 need for it not arised yet? Maybe it's just a poor idea?
 Enlighten me.

The reality of the matter is that output ranges have never been well fleshed
out. They seem to have been mainly created in order to have the opposite of
input ranges, but they were never designed anywhere near as thoroughly as
input ranges.

For some stuff to work well, we really do need some sort of property
analogous to length which indicates how many more elements could be put into
the output range, and having something like full that was analagous to empty
would make sense as well. As it stands, you basically get to put elements
into a range until it throws, because it's full, and while that makes sense
under some circumstances, in others, it is most definitely not ideal.

Also, std.digest introduced a primitive called finish (if I remember the
name correctly) which basically "finishes" an output range so that it's
possible for an algorithm to do something to it after it's been filled (e.g.
if you had to do a checksum on the data and add it to the end or something).
But while I believe that std.digest uses it consistently, it's never been
formalized beyond that.

There may be other things that should really be done to output ranges as
well, but really, what it comes down to is that they were designed just
enough to get by and then were never properly revisited. So, they definitely
work, but they're a bit hampered in comparison to what they could or should
be.

- Jonathan M Davis

Jul 05 2017

Jack Stouffer <jack jackstouffer.com> writes:

On Wednesday, 5 July 2017 at 21:28:31 UTC, Jonathan M Davis wrote:
 So, they definitely work, but they're a bit hampered in 
 comparison to what they could or should be.

Yes, and it will take someone writing a DIP for this to be fixed 
(not aiming this at you, just notifying people who don't know).

Jul 05 2017

"H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:

On Wed, Jul 05, 2017 at 03:28:31PM -0600, Jonathan M Davis via Digitalmars-d
wrote:
[...]
 The reality of the matter is that output ranges have never been well
 fleshed out. They seem to have been mainly created in order to have
 the opposite of input ranges, but they were never designed anywhere
 near as thoroughly as input ranges.
 
 For some stuff to work well, we really do need some sort of property
 analogous to length which indicates how many more elements could be
 put into the output range, and having something like full that was
 analagous to empty would make sense as well. As it stands, you
 basically get to put elements into a range until it throws, because
 it's full, and while that makes sense under some circumstances, in
 others, it is most definitely not ideal.

The way I see it, isOutputRange defines a bare minimum, no-frills output
range.  Just like plain input ranges, it doesn't give us much, but does
give us enough to get by, i.e., a .put method for writing stuff into it.
This provides the "base concept" that additional functionality can be
added onto, akin to forward ranges, bidirectional ranges, etc..

It seems clear that there's at least a subconcept of output range that
has limited capacity.  I'm not sure what's a good name for it, but
perhaps a FiniteOutputRange could be a tentative name. So you'd have:

	Output Range:
		Defines a .put method

	Finite OutputRange (inherits from Output Range):
		Defines a .full method that indicates when the output
		range is full.

I wouldn't commit to a .length method, because there may be some output
ranges that cannot commit to a specific length, but do know when they're
full.  Instead, we could have a hasLength template that checks whether a
specific output range has a .length method.  An output range with a
.length method would be expected to also be a finite output range (i.e.,
has .full), and .length should return 0 when .full is true.


 Also, std.digest introduced a primitive called finish (if I remember
 the name correctly) which basically "finishes" an output range so that
 it's possible for an algorithm to do something to it after it's been
 filled (e.g.  if you had to do a checksum on the data and add it to
 the end or something).  But while I believe that std.digest uses it
 consistently, it's never been formalized beyond that.

[...]

IMO .finish is too specific of an idea to be formalized as part of a
generic output range concept.  A better idea might be a flushable output
range with a .flush method that performs some action on the data that's
accumulated in the output range so far, and resets it back to a state
where it's not full. The model I have in mind here is a buffer that can
be flushed to disk once it's full, thus emptying it out again for reuse,
or a digest algorithm that needs to perform some action, like write a
checksum to the end of the data, and afterwards resets its state so that
it can compute the checksum of a new segment of data.

It's unclear, however, whether .flush (or .finish) should be part of a
distinct derivation tree from a plain output range, or whether it should
be somehow related to finite output ranges. I.e., whether we should have
a linear progression of output ranges:

	Output range -> finite output range ->(?) flushable output range

or a non-linear hierarchy:

	               Output range
		      /            \
	Finite output range      Flushable output range

Can there also be an output range that's both finite and has a .flush
method?  Seems possible.  Or perhaps .flush is an orthogonal aspect that
should be treated separately, e.g., with a hasFlush template akin to
hasLength for testing whether an output range has a .flush method.

More thought needs to be put in to formalizing all of this.


T

-- 
Making non-nullable pointers is just plugging one hole in a cheese grater. --
Walter Bright

Jul 05 2017

Jonathan M Davis via Digitalmars-d <digitalmars-d puremagic.com> writes:

On Wednesday, July 05, 2017 15:07:46 H. S. Teoh via Digitalmars-d wrote:
 More thought needs to be put in to formalizing all of this.

That's really what it comes down to, but as you suggest, I expect that
whatever changes we make would be additive with additional traits, similar
to what we have with input ranges, even if it's something basic like a
property called full, simply because requiring anything new would break
existing code - and the reality of the matter is that what we have does work
as-is on some level without additional primitives. It's just more limited
than it could or should be. But regardless, one or more people is going to
have to take the time to work out a proposal for anything to happen, and it
doesn't seem to have been a big enough issue for that to happen yet.

- Jonathan M Davis

Jul 05 2017

D Programming

C/C++ Programming

Other

digitalmars.D - Output range primitives