digitalmars.D - Truly lazy ranges, transient .front, and std.range.Generator
- Joseph Rushton Wakeling via Digitalmars-d (43/43) Aug 15 2015 Hello all,
- Alex Parrill (15/16) Aug 16 2015 I had this issue recently when reading from a command-line-style
- Joseph Rushton Wakeling via Digitalmars-d (7/20) Aug 18 2015 Interesting, thanks for sharing that. I'm not at all surprised that an ...
- HaraldZealot (27/38) Aug 18 2015 Do you mean something like that:
- Joseph Rushton Wakeling via Digitalmars-d (72/86) Aug 18 2015 Yes, broadly like your example, although note that your version won't ha...
- HaraldZealot (13/16) Aug 18 2015 Good point! And you show acceptable solution.
- Joseph Rushton Wakeling via Digitalmars-d (3/12) Aug 18 2015 What kind of issues did you have in mind?
Hello all, One of the design considerations I've been mulling over recently, in terms of random number generation functionality, is that while we talk of ranges being lazily evaluated, in fact this isn't strictly true. Most ranges are in fact a mix of lazy and eager: lazy in their use of popFront, but eager in that the current value of .front is always pre-calculated, usually starting with the first value being calculated in the constructor. This is fine for many ranges, whose outcome is in any case deterministic, but it's not appropriate for many other cases, where the appropriate moment of evaluation of 'front' is arguably _the moment where front is called_. This kind of requirement clearly holds anywhere where the values of the range depend on some kind of external input, and that arguably should include things like sources of randomness (even if they are, under the hood, only pseudo-random). One possible support for this is given by std.range.Generator, which implements a very simple wrapper of a callable entity via the following range primitives: enum empty = false; auto ref front() property { return fun(); } void popFront() { } Now, first, I should say that I think this is a good example of how the classification and concept of ranges is incomplete -- there is a case for an even simpler range than InputRange, one which _just_ has .empty and .front defined, and which we might call a TransientRange -- but second, it's problematic, inasmuch as it's an incomplete solution to the problem of truly lazy ranges. In some cases we're going to want true laziness of evaluation, but _not_ transience of .front. In these cases, the _first_ time .front is called, its value should be freshly evaluated; but thereafter, the value is cached, until .popFront() is called, at which point .front will be re-evaluated lazily the next time it's called. Something like std.algorithm.cache is inappropriate here, precisely because it's eager in its calculation of the cached values. As far as I can see, we don't have a valid solution for this case. We have some functionality in std.random -- e.g. randomCover and randomSample -- which implements rather shaky workarounds for that lack. I can't imagine this isn't a known case/problem, but as far as I can see any discussion of it has been more on GitHub than here. So before rushing off with Yet Another Custom Solution, I thought I'd raise the issue here, to ask what the current state of thought is on these kinds of challenge, and what may be upcoming in people's roadmaps. Thanks & best wishes, -- Joe
Aug 15 2015
On Saturday, 15 August 2015 at 10:06:13 UTC, Joseph Rushton Wakeling wrote:...I had this issue recently when reading from a command-line-style TCP connection; I needed to read the line up to the \n separator, but consuming the separator meant waiting for the next byte that would never arrive unless a new command was sent. So I made a wrapper range that evaluates the wrapped range's popFront only when front/empty is first called ("just in time"). Source code here: https://gist.github.com/ColonelThirtyTwo/0dfe76520efcda02d848 You can throw it in a UFCS chain anywhere except (for some reason) after something that takes a delegate template parameter like map. For example: auto reader = SocketReader(socket).joiner.jitRange.map!(byt => cast(char) byt);
Aug 16 2015
On 17/08/15 00:33, Alex Parrill via Digitalmars-d wrote:On Saturday, 15 August 2015 at 10:06:13 UTC, Joseph Rushton Wakeling wrote:Interesting, thanks for sharing that. I'm not at all surprised that an IO-based range should have similar issues to the one I describe; a comparable use-case I was thinking of was reading bytes from /dev/urandom. My own concern was whether there was any sort of generic functionality for constructing lazy-front ranges like this; I'll share more on my own approach replying to Harald Zealot below....I had this issue recently when reading from a command-line-style TCP connection; I needed to read the line up to the \n separator, but consuming the separator meant waiting for the next byte that would never arrive unless a new command was sent. So I made a wrapper range that evaluates the wrapped range's popFront only when front/empty is first called ("just in time"). Source code here: https://gist.github.com/ColonelThirtyTwo/0dfe76520efcda02d848 You can throw it in a UFCS chain anywhere except (for some reason) after something that takes a delegate template parameter like map. For example: auto reader = SocketReader(socket).joiner.jitRange.map!(byt => cast(char) byt);
Aug 18 2015
On Saturday, 15 August 2015 at 10:06:13 UTC, Joseph Rushton Wakeling wrote:... In some cases we're going to want true laziness of evaluation, but _not_ transience of .front. In these cases, the _first_ time .front is called, its value should be freshly evaluated; but thereafter, the value is cached, until .popFront() is called, at which point .front will be re-evaluated lazily the next time it's called. Something like std.algorithm.cache is inappropriate here, precisely because it's eager in its calculation of the cached values. ... -- JoeDo you mean something like that: ```d struct Range { public: enum empty = false; auto ref front() property { if(mustBeEvaluated) { cache = fun(); mustBeEvaluated = false; } return cache; } void popFront() { mustBeEvaluated = true; } private: ReturnType!fun cache; bool mustBeEvaluated = true; } ``` ?
Aug 18 2015
On 18/08/15 10:59, HaraldZealot via Digitalmars-d wrote:On Saturday, 15 August 2015 at 10:06:13 UTC, Joseph Rushton Wakeling wrote:Yes, broadly like your example, although note that your version won't handle multiple popFront() calls in succession without any .front call in-between. My own approach, which I knocked up earlier this month, was to create a generic mixin template that could be used to construct a lazy input range: // --------------------------------------------------------- mixin template LazyGen(T) { T front__; bool evaluated__ = false; public: enum bool empty = false; T front() property { if (!this.evaluated__) { this.front__ = this.evaluateFront__(); this.evaluated__ = true; } return this.front__; } void popFront() { if (this.evaluated__) { this.evaluated__ = false; } else { this.front__ = this.evaluateFront__(); } assert(!this.evaluated__); } } // --------------------------------------------------------- ... which could be mixin'd to any struct or class defining an evaluateFront__ method; although I think in retrospect I might rewrite that a bit to be more generic, not making assumptions about persistent values or the .empty condition: // --------------------------------------------------------- mixin template LazyPopFront() { private bool evaluated__ = false; public T front() property { if (!this.evaluated__) { this.popFront__(); // the 'true', private popFront this.evaluated__ = true; } return this.front__(); // private property method } public void popFront() { if (this.evaluated__) { this.evaluated__ = false; } else { this.popFront__(); } assert(!this.evaluated__); } } // --------------------------------------------------------- There are probably some other subtleties that need to be taken into account here, but broadly I think the above covers the fundamental requirements for a range whose front is truly lazily evaluated. Anyway, I'll try and follow up in the not-too-distant future with some more concrete example of how this can be used for some interesting stuff with RNGs. It would be interesting to know if the above solves any use-cases for anyone else, too.... In some cases we're going to want true laziness of evaluation, but _not_ transience of .front. In these cases, the _first_ time .front is called, its value should be freshly evaluated; but thereafter, the value is cached, until .popFront() is called, at which point .front will be re-evaluated lazily the next time it's called. Something like std.algorithm.cache is inappropriate here, precisely because it's eager in its calculation of the cached values. ... -- JoeDo you mean something like that:
Aug 18 2015
On Tuesday, 18 August 2015 at 19:37:47 UTC, Joseph Rushton Wakeling wrote:Yes, broadly like your example, although note that your version won't handle multiple popFront() calls in succession without any .front call in-between.Good point! And you show acceptable solution. The last your example I prefer to parametrize with alias to functions: ```d mixin template LazyPopFront(alias frontImplementation, alias popFrontImplementation) ... ``` It gives more flexibility. One that worries me, if someone start to make odd things like create shared instance of this range.
Aug 18 2015
On 18/08/15 22:47, HaraldZealot via Digitalmars-d wrote:The last your example I prefer to parametrize with alias to functions: ```d mixin template LazyPopFront(alias frontImplementation, alias popFrontImplementation) ... ``` It gives more flexibility.Yea, good call.One that worries me, if someone start to make odd things like create shared instance of this range.What kind of issues did you have in mind?
Aug 18 2015