digitalmars.D.learn - Trying to understand RandomSample struct in std.random
- Joseph Rushton Wakeling (17/17) Apr 12 2012 Hello all,
- =?UTF-8?B?QWxpIMOHZWhyZWxp?= (12/26) Apr 12 2012 That's misleading. RandomSample is a lazy range. The output seems to be
- Joseph Rushton Wakeling (15/20) Apr 12 2012 Ahhh, it's lazy evaluation. That would explain why you can set ret.gen ...
- =?UTF-8?B?QWxpIMOHZWhyZWxp?= (8/14) Apr 12 2012 It is to keep front() trivial. Unless popFront() is called, front()
Hello all, I'm trying to understand the internal operations of the RandomSample struct in std.random. What gets output at the end is clearly an array containing a subset of the original input. But I can't understand how this is constructed. The constructor sets various initial values and then calls a function prime() [looks like an implementation of Knuth's Algorithm S] which returns when the current _index value is selected. Otherwise, it moves along to the next index value. What I don't understand is where the selected values are stored or how they come to be available. I also don't understand the corresponding randomSample functions where a specified random number generator is passed; how it can work to set ret.gen = gen _after_ the RandomSample struct has been initialized. Can anyone help me to understand how this struct works? Thanks & best wishes, -- Joe
Apr 12 2012
On 04/12/2012 11:30 AM, Joseph Rushton Wakeling wrote:What gets output at the end is clearly an array containing a subset of the original input.That's misleading. RandomSample is a lazy range. The output seems to be elements of an array only as you pull data out of this range.The constructor sets various initial values and then calls a function prime() [looks like an implementation of Knuth's Algorithm S] which returns when the current _index value is selected. Otherwise, it moves along to the next index value. What I don't understand is where the selected values are stored or how they come to be available.The actual values are produced by the range that is passed to RandomSample. RandomSample merely skips over some of those values.I also don't understand the corresponding randomSample functions where a specified random number generator is passed; how it can work to set ret.gen = gen _after_ the RandomSample struct has been initialized.It is ok because the returned RandomSample range is not used yet. It will be used only after having been returned from the randomSample() convenience function.Can anyone help me to understand how this struct works?I shamelessly provide a link to a chapter that I have written, which I think covers all of these topics: http://ddili.org/ders/d.en/ranges.htmlThanks & best wishes, -- JoeAli
Apr 12 2012
On 12/04/12 20:39, Ali Çehreli wrote:That's misleading. RandomSample is a lazy range. The output seems to be elements of an array only as you pull data out of this range.Ahhh, it's lazy evaluation. That would explain why you can set ret.gen = gen in the randomSample functions and have it work. (... realized that before reading your explicit statement to that effect:-)I shamelessly provide a link to a chapter that I have written, which I think covers all of these topics: http://ddili.org/ders/d.en/ranges.htmlNo shame required, your textbook is very nice :-) I think I understand better now -- that the initial value is calculated by the call to prime() in the constructor, which leaves _input.front as being the first selected value; and then popFront() calls prime() again to set the next value, and so on. What I _still_ don't understand is the statement in the constructor: // we should skip some elements initially so we don't always // start with the first ... as it seems to me that doing so would bugger up the selection algorithm. The whole point of Algorithm S is that it considers one by one each of the possible values in sequence.
Apr 12 2012
On 04/12/2012 01:27 PM, Joseph Rushton Wakeling wrote:What I _still_ don't understand is the statement in the constructor: // we should skip some elements initially so we don't always // start with the first ... as it seems to me that doing so would bugger up the selection algorithm. The whole point of Algorithm S is that it considers one by one each of the possible values in sequence.It is to keep front() trivial. Unless popFront() is called, front() should always return the same element. That's why the constructor goes to the first selection up front so that the first call to front() need not calculate anything. As a result it reduces the laziness of the whole thing though. Even if popFront() is never called, the original range is consumed. Ali
Apr 12 2012