digitalmars.D - A simple solution for randomness copy problem
- Ilya Yaroshenko (65/75) Nov 29 2016 Hello,
- Joseph Rushton Wakeling (25/34) Nov 30 2016 Oh, nice. This feels like a bit of a "facepalm" moment as in,
- Ilya Yaroshenko (9/16) Nov 30 2016 Of course they can work with random devices. It looks strange to
-
Joseph Rushton Wakeling
(7/12)
Nov 30 2016
The terminology here (from the C++11
standard) is: - Dicebot (4/4) Nov 30 2016 One minor nitpick - please avoid calling postblit constructor a
Hello, The problem is that a structure for a random algorithm or a random variable may holds its own precomputed random state, which is not correct to copy. From [1] by Joseph Rushton Wakeling:Essentially the sampling algorithm carries out its own little internal pseudo-random process which falls back to using the RNG when necessary. I've had a lot of conversations with different people around this issue, and I've tried out a bunch of different ideas, including functor RNGs which would be wrapped with a range API accessing them via pointer (a suggestion Dicebot among others made to me). In every case, I've run into difficulty when moving from RNGs and distributions to random algorithms like RandomCover and RandomSample.The solution is to add a `hot` flag that indicates that precomputed random values can be used and reset this flag in copy constructor. It works without performance issues for the Vitter's algorithm and Normal sampling (of course if you don't copy the struct for each call). Both Vitter's random sample algorithm and normal distribution needs this flag or analog anyway (at least in my implementations). Below are 2 reduced code samples. The first one is for Vitter's strides for random sample and the second one is for Normal distribution. The actual code for random sample (`sample`) can be found at [2], and for `NormalVairable` at 3. [1] http://forum.dlang.org/post/gpsilefdillwtleuwohl forum.dlang.org [2] https://github.com/libmir/mir-random/blob/master/source/mir/random/algorithm.d [3] https://github.com/libmir/mir-random/blob/master/source/mir/random/variable.d ----------- Vitter's Algorithm D --------- struct VitterStrides { private double vprime; // A future/pregenerated random value private bool hot; // A flag that indicates that vprime can be used this(this) { hot = false; } sizediff_t opCall(G)(ref G gen) { ... } } ---------- Normal Distribution ---------- RandomVariable struct NormalVariable(T) if (isFloatingPoint!T) { private T y = 0; private bool hot; this(this) { hot = false; } /// T opCall(G)(ref G gen) if (isSaturatedRandomEngine!G) { T x = void; if (hot) { hot = false; x = y; } else { // compute independent x, y at once } ... } }
Nov 29 2016
On Tuesday, 29 November 2016 at 08:50:52 UTC, Ilya Yaroshenko wrote:The solution is to add a `hot` flag that indicates that precomputed random values can be used and reset this flag in copy constructor. It works without performance issues for the Vitter's algorithm and Normal sampling (of course if you don't copy the struct for each call).Oh, nice. This feels like a bit of a "facepalm" moment as in, "How did no one think of that before?" :-P It could be a heavy performance hit for larger state variables (e.g. a Ziggurat), but on that note ...Both Vitter's random sample algorithm and normal distribution needs this flag or analog anyway (at least in my implementations).I'm not sure that it's so important for distributions. Unlike the general case of random algorithms (which likely need to be created multiple times in inner program loops), distributions are much more likely to be created at a relatively high level and then _used_ multiple times in an inner loop. That means that in principle, distributions could be treated in the same way as RNGs, i.e. just ban copying by value and expect them to be passed around by ref or by pointer. Of course, including the `hot` flag could offer best of both worlds: if you pass around by ref then you avoid needing to recreate the internal state, if you pass around by value then the internal state is regenerated for statistical safety. I'll try this out in detail with the Phobos std.random stuff and see how it plays. Thank you _very_ much for focusing on this problem.if (isSaturatedRandomEngine!G)Question on your terminology here: while saturated makes sense, is it really your intention to restrict things to random _engines_ (i.e. pseudo-random algorithms)? Surely it should also be possible for this functionality to work with random devices?
Nov 30 2016
On Wednesday, 30 November 2016 at 11:29:25 UTC, Joseph Rushton Wakeling wrote:On Tuesday, 29 November 2016 at 08:50:52 UTC, Ilya YaroshenkoOf course they can work with random devices. It looks strange to me to have explicit API difference between engines and devices. A random devices can be marked as random engines or we can add a simple generic adaptor (Device->Engine) for them. --Ilya I will be happy to se your PR for random devices to mir-random. Thanks, Ilyaif (isSaturatedRandomEngine!G)Question on your terminology here: while saturated makes sense, is it really your intention to restrict things to random _engines_ (i.e. pseudo-random algorithms)? Surely it should also be possible for this functionality to work with random devices?
Nov 30 2016
On Wednesday, 30 November 2016 at 12:28:28 UTC, Ilya Yaroshenko wrote:Of course they can work with random devices. It looks strange to me to have explicit API difference between engines and devices. A random devices can be marked as random engines or we can add a simple generic adaptor (Device->Engine) for them.The terminology here (from the C++11 <random> standard) is: * generator == source of uniformly-distributed random bits * engine == pseudo-random generator (i.e. seedable sequences) * device == 'truly random' generatorI will be happy to se your PR for random devices to mir-random.I'll see what I can do.
Nov 30 2016
One minor nitpick - please avoid calling postblit constructor a "copy constructor", it tends to mislead developers with C++ origins into expecting copy constructor they are used to - and disappointment when it proves to not be the case.
Nov 30 2016