digitalmars.D.bugs - [Issue 8247] New: Inconsistent behaviour of randomSample depending on whether a random number generator is specified
- d-bugmail puremagic.com (52/52) Jun 14 2012 http://d.puremagic.com/issues/show_bug.cgi?id=8247
- d-bugmail puremagic.com (7/7) Jun 14 2012 http://d.puremagic.com/issues/show_bug.cgi?id=8247
- d-bugmail puremagic.com (11/11) Jun 14 2012 http://d.puremagic.com/issues/show_bug.cgi?id=8247
- d-bugmail puremagic.com (17/17) Jun 14 2012 http://d.puremagic.com/issues/show_bug.cgi?id=8247
- d-bugmail puremagic.com (20/20) Jun 15 2012 http://d.puremagic.com/issues/show_bug.cgi?id=8247
- d-bugmail puremagic.com (24/25) Jun 15 2012 by reference where needed.
- d-bugmail puremagic.com (12/38) Jun 15 2012 http://d.puremagic.com/issues/show_bug.cgi?id=8247
http://d.puremagic.com/issues/show_bug.cgi?id=8247 Summary: Inconsistent behaviour of randomSample depending on whether a random number generator is specified Product: D Version: D2 Platform: All OS/Version: All Status: NEW Severity: normal Priority: P2 Component: Phobos AssignedTo: nobody puremagic.com ReportedBy: joseph.wakeling webdrake.net 2012-06-14 12:27:53 PDT --- Created an attachment (id=1116) Working minimal example illustrating the inconsistencies described. The randomSample function in std.random can be called with or without specifying a random number generator to use. If no RNG is specified, then each lazy evaluation of the sample evaluates differently, i.e. if you do sample1 = randomSample(iota(0, 100), 5); writeln(sample1); writeln(sample1); writeln(sample1); you will get 3 different samples. Conversely, if a random number generator is specified, you will get 3 times the same result: sample2 = randomSample(iota(0, 100), 5, Random(unpredictableSeed)); writeln(sample2); writeln(sample2); writeln(sample2); Note that the seeding of the RNG is important, because if an already-existing RNG is provided to create multiple different samples, they will evaluate identically, e.g. sample3 = randomSample(iota(0, 100), 5, rndGen); writeln(sample3); sample4 = randomSample(iota(0, 100), 5, rndGen); writeln(sample4); sample5 = randomSample(iota(0, 100), 5, rndGen); writeln(sample5); ... will produce the same output 3 times. This happens because the RNG passed to randomSample is copied rather than used by reference. These inconsistencies lead to a lot of potential confusion and sources of bugs. So, first of all, we need a firm decision on how the lazy evaluation of RandomSample should behave -- should it (1) always evaluate to the same sample, or (2) always evaluate to a different sample? ... and depending on the answer, we then need to address how to specify and seed an RNG for RandomSample. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Jun 14 2012
http://d.puremagic.com/issues/show_bug.cgi?id=8247 2012-06-14 12:35:44 PDT --- Online discussion on this: http://forum.dlang.org/thread/4FD735EB.70404 webdrake.net -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Jun 14 2012
http://d.puremagic.com/issues/show_bug.cgi?id=8247 jens.k.mueller gmx.de changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |jens.k.mueller gmx.de I opt for the returning the same sample (option 1). I want the sample to stay the same. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Jun 14 2012
http://d.puremagic.com/issues/show_bug.cgi?id=8247 Jonathan M Davis <jmdavisProg gmx.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |jmdavisProg gmx.com PDT --- If you want randomSample to be consistent as to which you get, it needs to be made to handle both reference and value type random number generating ranges identically, since they could be either. At present, all of those in std.random are value types, which is actually a problem in general. They really should reference types. But regardless of which they are, there's nothing stopping someone from implementing either a value type or reference type range which is a random number generator, in which case you'll get inconsistent behavior if randomSample doesn't code for using both by using save where appropriate. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Jun 14 2012
http://d.puremagic.com/issues/show_bug.cgi?id=8247 Now I see why you want to pass RNG by reference. Because you may want that two functions share the same generator. But then I would go with passing them all by reference for consistency reasons. And all functions have as default argument rndGen() which could be renamed to defaultRNG(). randomShuffle is already doing it this way. Though I don't see why it sets the template argument RandomGen to Random by default. This should be inferred automatically by the default argument rndGen() anyway. So randomCover and randomSample should follow the same approach. I do not see why one needs to pass a RNG by value then. Admittedly I have never used std.random. So I may have wrong use cases in mind. But having a thread local RNG that is used by default should be okay. Jonathan Why should a RNG type have reference semantics? I think it's fine to pass them by reference where needed. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Jun 15 2012
http://d.puremagic.com/issues/show_bug.cgi?id=8247 PDT ---Why should a RNG type have reference semantics? I think it's fine to pass themby reference where needed. Because it makes no sense for it to have value semantics. Take this for example auto func(R)(R r) { r.popFront(); auto var1 = r.front; ///... } func(generator); generator.popFront(); auto var2 = generator.front; Both var1 and var2 will have the exact same value. This is an easy mistake to make, and since random number generators are supposed to be returning random numbers, having them return the _same_ number after popFront has been called is definitely problematic. By making them reference types, the only time that you get the same number multiple times in a row is when you do it on purpose (e.g. by storing the value of front or by calling save on the range). -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Jun 15 2012
http://d.puremagic.com/issues/show_bug.cgi?id=8247I see. Thanks. Since passing around RNGs should be by default by reference RNGs should be reference types. Otherwise everybody writing own functions accepting RNGs has to use ref which is error-prone. Using ref when passing RNGs in std.random won't solve this general design issue. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------Why should a RNG type have reference semantics? I think it's fine to pass themby reference where needed. Because it makes no sense for it to have value semantics. Take this for example auto func(R)(R r) { r.popFront(); auto var1 = r.front; ///... } func(generator); generator.popFront(); auto var2 = generator.front; Both var1 and var2 will have the exact same value. This is an easy mistake to make, and since random number generators are supposed to be returning random numbers, having them return the _same_ number after popFront has been called is definitely problematic. By making them reference types, the only time that you get the same number multiple times in a row is when you do it on purpose (e.g. by storing the value of front or by calling save on the range).
Jun 15 2012