www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Pseudo-random numbers in [0, n), covering all numbers in n steps?

reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
So we have https://dlang.org/phobos/std_random.html#.randomCover which 
needs to awkwardly allocate memory to keep track of the portions of the 
array already covered.

This could be fixed by devising a PRNG that takes a given period n and 
generates all numbers in [0, n) in exactly n steps.

However, I've had difficulty finding such PRNGs. Most want the maximum 
period possible so they're not concerned with a given period. Any insights?

BTW I found this statement in the documentation rather odd: "These 
issues will be resolved in a second-generation std.random that 
re-implements random number generators as reference types." The 
documentation is not a place for making vague promises and speculations 
about future developments. I think it should be removed.


Thanks,

Andrei
Feb 25 2016
next sibling parent reply John Colvin <john.loughran.colvin gmail.com> writes:
On Thursday, 25 February 2016 at 17:27:25 UTC, Andrei 
Alexandrescu wrote:
 So we have 
 https://dlang.org/phobos/std_random.html#.randomCover which 
 needs to awkwardly allocate memory to keep track of the 
 portions of the array already covered.

 This could be fixed by devising a PRNG that takes a given 
 period n and generates all numbers in [0, n) in exactly n steps.

 However, I've had difficulty finding such PRNGs. Most want the 
 maximum period possible so they're not concerned with a given 
 period. Any insights?
I don't think that's a good idea. A prng is closed path through a state space and it doesn't matter where you start on said path, you're going to follow the same closed path through the state space. I don't know of an algorithm for generating random permutations that isn't in-place (or O(N) storage), but I'm not an expert on the topic so maybe one does exist.
Feb 25 2016
next sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 02/25/2016 01:19 PM, John Colvin wrote:
 I don't think that's a good idea. A prng is closed path through a state
 space and it doesn't matter where you start on said path, you're going
 to follow the same closed path through the state space.
That's totally fine for some applications - those that simply want to iterate the input at most once in a random order. Far as I can tell this is the most frequent. I don't care for a second iteration to guarantee a different iteration schedule. And if I did, I'd have no trouble reseeding the generator. -- Andrei
Feb 25 2016
prev sibling parent tn <no email.com> writes:
On Thursday, 25 February 2016 at 18:19:56 UTC, John Colvin wrote:
 I don't know of an algorithm for generating random permutations 
 that isn't in-place (or O(N) storage), but I'm not an expert on 
 the topic so maybe one does exist.
These might be relevant: http://stackoverflow.com/questions/10054732/create-a-random-permutation-of-1-n-in-constant-space http://stackoverflow.com/questions/32182120/to-generate-random-permutation-of-a-array-in-on-time-and-o1-space
Feb 25 2016
prev sibling next sibling parent reply Nicholas Wilson <iamthewilsonator hotmail.com> writes:
On Thursday, 25 February 2016 at 17:27:25 UTC, Andrei 
Alexandrescu wrote:
 So we have 
 https://dlang.org/phobos/std_random.html#.randomCover which 
 needs to awkwardly allocate memory to keep track of the 
 portions of the array already covered.

 This could be fixed by devising a PRNG that takes a given 
 period n and generates all numbers in [0, n) in exactly n steps.

 However, I've had difficulty finding such PRNGs. Most want the 
 maximum period possible so they're not concerned with a given 
 period. Any insights?

 BTW I found this statement in the documentation rather odd: 
 "These issues will be resolved in a second-generation 
 std.random that re-implements random number generators as 
 reference types." The documentation is not a place for making 
 vague promises and speculations about future developments. I 
 think it should be removed.


 Thanks,

 Andrei
The technical name for the property of distribution you describe is k-Dimensional Equidistribution (in this case k=1). I would suggest taking a look at http://www.pcg-random.org. They claim to have both arbitrary period and k-Dimensional Equidistribution Nic
Feb 25 2016
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 02/25/2016 06:46 PM, Nicholas Wilson wrote:
 The technical name for the property of distribution you describe is
   k-Dimensional Equidistribution (in this case k=1).
 I would suggest taking a look at http://www.pcg-random.org.
 They claim to have both arbitrary period and k-Dimensional Equidistribution
Thanks, that's indeed closest! A hefty read. Anyone inclined to work on a PCG random implementation? -- Andrei
Feb 26 2016
next sibling parent reply Nicholas Wilson <iamthewilsonator hotmail.com> writes:
On Friday, 26 February 2016 at 14:59:43 UTC, Andrei Alexandrescu 
wrote:
 On 02/25/2016 06:46 PM, Nicholas Wilson wrote:
 The technical name for the property of distribution you 
 describe is
   k-Dimensional Equidistribution (in this case k=1).
 I would suggest taking a look at http://www.pcg-random.org.
 They claim to have both arbitrary period and k-Dimensional 
 Equidistribution
Thanks, that's indeed closest! A hefty read. Anyone inclined to work on a PCG random implementation? -- Andrei
Dstep could be used to port the C++ version if needed.
Feb 26 2016
parent reply Craig Dillabaugh <craig.dillabaugh gmail.com> writes:
On Friday, 26 February 2016 at 15:15:11 UTC, Nicholas Wilson 
wrote:
 On Friday, 26 February 2016 at 14:59:43 UTC, Andrei 
 Alexandrescu wrote:
 On 02/25/2016 06:46 PM, Nicholas Wilson wrote:
 The technical name for the property of distribution you 
 describe is
   k-Dimensional Equidistribution (in this case k=1).
 I would suggest taking a look at http://www.pcg-random.org.
 They claim to have both arbitrary period and k-Dimensional 
 Equidistribution
Thanks, that's indeed closest! A hefty read. Anyone inclined to work on a PCG random implementation? -- Andrei
Dstep could be used to port the C++ version if needed.
I don't think Dstep can handle C++.
Feb 26 2016
parent reply Nicholas Wilson <iamthewilsonator hotmail.com> writes:
On Friday, 26 February 2016 at 15:17:16 UTC, Craig Dillabaugh 
wrote:
 On Friday, 26 February 2016 at 15:15:11 UTC, Nicholas Wilson 
 wrote:
 On Friday, 26 February 2016 at 14:59:43 UTC, Andrei 
 Alexandrescu wrote:
 On 02/25/2016 06:46 PM, Nicholas Wilson wrote:
 The technical name for the property of distribution you 
 describe is
   k-Dimensional Equidistribution (in this case k=1).
 I would suggest taking a look at http://www.pcg-random.org.
 They claim to have both arbitrary period and k-Dimensional 
 Equidistribution
Thanks, that's indeed closest! A hefty read. Anyone inclined to work on a PCG random implementation? -- Andrei
Dstep could be used to port the C++ version if needed.
I don't think Dstep can handle C++.
Hmm. I thought that was what it did. Maybe I was thinking of another thing or perhaps I should make sure that I am not about to fall asleep before posting.
Feb 26 2016
parent Craig Dillabaugh <craig.dillabaugh gmail.com> writes:
On Friday, 26 February 2016 at 15:21:08 UTC, Nicholas Wilson 
wrote:
 On Friday, 26 February 2016 at 15:17:16 UTC, Craig Dillabaugh 
 wrote:
 On Friday, 26 February 2016 at 15:15:11 UTC, Nicholas Wilson 
 wrote:
 On Friday, 26 February 2016 at 14:59:43 UTC, Andrei 
 Alexandrescu wrote:
 On 02/25/2016 06:46 PM, Nicholas Wilson wrote:
 [...]
Thanks, that's indeed closest! A hefty read. Anyone inclined to work on a PCG random implementation? -- Andrei
Dstep could be used to port the C++ version if needed.
I don't think Dstep can handle C++.
Hmm. I thought that was what it did. Maybe I was thinking of another thing or perhaps I should make sure that I am not about to fall asleep before posting.
Dstep can convert C headers to D, but can't handle C++.
Feb 26 2016
prev sibling parent reply Alex Parrill <initrd.gz gmail.com> writes:
On Friday, 26 February 2016 at 14:59:43 UTC, Andrei Alexandrescu 
wrote:
 On 02/25/2016 06:46 PM, Nicholas Wilson wrote:
 The technical name for the property of distribution you 
 describe is
   k-Dimensional Equidistribution (in this case k=1).
 I would suggest taking a look at http://www.pcg-random.org.
 They claim to have both arbitrary period and k-Dimensional 
 Equidistribution
Thanks, that's indeed closest! A hefty read. Anyone inclined to work on a PCG random implementation? -- Andrei
Beat you to it: http://code.dlang.org/packages/d-pcg It only has the basic generators at the moment. I'll look into the more advanced stuff. (Also 64 bit outputs aren't implemented yet because they need a 128 bit uint for state. I noticed D reserves the names cent and ucent but hasn't implemented them)
Feb 26 2016
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 02/26/2016 10:19 AM, Alex Parrill wrote:
 On Friday, 26 February 2016 at 14:59:43 UTC, Andrei Alexandrescu wrote:
 On 02/25/2016 06:46 PM, Nicholas Wilson wrote:
 The technical name for the property of distribution you describe is
   k-Dimensional Equidistribution (in this case k=1).
 I would suggest taking a look at http://www.pcg-random.org.
 They claim to have both arbitrary period and k-Dimensional
 Equidistribution
Thanks, that's indeed closest! A hefty read. Anyone inclined to work on a PCG random implementation? -- Andrei
Beat you to it: http://code.dlang.org/packages/d-pcg It only has the basic generators at the moment. I'll look into the more advanced stuff. (Also 64 bit outputs aren't implemented yet because they need a 128 bit uint for state. I noticed D reserves the names cent and ucent but hasn't implemented them)
That's pretty darn cool! I don't see a way to create a generator given a range expressed as a power of two. Say e.g. a client wants to say "give me a generator with a cycle of 32768". Is this easily doable? Also: when the generator starts running, does it generate a full cycle, or it starts with a shorter cycle and then settle into a full cycle? Thanks, Andrei
Feb 26 2016
parent Alex Parrill <initrd.gz gmail.com> writes:
On Friday, 26 February 2016 at 16:45:53 UTC, Andrei Alexandrescu 
wrote:
 On 02/26/2016 10:19 AM, Alex Parrill wrote:
 On Friday, 26 February 2016 at 14:59:43 UTC, Andrei 
 Alexandrescu wrote:
 On 02/25/2016 06:46 PM, Nicholas Wilson wrote:
 The technical name for the property of distribution you 
 describe is
   k-Dimensional Equidistribution (in this case k=1).
 I would suggest taking a look at http://www.pcg-random.org.
 They claim to have both arbitrary period and k-Dimensional
 Equidistribution
Thanks, that's indeed closest! A hefty read. Anyone inclined to work on a PCG random implementation? -- Andrei
Beat you to it: http://code.dlang.org/packages/d-pcg It only has the basic generators at the moment. I'll look into the more advanced stuff. (Also 64 bit outputs aren't implemented yet because they need a 128 bit uint for state. I noticed D reserves the names cent and ucent but hasn't implemented them)
That's pretty darn cool! I don't see a way to create a generator given a range expressed as a power of two. Say e.g. a client wants to say "give me a generator with a cycle of 32768". Is this easily doable? Also: when the generator starts running, does it generate a full cycle, or it starts with a shorter cycle and then settle into a full cycle? Thanks, Andrei
My port at the moment only provides the basic pgm32 generators; their behavior should match the pgm32_* classes from the C++ library. I'll look into which of the generators support the equidistributed results, though I suspect that they are distributed across the entire domain of the result type.
Feb 26 2016
prev sibling next sibling parent reply Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:
On Thursday, 25 February 2016 at 17:27:25 UTC, Andrei 
Alexandrescu wrote:
 So we have 
 https://dlang.org/phobos/std_random.html#.randomCover which 
 needs to awkwardly allocate memory to keep track of the 
 portions of the array already covered.
Yes, this is definitely a standout in terms of being an unpleasant solution. It means that you require o(N) memory even when you're dealing with a lazily-evaluated range -- it would probably be more efficient in practice to just write the input into an array and do an in-place shuffle. :-(
 This could be fixed by devising a PRNG that takes a given 
 period n and generates all numbers in [0, n) in exactly n steps.

 However, I've had difficulty finding such PRNGs. Most want the 
 maximum period possible so they're not concerned with a given 
 period. Any insights?
I'll try to see what I can find. This must be something that other lazy/functional communities (Haskell, Clojure, ...) have had to contend with. BTW I would caution that with pseudo-RNGs per se, the problem is not so much the size of the period per se, as the fact that it's not very random (in the sense that a PRNG is aiming for) to visit every possible value within the range of the RNG exactly once before repeating ... ;-)
 BTW I found this statement in the documentation rather odd: 
 "These issues will be resolved in a second-generation 
 std.random that re-implements random number generators as 
 reference types." The documentation is not a place for making 
 vague promises and speculations about future developments. I 
 think it should be removed.
Yes, I agree. That was written at a time when those of us focused on std.random had great hope that a new design was immediately on the cards. I can probably find the PRs if you want to see the context.
Feb 26 2016
next sibling parent reply Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:
On Friday, 26 February 2016 at 08:05:09 UTC, Joseph Rushton 
Wakeling wrote:
 On Thursday, 25 February 2016 at 17:27:25 UTC, Andrei 
 Alexandrescu wrote:
 So we have 
 https://dlang.org/phobos/std_random.html#.randomCover which 
 needs to awkwardly allocate memory to keep track of the 
 portions of the array already covered.
Yes, this is definitely a standout in terms of being an unpleasant solution. It means that you require o(N) memory even when you're dealing with a lazily-evaluated range -- it would probably be more efficient in practice to just write the input into an array and do an in-place shuffle. :-(
Some interesting discussion/ideas here: https://cs.stackexchange.com/questions/29822/lazily-computing-a-random-permutation-of-the-positive-integers
Feb 26 2016
parent Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:
On Friday, 26 February 2016 at 08:12:18 UTC, Joseph Rushton 
Wakeling wrote:
 On Friday, 26 February 2016 at 08:05:09 UTC, Joseph Rushton 
 Wakeling wrote:
 On Thursday, 25 February 2016 at 17:27:25 UTC, Andrei 
 Alexandrescu wrote:
 So we have 
 https://dlang.org/phobos/std_random.html#.randomCover which 
 needs to awkwardly allocate memory to keep track of the 
 portions of the array already covered.
Yes, this is definitely a standout in terms of being an unpleasant solution. It means that you require o(N) memory even when you're dealing with a lazily-evaluated range -- it would probably be more efficient in practice to just write the input into an array and do an in-place shuffle. :-(
Also: I don't have the access to determine if this is really on the money, but looks like it could be promising: http://www.sciencedirect.com/science/article/pii/S0020019098001276
Feb 26 2016
prev sibling next sibling parent reply Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:
On Friday, 26 February 2016 at 08:05:09 UTC, Joseph Rushton 
Wakeling wrote:
 On Thursday, 25 February 2016 at 17:27:25 UTC, Andrei 
 Alexandrescu wrote:
 So we have 
 https://dlang.org/phobos/std_random.html#.randomCover which 
 needs to awkwardly allocate memory to keep track of the 
 portions of the array already covered.
Yes, this is definitely a standout in terms of being an unpleasant solution. It means that you require o(N) memory even when you're dealing with a lazily-evaluated range -- it would probably be more efficient in practice to just write the input into an array and do an in-place shuffle. :-(
 This could be fixed by devising a PRNG that takes a given 
 period n and generates all numbers in [0, n) in exactly n 
 steps.

 However, I've had difficulty finding such PRNGs. Most want the 
 maximum period possible so they're not concerned with a given 
 period. Any insights?
I'll try to see what I can find. This must be something that other lazy/functional communities (Haskell, Clojure, ...) have had to contend with.
A few potential lines of exploration here: http://apfelmus.nfshost.com/articles/random-permutations.html http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.150.9347&rep=rep1&type=pdf
Feb 26 2016
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 02/26/2016 03:34 AM, Joseph Rushton Wakeling wrote:
 http://apfelmus.nfshost.com/articles/random-permutations.html
This touches the input, we just want to cover it.
 http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.150.9347&rep=rep1&type=pdf
This seems like a nice article but I don't find a part relevant to this. Andrei
Feb 26 2016
next sibling parent Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:
On Friday, 26 February 2016 at 12:23:24 UTC, Andrei Alexandrescu 
wrote:
 On 02/26/2016 03:34 AM, Joseph Rushton Wakeling wrote:
 http://apfelmus.nfshost.com/articles/random-permutations.html
This touches the input, we just want to cover it.
 http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.150.9347&rep=rep1&type=pdf
This seems like a nice article but I don't find a part relevant to this.
I may have misread, because I was in a rush this morning, but I thought that in section 2.3 it defined a method for lazily generating permutations of a list. This was what I thought was relevant.
Feb 26 2016
prev sibling parent Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:
On Friday, 26 February 2016 at 12:23:24 UTC, Andrei Alexandrescu 
wrote:
 On 02/26/2016 03:34 AM, Joseph Rushton Wakeling wrote:
 http://apfelmus.nfshost.com/articles/random-permutations.html
This touches the input, we just want to cover it.
I thought the whole point of that article was that it described a method where there _wasn't_ reliance on mutable input that would be shuffled in-place?
Feb 26 2016
prev sibling next sibling parent Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:
On Friday, 26 February 2016 at 08:05:09 UTC, Joseph Rushton 
Wakeling wrote:
 On Thursday, 25 February 2016 at 17:27:25 UTC, Andrei 
 Alexandrescu wrote:
 So we have 
 https://dlang.org/phobos/std_random.html#.randomCover which 
 needs to awkwardly allocate memory to keep track of the 
 portions of the array already covered.
Yes, this is definitely a standout in terms of being an unpleasant solution. It means that you require o(N) memory even when you're dealing with a lazily-evaluated range -- it would probably be more efficient in practice to just write the input into an array and do an in-place shuffle. :-(
 This could be fixed by devising a PRNG that takes a given 
 period n and generates all numbers in [0, n) in exactly n 
 steps.

 However, I've had difficulty finding such PRNGs. Most want the 
 maximum period possible so they're not concerned with a given 
 period. Any insights?
I'll try to see what I can find. This must be something that other lazy/functional communities (Haskell, Clojure, ...) have had to contend with.
Last few links: https://www.quora.com/What-is-the-most-efficient-way-to-randomize-shuffle-a-linked-list?share=1 https://en.wikipedia.org/wiki/Gilbert%E2%80%93Shannon%E2%80%93Reeds_model https://stackoverflow.com/questions/12167630/algorithm-for-shuffling-a-linked-list-in-n-log-n-time/27624106#27624106
Feb 26 2016
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 02/26/2016 03:05 AM, Joseph Rushton Wakeling wrote:
 I can probably find the PRs if you want to see the context.
I understand the motivation behind that statement, and am not worried about pointing fingers etc. Would be great if a new PR removed the text. -- Andrei
Feb 26 2016
parent reply Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:
On Friday, 26 February 2016 at 12:21:11 UTC, Andrei Alexandrescu 
wrote:
 On 02/26/2016 03:05 AM, Joseph Rushton Wakeling wrote:
 I can probably find the PRs if you want to see the context.
I understand the motivation behind that statement, and am not worried about pointing fingers etc. Would be great if a new PR removed the text. -- Andrei
Done. :-)
Feb 26 2016
parent reply "H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:
On Fri, Feb 26, 2016 at 09:26:54PM +0000, Joseph Rushton Wakeling via
Digitalmars-d wrote:
 On Friday, 26 February 2016 at 12:21:11 UTC, Andrei Alexandrescu wrote:
On 02/26/2016 03:05 AM, Joseph Rushton Wakeling wrote:
I can probably find the PRs if you want to see the context.
I understand the motivation behind that statement, and am not worried about pointing fingers etc. Would be great if a new PR removed the text. -- Andrei
Done. :-)
Merged. :-) T -- Let's call it an accidental feature. -- Larry Wall
Feb 26 2016
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 02/26/2016 04:57 PM, H. S. Teoh via Digitalmars-d wrote:
 On Fri, Feb 26, 2016 at 09:26:54PM +0000, Joseph Rushton Wakeling via
Digitalmars-d wrote:
 On Friday, 26 February 2016 at 12:21:11 UTC, Andrei Alexandrescu wrote:
 On 02/26/2016 03:05 AM, Joseph Rushton Wakeling wrote:
 I can probably find the PRs if you want to see the context.
I understand the motivation behind that statement, and am not worried about pointing fingers etc. Would be great if a new PR removed the text. -- Andrei
Done. :-)
Merged. :-)
Thank you folks!! -- Andrei
Feb 26 2016
prev sibling parent reply Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:
On Thursday, 25 February 2016 at 17:27:25 UTC, Andrei 
Alexandrescu wrote:
 This could be fixed by devising a PRNG that takes a given 
 period n and generates all numbers in [0, n) in exactly n steps.
On reflection, I have a nasty feeling there's a fundamental problem with this proposed approach. It's this: if you're relying on the PRNG having a period of n in which it covers exactly once each of the numbers from [0, n), then you're essentially outsourcing the random aspect of the permutation to the _seeding_ of this generator. Now, what would be an appropriate seed-generation-mechanism to guarantee that this PRNG can select from all possible permutations with uniform probability?
Feb 26 2016
parent reply John Colvin <john.loughran.colvin gmail.com> writes:
On Friday, 26 February 2016 at 19:35:38 UTC, Joseph Rushton 
Wakeling wrote:
 On Thursday, 25 February 2016 at 17:27:25 UTC, Andrei 
 Alexandrescu wrote:
 This could be fixed by devising a PRNG that takes a given 
 period n and generates all numbers in [0, n) in exactly n 
 steps.
On reflection, I have a nasty feeling there's a fundamental problem with this proposed approach. It's this: if you're relying on the PRNG having a period of n in which it covers exactly once each of the numbers from [0, n), then you're essentially outsourcing the random aspect of the permutation to the _seeding_ of this generator. Now, what would be an appropriate seed-generation-mechanism to guarantee that this PRNG can select from all possible permutations with uniform probability?
This was what I was trying to get at in my initial post, but I failed to get the idea across properly.
Feb 26 2016
parent Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:
On Friday, 26 February 2016 at 23:05:00 UTC, John Colvin wrote:
 This was what I was trying to get at in my initial post, but I 
 failed to get the idea across properly.
Yea. It didn't even click with me at first, because when I first read Andrei's email I jumped straight in my head to the thought, "OK, so what _is_ an appropriate algorithm for lazily calculating a random permutation using O(1) storage space?" The good news is that, AFAICS, that is a readily solvable problem. It's inevitably more computationally complex than an in-place random shuffle -- O(n log n) instead of O(n) -- but algorithms do exist that could be adapted for Phobos.
Feb 26 2016