www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - unpredictableSeed

reply Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:
Hello all,

Can anyone advise on the theoretical basis for the unpredictableSeed method in 
std.random?  I've tried googling around for the theory of good thread-safe seed 
generation methods but haven't really found anything. :-(

Thanks & best wishes,

     -- Joe
Mar 02 2013
next sibling parent reply "Ivan Kazmenko" <gassa mail.ru> writes:
 Can anyone advise on the theoretical basis for the 
 unpredictableSeed method in std.random?  I've tried googling 
 around for the theory of good thread-safe seed generation 
 methods but haven't really found anything. :-(
I have to ask: what would be a good unpredictableSeed by definition? With the current implementation, three downsides come to my mind: 1. Process ID, thread ID and system tick are insecure sources of randomness and can provide just a few bits of randomness in certain situations. I don't know how to address this in a portable way. 2. Once we know the first seed, it is easy to predict all subsequent seeds. A solution would be to use a secure RNG instead, not just the one which gives away its state. 3. It would be a particularly bad idea to initialize MinstdRand0 instances with consecutive unpredictableSeeds and then consider them independent. This is just a consequence of a particular choice of RNG on the previous step. So, which of these do you consider the real problems, and what more do you need from unpredictableSeed? ----- Ivan Kazmenko.
Mar 03 2013
next sibling parent reply Johannes Pfau <nospam example.com> writes:
Am Sun, 03 Mar 2013 09:58:41 +0100
schrieb "Ivan Kazmenko" <gassa mail.ru>:

 Can anyone advise on the theoretical basis for the 
 unpredictableSeed method in std.random?  I've tried googling 
 around for the theory of good thread-safe seed generation 
 methods but haven't really found anything. :-(
I have to ask: what would be a good unpredictableSeed by definition? With the current implementation, three downsides come to my mind: 1. Process ID, thread ID and system tick are insecure sources of randomness and can provide just a few bits of randomness in certain situations. I don't know how to address this in a portable way. 2. Once we know the first seed, it is easy to predict all subsequent seeds. A solution would be to use a secure RNG instead, not just the one which gives away its state. 3. It would be a particularly bad idea to initialize MinstdRand0 instances with consecutive unpredictableSeeds and then consider them independent. This is just a consequence of a particular choice of RNG on the previous step. So, which of these do you consider the real problems, and what more do you need from unpredictableSeed? ----- Ivan Kazmenko.
Maybe it would make sense to use /dev/random where available? (The problem is that /dev/random can block. On small embedded systems without monitor/mice/keyboard this can happen easily)
Mar 03 2013
next sibling parent Jerome BENOIT <g6299304p rezozer.net> writes:
On 03/03/13 10:06, Johannes Pfau wrote:
 Am Sun, 03 Mar 2013 09:58:41 +0100
 schrieb "Ivan Kazmenko"<gassa mail.ru>:

 Can anyone advise on the theoretical basis for the
 unpredictableSeed method in std.random?  I've tried googling
 around for the theory of good thread-safe seed generation
 methods but haven't really found anything. :-(
I have to ask: what would be a good unpredictableSeed by definition? With the current implementation, three downsides come to my mind: 1. Process ID, thread ID and system tick are insecure sources of randomness and can provide just a few bits of randomness in certain situations. I don't know how to address this in a portable way. 2. Once we know the first seed, it is easy to predict all subsequent seeds. A solution would be to use a secure RNG instead, not just the one which gives away its state. 3. It would be a particularly bad idea to initialize MinstdRand0 instances with consecutive unpredictableSeeds and then consider them independent. This is just a consequence of a particular choice of RNG on the previous step. So, which of these do you consider the real problems, and what more do you need from unpredictableSeed? ----- Ivan Kazmenko.
Maybe it would make sense to use /dev/random where available? (The problem is that /dev/random can block. On small embedded systems without monitor/mice/keyboard this can happen easily)
/dev/urandom can be used if /dev/random is block: the available entropy can be used as criterion: /proc/sys/kernel/random/entropy_avail Since a very long while I have written a piece of C code to do so and to read from an environment dedicated environment variable in view to reproduce the generated sequences if necessary (mainly debugging): I use it intensively for numerical experiences and it works very well. Jerome
Mar 03 2013
prev sibling parent reply Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:
On 03/03/2013 10:06 AM, Johannes Pfau wrote:
 Maybe it would make sense to use /dev/random where available? (The
 problem is that /dev/random can block. On small embedded systems
 without monitor/mice/keyboard this can happen easily)
I suppose theoretically you shouldn't be calling /dev/random many times -- only each time you spawn a thread within which pseudo-random numbers are being generated -- and in an embedded context, this ought to happen rarely, no? But from googling around I see that /dev/random can block fairly quickly, after only a handful of numbers :-( Is something equivalent available on Windows?
Mar 03 2013
parent reply "jerro" <a a.com> writes:
 But from googling around I see that /dev/random can block 
 fairly quickly, after only a handful of numbers :-(
You can solve this by using /dev/urandom instead, as Jerome have said already.
 Is something equivalent available on Windows?
There's CryptGenRandom.
Mar 03 2013
parent Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:
On 03/03/2013 12:41 PM, jerro wrote:
 You can solve this by using /dev/urandom instead, as Jerome have said already.
Yes, but this is less trustworthy from a randomness point of view. I might use it personally for something, but I wouldn't want to use it as the basis of a supposedly trustworthy random seed. A bit more looking around pointed me at the Fortuna algorithm, which might be worth looking into: https://en.wikipedia.org/wiki/Fortuna_%28PRNG%29
 There's CryptGenRandom.
Thanks. :-) Not that I use Windows for simulation, but any addition or change to Phobos needs to support it.
Mar 03 2013
prev sibling next sibling parent Dmitry Olshansky <dmitry.olsh gmail.com> writes:
03-Mar-2013 12:58, Ivan Kazmenko пишет:
 Can anyone advise on the theoretical basis for the unpredictableSeed
 method in std.random?  I've tried googling around for the theory of
 good thread-safe seed generation methods but haven't really found
 anything. :-(
I have to ask: what would be a good unpredictableSeed by definition? With the current implementation, three downsides come to my mind: 1. Process ID, thread ID and system tick are insecure sources of randomness and can provide just a few bits of randomness in certain situations. I don't know how to address this in a portable way.
Do some cheap syscalls and measure effective latency, look at nanoseconds and such. It would give you a bit of good enough noise due to unpredictable mess of context switches in the OS.
 2. Once we know the first seed, it is easy to predict all subsequent
 seeds.  A solution would be to use a secure RNG instead, not just the
 one which gives away its state.
Indeed would be nice to obtain each seed separately (and preferably by different means). That being said hashing and PRNG-ing of some initial vector is fine for basic unpredictable seed. (just don't include init-vector in the seed itself)
 3. It would be a particularly bad idea to initialize MinstdRand0
 instances with consecutive unpredictableSeeds and then consider them
 independent.  This is just a consequence of a particular choice of RNG
 on the previous step.
 So, which of these do you consider the real problems, and what more do
 you need from unpredictableSeed?
AFAIK there are OS APIs that give you proper secure seeds. Somewhere in Windows Crypto API: http://msdn.microsoft.com/en-us/library/windows/desktop/aa379942(v=vs.85).aspx Must be something equivalent for POSIX. Also upcoming hardware like Intel's Ivy chips, and a lot of ARMs do have hardware random generators. Plus the devices that do generate true entropy. This would be a nice enhancement for std.random to include support for these and secureSeed (as opposed to "unpredictable"). There is a difference between seriously unpredictable (good enough for monte-carlo, games etc.) and cryptographically good - an overkill for monte-carlo and such, but a MUST for e.g. private key generation. -- Dmitry Olshansky
Mar 03 2013
prev sibling parent Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:
On 03/03/2013 09:58 AM, Ivan Kazmenko wrote:
 I have to ask: what would be a good unpredictableSeed by definition?  With the
 current implementation, three downsides come to my mind:

 1. Process ID, thread ID and system tick are insecure sources of randomness and
 can provide just a few bits of randomness in certain situations.  I don't know
 how to address this in a portable way.

 2. Once we know the first seed, it is easy to predict all subsequent seeds.  A
 solution would be to use a secure RNG instead, not just the one which gives
away
 its state.
I have to say that in my case, I'm not interested in security, but merely in having a statistically good enough seeding for parallel simulations. That is, I want multiple parallel streams of random numbers that are independent to the highest possible degree. But, you have a point, and if unpredictableSeed is good enough for my application, it may not be good enough for others. (Though I'm not sure e.g. Mersenne Twister is adequate for crypto in the first place.) That said, is it as un-random and predictable as you say? I'd have anticipated that the combination of process ID, thread ID and system tick would generate quite a good seed for Mersenne Twister, but I don't have the theoretical argument to back it up. :-(
 3. It would be a particularly bad idea to initialize MinstdRand0 instances with
 consecutive unpredictableSeeds and then consider them independent.  This is
just
 a consequence of a particular choice of RNG on the previous step.
Is MinstdRand0 really something you should consider using at all, if you care about quality statistics?
 So, which of these do you consider the real problems, and what more do you need
 from unpredictableSeed?
I need _statistically_ thread-safe pseudo-random number generation for the purposes of scientific simulation -- so, I don't care about the theoretical predictability, but about the quality of approximation of randomness. I'm using rndGen (so, Mersenne Twister on my x86-64 system) with anything from 2-20 different threads. My broader interest here is more theoretical, though, because I'd like to see std.random equipped with the best possible tools for the job -- and there are a number of existing niggles with it that spark my paranoia about other aspects :-)
Mar 03 2013
prev sibling parent reply "Rob T" <alanb ucora.com> writes:
On Saturday, 2 March 2013 at 17:40:58 UTC, Joseph Rushton 
Wakeling wrote:
 Hello all,

 Can anyone advise on the theoretical basis for the 
 unpredictableSeed method in std.random?  I've tried googling 
 around for the theory of good thread-safe seed generation 
 methods but haven't really found anything. :-(

 Thanks & best wishes,

     -- Joe
You can use the real time clock, which should have nanosecond precision. It should be very hard to predict because the clock will fluctuate based on environmental factors. I don't know if all architectures have an adequate real time clock however if portability is needed. --rt
Mar 03 2013
parent reply "Andrea Fontana" <nospam example.com> writes:
On Monday, 4 March 2013 at 04:18:10 UTC, Rob T wrote:
 On Saturday, 2 March 2013 at 17:40:58 UTC, Joseph Rushton 
 Wakeling wrote:
 Hello all,

 Can anyone advise on the theoretical basis for the 
 unpredictableSeed method in std.random?  I've tried googling 
 around for the theory of good thread-safe seed generation 
 methods but haven't really found anything. :-(

 Thanks & best wishes,

    -- Joe
You can use the real time clock, which should have nanosecond precision. It should be very hard to predict because the clock will fluctuate based on environmental factors. I don't know if all architectures have an adequate real time clock however if portability is needed. --rt
Maybe you can try to connect an external hardware device (e.g. arduino) and read some params from real world... :)
Mar 04 2013
parent reply Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:
On 03/04/2013 09:58 AM, Andrea Fontana wrote:
 Maybe you can try to connect an external hardware device (e.g. arduino) and
read
 some params from real world... :)
Yes, there are nice options here ... :-) However, to re-focus the discussion -- I'm not so much asking "How do I ensure my own code is statistically safe?", as there are lots of ways I can go about that. I'm concerned with the theoretical and practical justification for Phobos' existing unpredictableSeed, and possible superior alternatives that could reasonably be implemented _for Phobos_.
Mar 04 2013
parent "Rob T" <alanb ucora.com> writes:
On Monday, 4 March 2013 at 11:04:46 UTC, Joseph Rushton Wakeling 
wrote:
 On 03/04/2013 09:58 AM, Andrea Fontana wrote:
 Maybe you can try to connect an external hardware device (e.g. 
 arduino) and read
 some params from real world... :)
Yes, there are nice options here ... :-) However, to re-focus the discussion -- I'm not so much asking "How do I ensure my own code is statistically safe?", as there are lots of ways I can go about that. I'm concerned with the theoretical and practical justification for Phobos' existing unpredictableSeed, and possible superior alternatives that could reasonably be implemented _for Phobos_.
I found this which seems to be what Phobos duplicates http://www.cryptosys.net/rng_algorithms_old.html The theory appears to be no more than an ad-hoc attempt to find something unique and hard to predict across threads, processes and machines. The superseded and improved version uses a hash of more potentially unique values http://www.cryptosys.net/rng_algorithms.html Clearly we're lacking a real solution, and IMO the solution should be hardware devices that come with standardized random generators. --rt
Mar 04 2013