digitalmars.D - A Tausworthe random number generator
- terchestor (46/46) Jan 21 2014 I'm a newcomer to D language (after years of programming with ASM
- John Colvin (6/53) Jan 21 2014 Github, bitbucket or equivalent.
- terchestor (5/63) Jan 21 2014 Well, thanks for the advices.
- bearophile (26/52) Jan 21 2014 What's the purpose of those §? They add noise. If they aren't
- H. S. Teoh (7/28) Jan 21 2014 [...]
- bearophile (4/7) Jan 21 2014 Yes, I remember. A final class is probably better.
- terchestor (18/68) Jan 21 2014 That's an old usage of mine.
- bearophile (5/6) Jan 21 2014 I can't be sure, but I think the trailing _t is not very used in
- monarch_dodra (21/25) Jan 21 2014 The code feels a bit "C-ish", but I'd argue those are just style
- terchestor (9/28) Jan 21 2014 That's and excellent idea because your final example is what I
- Rikki Cattermole (7/8) Jan 21 2014 Something along the lines of github/gist/pastebin.
- terchestor (8/8) Jan 22 2014 Thanks everybody for reviewing my code.
- monarch_dodra (30/33) Jan 22 2014 I'd say it's your "Tconst_tau0" that is getting in your way: You
- terchestor (9/39) Jan 22 2014 Actually, for the purpose I've in mind, a DSP library, which is
- terchestor (9/9) Jan 22 2014 I finally opted for both the simplest and fastest solution: a
- Chris Cain (3/4) Jan 22 2014 I think you mean "std.random.unpredictableSeed" but 100% agree.
I'm a newcomer to D language (after years of programming with ASM Motorola 68030-40 long time ago, as well as C, C++, Javascript, PHP, etc.), but not anymore a professional coder. After reading some articles around there and that nice and cute TDPL (just after Coding standards, another world!), I began to write some to exercise my fresh new "skill". I choose as a first experience, to write a simple class implementing a very interesting random number generator known as Tausworthe, according to the recipe found in this paper: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.101.9723&rep=rep1&type=pdf. The algorithm in C is quite simple and set in Lecuyer's paper as: unsigned long s1, s2, s3, b; double taus88 () { /* Generates numbers between 0 and 1. */ b = (((s1 << 13) ^ s1) >> 19); s1 = (((s1 & 4294967294) << 12) ^ b); b = (((s2 << 2) ^ s2) >> 25); s2 = (((s2 & 4294967288) << 4) ^ b); b = (((s3 << 3) ^ s3) >> 11); s3 = (((s3 & 4294967280) << 17) ^ b); return ((s1 ^ s2 ^ s3) * 2.3283064365e-10); } I made some "adjustments" to this algorithm, to yield either a (long) integer or (double) float result, in any range. Benchmarks ---------- Tausworthe module (rdmd -main -unittest -debug=Benchmark_Tausworthe tausworthe.d) ============================================================ Call 1_000_000 times benchmark_uniform_ulong( ): 30 ms | Average : 3e-05 ms ============================================================ ============================================================ Call 1_000_000 times benchmark_uniform_double( ): 42 ms | Average : 4.2e-05 ms ============================================================ ============================================================ Call 1_000_000 times benchmark_uniform_std( ): 70 ms | Average : 7e-05 ms ============================================================ Comparing ratio tausworthe / std.random: 0.609229 I'd greatly appreciate any reviewing and testing of my code. Any improvements you feel necessary are most welcome (as said before, I'm a newbie). What's the best place to publish the code?
Jan 21 2014
On Tuesday, 21 January 2014 at 10:05:42 UTC, terchestor wrote:I'm a newcomer to D language (after years of programming with ASM Motorola 68030-40 long time ago, as well as C, C++, Javascript, PHP, etc.), but not anymore a professional coder. After reading some articles around there and that nice and cute TDPL (just after Coding standards, another world!), I began to write some to exercise my fresh new "skill". I choose as a first experience, to write a simple class implementing a very interesting random number generator known as Tausworthe, according to the recipe found in this paper: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.101.9723&rep=rep1&type=pdf. The algorithm in C is quite simple and set in Lecuyer's paper as: unsigned long s1, s2, s3, b; double taus88 () { /* Generates numbers between 0 and 1. */ b = (((s1 << 13) ^ s1) >> 19); s1 = (((s1 & 4294967294) << 12) ^ b); b = (((s2 << 2) ^ s2) >> 25); s2 = (((s2 & 4294967288) << 4) ^ b); b = (((s3 << 3) ^ s3) >> 11); s3 = (((s3 & 4294967280) << 17) ^ b); return ((s1 ^ s2 ^ s3) * 2.3283064365e-10); } I made some "adjustments" to this algorithm, to yield either a (long) integer or (double) float result, in any range. Benchmarks ---------- Tausworthe module (rdmd -main -unittest -debug=Benchmark_Tausworthe tausworthe.d) ============================================================ Call 1_000_000 times benchmark_uniform_ulong( ): 30 ms | Average : 3e-05 ms ============================================================ ============================================================ Call 1_000_000 times benchmark_uniform_double( ): 42 ms | Average : 4.2e-05 ms ============================================================ ============================================================ Call 1_000_000 times benchmark_uniform_std( ): 70 ms | Average : 7e-05 ms ============================================================ Comparing ratio tausworthe / std.random: 0.609229 I'd greatly appreciate any reviewing and testing of my code. Any improvements you feel necessary are most welcome (as said before, I'm a newbie). What's the best place to publish the code?Github, bitbucket or equivalent. If you think it's ready for general use then write a dub.json and register it to code.dlang.org Take care with modifying random number generators, it's quite easy to ruin them by changing something that seems innocent.
Jan 21 2014
On Tuesday, 21 January 2014 at 10:10:32 UTC, John Colvin wrote:On Tuesday, 21 January 2014 at 10:05:42 UTC, terchestor wrote:Well, thanks for the advices. Here it is: https://github.com/Terchestor/dlang/tree/Tausworthe-class Please be indulgent ,-)I'm a newcomer to D language (after years of programming with ASM Motorola 68030-40 long time ago, as well as C, C++, Javascript, PHP, etc.), but not anymore a professional coder. After reading some articles around there and that nice and cute TDPL (just after Coding standards, another world!), I began to write some to exercise my fresh new "skill". I choose as a first experience, to write a simple class implementing a very interesting random number generator known as Tausworthe, according to the recipe found in this paper: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.101.9723&rep=rep1&type=pdf. The algorithm in C is quite simple and set in Lecuyer's paper as: unsigned long s1, s2, s3, b; double taus88 () { /* Generates numbers between 0 and 1. */ b = (((s1 << 13) ^ s1) >> 19); s1 = (((s1 & 4294967294) << 12) ^ b); b = (((s2 << 2) ^ s2) >> 25); s2 = (((s2 & 4294967288) << 4) ^ b); b = (((s3 << 3) ^ s3) >> 11); s3 = (((s3 & 4294967280) << 17) ^ b); return ((s1 ^ s2 ^ s3) * 2.3283064365e-10); } I made some "adjustments" to this algorithm, to yield either a (long) integer or (double) float result, in any range. Benchmarks ---------- Tausworthe module (rdmd -main -unittest -debug=Benchmark_Tausworthe tausworthe.d) ============================================================ Call 1_000_000 times benchmark_uniform_ulong( ): 30 ms | Average : 3e-05 ms ============================================================ ============================================================ Call 1_000_000 times benchmark_uniform_double( ): 42 ms | Average : 4.2e-05 ms ============================================================ ============================================================ Call 1_000_000 times benchmark_uniform_std( ): 70 ms | Average : 7e-05 ms ============================================================ Comparing ratio tausworthe / std.random: 0.609229 I'd greatly appreciate any reviewing and testing of my code. Any improvements you feel necessary are most welcome (as said before, I'm a newbie). What's the best place to publish the code?Github, bitbucket or equivalent. If you think it's ready for general use then write a dub.json and register it to code.dlang.org Take care with modifying random number generators, it's quite easy to ruin them by changing something that seems innocent.
Jan 21 2014
terchestor:https://github.com/Terchestor/dlang/tree/Tausworthe-class Please be indulgent ,-)/++ §§§ TauswortheWhat's the purpose of those §? They add noise. If they aren't necessary I suggest to remove them.class Tausworthe ( I, F ) if ( __traits(isIntegral, I ) && __traits(isFloating, F ) )I suggest to not put spaces around ( ). And if you want also remove the space after the class/functions name. Perhaps you want to use a struct? The decision is important. Are your methods all virtual?{ alias I integral_t; alias F floating_t;Today the syntax "alias Name = Foo;" is preferred. In D code the use of "_t" as suffix for types is less common. And perhaps it's better to use a starting upper case letter for their name.enum tauswortheConstFloat : floating_t { tau0 = 4294967296.0fClass/struct/Enum/Union names should start with an upper case letter.tau1 = 4294967294,Perhaps that literal can end with a U.~this( ) { }No need for this.void seed( integral_t seed_=1 ) ... auto s = seed_;In general annotate with enum/const/immutable all variables that don't need to mutate.integral_t fseed( const integral_t s_ ) { return ( tauswortheConst.lgc * s_ ) & tauswortheConst.mask; }That's probably better annotated as pure nothrow.}//§§==========END_CLASS::Tausworthe==========I suggest to reduce the amount of === in this code.auto ?tausworthe0 = new Tausworthe!(ulong, double);While D supports Unicode identifiers, they are a often a pain in the ass (this is a personal opinion), so perhaps you want to kill them all.//§§ __COPYRIGHT_NOTICE__I think there's a ddoc field syntax for the copyright in the module ddoc. Bye, bearophile
Jan 21 2014
On Tue, Jan 21, 2014 at 12:32:41PM +0000, bearophile wrote:terchestor:[...] There are a lot of issues with RNG's being passed by value; it's better to use a reference type (i.e. class) and just make all methods final. T -- It said to install Windows 2000 or better, so I installed Linux instead.https://github.com/Terchestor/dlang/tree/Tausworthe-class Please be indulgent ,-)/++ §§§ TauswortheWhat's the purpose of those §? They add noise. If they aren't necessary I suggest to remove them.class Tausworthe ( I, F ) if ( __traits(isIntegral, I ) && __traits(isFloating, F ) )I suggest to not put spaces around ( ). And if you want also remove the space after the class/functions name. Perhaps you want to use a struct? The decision is important. Are your methods all virtual?
Jan 21 2014
H. S. Teoh:There are a lot of issues with RNG's being passed by value; it's better to use a reference type (i.e. class) and just make all methods final.Yes, I remember. A final class is probably better. Bye, bearophile
Jan 21 2014
On Tuesday, 21 January 2014 at 12:32:43 UTC, bearophile wrote:terchestor:Well, a fantasy, but I wont keep it!https://github.com/Terchestor/dlang/tree/Tausworthe-class/++ §§§ TauswortheWhat's the purpose of those §? They add noise. If they aren't necessary I suggest to remove them.That's an old usage of mine.class Tausworthe ( I, F ) if ( __traits(isIntegral, I ) && __traits(isFloating, F ) )I suggest to not put spaces around ( ). And if you want also remove the space after the class/functions name.Perhaps you want to use a struct? The decision is important. Are your methods all virtual?Not a struc, I need a class.I knew not. Do ypu mean: alias integral_t = I;{ alias I integral_t; alias F floating_t;Today the syntax "alias Name = Foo;" is preferred.In D code the use of "_t" as suffix for types is less common. And perhaps it's better to use a starting upper case letter for their name.I forgot the Capital letter, but I like the trailing _t, like size_t.Of course!enum tauswortheConstFloat : floating_t { tau0 = 4294967296.0fClass/struct/Enum/Union names should start with an upper case letter.tau1 = 4294967294,Perhaps that literal can end with a U.~this( ) { }No need for this.I'd better put an assertion (or enforce?) to be sure seed won't be null, and give constantness to seed.void seed( integral_t seed_=1 ) ... auto s = seed_;In general annotate with enum/const/immutable all variables that don't need to mutate.OK.integral_t fseed( const integral_t s_ ) { return ( tauswortheConst.lgc * s_ ) & tauswortheConst.mask; }That's probably better annotated as pure nothrow.OK.}//§§==========END_CLASS::Tausworthe==========I suggest to reduce the amount of === in this code.Not sure. I'm experimenting a kind of tagging to easily distinguish nested functions (with greek phi) or temporaries. Perhaps not a so good idea, but i'll see if enhances the readability later.auto ?tausworthe0 = new Tausworthe!(ulong, double);While D supports Unicode identifiers, they are a often a pain in the ass (this is a personal opinion), so perhaps you want to kill them all.Yes, but it's a work on progress...//§§ __COPYRIGHT_NOTICE__I think there's a ddoc field syntax for the copyright in the module ddoc.Thank you so much for your invaluable help.
Jan 21 2014
terchestor:but I like the trailing _t, like size_t.I can't be sure, but I think the trailing _t is not very used in D. size_t is an exception, like hash_t and little more. Bye, bearophile
Jan 21 2014
On Tuesday, 21 January 2014 at 11:04:13 UTC, terchestor wrote:Well, thanks for the advices. Here it is: https://github.com/Terchestor/dlang/tree/Tausworthe-class Please be indulgent ,-)The code feels a bit "C-ish", but I'd argue those are just style issues. My major points would be: 1) Make it a "final class": By default, classes have all their methods virtual. I don't think you want that. 2) I'd adhere to the "Range" interface: Give your function the "front"/"popFront()" functions, as well as the public "enum empty = false". 3) I think the *instance* itself should carry the bounds on initialization, and provide a single "uniform" function. 4) Uniform initialization: provide a "tausworthe(I = int, R = double)(int seed, double low = 0.0, double high = 1.0 )" function. This function should simply return a new instance of a Tausworthe. The advantage of this are 2-fold: 4.1) Similar syntax for struct/class initialization 4.2) Default Type parameters for the type. The combination of 2/3/4 will allow you to write nifty things like: int[] myRandomNumbers = tausworthe(randomSeed).take(10).array(); And voila! An array of 10 random numbers! Easy-peasy.
Jan 21 2014
On Tuesday, 21 January 2014 at 13:44:09 UTC, monarch_dodra wrote:On Tuesday, 21 January 2014 at 11:04:13 UTC, terchestor wrote:My major points would be: 1) Make it a "final class": By default, classes have all their methods virtual. I don't think you want that.You're right.2) I'd adhere to the "Range" interface: Give your function the "front"/"popFront()" functions, as well as the public "enum empty = false".That's and excellent idea because your final example is what I had in mind.3) I think the *instance* itself should carry the bounds on initialization, and provide a single "uniform" function.There is a later version with only one uniform method. But I'd like to be able to change bounds with a single instance.4) Uniform initialization: provide a "tausworthe(I = int, R = double)(int seed, double low = 0.0, double high = 1.0 )" function. This function should simply return a new instance of a Tausworthe. The advantage of this are 2-fold: 4.1) Similar syntax for struct/class initialization 4.2) Default Type parameters for the type.Excellent, as I said.The combination of 2/3/4 will allow you to write nifty things like: int[] myRandomNumbers = tausworthe(randomSeed).take(10).array(); And voila! An array of 10 random numbers! Easy-peasy.I adhere. Thanks for taking time to review first D try.
Jan 21 2014
On Tuesday, 21 January 2014 at 10:05:42 UTC, terchestor wrote:What's the best place to publish the code?Something along the lines of github/gist/pastebin. Also this is more of a learn newsgroup post. In case you're not aware, in D we can utilise static variables inside functions. Meaning those global variables can be pushed into the function. If thats an approach you would be interested in.
Jan 21 2014
Thanks everybody for reviewing my code. I rewrote almost everything to condense the code and conform to stylistic D usage. My problem is the cast to integral type (line 80), downgrading performances of circa 50%. Is there any possibility to avoid it? Next step will be to implement the range interface as suggested by monarch_dodra.
Jan 22 2014
On Wednesday, 22 January 2014 at 09:33:39 UTC, terchestor wrote:My problem is the cast to integral type (line 80), downgrading performances of circa 50%. Is there any possibility to avoid it?I'd say it's your "Tconst_tau0" that is getting in your way: You have a uint sized integral, which you divide by a floating point, just to multiply it again and cast back to integral type (!). I'd suggest you get rid of "high/low" completly. There are prng *adaptors* that can create a uniform distribution *from* a natural prng (std.random.uniform). If you do this, you code becomes (starting line 73): auto uni = (s1 ^ s2 ^ s3); static if (isFloatingPoint!result_t) { return uni / Tconst_tau0; } else { return uni; } Heck, you could get rid of floating point support entirely, and generate only integrals. Then, you just let "uniform" do the tough work: auto rnd = tausworthe!(ulong, uint)(randomSeed); auto rndF = rnd.uniform!"[)"(0.0, 1.0); //0 - 1 double range auto rnd1024 = rnd.uniform!"[)"(0, 1024); //[0 .. 1024) integral range. Nitpick: /// get seed from current time auto se = Clock.currTime().stdTime; This is known to be a bad seed. If you need a "good enough" seed, use "std.random.randomSeed". Although arguably (IMO), it is better place the burden of seeding on the caller.
Jan 22 2014
On Wednesday, 22 January 2014 at 11:02:10 UTC, monarch_dodra wrote:On Wednesday, 22 January 2014 at 09:33:39 UTC, terchestor wrote:Actually, for the purpose I've in mind, a DSP library, which is time critical (and that's why I need the fastest possible PRNG), I'd better decoupling the range processing and create a class implementing the range interface to store the numbers generated by uniform to serve them on demand: a composite pattern is better than inheritance.My problem is the cast to integral type (line 80), downgrading performances of circa 50%. Is there any possibility to avoid it?I'd suggest you get rid of "high/low" completly. There are prng *adaptors* that can create a uniform distribution *from* a natural prng (std.random.uniform). If you do this, you code becomes (starting line 73): auto uni = (s1 ^ s2 ^ s3); static if (isFloatingPoint!result_t) { return uni / Tconst_tau0; } else { return uni; } Heck, you could get rid of floating point support entirely, and generate only integrals. Then, you just let "uniform" do the tough work: auto rnd = tausworthe!(ulong, uint)(randomSeed); auto rndF = rnd.uniform!"[)"(0.0, 1.0); //0 - 1 double range auto rnd1024 = rnd.uniform!"[)"(0, 1024); //[0 .. 1024) integral range.Nitpick: /// get seed from current time auto se = Clock.currTime().stdTime; This is known to be a bad seed. If you need a "good enough" seed, use "std.random.randomSeed". Although arguably (IMO), it is better place the burden of seeding on the caller.Fine remark worth adopting.
Jan 22 2014
I finally opted for both the simplest and fastest solution: a templated class (integral type) and a templated uniform method (integral or floating-point) generating variates either in the whole integral range or in the 0.0..1.0 range. I let the burden of generating the whole floating-point range on the user side, choosing any divider of the integral generated variate. Hence a similar speed for both types, roughly 40% faster than std.random.uniform.
Jan 22 2014
On Wednesday, 22 January 2014 at 11:02:10 UTC, monarch_dodra wrote:use "std.random.randomSeed".I think you mean "std.random.unpredictableSeed" but 100% agree.
Jan 22 2014