digitalmars.D - A Tausworthe random number generator

terchestor (46/46) Jan 21 2014 I'm a newcomer to D language (after years of programming with ASM

John Colvin (6/53) Jan 21 2014 Github, bitbucket or equivalent.

terchestor (5/63) Jan 21 2014 Well, thanks for the advices.

bearophile (26/52) Jan 21 2014 What's the purpose of those §? They add noise. If they aren't

H. S. Teoh (7/28) Jan 21 2014 [...]

bearophile (4/7) Jan 21 2014 Yes, I remember. A final class is probably better.

terchestor (18/68) Jan 21 2014 That's an old usage of mine.

bearophile (5/6) Jan 21 2014 I can't be sure, but I think the trailing _t is not very used in

monarch_dodra (21/25) Jan 21 2014 The code feels a bit "C-ish", but I'd argue those are just style

terchestor (9/28) Jan 21 2014 That's and excellent idea because your final example is what I

Rikki Cattermole (7/8) Jan 21 2014 Something along the lines of github/gist/pastebin.
terchestor (8/8) Jan 22 2014 Thanks everybody for reviewing my code.

monarch_dodra (30/33) Jan 22 2014 I'd say it's your "Tconst_tau0" that is getting in your way: You

terchestor (9/39) Jan 22 2014 Actually, for the purpose I've in mind, a DSP library, which is

terchestor (9/9) Jan 22 2014 I finally opted for both the simplest and fastest solution: a

Chris Cain (3/4) Jan 22 2014 I think you mean "std.random.unpredictableSeed" but 100% agree.

"terchestor" <terchestor gmail.com> writes:

I'm a newcomer to D language (after years of programming with ASM 
Motorola 68030-40 long time ago, as well as C, C++, Javascript, 
PHP, etc.), but not anymore a professional coder.
After reading some articles around there and that nice and cute 
TDPL (just after Coding standards, another world!), I began to 
write some to exercise my fresh new "skill".
I choose as a first experience, to write a simple class 
implementing a very interesting random number generator known as 
Tausworthe, according to the recipe found in this paper: 
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.101.9723&rep=rep1&type=pdf.
The algorithm in C is quite simple and set in Lecuyer's paper as:

unsigned long s1, s2, s3, b;
double taus88 ()
{
  /* Generates numbers between 0 and 1. */
b = (((s1 << 13) ^ s1) >> 19);
s1 = (((s1 & 4294967294) << 12) ^ b);
b = (((s2 << 2) ^ s2) >> 25);
s2 = (((s2 & 4294967288) << 4) ^ b);
b = (((s3 << 3) ^ s3) >> 11);
s3 = (((s3 & 4294967280) << 17) ^ b);
return ((s1 ^ s2 ^ s3) * 2.3283064365e-10);
}

I made some "adjustments" to this algorithm, to yield either a 
(long) integer or (double) float result, in any range.

Benchmarks
----------
Tausworthe module (rdmd -main -unittest 
-debug=Benchmark_Tausworthe tausworthe.d)
============================================================
Call 1_000_000 times benchmark_uniform_ulong( ): 30 ms | Average 
: 3e-05 ms
============================================================
============================================================
Call 1_000_000 times benchmark_uniform_double( ): 42 ms | Average 
: 4.2e-05 ms
============================================================
============================================================
Call 1_000_000 times benchmark_uniform_std( ): 70 ms | Average : 
7e-05 ms
============================================================
Comparing ratio tausworthe / std.random:  0.609229

I'd greatly appreciate any reviewing and testing of my code. Any 
improvements you feel necessary are most welcome (as said before, 
I'm a newbie).

What's the best place to publish the code?

Jan 21 2014

"John Colvin" <john.loughran.colvin gmail.com> writes:

On Tuesday, 21 January 2014 at 10:05:42 UTC, terchestor wrote:
 I'm a newcomer to D language (after years of programming with 
 ASM Motorola 68030-40 long time ago, as well as C, C++, 
 Javascript, PHP, etc.), but not anymore a professional coder.
 After reading some articles around there and that nice and cute 
 TDPL (just after Coding standards, another world!), I began to 
 write some to exercise my fresh new "skill".
 I choose as a first experience, to write a simple class 
 implementing a very interesting random number generator known 
 as Tausworthe, according to the recipe found in this paper: 
 http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.101.9723&rep=rep1&type=pdf.
 The algorithm in C is quite simple and set in Lecuyer's paper 
 as:

 unsigned long s1, s2, s3, b;
 double taus88 ()
 {
  /* Generates numbers between 0 and 1. */
 b = (((s1 << 13) ^ s1) >> 19);
 s1 = (((s1 & 4294967294) << 12) ^ b);
 b = (((s2 << 2) ^ s2) >> 25);
 s2 = (((s2 & 4294967288) << 4) ^ b);
 b = (((s3 << 3) ^ s3) >> 11);
 s3 = (((s3 & 4294967280) << 17) ^ b);
 return ((s1 ^ s2 ^ s3) * 2.3283064365e-10);
 }

 I made some "adjustments" to this algorithm, to yield either a 
 (long) integer or (double) float result, in any range.

 Benchmarks
 ----------
 Tausworthe module (rdmd -main -unittest 
 -debug=Benchmark_Tausworthe tausworthe.d)
 ============================================================
 Call 1_000_000 times benchmark_uniform_ulong( ): 30 ms | 
 Average : 3e-05 ms
 ============================================================
 ============================================================
 Call 1_000_000 times benchmark_uniform_double( ): 42 ms | 
 Average : 4.2e-05 ms
 ============================================================
 ============================================================
 Call 1_000_000 times benchmark_uniform_std( ): 70 ms | Average 
 : 7e-05 ms
 ============================================================
 Comparing ratio tausworthe / std.random:  0.609229

 I'd greatly appreciate any reviewing and testing of my code. 
 Any improvements you feel necessary are most welcome (as said 
 before, I'm a newbie).

 What's the best place to publish the code?

Github, bitbucket or equivalent.

If you think it's ready for general use then write a dub.json and 
register it to code.dlang.org


Take care with modifying random number generators, it's quite 
easy to ruin them by changing something that seems innocent.

Jan 21 2014

"terchestor" <terchestor gmail.com> writes:

On Tuesday, 21 January 2014 at 10:10:32 UTC, John Colvin wrote:
 On Tuesday, 21 January 2014 at 10:05:42 UTC, terchestor wrote:
 I'm a newcomer to D language (after years of programming with 
 ASM Motorola 68030-40 long time ago, as well as C, C++, 
 Javascript, PHP, etc.), but not anymore a professional coder.
 After reading some articles around there and that nice and 
 cute TDPL (just after Coding standards, another world!), I 
 began to write some to exercise my fresh new "skill".
 I choose as a first experience, to write a simple class 
 implementing a very interesting random number generator known 
 as Tausworthe, according to the recipe found in this paper: 
 http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.101.9723&rep=rep1&type=pdf.
 The algorithm in C is quite simple and set in Lecuyer's paper 
 as:

 unsigned long s1, s2, s3, b;
 double taus88 ()
 {
 /* Generates numbers between 0 and 1. */
 b = (((s1 << 13) ^ s1) >> 19);
 s1 = (((s1 & 4294967294) << 12) ^ b);
 b = (((s2 << 2) ^ s2) >> 25);
 s2 = (((s2 & 4294967288) << 4) ^ b);
 b = (((s3 << 3) ^ s3) >> 11);
 s3 = (((s3 & 4294967280) << 17) ^ b);
 return ((s1 ^ s2 ^ s3) * 2.3283064365e-10);
 }

 I made some "adjustments" to this algorithm, to yield either a 
 (long) integer or (double) float result, in any range.

 Benchmarks
 ----------
 Tausworthe module (rdmd -main -unittest 
 -debug=Benchmark_Tausworthe tausworthe.d)
 ============================================================
 Call 1_000_000 times benchmark_uniform_ulong( ): 30 ms | 
 Average : 3e-05 ms
 ============================================================
 ============================================================
 Call 1_000_000 times benchmark_uniform_double( ): 42 ms | 
 Average : 4.2e-05 ms
 ============================================================
 ============================================================
 Call 1_000_000 times benchmark_uniform_std( ): 70 ms | Average 
 : 7e-05 ms
 ============================================================
 Comparing ratio tausworthe / std.random:  0.609229

 I'd greatly appreciate any reviewing and testing of my code. 
 Any improvements you feel necessary are most welcome (as said 
 before, I'm a newbie).

 What's the best place to publish the code?

 Github, bitbucket or equivalent.

 If you think it's ready for general use then write a dub.json 
 and register it to code.dlang.org


 Take care with modifying random number generators, it's quite 
 easy to ruin them by changing something that seems innocent.

Well, thanks for the advices.
Here it is: 
https://github.com/Terchestor/dlang/tree/Tausworthe-class

Please be indulgent ,-)

Jan 21 2014

"bearophile" <bearophileHUGS lycos.com> writes:

terchestor:

 https://github.com/Terchestor/dlang/tree/Tausworthe-class

 Please be indulgent ,-)

 /++
 §§§ Tausworthe

What's the purpose of those §? They add noise. If they aren't 
necessary I suggest to remove them.


 class Tausworthe ( I, F ) if ( __traits(isIntegral, I ) &&
                                __traits(isFloating, F ) )

I suggest to not put spaces around ( ). And if you want also 
remove the space after the class/functions name.

Perhaps you want to use a struct? The decision is important. Are 
your methods all virtual?


 {
     alias I integral_t;
     alias F floating_t;

Today the syntax "alias Name = Foo;" is preferred.

In D code the use of "_t" as suffix for types is less common. And 
perhaps it's better to use a starting upper case letter for their 
name.


     enum tauswortheConstFloat : floating_t
     {
         tau0 = 4294967296.0f

Class/struct/Enum/Union names should start with an upper case 
letter.


          tau1 = 4294967294,

Perhaps that literal can end with a U.


    ~this( )
     { }

No need for this.


     void seed( integral_t seed_=1 )
 ...
         auto s = seed_;

In general annotate with enum/const/immutable all variables that 
don't need to mutate.


         integral_t fseed( const integral_t s_ )
         {
             return ( tauswortheConst.lgc * s_ ) & 
 tauswortheConst.mask;
         }

That's probably better annotated as pure nothrow.


 }//§§==========END_CLASS::Tausworthe==========

I suggest to reduce the amount of === in this code.


 auto ?tausworthe0 = new Tausworthe!(ulong, double);

While D supports Unicode identifiers, they are a often a pain in 
the ass (this is a personal opinion), so perhaps you want to kill 
them all.


 //§§ __COPYRIGHT_NOTICE__

I think there's a ddoc field syntax for the copyright in the 
module ddoc.

Bye,
bearophile

Jan 21 2014

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Tue, Jan 21, 2014 at 12:32:41PM +0000, bearophile wrote:
 terchestor:
 
https://github.com/Terchestor/dlang/tree/Tausworthe-class

Please be indulgent ,-)

 
/++
��� Tausworthe

 
 What's the purpose of those �? They add noise. If they aren't
 necessary I suggest to remove them.
 
 
class Tausworthe ( I, F ) if ( __traits(isIntegral, I ) &&
                               __traits(isFloating, F ) )

 
 I suggest to not put spaces around ( ). And if you want also remove
 the space after the class/functions name.
 
 Perhaps you want to use a struct? The decision is important. Are
 your methods all virtual?

[...]

There are a lot of issues with RNG's being passed by value; it's better
to use a reference type (i.e. class) and just make all methods final.


T

-- 
It said to install Windows 2000 or better, so I installed Linux instead.

Jan 21 2014

"bearophile" <bearophileHUGS lycos.com> writes:

H. S. Teoh:

 There are a lot of issues with RNG's being passed by value; 
 it's better to use a reference type (i.e. class) and just
 make all methods final.

Yes, I remember. A final class is probably better.

Bye,
bearophile

Jan 21 2014

"terchestor" <terchestor gmail.com> writes:

On Tuesday, 21 January 2014 at 12:32:43 UTC, bearophile wrote:
 terchestor:

 https://github.com/Terchestor/dlang/tree/Tausworthe-class

 /++
 §§§ Tausworthe

 What's the purpose of those §? They add noise. If they aren't 
 necessary I suggest to remove them.

Well, a fantasy, but I wont keep it!
 class Tausworthe ( I, F ) if ( __traits(isIntegral, I ) &&
                               __traits(isFloating, F ) )

 I suggest to not put spaces around ( ). And if you want also 
 remove the space after the class/functions name.

That's an old usage of mine.

 Perhaps you want to use a struct? The decision is important. 
 Are your methods all virtual?

Not a struc, I need a class.

 {
    alias I integral_t;
    alias F floating_t;

 Today the syntax "alias Name = Foo;" is preferred.

I knew not. Do ypu mean: alias integral_t = I;

 In D code the use of "_t" as suffix for types is less common. 
 And perhaps it's better to use a starting upper case letter for 
 their name.

I forgot the Capital letter, but I like the trailing _t, like 
size_t.
    enum tauswortheConstFloat : floating_t
    {
        tau0 = 4294967296.0f

 Class/struct/Enum/Union names should start with an upper case 
 letter.


         tau1 = 4294967294,

 Perhaps that literal can end with a U.


   ~this( )
    { }

 No need for this.

Of course!

    void seed( integral_t seed_=1 )
 ...
        auto s = seed_;

 In general annotate with enum/const/immutable all variables 
 that don't need to mutate.

I'd better put an assertion (or enforce?) to be sure seed won't 
be null, and give constantness to seed.

        integral_t fseed( const integral_t s_ )
        {
            return ( tauswortheConst.lgc * s_ ) & 
 tauswortheConst.mask;
        }

 That's probably better annotated as pure nothrow.

OK.

 }//§§==========END_CLASS::Tausworthe==========

 I suggest to reduce the amount of === in this code.

OK.

 auto ?tausworthe0 = new Tausworthe!(ulong, double);

 While D supports Unicode identifiers, they are a often a pain 
 in the ass (this is a personal opinion), so perhaps you want to 
 kill them all.

Not sure. I'm experimenting a kind of tagging to easily 
distinguish nested functions (with greek phi) or temporaries. 
Perhaps not a so good idea, but i'll see if enhances the 
readability later.
 //§§ __COPYRIGHT_NOTICE__

 I think there's a ddoc field syntax for the copyright in the 
 module ddoc.

Yes, but it's a work on progress...

Thank you so much for your invaluable help.

Jan 21 2014

"bearophile" <bearophileHUGS lycos.com> writes:

terchestor:

but I like the trailing _t, like size_t.

I can't be sure, but I think the trailing _t is not very used in 
D. size_t is an exception, like hash_t and little more.

Bye,
bearophile

Jan 21 2014

"monarch_dodra" <monarchdodra gmail.com> writes:

On Tuesday, 21 January 2014 at 11:04:13 UTC, terchestor wrote:
 Well, thanks for the advices.
 Here it is: 
 https://github.com/Terchestor/dlang/tree/Tausworthe-class

 Please be indulgent ,-)

The code feels a bit "C-ish", but I'd argue those are just style 
issues.

My major points would be:
1) Make it a "final class": By default, classes have all their 
methods virtual. I don't think you want that.

2) I'd adhere to the "Range" interface: Give your function the 
"front"/"popFront()" functions, as well as the public "enum empty 
= false".

3) I think the *instance* itself should carry the bounds on 
initialization, and provide a single "uniform" function.

4) Uniform initialization: provide a "tausworthe(I = int, R = 
double)(int seed, double low = 0.0, double high = 1.0 )" 
function. This function should simply return a new instance of a 
Tausworthe. The advantage of this are 2-fold:
4.1) Similar syntax for struct/class initialization
4.2) Default Type parameters for the type.

The combination of 2/3/4 will allow you to write nifty things 
like:
int[] myRandomNumbers = tausworthe(randomSeed).take(10).array();

And voila! An array of 10 random numbers! Easy-peasy.

Jan 21 2014

"terchestor" <terchestor gmail.com> writes:

On Tuesday, 21 January 2014 at 13:44:09 UTC, monarch_dodra wrote:
 On Tuesday, 21 January 2014 at 11:04:13 UTC, terchestor wrote:

 My major points would be:
 1) Make it a "final class": By default, classes have all their 
 methods virtual. I don't think you want that.

You're right.

 2) I'd adhere to the "Range" interface: Give your function the 
 "front"/"popFront()" functions, as well as the public "enum 
 empty = false".

That's and excellent idea because your final example is what I 
had in mind.

 3) I think the *instance* itself should carry the bounds on 
 initialization, and provide a single "uniform" function.

There is a later version with only one uniform method. But I'd 
like to be able to change bounds with a single instance.

 4) Uniform initialization: provide a "tausworthe(I = int, R = 
 double)(int seed, double low = 0.0, double high = 1.0 )" 
 function. This function should simply return a new instance of 
 a Tausworthe. The advantage of this are 2-fold:
 4.1) Similar syntax for struct/class initialization
 4.2) Default Type parameters for the type.

Excellent, as I said.

 The combination of 2/3/4 will allow you to write nifty things 
 like:
 int[] myRandomNumbers = tausworthe(randomSeed).take(10).array();

 And voila! An array of 10 random numbers! Easy-peasy.

I adhere.
Thanks for taking time to review first D try.

Jan 21 2014

"Rikki Cattermole" <alphaglosined gmail.com> writes:

On Tuesday, 21 January 2014 at 10:05:42 UTC, terchestor wrote:
 What's the best place to publish the code?

Something along the lines of github/gist/pastebin.
Also this is more of a learn newsgroup post.

In case you're not aware, in D we can utilise static variables 
inside functions.
Meaning those global variables can be pushed into the function.
If thats an approach you would be interested in.

Jan 21 2014

"terchestor" <terchestor gmail.com> writes:

Thanks everybody for reviewing my code.

I rewrote almost everything to condense the code and conform to 
stylistic D usage.
My problem is the cast to integral type (line 80), downgrading 
performances of circa 50%.
Is there any possibility to avoid it?

Next step will be to implement the range interface as suggested 
by monarch_dodra.

Jan 22 2014

"monarch_dodra" <monarchdodra gmail.com> writes:

On Wednesday, 22 January 2014 at 09:33:39 UTC, terchestor wrote:
 My problem is the cast to integral type (line 80), downgrading 
 performances of circa 50%.
 Is there any possibility to avoid it?

I'd say it's your "Tconst_tau0" that is getting in your way: You 
have a uint sized integral, which you divide by a floating point, 
just to multiply it again and cast back to integral type (!).

I'd suggest you get rid of "high/low" completly. There are prng 
*adaptors* that can create a uniform distribution *from* a 
natural prng (std.random.uniform). If you do this, you code 
becomes (starting line 73):

         auto uni = (s1 ^ s2 ^ s3);
         static if (isFloatingPoint!result_t)
         {
             return uni / Tconst_tau0;
         }
         else
         {
             return uni;
         }

Heck, you could get rid of floating point support entirely, and 
generate only integrals. Then, you just let "uniform" do the 
tough work:

auto rnd = tausworthe!(ulong, uint)(randomSeed);

auto rndF    = rnd.uniform!"[)"(0.0,  1.0); //0 - 1 double range
auto rnd1024 = rnd.uniform!"[)"(0,   1024); //[0 .. 1024) 
integral range.

Nitpick:
         /// get seed from current time
         auto se = Clock.currTime().stdTime;

This is known to be a bad seed. If you need a "good enough" seed, 
use "std.random.randomSeed". Although arguably (IMO), it is 
better place the burden of seeding on the caller.

Jan 22 2014

"terchestor" <terchestor gmail.com> writes:

On Wednesday, 22 January 2014 at 11:02:10 UTC, monarch_dodra 
wrote:
 On Wednesday, 22 January 2014 at 09:33:39 UTC, terchestor wrote:
 My problem is the cast to integral type (line 80), downgrading 
 performances of circa 50%.
 Is there any possibility to avoid it?

 I'd suggest you get rid of "high/low" completly. There are prng 
 *adaptors* that can create a uniform distribution *from* a 
 natural prng (std.random.uniform). If you do this, you code 
 becomes (starting line 73):

         auto uni = (s1 ^ s2 ^ s3);
         static if (isFloatingPoint!result_t)
         {
             return uni / Tconst_tau0;
         }
         else
         {
             return uni;
         }

 Heck, you could get rid of floating point support entirely, and 
 generate only integrals. Then, you just let "uniform" do the 
 tough work:

 auto rnd = tausworthe!(ulong, uint)(randomSeed);

 auto rndF    = rnd.uniform!"[)"(0.0,  1.0); //0 - 1 double range
 auto rnd1024 = rnd.uniform!"[)"(0,   1024); //[0 .. 1024) 
 integral range.

Actually, for the purpose I've in mind, a DSP library, which is 
time critical (and that's why I need the fastest possible PRNG), 
I'd better decoupling the range processing and create a class 
implementing the range interface to store the numbers generated 
by uniform to serve them on demand: a composite pattern is better 
than inheritance.

 Nitpick:
         /// get seed from current time
         auto se = Clock.currTime().stdTime;

 This is known to be a bad seed. If you need a "good enough" 
 seed, use "std.random.randomSeed". Although arguably (IMO), it 
 is better place the burden of seeding on the caller.

Fine remark worth adopting.

Jan 22 2014

"terchestor" <terchestor gmail.com> writes:

I finally opted for both the simplest and fastest solution: a 
templated class (integral type) and a templated uniform method 
(integral or floating-point) generating variates either in the 
whole integral range or in the 0.0..1.0 range.
I let the burden of generating the whole floating-point range on 
the user side, choosing any divider of the integral generated 
variate.

Hence a similar speed for both types, roughly 40% faster than 
std.random.uniform.

Jan 22 2014

"Chris Cain" <clcain uncg.edu> writes:

On Wednesday, 22 January 2014 at 11:02:10 UTC, monarch_dodra 
wrote:
 use "std.random.randomSeed".

I think you mean "std.random.unpredictableSeed" but 100% agree.

Jan 22 2014

D Programming

C/C++ Programming

Other

digitalmars.D - A Tausworthe random number generator