www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - mir.random - my GSoC project

reply Seb <seb wilzba.ch> writes:
Hi all,

I am very proud to be selected as a GSoC stipend for the D 
foundation.

Most of you already know me from github ( wilzbach) and IRC 
(greenify).

In my GSoC project I will contribute to Dlang’s upcoming 
numerical library mir [1]. You probably heard about mir from the 
new ndslice [2] package in Phobos. It's also the development & 
testing spot for future math additions to Phobos. Ilya is working 
very hard to get more functionality to mir and he will also be my 
mentor for the mir.random package.

[1] https://github.com/DlangScience/mir
[2] http://dlang.org/phobos/std_experimental_ndslice.html

mir.random
----------

This project is about adding non-uniform random generators to mir 
and hopefully eventually to Phobos.

While it is intended to be similar in terms of functionality to 
C++’s <random> [3] and NumPy’s random [4], our main focus is it's 
performance. Hence I will do a lot of literature research. A 
simple example of achieving better performance is the normal 
distribution. In most implementations I looked at (<random> [5], 
NumPy [6]) the Box-Muller transform [7] is used, however there 
exists a newer, faster approach: the Ziggurat method [8, 9] which 
is about three to four times faster [9].

Moreover I plan to add a universal random generator to allow easy 
creation of arbitrary random distributions. It could be something 
like the Tinflex algorithm [10], but I still have to do more 
literature research on this topic.

[3] <random> http://en.cppreference.com/w/cpp/numeric/random
[4] numpy.random 
http://docs.scipy.org/doc/numpy/reference/routines.random.html
[5] normal in <random> 
https://github.com/llvm-mirror/libcxx/blob/master/include/random#L4312
[6] normal in NumPy 
https://github.com/numpy/numpy/blob/master/numpy/random/mtrand/
istributions.c#L106 https://github.com/numpy/numpy/blob/master/numpy/random/mtrand/randomkit.c#L610
[7] Box-Muller transform 
https://en.m.wikipedia.org/wiki/Box%E2%80%93Muller_transform
[8] The Ziggurat Method for Generating Random Variables 
http://www.jstatsoft.org/v05/i08/paper
[9] An Improved Ziggurat Method to Generate Normal Random Samples 
http://www.doornik.com/research/ziggurat.pdf
[10] Tinflex 
https://cran.r-project.org/web/packages/Tinflex/Tinflex.pdf

Stay in touch
------------------

We will occasionally post updates to this newsgroup, but you can 
also follow us on Twitter ( libmir) [11] for more updates and for 
more general news  DlangScience [12] is tweeting too! Of course 
you can also directly watch us on Github [13]. For discussions 
and questions, you are cordially invited to our Gitter chat room 
[14].

During the GSoC I will also regularly post articles to my blog 
[15] - it offers email, rss and atom subscription.
Shortly before the GSoC starts, I will post the final time 
schedule here for tracking.

As mentioned mir is quite young, so contributions are very 
welcome.

Cheers,

Seb

PS: I will also be at dconf in Berlin, so maybe we can have a 
chat there :)

[11] https://twitter.com/libmir
[12] https://twitter.com/dlangscience
[13] https://github.com/libmir/mir
[14] https://gitter.im/libmir/public
[15] https://seb.wilzba.ch
Apr 23 2016
next sibling parent Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:
On Saturday, 23 April 2016 at 14:17:19 UTC, Seb wrote:
 I am very proud to be selected as a GSoC stipend for the D 
 foundation.
Congratulations, Seb!
 This project is about adding non-uniform random generators to 
 mir and hopefully eventually to Phobos.
This is a very welcome contribution. Thank you for stepping up to provide it. You obviously already have a very good handle on the literature related to RNG algorithms. What I'd advise, though, is that you also familiarize yourself with the problematic issues related to how random number generation relates to D's range framework. The broader scope of the problem is that both random number generators and random algorithms (i.e. algorithms whose popFront includes a call to an RNG; e.g. randomSample or randomCover) face a number of problems: * if they are not accessed via reference, there are lots of ways in which unintended correlations can result - one suggestion has been to simply disable copy-by- value and force them to be passed by ref or pointer, but in my experience that places some nasty limits on how readily they can be integrated into e.g. UFCS chains of range functionality - another option is to implement them as reference types; however, this creates some challenges w.r.t. memory allocation and the cost of creating multiple instances of e.g. a random algorithm in the inner loops of a program * if pseudo-RNGs are implemented as forward ranges, then again, many unintended correlations can be generated, in this case because library functionality will freely use the .save method to copy range state - this is simpler to address; just make all RNGs and random algorithms input ranges, and implement (say) a 'dup' method for pseudo-RNGs that the programmer can call when they're really sure they want to duplicate RNG state * more an aesthetic issue than a practical one, but note that typical range design (where the initial state of the .front property is determined upon construction) maybe sits a little oddly with random ranges, where the values ought ideally to be _truly_ lazy in their generation Some of this is touched on in my DConf talk from last year: https://www.youtube.com/watch?v=QdMdH7WX2ew (... which says something about the relative busyness of my time since then, that I haven't been able to make much progress on it ...) Note that I'm not suggesting you need to find a solution to the above issues (although it would be cool if you did:-), but just to be aware of them in order to understand how to offer good guidance on the usage of the functionality you develop.
 As mentioned mir is quite young, so contributions are very 
 welcome.
Minor aside: I think that was an odd choice of project name, given it was already the name of a very well known free software project addressing completely different interests ;-)
 PS: I will also be at dconf in Berlin, so maybe we can have a 
 chat there :)
Great, looking forward to it. :-) Good luck & best wishes, -- Joe
Apr 23 2016
prev sibling parent reply Martin Nowak <code+news.digitalmars dawg.eu> writes:
On 04/23/2016 04:17 PM, Seb wrote:
 This project is about adding non-uniform random generators to mir and
 hopefully eventually to Phobos.
I just happen to need a gaussian random number generator right now. Is there already some WIP code, or would you have an intermediate recommendation?
Jun 02 2016
next sibling parent Edwin van Leeuwen <edder tkwsping.nl> writes:
On Thursday, 2 June 2016 at 10:56:36 UTC, Martin Nowak wrote:
 On 04/23/2016 04:17 PM, Seb wrote:
 This project is about adding non-uniform random generators to 
 mir and hopefully eventually to Phobos.
I just happen to need a gaussian random number generator right now. Is there already some WIP code, or would you have an intermediate recommendation?
I tend to use rNorm from dstats: https://github.com/dsimcha/dstats/blob/master/source/dstats/random.d#L266
Jun 02 2016
prev sibling next sibling parent Seb <seb wilzba.ch> writes:
On Thursday, 2 June 2016 at 10:56:36 UTC, Martin Nowak wrote:
 On 04/23/2016 04:17 PM, Seb wrote:
 This project is about adding non-uniform random generators to 
 mir and hopefully eventually to Phobos.
I just happen to need a gaussian random number generator right now. Is there already some WIP code
My focus for the first six week is the transformed density rejection with inflection points algorithm (being able to generate any distribution based on it's CDF). Unfortunately that won't help you, but yeah there is WIP https://github.com/libmir/mir/pull/222
 or would you have an intermediate recommendation?
Yep I can also recommend the NumPy-port dstats: https://github.com/DlangScience/dstats
Jun 02 2016
prev sibling parent HaraldZealot <harald_zealot tut.by> writes:
On Thursday, 2 June 2016 at 10:56:36 UTC, Martin Nowak wrote:
 On 04/23/2016 04:17 PM, Seb wrote:
 This project is about adding non-uniform random generators to 
 mir and hopefully eventually to Phobos.
I just happen to need a gaussian random number generator right now. Is there already some WIP code, or would you have an intermediate recommendation?
Is good workaround you can use: ```d real normalrnd(real mu, real sigma) { import std.random: uniform; import std.mathspecial: normalDistributionInverse; return mu + sigma * normalDistributionInverse(uniform(0.0L, 1.0L)); } ```
Jun 03 2016