digitalmars.D.learn - Fixed-size arrays and randomShuffle()
- Vidar Wahlberg (19/19) May 03 2012 May be that this works as intended, but it fooled me:
- =?UTF-8?B?QWxpIMOHZWhyZWxp?= (5/24) May 03 2012 Fixed-length arrays are value types. 'a' is copied to randomShuffle() so...
- Vidar Wahlberg (11/14) May 03 2012 True, it is however still not exceptionally newbie (or perhaps even
- =?UTF-8?B?QWxpIMOHZWhyZWxp?= (10/25) May 03 2012 Random shuffle can work on a fixed-length array and there is a way for
- Dmitry Olshansky (8/14) May 03 2012 It's all about RNG used behind the scenes. Default one is Mersane
- Vidar Wahlberg (3/8) May 03 2012 I tried those two as well. Still significantly slower than what I can
- Chris Cain (30/32) May 03 2012 You might want to post your code... I wrote this code in D:
- Vidar Wahlberg (34/35) May 03 2012 Sure!
- Chris Cain (32/32) May 03 2012 OK, I took a look at your example, and I saw the kind of
- Chris Cain (24/24) May 03 2012 On a related note, how did you get the other random generators
- ixid (2/2) May 03 2012 Why would you not want it to shuffle a fixed array? That's a very
- bearophile (5/7) May 03 2012 It's a badly written function (with insufficient unit tests):
May be that this works as intended, but it fooled me: --- import std.random; import std.stdio; void main() { int[5] a = 0; a[0] = 1; int[] b = [1, 0, 0, 0, 0]; randomShuffle(a); writeln(a); randomShuffle(b); writeln(b); } --- In DMD 2.0.59 the fixed-size array "a" won't be shuffled (the dynamic array "b" will), and you won't get any warning about it. I'm not sure whether this counts as something that should be reported as a bug/improvement, nor if only randomShuffle() displays this behaviour, perhaps you could enlighten me.
May 03 2012
On 05/03/2012 06:30 AM, Vidar Wahlberg wrote:May be that this works as intended, but it fooled me: --- import std.random; import std.stdio; void main() { int[5] a = 0; a[0] = 1; int[] b = [1, 0, 0, 0, 0]; randomShuffle(a);Fixed-length arrays are value types. 'a' is copied to randomShuffle() so its copy is shuffled. Passing a slice of the whole array works: randomShuffle(a[]);writeln(a); randomShuffle(b); writeln(b); } --- In DMD 2.0.59 the fixed-size array "a" won't be shuffled (the dynamic array "b" will), and you won't get any warning about it. I'm not sure whether this counts as something that should be reported as a bug/improvement, nor if only randomShuffle() displays this behaviour, perhaps you could enlighten me.Ali
May 03 2012
On 2012-05-03 15:34, Ali Çehreli wrote:Fixed-length arrays are value types. 'a' is copied to randomShuffle() so its copy is shuffled. Passing a slice of the whole array works: randomShuffle(a[]);True, it is however still not exceptionally newbie (or perhaps even user?) friendly (my question was more of "does it have to be this way?" rather than "how do you do this?", even though I appreciate the answer on how to do it). Is it not possible for the compilator to let you know that what you're doing doesn't make any sense? A quick follow-up: I've tried some various random number engines, but neither come even close to the performance of whatever is used for Java's "Collection.shuffle()" method. Perhaps someone can shed some light on this?
May 03 2012
On 05/03/2012 06:55 AM, Vidar Wahlberg wrote:On 2012-05-03 15:34, Ali Çehreli wrote:Random shuffle can work on a fixed-length array and there is a way for the implementation to know: import std.traits; // ... __traits(isStaticArray, a) That can be used in a template constraint.Fixed-length arrays are value types. 'a' is copied to randomShuffle() so its copy is shuffled. Passing a slice of the whole array works: randomShuffle(a[]);True, it is however still not exceptionally newbie (or perhaps even user?) friendly (my question was more of "does it have to be this way?" rather than "how do you do this?", even though I appreciate the answer on how to do it). Is it not possible for the compilator to let you know that what you're doing doesn't make any sense?A quick follow-up: I've tried some various random number engines, but neither come even close to the performance of whatever is used for Java's "Collection.shuffle()" method. Perhaps someone can shed some light onthis? I have no idea with that one. Ali
May 03 2012
On 03.05.2012 18:02, Ali Çehreli wrote:> A quick follow-up: > I've tried some various random number engines, but neither come even > close to the performance of whatever is used for Java's > "Collection.shuffle()" method. Perhaps someone can shed some light on this? I have no idea with that one.It's all about RNG used behind the scenes. Default one is Mersane Twister which (AFAIK) is not particularly fast. But has a period of 2^19937 elements. You should probably use XorShift or MinstdRand generator and a version of shuffle with 2nd parameter. -- Dmitry Olshansky
May 03 2012
On 2012-05-03 16:26, Dmitry Olshansky wrote:It's all about RNG used behind the scenes. Default one is Mersane Twister which (AFAIK) is not particularly fast. But has a period of 2^19937 elements. You should probably use XorShift or MinstdRand generator and a version of shuffle with 2nd parameter.I tried those two as well. Still significantly slower than what I can achieve in Java.
May 03 2012
On Thursday, 3 May 2012 at 14:41:20 UTC, Vidar Wahlberg wrote:I tried those two as well. Still significantly slower than what I can achieve in Java.You might want to post your code... I wrote this code in D: -=-=-=- import std.random, std.stdio, std.datetime; void main() { int[] arr = new int[5_000_000]; foreach(i, ref e; arr) e = i; StopWatch sw = AutoStart.yes; arr.randomShuffle(); sw.stop(); writeln("Took ", sw.peek().to!("msecs", double)(), "ms"); } -=-=-=- And it performed _identically_ to this in Java: -=-=-=- import java.util.ArrayList; import java.util.Collections; public class Main { public static void main(String[] args) { ArrayList<Integer> ints = new ArrayList<>(5000); for(int i = 0; i < 5_000_000; ++i) ints.add(i); long startTime = System.currentTimeMillis(); Collections.shuffle(ints); long endTime = System.currentTimeMillis(); System.out.println("Took " + (endTime - startTime) + "ms"); } } -=-=-=-
May 03 2012
On 2012-05-03 17:31, Chris Cain wrote:You might want to post your code...Sure! D: --- import std.random; import std.stdio; void main() { auto iterations = 10000000; int[] a; for (int i = 0; i < 42; ++i) a ~= i; for (int i = 0; i < iterations; ++i) randomShuffle(a); } naushika:~/projects> dmd random.d && time ./random ./random 38,35s user 0,05s system 99% cpu 38,420 total --- Java (7): --- import java.util.ArrayList; import java.util.Collections; public class Rnd { public static void main(String... args) { int iterations = 10000000; ArrayList<Integer> a = new ArrayList<Integer>(); for (int i = 0; i < 42; ++i) a.add(i); for (int i = 0; i < iterations; ++i) Collections.shuffle(a); } } naushika:~/projects> javac Rnd.java && time java Rnd java Rnd 9,92s user 0,03s system 100% cpu 9,922 total ---
May 03 2012
OK, I took a look at your example, and I saw the kind of performance you were seeing. I performed an investigation on what could cause such a disparity and came up with a conclusion. The answer is two things: First of all, as noted above, D's default generator is a mersenne twister RNG. You can emulate Java's RNG like so: auto rng = LinearCongruentialEngine!(ulong, 25214903917uL, 11uL, 2uL^^48uL)(); Once I equalized that, I looked into the various methods that are called and settled in on uniform. https://github.com/D-Programming-Language/phobos/blob/master/std/random.d#L1154 As you can see, there's a division at line 1154 and another at line 1158. This means there's a minimum of two division operations every time uniform is called. Now, normally this isn't a big deal, but if we really want maximum performance, we need to eliminate at least one. If you replace lines 1154-1161 (auto bucketSize ... to return...) with: CountType rnum, result; do { rnum = cast(CountType) uniform!CountType(urng); result = rnum % count; } while (rnum > count && (rnum - result + (count - 1)) < (rnum - result - 1)); return cast(typeof(return)) (min + result); Then the time taken shrinks down to roughly the same (within a tenth of a second) as Java. I'll probably clean this up (and write some comments on how this works) and see about submitting it as a patch unless anyone sees anything wrong with this approach.
May 03 2012
On a related note, how did you get the other random generators working? I tried to compile this and it gives me an error: -=-=-=- import std.random, std.stdio, std.datetime; void main() { int[] arr = new int[5_000_000]; foreach(i, ref e; arr) e = i; auto rng = MinstdRand0(1); rng.seed(unpredictableSeed); StopWatch sw = AutoStart.yes; randomShuffle(arr, rng); sw.stop(); writeln("Took ", sw.peek().to!("msecs", double)(), "ms"); } -=-=-=- The error: -=-=-=- C:\D\dmd2\windows\bin\..\..\src\phobos\std\random.d(1263): Error: cannot implicitly convert expression (rndGen()) of type MersenneTwisterEngine!(uint,32,624,397,31,-1727483681u,11,7,-1658038656 ,15,-272236544u,18) to LinearCongruentialEngine!(uint,16807,0,2147483647) -=-=-=- Is this a bug in 2.059? Or am I doing something wrong?
May 03 2012
Why would you not want it to shuffle a fixed array? That's a very frustrating silent failure.
May 03 2012
Vidar Wahlberg:I'm not sure whether this counts as something that should be reported as a bug/improvement,It's a badly written function (with insufficient unit tests): http://d.puremagic.com/issues/show_bug.cgi?id=8026 Bye, bearophile
May 03 2012