digitalmars.D - default random object?

Andrei Alexandrescu (33/33) Feb 13 2009 Leonardo suggested that some functions in std.random should not require

Denis Koroskin (10/43) Feb 13 2009 Yes.
bearophile (10/28) Feb 13 2009 That's awful and ugly. My suggestions are simple (copied from my dlibs a...

Andrei Alexandrescu (8/40) Feb 13 2009 Leaving normal() aside (does your implementation implement the ziggurat

bearophile (9/14) Feb 13 2009 You never need this, so it's over-generalized, and it's also ugly to see...

Daniel Keep (7/19) Feb 13 2009 My pages of code implementing it in Python would like to disagree with

bearophile (8/13) Feb 13 2009 Python devs (and the same is true for other people, like Mathematica dev...

Andrei Alexandrescu (3/6) Feb 13 2009 I know. This is an exciting time to be around D!

Yigal Chripun (9/53) Feb 14 2009 your example counters your claim.

Andrei Alexandrescu (5/31) Feb 14 2009 My point was that realloc() is wrong, not free(). The sorely missing

Yigal Chripun (9/40) Feb 14 2009 I see.

Derek Parnell (7/10) Feb 14 2009 Unless you need to contract the allocated memory ;-) I believe that

Andrei Alexandrescu (6/16) Feb 14 2009 expand() is a poorly-chosen name. People who are criticizing C's
grauzone (5/14) Feb 14 2009 expand() could take a negative value!
Yigal Chripun (6/12) Feb 14 2009 use free() to free the part the is not needed anymore?

Lionello Lunesu (5/10) Feb 15 2009 What would expand's signature be? It needs the current pointer, it needs...

Andrei Alexandrescu (5/15) Feb 15 2009 realloc moves memory, expand wouldn't. That delivered a fatal blow to

dsimcha (27/58) Feb 13 2009 Yes. I've found D2's random number generation scheme somewhat annoying ...
Derek Parnell (13/21) Feb 13 2009 Yes
Steve Schveighoffer (12/19) Feb 13 2009 Good idea. Random number generators should not require locking in the

Andrei Alexandrescu (16/23) Feb 13 2009 I tried both versions, and it turns out my code is almost never simpler

grauzone (2/5) Feb 13 2009 Because C++ is the pinnacle of language and library design.

Andrei Alexandrescu (3/9) Feb 13 2009 No, just because their random numbers library is very good.
Bill Baxter (3/7) Feb 13 2009 (Score: -1, flamebait)

KennyTM~ (5/38) Feb 13 2009 FYI, GSL (GNU Scientific Library) uses [0,1) for their uniform() and
Walter Bright (2/5) Feb 14 2009 auto big = uniform(rng, uint.max / 2 - 1, uint.max) + 1;

Denis Koroskin (2/6) Feb 14 2009 Yeah, but now about uniform(uint.min, uint.max)?

Walter Bright (3/14) Feb 14 2009 auto big = cast(uint)uniform(rng, cast(ulong)uint.min,
Steve Schveighoffer (5/16) Feb 14 2009 I think we're working under the assumption that the uniform function is

Jason House (2/34) Feb 14 2009 D uses a..b notation as part of the language to mean "[)". Because of th...

Walter Bright (2/5) Feb 14 2009 I think that's exactly right. Consistency is very important.

Don (5/11) Feb 15 2009 Agreed. Note that

Benji Smith (11/44) Feb 15 2009 Maybe a NumericInterval struct would be a good idea. It could be

Benji Smith (14/27) Feb 15 2009 Incidentally, the NumericInterval has lots of other interesting

Andrei Alexandrescu (4/33) Feb 15 2009 I've never been in a situation in my life where I thought, hey, a random...

Don (4/39) Feb 16 2009 It's worse than that. Since the range of double includes infinity, a

Nick Sabalausky (7/46) Feb 16 2009 I would think that, with the possible exception of mathematicians, most ...

Don (12/57) Feb 17 2009 You're asking for a random *number* and NaN is not a number :).

Benji Smith (8/49) Feb 18 2009 Way to miss the forest for the trees.

Don (12/65) Feb 19 2009 On the contrary, I've been giving NumericInterval considerable thought.

Denis Koroskin (2/64) Feb 19 2009 Same as NumericInterval's bounds ([],(),[),(]), strictness might be an i...
Benji Smith (18/85) Feb 19 2009 I don't see any idealogical reason for requiring x2 >= x1.

Don (8/100) Feb 19 2009 That's a really good question, and probably the killer argument against

Walter Bright (6/10) Feb 14 2009 That could have speed problems with a tight loop accessing thread local

dsimcha (8/15) Feb 14 2009 I don't think this matters, since for apps that really need it, a RNG ca...

grauzone (9/25) Feb 14 2009 I don't understand this. If implemented right (D2.0 or gcc thread

Walter Bright (5/8) Feb 14 2009 That's the notorious double-checked-locking bug, an incredibly seductive...

grauzone (6/16) Feb 14 2009 No, I meant something like this:

Andrei Alexandrescu (25/53) Feb 14 2009 Well I can only welcome the return of persistent flamebaiting to this

grauzone (20/82) Feb 14 2009 Why is it "flamebaiting" when I say my opinion? Others are doing it too.

Andrei Alexandrescu (54/119) Feb 14 2009 Yah, you're right. I shouldn't have answered, and my answer ended up

dsimcha (13/37) Feb 14 2009 Just to set the record straight, I actually agree with Andrei here. Ele...

grauzone (16/54) Feb 14 2009 Well, I'm not necessarily talking about "elegant but costly

Denis Koroskin (2/54) Feb 14 2009 Did you try passing a comparator (C++ way to do things)? It should be pr...

Andrei Alexandrescu (21/44) Feb 14 2009 Is this what you meant?

Sergey Gromov (4/13) Feb 14 2009 I'm used to float generators [0,1] and integer generators [0,max).
Don (19/37) Feb 14 2009 This is a general issue applying to any numeric range. I've been giving

Andrei Alexandrescu (4/35) Feb 14 2009 So what would you recommend? [a, b) for floats and [a, b] for ints, or

Don Clugston (3/41) Feb 15 2009 I'm leaning towards [a,b) for everything (consistency with arrays), but

Bill Baxter (6/50) Feb 15 2009 How do you create a random uint that can take on any of uint's values

Don (16/63) Feb 15 2009 I think that is the primary argument. BUT:

Bill Baxter (9/86) Feb 15 2009 I guess I'd find it more useful to have a function that gave a random

Andrei Alexandrescu (9/29) Feb 15 2009 You can, just pass short values into uniform() or specify type arguments...

Bill Baxter (8/39) Feb 15 2009 If that's what it took I'd probably try this instead:

Andrei Alexandrescu (5/17) Feb 15 2009 This will indeed work:

Bill Baxter (5/23) Feb 15 2009 No, but if you are then I'll just make my own 1-liner library

Andrei Alexandrescu (3/25) Feb 15 2009 What signatures would the functions have?

Bill Baxter (8/38) Feb 15 2009 What I'd like best is if all the things you listed would work, but in ad...

bearophile (6/8) Feb 15 2009 I mostly agree. For example a basic randUint is useful (I have it in dli...

dsimcha (6/57) Feb 15 2009 Keep in mind that we're talking only about defaults here. At least in t...

Bill Baxter (4/61) Feb 15 2009 I don't agree that it's such an edge case. I often want a random

Joel C. Salomon (7/13) Feb 15 2009 In that vein: for the floating-point case, do you want to emulate a

Andrei Alexandrescu (7/20) Feb 15 2009 The former! Otherwise the distribution will be biased toward numbers

Don (5/27) Feb 15 2009 You mean [1, 2). The code for complete uniform distribution is in

Christopher Wright (8/14) Feb 14 2009 I always have to look up whether "[]" is exclusive or inclusive. It's

bearophile (9/11) Feb 14 2009 A cute thing for you: where I live in Mathematics we use a different syn...
Ary Borenszweig (6/17) Feb 14 2009 You can remember "]" is more in touch with the number of the right (the

Christopher Wright (3/22) Feb 14 2009 Right, but these helpful mnemonics will flee well before I see any use

Jason House (3/15) Feb 14 2009 Lately, I wonder why TLS is treated as special and globals are normal. I...
bearophile (5/8) Feb 15 2009 For that there's choice():
Lionello Lunesu (13/37) Feb 15 2009 Yes.

Andrei Alexandrescu (8/22) Feb 15 2009 I couldn't find good names. uniformOpenClosed, uniformOC,

Steve Schveighoffer (8/12) Feb 15 2009 Do you mean rand()%max specifically in the case of the C function rand

Bill Baxter (11/22) Feb 15 2009 It's bad in general. Say your RNG generates perfectly uniform numbers

Steve Schveighoffer (8/36) Feb 15 2009 Hm... good point, but probably not significant unless you are generating...

Andrei Alexandrescu (3/38) Feb 15 2009 See the implementation of uniform in std.random.

Steve Schveighoffer (18/58) Feb 15 2009 Ugh, so you have to repeatedly generate random numbers until it gives yo...

Lionello Lunesu (4/15) Feb 15 2009 Actually, I forgot to add that I always use divisors of RAND_MAX ;-)

Lionello Lunesu (7/11) Feb 15 2009 Yes yes, I know I know, but, honestly, when I'm generating noise for som...

Andrei Alexandrescu (4/15) Feb 16 2009 Ahhh, so that's why all my enemies in Far Cry 1 and 2 behave so

Don (8/34) Feb 16 2009 I don't like that approach. The "[)", "[]" issue comes up every time

Leandro Lucarella (16/45) Feb 16 2009 Yes
Jarrett Billingsley (9/26) Feb 16 2009 On Fri, Feb 13, 2009 at 8:10 PM, Andrei Alexandrescu
Daniel Keep (8/10) Feb 19 2009 Slightly off-topic, but I just found this and thought it might be of

BCS (3/20) Feb 19 2009 someone should make a "double DieHard(uint function())" so we can test t...
Don (8/25) Feb 20 2009 Yeah. It's worth knowing that reason that linear congruential RNGs have

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

Leonardo suggested that some functions in std.random should not require 
their user to be bothered with creating a random object, i.e.:

auto r = Random(unpredictableSeed);
auto n = uniform(r, 0, 100);

Instead the library should simply support:

auto n = uniform(0, 100);

and do this by presumably using a global RNG under the hood. So I wanted 
to ask all y'all:

1. Are you cool with making the rng the last parameter and give it a 
default value?

2. The global random generator will be allocated per thread. Are you 
cool with this too?

3. How should the global rng be initialized? To always generate the same 
sequence, or not?

4. While we're at it, should uniform(a, b) generate by default something 
in [a, b] or [a, b)? Someone once explained to me that generating [a, b] 
for floating point numbers is the source of all evils and that Hitler, 
Stalin and Kim Il Sung (should he still be alive) must be using that 
kind of generator. Conversely, generating [a, b) is guaranteed to bring 
in the long term everlasting peace to Earth. My problem however is that 
in the integer realm I always want to generate [a, b]. Furthermore, I 
wouldn't be happy if the shape of the interval was different for 
integers and floating point numbers. How to break this conundrum? Don't 
forget that we're only worrying about defaults, explicit generation is 
always possible with self-explanatory code:

auto rng = Random(unpredictableSeed);
auto a = 0.0, b = 1.0;
auto x1 = uniform!("[]")(rng, a, b);
auto x2 = uniform!("[)")(rng, a, b);
auto x3 = uniform!("(]")(rng, a, b);
auto x4 = uniform!("()")(rng, a, b);


Thanks,

Andrei

Feb 13 2009

"Denis Koroskin" <2korden gmail.com> writes:

On Sat, 14 Feb 2009 04:10:29 +0300, Andrei Alexandrescu
<SeeWebsiteForEmail erdani.org> wrote:

 Leonardo suggested that some functions in std.random should not require  
 their user to be bothered with creating a random object, i.e.:

 auto r = Random(unpredictableSeed);
 auto n = uniform(r, 0, 100);

 Instead the library should simply support:

 auto n = uniform(0, 100);

I think that both should be available.

 and do this by presumably using a global RNG under the hood. So I wanted  
 to ask all y'all:

 1. Are you cool with making the rng the last parameter and give it a  
 default value?

Yes.

 2. The global random generator will be allocated per thread. Are you  
 cool with this too?

Yes.

 3. How should the global rng be initialized? To always generate the same  
 sequence, or not?

Yes, it should be deterministic.

 4. While we're at it, should uniform(a, b) generate by default something  
 in [a, b] or [a, b)? Someone once explained to me that generating [a, b]  
 for floating point numbers is the source of all evils and that Hitler,  
 Stalin and Kim Il Sung (should he still be alive) must be using that  
 kind of generator. Conversely, generating [a, b) is guaranteed to bring  
 in the long term everlasting peace to Earth. My problem however is that  
 in the integer realm I always want to generate [a, b]. Furthermore, I  
 wouldn't be happy if the shape of the interval was different for  
 integers and floating point numbers. How to break this conundrum? Don't  
 forget that we're only worrying about defaults, explicit generation is  
 always possible with self-explanatory code:

 auto rng = Random(unpredictableSeed);
 auto a = 0.0, b = 1.0;
 auto x1 = uniform!("[]")(rng, a, b);
 auto x2 = uniform!("[)")(rng, a, b);
 auto x3 = uniform!("(]")(rng, a, b);
 auto x4 = uniform!("()")(rng, a, b);

It should definitely include a but not b for floating point types.
It'd be satisfied if the same would be for integer types, too.

My typical use-case is "auto x = rand() % t;" which yields numbers in [0 to t).

For example, to get random range element, I'd like to write the following:
auto elem = range[uniform(0, range.length)];

 Thanks,

 Andrei

Feb 13 2009

bearophile <bearophileHUGS lycos.com> writes:

Andrei Alexandrescu:

 3. How should the global rng be initialized?

Automatically seeded with the time at the beginning of the program. Of course
the seed can be set again in any moment.


 4. While we're at it, should uniform(a, b) generate by default something 
 in [a, b] or [a, b)? Someone once explained to me that generating [a, b] 
 for floating point numbers is the source of all evils and that Hitler, 
 Stalin and Kim Il Sung (should he still be alive) must be using that 
 kind of generator. Conversely, generating [a, b) is guaranteed to bring 
 in the long term everlasting peace to Earth. My problem however is that 
 in the integer realm I always want to generate [a, b]. Furthermore, I 
 wouldn't be happy if the shape of the interval was different for 
 integers and floating point numbers. How to break this conundrum? Don't 
 forget that we're only worrying about defaults, explicit generation is 
 always possible with self-explanatory code:
 auto rng = Random(unpredictableSeed);
 auto a = 0.0, b = 1.0;
 auto x1 = uniform!("[]")(rng, a, b);
 auto x2 = uniform!("[)")(rng, a, b);
 auto x3 = uniform!("(]")(rng, a, b);
 auto x4 = uniform!("()")(rng, a, b);

That's awful and ugly. My suggestions are simple (copied from my dlibs and the
random std lib of Python):

random() => floating point [0, 1)
randInt(a=0, b) => integral [a, b]
randRange(a=0, b) => integral [a, b)
uniform(a, b) => floating point [a, b)
normal(a, b) => good quality normally-distributed number with given std dev and
avg.

Bye,
bearophile

Feb 13 2009

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

bearophile wrote:
 Andrei Alexandrescu:
 
 3. How should the global rng be initialized?

 
 Automatically seeded with the time at the beginning of the program. Of course
the seed can be set again in any moment.
 
 
 4. While we're at it, should uniform(a, b) generate by default something 
 in [a, b] or [a, b)? Someone once explained to me that generating [a, b] 
 for floating point numbers is the source of all evils and that Hitler, 
 Stalin and Kim Il Sung (should he still be alive) must be using that 
 kind of generator. Conversely, generating [a, b) is guaranteed to bring 
 in the long term everlasting peace to Earth. My problem however is that 
 in the integer realm I always want to generate [a, b]. Furthermore, I 
 wouldn't be happy if the shape of the interval was different for 
 integers and floating point numbers. How to break this conundrum? Don't 
 forget that we're only worrying about defaults, explicit generation is 
 always possible with self-explanatory code:
 auto rng = Random(unpredictableSeed);
 auto a = 0.0, b = 1.0;
 auto x1 = uniform!("[]")(rng, a, b);
 auto x2 = uniform!("[)")(rng, a, b);
 auto x3 = uniform!("(]")(rng, a, b);
 auto x4 = uniform!("()")(rng, a, b);

 
 That's awful and ugly. My suggestions are simple (copied from my dlibs and the
random std lib of Python):
 
 random() => floating point [0, 1)
 randInt(a=0, b) => integral [a, b]
 randRange(a=0, b) => integral [a, b)
 uniform(a, b) => floating point [a, b)
 normal(a, b) => good quality normally-distributed number with given std dev
and avg.

Leaving normal() aside (does your implementation implement the ziggurat 
algorithm? If so, would you agree to contribute it to phobos?), I don't 
see how having to memorize four names instead of one is necessarily 
awesome and beautiful. Besides uniform() renders "random" redundant. 
Hardly a pinnacle of good API design, see realloc() which essentially 
makes malloc() and free() redundant.

Andrei

Feb 13 2009

bearophile <bearophileHUGS lycos.com> writes:

Andrei Alexandrescu:
 Leaving normal() aside (does your implementation implement the ziggurat 
 algorithm?

You can use two well known algorithms, one is a simple one, and the other is
more precise. The simple one is just to extract 10 or 12 uniform numbers and
sum them.


 I don't see how having to memorize four names instead of one is necessarily 
 awesome and beautiful. Besides uniform() renders "random" redundant. 
 Hardly a pinnacle of good API design

You never need this, so it's over-generalized, and it's also ugly to see and
noisy too:
uniform!("(]")(rng, a, b);

Those names are almost self-explaining, you don't need much time to remember
them. And each one of them represent a very common operation you too want to
do. Of course your tastes can be different from mine (and Python ones too).

When the phobos of D2 will be "finished" it will require one year or more of
"tuning" (discussing names here on the newsgroup) to find the best names, API,
etc.

While we are discussing Phobos, I'd also like to have a lazy prime generator,
prime tests a binomial, etc. Such things are useful in many programs (you can
find them all in my dlibs, but that hyperfast lazy prime generator is GNU,
unfortunately).

Bye,
bearophile

Feb 13 2009

Daniel Keep <daniel.keep.lists gmail.com> writes:

bearophile wrote:
 Andrei Alexandrescu:
 I don't see how having to memorize four names instead of one is necessarily 
 awesome and beautiful. Besides uniform() renders "random" redundant. 
 Hardly a pinnacle of good API design

 
 You never need this, so it's over-generalized, and it's also ugly to see and
noisy too:
 uniform!("(]")(rng, a, b);

My pages of code implementing it in Python would like to disagree with
you on that.  So would the safety science guys I write simulation and
analysis code for.

Then again "I never need this" isn't as convincing an argument.  :)

 ...
 
 While we are discussing Phobos, I'd also like to have a lazy prime generator,
prime tests a binomial, etc. Such things are useful in many programs (you can
find them all in my dlibs, but that hyperfast lazy prime generator is GNU,
unfortunately).

Prime test a binomial?  I never need this, so it's not necessary.

 Bye,
 bearophile

  -- Daniel

Feb 13 2009

bearophile <bearophileHUGS lycos.com> writes:

Daniel Keep:
 My pages of code implementing it in Python would like to disagree with
 you on that.  So would the safety science guys I write simulation and
 analysis code for.

Python devs (and the same is true for other people, like Mathematica devs) have
seen such thing isn't a common need, I have followed when they voted for
functions to put and functions to keep out of the random module. I think a std
rnd module can't contain everything, but only the things useful for most
people, to keep low the number of simple functions, so you can remember them
better.
For more specialized usages you can import a third-part module for stats, etc.


 Then again "I never need this" isn't as convincing an argument.  :)

"Never" is of course meant in statistical terms, a std lib can't contain
everything, so you put there only the things that most people need.


 Prime test a binomial?  I never need this, so it's not necessary.

Sorry: Prime test, binomial, are two separated things.
And lot of people need that, all basic statistics modules have such things.

Bye,
bearophile

Feb 13 2009

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

bearophile wrote:
 When the phobos of D2 will be "finished" it will require one year or
 more of "tuning" (discussing names here on the newsgroup) to find the
 best names, API, etc.

I know. This is an exciting time to be around D!

Andrei

Feb 13 2009

Yigal Chripun <yigal100 gmail.com> writes:

Andrei Alexandrescu wrote:
 bearophile wrote:
 Andrei Alexandrescu:

 3. How should the global rng be initialized?

 Automatically seeded with the time at the beginning of the program. Of
 course the seed can be set again in any moment.


 4. While we're at it, should uniform(a, b) generate by default
 something in [a, b] or [a, b)? Someone once explained to me that
 generating [a, b] for floating point numbers is the source of all
 evils and that Hitler, Stalin and Kim Il Sung (should he still be
 alive) must be using that kind of generator. Conversely, generating
 [a, b) is guaranteed to bring in the long term everlasting peace to
 Earth. My problem however is that in the integer realm I always want
 to generate [a, b]. Furthermore, I wouldn't be happy if the shape of
 the interval was different for integers and floating point numbers.
 How to break this conundrum? Don't forget that we're only worrying
 about defaults, explicit generation is always possible with
 self-explanatory code:
 auto rng = Random(unpredictableSeed);
 auto a = 0.0, b = 1.0;
 auto x1 = uniform!("[]")(rng, a, b);
 auto x2 = uniform!("[)")(rng, a, b);
 auto x3 = uniform!("(]")(rng, a, b);
 auto x4 = uniform!("()")(rng, a, b);

 That's awful and ugly. My suggestions are simple (copied from my dlibs
 and the random std lib of Python):

 random() => floating point [0, 1)
 randInt(a=0, b) => integral [a, b]
 randRange(a=0, b) => integral [a, b)
 uniform(a, b) => floating point [a, b)
 normal(a, b) => good quality normally-distributed number with given
 std dev and avg.

 Leaving normal() aside (does your implementation implement the ziggurat
 algorithm? If so, would you agree to contribute it to phobos?), I don't
 see how having to memorize four names instead of one is necessarily
 awesome and beautiful. Besides uniform() renders "random" redundant.
 Hardly a pinnacle of good API design, see realloc() which essentially
 makes malloc() and free() redundant.

 Andrei

your example counters your claim.
The same reason we prefer to use D with so many concepts and keywords 
over something like brainf**k that is turing-complete with just 8 or so 
commands can be apllied here.
sure, the *implementation* of realloc() makes the *implementations* of 
free() and malloc() redundunt but it's easier to remember to use free() 
when you actually want to free something. if free() wasn't available, 
the first thing I'd do was to use a macro to define it.

Feb 14 2009

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

Yigal Chripun wrote:
 Andrei Alexandrescu wrote:
 bearophile wrote:
 random() => floating point [0, 1)
 randInt(a=0, b) => integral [a, b]
 randRange(a=0, b) => integral [a, b)
 uniform(a, b) => floating point [a, b)
 normal(a, b) => good quality normally-distributed number with given
 std dev and avg.

 Leaving normal() aside (does your implementation implement the ziggurat
 algorithm? If so, would you agree to contribute it to phobos?), I don't
 see how having to memorize four names instead of one is necessarily
 awesome and beautiful. Besides uniform() renders "random" redundant.
 Hardly a pinnacle of good API design, see realloc() which essentially
 makes malloc() and free() redundant.

 Andrei

 
 your example counters your claim.
 The same reason we prefer to use D with so many concepts and keywords 
 over something like brainf**k that is turing-complete with just 8 or so 
 commands can be apllied here.
 sure, the *implementation* of realloc() makes the *implementations* of 
 free() and malloc() redundunt but it's easier to remember to use free() 
 when you actually want to free something. if free() wasn't available, 
 the first thing I'd do was to use a macro to define it.

My point was that realloc() is wrong, not free(). The sorely missing 
allocation primitive is expand(), and we've been paying for it through 
the nose for decades.

Andrei

Feb 14 2009

Yigal Chripun <yigal100 gmail.com> writes:

Andrei Alexandrescu wrote:
 Yigal Chripun wrote:
 Andrei Alexandrescu wrote:
 bearophile wrote:
 random() => floating point [0, 1)
 randInt(a=0, b) => integral [a, b]
 randRange(a=0, b) => integral [a, b)
 uniform(a, b) => floating point [a, b)
 normal(a, b) => good quality normally-distributed number with given
 std dev and avg.

 Leaving normal() aside (does your implementation implement the ziggurat
 algorithm? If so, would you agree to contribute it to phobos?), I don't
 see how having to memorize four names instead of one is necessarily
 awesome and beautiful. Besides uniform() renders "random" redundant.
 Hardly a pinnacle of good API design, see realloc() which essentially
 makes malloc() and free() redundant.

 Andrei

 your example counters your claim.
 The same reason we prefer to use D with so many concepts and keywords
 over something like brainf**k that is turing-complete with just 8 or
 so commands can be apllied here.
 sure, the *implementation* of realloc() makes the *implementations* of
 free() and malloc() redundunt but it's easier to remember to use
 free() when you actually want to free something. if free() wasn't
 available, the first thing I'd do was to use a macro to define it.

 My point was that realloc() is wrong, not free(). The sorely missing
 allocation primitive is expand(), and we've been paying for it through
 the nose for decades.

 Andrei

I see.
Having 3 functions: malloc(), expand() and free() is indeed a very good 
design.

regarding the original topic, I still generally prefer bearophile's API,
random may be omitted and other small changes are also possible but it 
looks better to me than using something like:
uniform!("[]")(rng, a, b);
maybe some aliases are in order?

Feb 14 2009

Derek Parnell <derek psych.ward> writes:

On Sat, 14 Feb 2009 19:07:52 +0200, Yigal Chripun wrote:


 I see.
 Having 3 functions: malloc(), expand() and free() is indeed a very good 
 design.

Unless you need to contract the allocated memory ;-) I believe that
realloc() does expansions and contractions, no?

-- 
Derek Parnell
Melbourne, Australia
skype: derek.j.parnell

Feb 14 2009

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

Derek Parnell wrote:
 On Sat, 14 Feb 2009 19:07:52 +0200, Yigal Chripun wrote:
 
 
 I see.
 Having 3 functions: malloc(), expand() and free() is indeed a very good 
 design.

 
 Unless you need to contract the allocated memory ;-) I believe that
 realloc() does expansions and contractions, no?
 

expand() is a poorly-chosen name. People who are criticizing C's 
allocation API really mean reallocInPlace. Then realloc() can be 
implemented in terms of malloc, free, and reallocInPlace as a 
convenience function.

Andrei

Feb 14 2009

grauzone <none example.net> writes:

Derek Parnell wrote:
 On Sat, 14 Feb 2009 19:07:52 +0200, Yigal Chripun wrote:
 
 
 I see.
 Having 3 functions: malloc(), expand() and free() is indeed a very good 
 design.

 
 Unless you need to contract the allocated memory ;-) I believe that
 realloc() does expansions and contractions, no?

expand() could take a negative value!

I'm not sure though, what you guys are up to. The really bad thing about 
realloc() is error handling. How do you check for errors? You need some 
additional lines of code just to find out if (re-)allocation has failed.

Feb 14 2009

Yigal Chripun <yigal100 gmail.com> writes:

Derek Parnell wrote:
 On Sat, 14 Feb 2009 19:07:52 +0200, Yigal Chripun wrote:


 I see.
 Having 3 functions: malloc(), expand() and free() is indeed a very good
 design.

 Unless you need to contract the allocated memory ;-) I believe that
 realloc() does expansions and contractions, no?

use free() to free the part the is not needed anymore?

T* arr = malloc(sizeof(T) * n);
...
tail = arr[x];
free(tail);

Feb 14 2009

Lionello Lunesu <lio lunesu.remove.com> writes:

 My point was that realloc() is wrong, not free(). The sorely missing 
 allocation primitive is expand(), and we've been paying for it through 
 the nose for decades.
 
 Andrei

What would expand's signature be? It needs the current pointer, it needs 
the new size. Seems that it's nothing else than a renamed realloc.

The problem is just that realloc allows null-pointer (=malloc) and 
0-size (=free), thereby making malloc/free obsolete.

L.

Feb 15 2009

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

Lionello Lunesu wrote:
 My point was that realloc() is wrong, not free(). The sorely missing 
 allocation primitive is expand(), and we've been paying for it through 
 the nose for decades.

 Andrei

 
 What would expand's signature be? It needs the current pointer, it needs 
 the new size. Seems that it's nothing else than a renamed realloc.

realloc moves memory, expand wouldn't. That delivered a fatal blow to 
C++ making its allocation primitives inferior to C's.

 The problem is just that realloc allows null-pointer (=malloc) and 
 0-size (=free), thereby making malloc/free obsolete.

That's the least of its problems.

Andrei

Feb 15 2009

dsimcha <dsimcha yahoo.com> writes:

== Quote from Andrei Alexandrescu (SeeWebsiteForEmail erdani.org)'s article
 Leonardo suggested that some functions in std.random should not require
 their user to be bothered with creating a random object, i.e.:
 auto r = Random(unpredictableSeed);
 auto n = uniform(r, 0, 100);
 Instead the library should simply support:
 auto n = uniform(0, 100);
 and do this by presumably using a global RNG under the hood. So I wanted
 to ask all y'all:
 1. Are you cool with making the rng the last parameter and give it a
 default value?

Yes.

 2. The global random generator will be allocated per thread. Are you
 cool with this too?

Yes.  I've found D2's random number generation scheme somewhat annoying up until
now because it requires the caller to explicitly do something with the
underlying
RNG object between random number generations, in a sense lessening the
encapsulation of the higher level random number generation functions such as
uniform() and stuff that's written by the user. Even if you make a Uniform
object
or a Normal object or something, the implementation detail that pseudorandom
number generation requires state to be maintained in between calls still leaks
out
into the interface.  IMHO, the whole concept of RNGs requiring state to be
maintained between calls is an implementation detail and therefore should
ideally
be encapsulated.

For small, single-threaded programs, most of the time I just declare a global
RNG
and seed it in a module initializer anyhow, for convenience.  This thread-local
RNG is a perfect way to provide a simple default behavior that "just works" and
is
well-encapsulated, while still allowing more flexibility and control in the
relatively rare cases when it's really needed.

 3. How should the global rng be initialized? To always generate the same
 sequence, or not?

It should be properly seeded, though I don't know how this would work for RNGs
for
anything but the main thread.  This keeps with the principle that the default
should "just work" reasonably well without the user having to think about
implementation details.  Maybe core.thread.Thread needs to have the ability to
execute a callback function on the creation of a new thread.  This is probably
not
a bad feature anyhow.

 4. While we're at it, should uniform(a, b) generate by default something
 in [a, b] or [a, b)? Someone once explained to me that generating [a, b]
 for floating point numbers is the source of all evils and that Hitler,
 Stalin and Kim Il Sung (should he still be alive) must be using that
 kind of generator. Conversely, generating [a, b) is guaranteed to bring
 in the long term everlasting peace to Earth. My problem however is that
 in the integer realm I always want to generate [a, b]. Furthermore, I
 wouldn't be happy if the shape of the interval was different for
 integers and floating point numbers. How to break this conundrum? Don't
 forget that we're only worrying about defaults, explicit generation is
 always possible with self-explanatory code:
 auto rng = Random(unpredictableSeed);
 auto a = 0.0, b = 1.0;
 auto x1 = uniform!("[]")(rng, a, b);
 auto x2 = uniform!("[)")(rng, a, b);
 auto x3 = uniform!("(]")(rng, a, b);
 auto x4 = uniform!("()")(rng, a, b);

I like uniform the way it is, [a, b), even for integers, because it jives well
with array slice notation, and a major use for uniform integers is randomly
selecting an element from an array.  This is not that important, though, because
no matter what you do, it's going to seem stupid and unintuitive to someone.

Feb 13 2009

Derek Parnell <derek psych.ward> writes:

On Fri, 13 Feb 2009 17:10:29 -0800, Andrei Alexandrescu wrote:


 1. Are you cool with making the rng the last parameter and give it a 
 default value?

Yes
 
 2. The global random generator will be allocated per thread. Are you 
 cool with this too?

Yes
 
 3. How should the global rng be initialized? To always generate the same 
 sequence, or not?

New sequence each time. A predetermined sequence should be under the
application's control and be explicit, IMHO.
 
 4. While we're at it, should uniform(a, b) generate by default something 
 in [a, b] or [a, b)? 

I prefer [a,b] as the default, and I'm thinking integers too. 

-- 
Derek Parnell
Melbourne, Australia
skype: derek.j.parnell

Feb 13 2009

Steve Schveighoffer <schveiguy yahoo.com> writes:

On Fri, 13 Feb 2009 17:10:29 -0800, Andrei Alexandrescu wrote:


 2. The global random generator will be allocated per thread. Are you
 cool with this too?

Good idea.  Random number generators should not require locking in the 
most common case.

 
 3. How should the global rng be initialized? To always generate the same
 sequence, or not?

Not.  And even between threads not.  So some combination of time and 
thread id for seed is probably in order.  If I want to "just use" random 
numbers in a simple program, I don't want to see the same values every 
time.  If I want to repeat the values every time, I'll take the time to 
seed it so that happens.

 4. While we're at it, should uniform(a, b) generate by default something
 in [a, b] or [a, b)?

[a,b)

Every other piece of range-like code is zero based, and excludes the 
upper bound.  This should be no different.  It makes the code simpler too.

-Steve

Feb 13 2009

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

Steve Schveighoffer wrote:
 4. While we're at it, should uniform(a, b) generate by default something
 in [a, b] or [a, b)?

 
 [a,b)
 
 Every other piece of range-like code is zero based, and excludes the 
 upper bound.  This should be no different.  It makes the code simpler too.

I tried both versions, and it turns out my code is almost never simpler 
with open integral intervals. Most of the time I need something like:

auto x = uniform(rng, -100, 100);
auto y = uniform(rng, 0, 100);

and I need to remember to actually ask for 101 instead of 100. True, 
when you want a random index in an array, open intervals are more 
convenient.

One purity-based argument is that in a random number you may actually 
ask for the total range:

auto big = uniform(rng, uint.max / 2, uint.max);

If the interval is open I can't generate uint.max.

Anyway, I checked the C++ API and it turns out they use closed intervals 
for integers and open intervals for reals. I know there's been a lot of 
expert scrutiny there, so I suppose I better copy their design.


Andrei

Feb 13 2009

grauzone <none example.net> writes:

 Anyway, I checked the C++ API and it turns out they use closed intervals 
 for integers and open intervals for reals. I know there's been a lot of 
 expert scrutiny there, so I suppose I better copy their design.

Because C++ is the pinnacle of language and library design.

SCNR

Feb 13 2009

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

grauzone wrote:
 Anyway, I checked the C++ API and it turns out they use closed 
 intervals for integers and open intervals for reals. I know there's 
 been a lot of expert scrutiny there, so I suppose I better copy their 
 design.

 
 Because C++ is the pinnacle of language and library design.

No, just because their random numbers library is very good.

Andrei

Feb 13 2009

Bill Baxter <wbaxter gmail.com> writes:

On Sat, Feb 14, 2009 at 3:04 PM, grauzone <none example.net> wrote:
 Anyway, I checked the C++ API and it turns out they use closed intervals
 for integers and open intervals for reals. I know there's been a lot of
 expert scrutiny there, so I suppose I better copy their design.

 Because C++ is the pinnacle of language and library design.

(Score: -1, flamebait)

--bb

Feb 13 2009

KennyTM~ <kennytm gmail.com> writes:

Andrei Alexandrescu wrote:
 Steve Schveighoffer wrote:
 4. While we're at it, should uniform(a, b) generate by default something
 in [a, b] or [a, b)?

 [a,b)

 Every other piece of range-like code is zero based, and excludes the 
 upper bound.  This should be no different.  It makes the code simpler 
 too.

 
 I tried both versions, and it turns out my code is almost never simpler 
 with open integral intervals. Most of the time I need something like:
 
 auto x = uniform(rng, -100, 100);
 auto y = uniform(rng, 0, 100);
 
 and I need to remember to actually ask for 101 instead of 100. True, 
 when you want a random index in an array, open intervals are more 
 convenient.
 
 One purity-based argument is that in a random number you may actually 
 ask for the total range:
 
 auto big = uniform(rng, uint.max / 2, uint.max);
 
 If the interval is open I can't generate uint.max.
 

Use uniform!("[]")(rng, uint.max/2, uint.max) for it then.

 Anyway, I checked the C++ API and it turns out they use closed intervals 
 for integers and open intervals for reals. I know there's been a lot of 
 expert scrutiny there, so I suppose I better copy their design.
 
 
 Andrei

FYI, GSL (GNU Scientific Library) uses [0,1) for their uniform() and 
[0,n) for uniform_int().

-- Kenny.

Feb 13 2009

Walter Bright <newshound1 digitalmars.com> writes:

Andrei Alexandrescu wrote:
 auto big = uniform(rng, uint.max / 2, uint.max);
 
 If the interval is open I can't generate uint.max.


auto big = uniform(rng, uint.max / 2 - 1, uint.max) + 1;

Feb 14 2009

"Denis Koroskin" <2korden gmail.com> writes:

On Sat, 14 Feb 2009 14:04:36 +0300, Walter Bright <newshound1 digitalmars.com>
wrote:

 Andrei Alexandrescu wrote:
 auto big = uniform(rng, uint.max / 2, uint.max);
  If the interval is open I can't generate uint.max.


 auto big = uniform(rng, uint.max / 2 - 1, uint.max) + 1;

Yeah, but now about uniform(uint.min, uint.max)?

Feb 14 2009

Walter Bright <newshound1 digitalmars.com> writes:

Denis Koroskin wrote:
 On Sat, 14 Feb 2009 14:04:36 +0300, Walter Bright 
 <newshound1 digitalmars.com> wrote:
 
 Andrei Alexandrescu wrote:
 auto big = uniform(rng, uint.max / 2, uint.max);
  If the interval is open I can't generate uint.max.


 auto big = uniform(rng, uint.max / 2 - 1, uint.max) + 1;

 
 Yeah, but now about uniform(uint.min, uint.max)?


auto big = cast(uint)uniform(rng, cast(ulong)uint.min, 
cast(ulong)uint.max + 1);

Feb 14 2009

Steve Schveighoffer <schveiguy yahoo.com> writes:

On Sat, 14 Feb 2009 14:11:10 +0300, Denis Koroskin wrote:

 On Sat, 14 Feb 2009 14:04:36 +0300, Walter Bright
 <newshound1 digitalmars.com> wrote:
 
 Andrei Alexandrescu wrote:
 auto big = uniform(rng, uint.max / 2, uint.max);
  If the interval is open I can't generate uint.max.


 auto big = uniform(rng, uint.max / 2 - 1, uint.max) + 1;

 
 Yeah, but now about uniform(uint.min, uint.max)?

I think we're working under the assumption that the uniform function is 
based on some function that returns a random int or uint, no?  So just 
call the base function.

-Steve

Feb 14 2009

Jason House <jason.james.house gmail.com> writes:

Andrei Alexandrescu Wrote:

 Steve Schveighoffer wrote:
 4. While we're at it, should uniform(a, b) generate by default something
 in [a, b] or [a, b)?

 
 [a,b)
 
 Every other piece of range-like code is zero based, and excludes the 
 upper bound.  This should be no different.  It makes the code simpler too.

 
 I tried both versions, and it turns out my code is almost never simpler 
 with open integral intervals. Most of the time I need something like:
 
 auto x = uniform(rng, -100, 100);
 auto y = uniform(rng, 0, 100);
 
 and I need to remember to actually ask for 101 instead of 100. True, 
 when you want a random index in an array, open intervals are more 
 convenient.
 
 One purity-based argument is that in a random number you may actually 
 ask for the total range:
 
 auto big = uniform(rng, uint.max / 2, uint.max);
 
 If the interval is open I can't generate uint.max.
 
 Anyway, I checked the C++ API and it turns out they use closed intervals 
 for integers and open intervals for reals. I know there's been a lot of 
 expert scrutiny there, so I suppose I better copy their design.
 
 
 Andrei


D uses a..b notation as part of the language to mean "[)". Because of that, it
makes more sense to make that the default in D vs. other languages that don't
have such constructs.

Feb 14 2009

Walter Bright <newshound1 digitalmars.com> writes:

Jason House wrote:
 D uses a..b notation as part of the language to mean "[)". Because of
 that, it makes more sense to make that the default in D vs. other
 languages that don't have such constructs.

I think that's exactly right. Consistency is very important.

Feb 14 2009

Don <nospam nospam.com> writes:

Walter Bright wrote:
 Jason House wrote:
 D uses a..b notation as part of the language to mean "[)". Because of
 that, it makes more sense to make that the default in D vs. other
 languages that don't have such constructs.

 
 I think that's exactly right. Consistency is very important.

Agreed. Note that

arr[random(0..arr.length)]

is likely to be an _extremely_ common usage pattern. That argues very 
strongly for "[)".

Feb 15 2009

Benji Smith <dlanguage benjismith.net> writes:

Andrei Alexandrescu wrote:
 Steve Schveighoffer wrote:
 4. While we're at it, should uniform(a, b) generate by default something
 in [a, b] or [a, b)?

 [a,b)

 Every other piece of range-like code is zero based, and excludes the 
 upper bound.  This should be no different.  It makes the code simpler 
 too.

 
 I tried both versions, and it turns out my code is almost never simpler 
 with open integral intervals. Most of the time I need something like:
 
 auto x = uniform(rng, -100, 100);
 auto y = uniform(rng, 0, 100);
 
 and I need to remember to actually ask for 101 instead of 100. True, 
 when you want a random index in an array, open intervals are more 
 convenient.
 
 One purity-based argument is that in a random number you may actually 
 ask for the total range:
 
 auto big = uniform(rng, uint.max / 2, uint.max);
 
 If the interval is open I can't generate uint.max.
 
 Anyway, I checked the C++ API and it turns out they use closed intervals 
 for integers and open intervals for reals. I know there's been a lot of 
 expert scrutiny there, so I suppose I better copy their design.
 
 
 Andrei

Maybe a NumericInterval struct would be a good idea. It could be 
specialized to any numeric type (float, double, int, etc), it would know 
its own boundaries, and it'd keep track of whether those boundaries were 
open or closed.

The random functions would take an RND and an interval (with some 
reasonable default intervals for common tasks like choosing elements 
from arrays and random-access ranges).

I have a Java implementation around here somewhere that I could port to 
D if anyone is interested.

--benji

Feb 15 2009

Benji Smith <dlanguage benjismith.net> writes:

Benji Smith wrote:
 Maybe a NumericInterval struct would be a good idea. It could be 
 specialized to any numeric type (float, double, int, etc), it would know 
 its own boundaries, and it'd keep track of whether those boundaries were 
 open or closed.
 
 The random functions would take an RND and an interval (with some 
 reasonable default intervals for common tasks like choosing elements 
 from arrays and random-access ranges).
 
 I have a Java implementation around here somewhere that I could port to 
 D if anyone is interested.
 
 --benji

Incidentally, the NumericInterval has lots of other interesting 
applications. For example

    auto i = NumericInterval.UBYTE.intersect(NumericInterval.SBYTE);
    bool safelyPolysemious = i.contains(someByteValue);

    auto array = new double[123];
    auto i = NumericInterval.indexInterval(array);
    bool indexIsLegal = i.contains(someIndex);

Using a numeric interval for generating random numbers would be, in my 
opinion, totally ideal.

    double d = uniform(NumericInterval.DOUBLE); // Any double value

    auto i = NumericInterval.parse("[ 123, 456.789 )");
    double random = uniform!(double)(i, rng);

--benji

Feb 15 2009

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

Benji Smith wrote:
 Benji Smith wrote:
 Maybe a NumericInterval struct would be a good idea. It could be 
 specialized to any numeric type (float, double, int, etc), it would 
 know its own boundaries, and it'd keep track of whether those 
 boundaries were open or closed.

 The random functions would take an RND and an interval (with some 
 reasonable default intervals for common tasks like choosing elements 
 from arrays and random-access ranges).

 I have a Java implementation around here somewhere that I could port 
 to D if anyone is interested.

 --benji

 
 Incidentally, the NumericInterval has lots of other interesting 
 applications. For example
 
    auto i = NumericInterval.UBYTE.intersect(NumericInterval.SBYTE);
    bool safelyPolysemious = i.contains(someByteValue);
 
    auto array = new double[123];
    auto i = NumericInterval.indexInterval(array);
    bool indexIsLegal = i.contains(someIndex);
 
 Using a numeric interval for generating random numbers would be, in my 
 opinion, totally ideal.
 
    double d = uniform(NumericInterval.DOUBLE); // Any double value

I've never been in a situation in my life where I thought, hey, a random 
double is exactly what I'd need right now. It's a ginormous interval!

Andrei

Feb 15 2009

Don <nospam nospam.com> writes:

Andrei Alexandrescu wrote:
 Benji Smith wrote:
 Benji Smith wrote:
 Maybe a NumericInterval struct would be a good idea. It could be 
 specialized to any numeric type (float, double, int, etc), it would 
 know its own boundaries, and it'd keep track of whether those 
 boundaries were open or closed.

 The random functions would take an RND and an interval (with some 
 reasonable default intervals for common tasks like choosing elements 
 from arrays and random-access ranges).

 I have a Java implementation around here somewhere that I could port 
 to D if anyone is interested.

 --benji

 Incidentally, the NumericInterval has lots of other interesting 
 applications. For example

    auto i = NumericInterval.UBYTE.intersect(NumericInterval.SBYTE);
    bool safelyPolysemious = i.contains(someByteValue);

    auto array = new double[123];
    auto i = NumericInterval.indexInterval(array);
    bool indexIsLegal = i.contains(someIndex);

 Using a numeric interval for generating random numbers would be, in my 
 opinion, totally ideal.

    double d = uniform(NumericInterval.DOUBLE); // Any double value

 
 I've never been in a situation in my life where I thought, hey, a random 
 double is exactly what I'd need right now. It's a ginormous interval!
 
 Andrei

It's worse than that. Since the range of double includes infinity, a 
uniform distribution must return +-infinity with probability 1. It's 
nonsense.

Feb 16 2009

"Nick Sabalausky" <a a.a> writes:

"Don" <nospam nospam.com> wrote in message 
news:gnb7mt$2osv$1 digitalmars.com...
 Andrei Alexandrescu wrote:
 Benji Smith wrote:
 Benji Smith wrote:
 Maybe a NumericInterval struct would be a good idea. It could be 
 specialized to any numeric type (float, double, int, etc), it would 
 know its own boundaries, and it'd keep track of whether those 
 boundaries were open or closed.

 The random functions would take an RND and an interval (with some 
 reasonable default intervals for common tasks like choosing elements 
 from arrays and random-access ranges).

 I have a Java implementation around here somewhere that I could port to 
 D if anyone is interested.

 --benji

 Incidentally, the NumericInterval has lots of other interesting 
 applications. For example

    auto i = NumericInterval.UBYTE.intersect(NumericInterval.SBYTE);
    bool safelyPolysemious = i.contains(someByteValue);

    auto array = new double[123];
    auto i = NumericInterval.indexInterval(array);
    bool indexIsLegal = i.contains(someIndex);

 Using a numeric interval for generating random numbers would be, in my 
 opinion, totally ideal.

    double d = uniform(NumericInterval.DOUBLE); // Any double value

 I've never been in a situation in my life where I thought, hey, a random 
 double is exactly what I'd need right now. It's a ginormous interval!

 Andrei

 It's worse than that. Since the range of double includes infinity, a 
 uniform distribution must return +-infinity with probability 1. It's 
 nonsense.

I would think that, with the possible exception of mathematicians, most of 
the time anyone would want a random floating-point number they would 
consider +/-infinity a special case. Most non-math-oriented code isn't 
really designed to handle infinity anyway. Besides, technically, double also 
includes NaN ;)

Feb 16 2009

Don <nospam nospam.com> writes:

Nick Sabalausky wrote:
 "Don" <nospam nospam.com> wrote in message 
 news:gnb7mt$2osv$1 digitalmars.com...
 Andrei Alexandrescu wrote:
 Benji Smith wrote:
 Benji Smith wrote:
 Maybe a NumericInterval struct would be a good idea. It could be 
 specialized to any numeric type (float, double, int, etc), it would 
 know its own boundaries, and it'd keep track of whether those 
 boundaries were open or closed.

 The random functions would take an RND and an interval (with some 
 reasonable default intervals for common tasks like choosing elements 
 from arrays and random-access ranges).

 I have a Java implementation around here somewhere that I could port to 
 D if anyone is interested.

 --benji

 Incidentally, the NumericInterval has lots of other interesting 
 applications. For example

    auto i = NumericInterval.UBYTE.intersect(NumericInterval.SBYTE);
    bool safelyPolysemious = i.contains(someByteValue);

    auto array = new double[123];
    auto i = NumericInterval.indexInterval(array);
    bool indexIsLegal = i.contains(someIndex);

 Using a numeric interval for generating random numbers would be, in my 
 opinion, totally ideal.

    double d = uniform(NumericInterval.DOUBLE); // Any double value

 I've never been in a situation in my life where I thought, hey, a random 
 double is exactly what I'd need right now. It's a ginormous interval!

 Andrei

 It's worse than that. Since the range of double includes infinity, a 
 uniform distribution must return +-infinity with probability 1. It's 
 nonsense.

 
 I would think that, with the possible exception of mathematicians, most of 
 the time anyone would want a random floating-point number they would 
 consider +/-infinity a special case. Most non-math-oriented code isn't 
 really designed to handle infinity anyway. Besides, technically, double also 
 includes NaN ;) 

You're asking for a random *number* and NaN is not a number :).
If you have a double x, which could be double.max, -double.max, 0.0, 
-double.min*double.epsilon, or +double.min*double.epsilon, there's 
hardly any operations you can do on it which won't generate an infinity 
or NaN.
But cutting out infinity doesn't help much -- anything from the range of 
double will be really close to double.max. If the code is correct, it's 
still going to have to deal with infinity.


double d = uniform(NumericInterval.DOUBLE);
d *=2;
// BUG: There's a 50% chance that d is now +- infinity.

Feb 17 2009

Benji Smith <dlanguage benjismith.net> writes:

Don wrote:
 Andrei Alexandrescu wrote:
 Benji Smith wrote:
 Benji Smith wrote:
 Maybe a NumericInterval struct would be a good idea. It could be 
 specialized to any numeric type (float, double, int, etc), it would 
 know its own boundaries, and it'd keep track of whether those 
 boundaries were open or closed.

 The random functions would take an RND and an interval (with some 
 reasonable default intervals for common tasks like choosing elements 
 from arrays and random-access ranges).

 I have a Java implementation around here somewhere that I could port 
 to D if anyone is interested.

 --benji

 Incidentally, the NumericInterval has lots of other interesting 
 applications. For example

    auto i = NumericInterval.UBYTE.intersect(NumericInterval.SBYTE);
    bool safelyPolysemious = i.contains(someByteValue);

    auto array = new double[123];
    auto i = NumericInterval.indexInterval(array);
    bool indexIsLegal = i.contains(someIndex);

 Using a numeric interval for generating random numbers would be, in 
 my opinion, totally ideal.

    double d = uniform(NumericInterval.DOUBLE); // Any double value

 I've never been in a situation in my life where I thought, hey, a 
 random double is exactly what I'd need right now. It's a ginormous 
 interval!

 Andrei

 
 It's worse than that. Since the range of double includes infinity, a 
 uniform distribution must return +-infinity with probability 1. It's 
 nonsense.

Way to miss the forest for the trees.

You guys telling me can't see any legitimate use for a NumericInterval 
type? And that it wouldn't be convenient to use for random number 
generation within that interval?

So the full double range was a dumb example. But that wasn't really the 
point, was it?

--benji

Feb 18 2009

Don <nospam nospam.com> writes:

Benji Smith wrote:
 Don wrote:
 Andrei Alexandrescu wrote:
 Benji Smith wrote:
 Benji Smith wrote:
 Maybe a NumericInterval struct would be a good idea. It could be 
 specialized to any numeric type (float, double, int, etc), it would 
 know its own boundaries, and it'd keep track of whether those 
 boundaries were open or closed.

 The random functions would take an RND and an interval (with some 
 reasonable default intervals for common tasks like choosing 
 elements from arrays and random-access ranges).

 I have a Java implementation around here somewhere that I could 
 port to D if anyone is interested.

 --benji

 Incidentally, the NumericInterval has lots of other interesting 
 applications. For example

    auto i = NumericInterval.UBYTE.intersect(NumericInterval.SBYTE);
    bool safelyPolysemious = i.contains(someByteValue);

    auto array = new double[123];
    auto i = NumericInterval.indexInterval(array);
    bool indexIsLegal = i.contains(someIndex);

 Using a numeric interval for generating random numbers would be, in 
 my opinion, totally ideal.

    double d = uniform(NumericInterval.DOUBLE); // Any double value

 I've never been in a situation in my life where I thought, hey, a 
 random double is exactly what I'd need right now. It's a ginormous 
 interval!

 Andrei

 It's worse than that. Since the range of double includes infinity, a 
 uniform distribution must return +-infinity with probability 1. It's 
 nonsense.

 
 Way to miss the forest for the trees.
 
 You guys telling me can't see any legitimate use for a NumericInterval 
 type? And that it wouldn't be convenient to use for random number 
 generation within that interval?
 
 So the full double range was a dumb example. But that wasn't really the 
 point, was it?
 
 --benji

On the contrary, I've been giving NumericInterval considerable thought.
One key issue is whether a NumericInterval(x1, x2) must satisfy x1 <= x2 
(the _strict_ definition), or whether it is also permissible to have 
x2<=x1 (ie, you can specify the two endpoints in reverse order; the 
interval is then between min(x1,x2) and max(x1, x2)).

This is an issue because I've noticed is that when I want to use it, I 
often have related pairs of values.
eg.
Suppose u is the interval {x1, x2}. There's a related v = {f(x1), f(x2)}.
Unfortunately although x1<=x2, f(x1) may not be <= f(x2). So v is not an 
interval in the _strict_ sense. But it satisfies the _relaxed_ definition.

Feb 19 2009

"Denis Koroskin" <2korden gmail.com> writes:

On Thu, 19 Feb 2009 11:35:04 +0300, Don <nospam nospam.com> wrote:

 Benji Smith wrote:
 Don wrote:
 Andrei Alexandrescu wrote:
 Benji Smith wrote:
 Benji Smith wrote:
 Maybe a NumericInterval struct would be a good idea. It could be  
 specialized to any numeric type (float, double, int, etc), it would  
 know its own boundaries, and it'd keep track of whether those  
 boundaries were open or closed.

 The random functions would take an RND and an interval (with some  
 reasonable default intervals for common tasks like choosing  
 elements from arrays and random-access ranges).

 I have a Java implementation around here somewhere that I could  
 port to D if anyone is interested.

 --benji

 Incidentally, the NumericInterval has lots of other interesting  
 applications. For example

    auto i = NumericInterval.UBYTE.intersect(NumericInterval.SBYTE);
    bool safelyPolysemious = i.contains(someByteValue);

    auto array = new double[123];
    auto i = NumericInterval.indexInterval(array);
    bool indexIsLegal = i.contains(someIndex);

 Using a numeric interval for generating random numbers would be, in  
 my opinion, totally ideal.

    double d = uniform(NumericInterval.DOUBLE); // Any double value

 I've never been in a situation in my life where I thought, hey, a  
 random double is exactly what I'd need right now. It's a ginormous  
 interval!

 Andrei

 It's worse than that. Since the range of double includes infinity, a  
 uniform distribution must return +-infinity with probability 1. It's  
 nonsense.

  Way to miss the forest for the trees.
  You guys telling me can't see any legitimate use for a NumericInterval  
 type? And that it wouldn't be convenient to use for random number  
 generation within that interval?
  So the full double range was a dumb example. But that wasn't really  
 the point, was it?
  --benji

 On the contrary, I've been giving NumericInterval considerable thought.
 One key issue is whether a NumericInterval(x1, x2) must satisfy x1 <= x2  
 (the _strict_ definition), or whether it is also permissible to have  
 x2<=x1 (ie, you can specify the two endpoints in reverse order; the  
 interval is then between min(x1,x2) and max(x1, x2)).

 This is an issue because I've noticed is that when I want to use it, I  
 often have related pairs of values.
 eg.
 Suppose u is the interval {x1, x2}. There's a related v = {f(x1), f(x2)}.
 Unfortunately although x1<=x2, f(x1) may not be <= f(x2). So v is not an  
 interval in the _strict_ sense. But it satisfies the _relaxed_  
 definition.

Same as NumericInterval's bounds ([],(),[),(]), strictness might be an
interval's template parameter.

Feb 19 2009

Benji Smith <dlanguage benjismith.net> writes:

Don wrote:
 Benji Smith wrote:
 Don wrote:
 Andrei Alexandrescu wrote:
 Benji Smith wrote:
 Benji Smith wrote:
 Maybe a NumericInterval struct would be a good idea. It could be 
 specialized to any numeric type (float, double, int, etc), it 
 would know its own boundaries, and it'd keep track of whether 
 those boundaries were open or closed.

 The random functions would take an RND and an interval (with some 
 reasonable default intervals for common tasks like choosing 
 elements from arrays and random-access ranges).

 I have a Java implementation around here somewhere that I could 
 port to D if anyone is interested.

 --benji

 Incidentally, the NumericInterval has lots of other interesting 
 applications. For example

    auto i = NumericInterval.UBYTE.intersect(NumericInterval.SBYTE);
    bool safelyPolysemious = i.contains(someByteValue);

    auto array = new double[123];
    auto i = NumericInterval.indexInterval(array);
    bool indexIsLegal = i.contains(someIndex);

 Using a numeric interval for generating random numbers would be, in 
 my opinion, totally ideal.

    double d = uniform(NumericInterval.DOUBLE); // Any double value

 I've never been in a situation in my life where I thought, hey, a 
 random double is exactly what I'd need right now. It's a ginormous 
 interval!

 Andrei

 It's worse than that. Since the range of double includes infinity, a 
 uniform distribution must return +-infinity with probability 1. It's 
 nonsense.

 Way to miss the forest for the trees.

 You guys telling me can't see any legitimate use for a NumericInterval 
 type? And that it wouldn't be convenient to use for random number 
 generation within that interval?

 So the full double range was a dumb example. But that wasn't really 
 the point, was it?

 --benji

 
 On the contrary, I've been giving NumericInterval considerable thought.
 One key issue is whether a NumericInterval(x1, x2) must satisfy x1 <= x2 
 (the _strict_ definition), or whether it is also permissible to have 
 x2<=x1 (ie, you can specify the two endpoints in reverse order; the 
 interval is then between min(x1,x2) and max(x1, x2)).
 
 This is an issue because I've noticed is that when I want to use it, I 
 often have related pairs of values.
 eg.
 Suppose u is the interval {x1, x2}. There's a related v = {f(x1), f(x2)}.
 Unfortunately although x1<=x2, f(x1) may not be <= f(x2). So v is not an 
 interval in the _strict_ sense. But it satisfies the _relaxed_ definition.

I don't see any idealogical reason for requiring x2 >= x1.

But the public API of the interval will probably have functions or 
properties returning the "lowerBound" and "upperBound". And the 
implementations of the "containsValue", "intersect", and "overlap" 
functions are all more straightforward to write if you know in advance 
which value is which, potentially switching them in the constructor.

Of course, if you switch the values, do you also switch the boundary 
open/closed boundaries? What about this case:

    auto i = Interval!("[)")(1000, -1000);

Which side of the range is open, and which is closed? Does the "[)" 
argument apply to the natural order of the range (closed on its lower 
bound) or does it apply to the order of the arguments in the function 
(closed on its leftmost argument)?

As long as the behavior is well documented, I think it'd be fine either 
way. But I also think it'd be reasonable to throw an exception if the 
arguments are in the wrong order.

--benji

Feb 19 2009

Don <nospam nospam.com> writes:

Benji Smith wrote:
 Don wrote:
 Benji Smith wrote:
 Don wrote:
 Andrei Alexandrescu wrote:
 Benji Smith wrote:
 Benji Smith wrote:
 Maybe a NumericInterval struct would be a good idea. It could be 
 specialized to any numeric type (float, double, int, etc), it 
 would know its own boundaries, and it'd keep track of whether 
 those boundaries were open or closed.

 The random functions would take an RND and an interval (with some 
 reasonable default intervals for common tasks like choosing 
 elements from arrays and random-access ranges).

 I have a Java implementation around here somewhere that I could 
 port to D if anyone is interested.

 --benji

 Incidentally, the NumericInterval has lots of other interesting 
 applications. For example

    auto i = NumericInterval.UBYTE.intersect(NumericInterval.SBYTE);
    bool safelyPolysemious = i.contains(someByteValue);

    auto array = new double[123];
    auto i = NumericInterval.indexInterval(array);
    bool indexIsLegal = i.contains(someIndex);

 Using a numeric interval for generating random numbers would be, 
 in my opinion, totally ideal.

    double d = uniform(NumericInterval.DOUBLE); // Any double value

 I've never been in a situation in my life where I thought, hey, a 
 random double is exactly what I'd need right now. It's a ginormous 
 interval!

 Andrei

 It's worse than that. Since the range of double includes infinity, a 
 uniform distribution must return +-infinity with probability 1. It's 
 nonsense.

 Way to miss the forest for the trees.

 You guys telling me can't see any legitimate use for a 
 NumericInterval type? And that it wouldn't be convenient to use for 
 random number generation within that interval?

 So the full double range was a dumb example. But that wasn't really 
 the point, was it?

 --benji

 On the contrary, I've been giving NumericInterval considerable thought.
 One key issue is whether a NumericInterval(x1, x2) must satisfy x1 <= 
 x2 (the _strict_ definition), or whether it is also permissible to 
 have x2<=x1 (ie, you can specify the two endpoints in reverse order; 
 the interval is then between min(x1,x2) and max(x1, x2)).

 This is an issue because I've noticed is that when I want to use it, I 
 often have related pairs of values.
 eg.
 Suppose u is the interval {x1, x2}. There's a related v = {f(x1), f(x2)}.
 Unfortunately although x1<=x2, f(x1) may not be <= f(x2). So v is not 
 an interval in the _strict_ sense. But it satisfies the _relaxed_ 
 definition.

 
 I don't see any idealogical reason for requiring x2 >= x1.
 
 But the public API of the interval will probably have functions or 
 properties returning the "lowerBound" and "upperBound". And the 
 implementations of the "containsValue", "intersect", and "overlap" 
 functions are all more straightforward to write if you know in advance 
 which value is which, potentially switching them in the constructor.
 
 Of course, if you switch the values, do you also switch the boundary 
 open/closed boundaries? What about this case:
 
    auto i = Interval!("[)")(1000, -1000);
 
 Which side of the range is open, and which is closed? Does the "[)" 
 argument apply to the natural order of the range (closed on its lower 
 bound) or does it apply to the order of the arguments in the function 
 (closed on its leftmost argument)?

That's a really good question, and probably the killer argument against 
relaxed ranges.

A NumericInterval would always be stored internally as "[]", so the "[)" 
is only required at construction/assignment (it's not part of the type).
So instead, it seems as though a different type is required (a pair of 
values), which provides conversion to a strict range. Not as simple as I 
hoped.

 As long as the behavior is well documented, I think it'd be fine either 
 way. But I also think it'd be reasonable to throw an exception if the 
 arguments are in the wrong order.
 
 --benji

Feb 19 2009

Walter Bright <newshound1 digitalmars.com> writes:

Andrei Alexandrescu wrote:
 2. The global random generator will be allocated per thread. Are you 
 cool with this too?

That could have speed problems with a tight loop accessing thread local 
storage. If it's a shared global with no locking, the thread randomness 
will increase its randomness <g>.


 3. How should the global rng be initialized? To always generate the same 
 sequence, or not?

Always different, because that's what I want if I just want a freakin' 
random number!

Feb 14 2009

dsimcha <dsimcha yahoo.com> writes:

== Quote from Walter Bright (newshound1 digitalmars.com)'s article
 Andrei Alexandrescu wrote:
 2. The global random generator will be allocated per thread. Are you
 cool with this too?

 That could have speed problems with a tight loop accessing thread local
 storage.

I don't think this matters, since for apps that really need it, a RNG can be
explicitly passed in.  This is purely a convenience feature for when you want
random number generation to "just work" even if it's not the most efficient
thing
in the world.  Also, checking thread_needLock() is dirt cheap, in my experience
faster than accessing TLS.  You could also have a global, non-thread local RNG
that is used if a check to thread_needLock() returns false.

 If it's a shared global with no locking, the thread randomness
 will increase its randomness <g>.

Yay, D's getting hardware random number generators!!!

Feb 14 2009

grauzone <none example.net> writes:

dsimcha wrote:
 == Quote from Walter Bright (newshound1 digitalmars.com)'s article
 Andrei Alexandrescu wrote:
 2. The global random generator will be allocated per thread. Are you
 cool with this too?

 That could have speed problems with a tight loop accessing thread local
 storage.

 
 I don't think this matters, since for apps that really need it, a RNG can be
 explicitly passed in.  This is purely a convenience feature for when you want
 random number generation to "just work" even if it's not the most efficient
thing
 in the world.  Also, checking thread_needLock() is dirt cheap, in my experience
 faster than accessing TLS.  You could also have a global, non-thread local RNG
 that is used if a check to thread_needLock() returns false.

I don't understand this. If implemented right (D2.0 or gcc thread 
variables), TLS can be as fast as a normal global variable. You only 
need an additional check (a simple if()) to lazily initialize the RNG.

Regarding "just work": it seems D's philosophy, to prefer to do the 
simple and "right" thing instead of raw efficiency, seems to be 
abandoned. Like std.algorithm uses weird string mixins, and the 
predicate is a template argument instead of a normal one (that's IMHO).

 If it's a shared global with no locking, the thread randomness
 will increase its randomness <g>.


Huh, non-blocking random generators?

 Yay, D's getting hardware random number generators!!!

Feb 14 2009

Walter Bright <newshound1 digitalmars.com> writes:

grauzone wrote:
 I don't understand this. If implemented right (D2.0 or gcc thread 
 variables), TLS can be as fast as a normal global variable. You only 
 need an additional check (a simple if()) to lazily initialize the RNG.

That's the notorious double-checked-locking bug, an incredibly seductive 
pattern that is wrong and very illustrative about how hard it is to do 
multithreaded programming correctly.

http://www.cs.umd.edu/~pugh/java/memoryModel/DoubleCheckedLocking.html

Feb 14 2009

grauzone <none example.net> writes:

Walter Bright wrote:
 grauzone wrote:
 I don't understand this. If implemented right (D2.0 or gcc thread 
 variables), TLS can be as fast as a normal global variable. You only 
 need an additional check (a simple if()) to lazily initialize the RNG.

 
 That's the notorious double-checked-locking bug, an incredibly seductive 
 pattern that is wrong and very illustrative about how hard it is to do 
 multithreaded programming correctly.
 
 http://www.cs.umd.edu/~pugh/java/memoryModel/DoubleCheckedLocking.html

No, I meant something like this:

thread RandomGenerator rng;

if (!rng) {
	rng = new RandomGenerator();
}

Feb 14 2009

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

grauzone wrote:
 dsimcha wrote:
 == Quote from Walter Bright (newshound1 digitalmars.com)'s article
 Andrei Alexandrescu wrote:
 2. The global random generator will be allocated per thread. Are you
 cool with this too?

 That could have speed problems with a tight loop accessing thread local
 storage.

 I don't think this matters, since for apps that really need it, a RNG 
 can be
 explicitly passed in.  This is purely a convenience feature for when 
 you want
 random number generation to "just work" even if it's not the most 
 efficient thing
 in the world.  Also, checking thread_needLock() is dirt cheap, in my 
 experience
 faster than accessing TLS.  You could also have a global, non-thread 
 local RNG
 that is used if a check to thread_needLock() returns false.

 
 I don't understand this. If implemented right (D2.0 or gcc thread 
 variables), TLS can be as fast as a normal global variable. You only 
 need an additional check (a simple if()) to lazily initialize the RNG.
 
 Regarding "just work": it seems D's philosophy, to prefer to do the 
 simple and "right" thing instead of raw efficiency, seems to be 
 abandoned. Like std.algorithm uses weird string mixins, and the 
 predicate is a template argument instead of a normal one (that's IMHO).

Well I can only welcome the return of persistent flamebaiting to this 
group. D's philosophy is to prefer to do the right thing, even when that 
doesn't align with what you believe that is. I don't understand why the 
pattern of some of your posts is to throw some random (sic) semantic 
grenade in hope to pick a fight.

Speaking of "normal" passing of the predicate, I'd been curious this 
morning to see how a hand-written max fares compared to reduce!(max), 
with reduce!("b > a ? b : a"), and with an indirect reduce using a 
delegate passed, ahem, "normally". This is because a nearest-neighbor 
algorithm I work on needs to compute max over an array in its inner 
loop. I wanted to see whether I need to roll my own max implementation 
versus using std.algorithm.

For 20000 evaluations over a 100000-integers array, reduce!(min), 
reduce!("b > a ? b : a"), and handMax all finished within 3.4 to 3.5 
seconds on my machine. The "normal" version took 11.6 seconds to 
complete. Trials with various sizes reveal a similar pattern throughout: 
pass-by-delegate is way behind the others, which work about as fast.

This means that, had std.algorithm taken the "normal" route, it would 
have been plenty useless for any serious needs. Not taking the "normal" 
route means that its abstractions rival hand-tuned code so I can use 
them without worrying they will come unglued when shown a real challenge.

So, to paraphrase the ugly guy in "The Good, The Bad, and The Ugly": If 
you want to shoot, shoot, don't troll.


Andrei

Feb 14 2009

grauzone <none example.net> writes:

Andrei Alexandrescu wrote:
 grauzone wrote:
 dsimcha wrote:
 == Quote from Walter Bright (newshound1 digitalmars.com)'s article
 Andrei Alexandrescu wrote:
 2. The global random generator will be allocated per thread. Are you
 cool with this too?

 That could have speed problems with a tight loop accessing thread local
 storage.

 I don't think this matters, since for apps that really need it, a RNG 
 can be
 explicitly passed in.  This is purely a convenience feature for when 
 you want
 random number generation to "just work" even if it's not the most 
 efficient thing
 in the world.  Also, checking thread_needLock() is dirt cheap, in my 
 experience
 faster than accessing TLS.  You could also have a global, non-thread 
 local RNG
 that is used if a check to thread_needLock() returns false.

 I don't understand this. If implemented right (D2.0 or gcc thread 
 variables), TLS can be as fast as a normal global variable. You only 
 need an additional check (a simple if()) to lazily initialize the RNG.

 Regarding "just work": it seems D's philosophy, to prefer to do the 
 simple and "right" thing instead of raw efficiency, seems to be 
 abandoned. Like std.algorithm uses weird string mixins, and the 
 predicate is a template argument instead of a normal one (that's IMHO).

 
 Well I can only welcome the return of persistent flamebaiting to this 
 group. D's philosophy is to prefer to do the right thing, even when that 

Why is it "flamebaiting" when I say my opinion? Others are doing it too.

Here, I even used "IMHO" to mark that this is my opinion, and not 
necessarily some kind of objective truth.

And if you think I'm trolling, then you obviously shouldn't take the 
bait. Else you'd add quite some fuel for an actual flamewar. I really 
don't understand this.

 doesn't align with what you believe that is. I don't understand why the 
 pattern of some of your posts is to throw some random (sic) semantic 
 grenade in hope to pick a fight.

Well, sorry. I didn't mean to offend anyone. Actually, I'm a bit shocked 
by your reaction.

 Speaking of "normal" passing of the predicate, I'd been curious this 
 morning to see how a hand-written max fares compared to reduce!(max), 
 with reduce!("b > a ? b : a"), and with an indirect reduce using a 
 delegate passed, ahem, "normally". This is because a nearest-neighbor 
 algorithm I work on needs to compute max over an array in its inner 
 loop. I wanted to see whether I need to roll my own max implementation 
 versus using std.algorithm.
 
 For 20000 evaluations over a 100000-integers array, reduce!(min), 
 reduce!("b > a ? b : a"), and handMax all finished within 3.4 to 3.5 
 seconds on my machine. The "normal" version took 11.6 seconds to 
 complete. Trials with various sizes reveal a similar pattern throughout: 
 pass-by-delegate is way behind the others, which work about as fast.
 
 This means that, had std.algorithm taken the "normal" route, it would 
 have been plenty useless for any serious needs. Not taking the "normal" 
 route means that its abstractions rival hand-tuned code so I can use 
 them without worrying they will come unglued when shown a real challenge.

That's what I said. You preferred this design because it's faster. One 
can agree or disagree if this approach introduces "syntax artifacts" or 
if it violates KISS or whatever. Everything has its pros and cons. I 
just said I disagree.

Also, dsimcha said:
 This is purely a convenience feature for when you want
 random number generation to "just work" even if it's not the most
 efficient thing
 in the world.

I agree to him. If you need efficiency, you can write your own code or 
use a highly specialized library fit for the job. Most users probably 
wouldn't rely on the standard library if they're optimizing their code. 
As long as the user _can_ write his own code, the standard library 
should prefer to be simple and easy to use. It also prevents bitrot, 
bloat, and incomprehensible library code.

 So, to paraphrase the ugly guy in "The Good, The Bad, and The Ugly": If 
 you want to shoot, shoot, don't troll.
 
 
 Andrei

Feb 14 2009

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

grauzone wrote:
 Andrei Alexandrescu wrote:
 grauzone wrote:
 dsimcha wrote:
 == Quote from Walter Bright (newshound1 digitalmars.com)'s article
 Andrei Alexandrescu wrote:
 2. The global random generator will be allocated per thread. Are you
 cool with this too?

 That could have speed problems with a tight loop accessing thread 
 local
 storage.

 I don't think this matters, since for apps that really need it, a 
 RNG can be
 explicitly passed in.  This is purely a convenience feature for when 
 you want
 random number generation to "just work" even if it's not the most 
 efficient thing
 in the world.  Also, checking thread_needLock() is dirt cheap, in my 
 experience
 faster than accessing TLS.  You could also have a global, non-thread 
 local RNG
 that is used if a check to thread_needLock() returns false.

 I don't understand this. If implemented right (D2.0 or gcc thread 
 variables), TLS can be as fast as a normal global variable. You only 
 need an additional check (a simple if()) to lazily initialize the RNG.

 Regarding "just work": it seems D's philosophy, to prefer to do the 
 simple and "right" thing instead of raw efficiency, seems to be 
 abandoned. Like std.algorithm uses weird string mixins, and the 
 predicate is a template argument instead of a normal one (that's IMHO).

 Well I can only welcome the return of persistent flamebaiting to this 
 group. D's philosophy is to prefer to do the right thing, even when that 

 Why is it "flamebaiting" when I say my opinion? Others are doing it too.

 Here, I even used "IMHO" to mark that this is my opinion, and not 
 necessarily some kind of objective truth.

 And if you think I'm trolling, then you obviously shouldn't take the 
 bait. Else you'd add quite some fuel for an actual flamewar. I really 
 don't understand this.

Yah, you're right. I shouldn't have answered, and my answer ended up 
going firmly further out of line. I am sorry about that. If you care for 
my reasons, please read on.

Now I do understand in the end you're doing nothing but stating an 
opinion, which is of course what we do most of the time. All I'm saying 
is that it's more beneficial to discuss an opinion of the form:

"I prefer froobenglobs to std.brooblenglabs. I think std.brooblenglabs 
are unnecessarily complicated with too few redeeming qualities. In 
contrast, froobenglobs have the simplicity given by using the universal 
trilegate feature, and their inherent loss of quantic traction is 
negligible. So I suggest froobenglobs to be considered as a supplement 
or replacement for a future std.brooblenglabs."

This is an opinion that can be discussed because it brings supporting 
arguments that can in turn be debated. Many good things transpire that 
way. Should we reach the conclusion that our views on the matter differ 
in some irreducible manner, we'll still be gained from the discussion. 
Contrast this with not an opinion, but an umbrella statement that 
implicitly assumes a number of opinions are true facts:

"Everything about C++, language and libraries, reeks of bad design" 
(this is how I read your ironic statement to the contrary).

or

"It looks like D's philosophy to do the simple and right thing instead 
of raw efficiency has been abandoned."

This (and the subsequent example) after I've spent time on several 
occasions on explaining you how and why exactly pass-by-delegate is not 
defensible versus pass-by-alias, which includes pass-by-delegate in its 
jeans pocket. The "IMHO" license you use does little to help such 
statements. I agree the odd rules of pass-by-string may seem unsavory to 
some, but please agree with me that there is little in the way of a 
salient answer that could be given to the umbrella statement above.

 This means that, had std.algorithm taken the "normal" route, it would 
 have been plenty useless for any serious needs. Not taking the 
 "normal" route means that its abstractions rival hand-tuned code so I 
 can use them without worrying they will come unglued when shown a real 
 challenge.

 That's what I said. You preferred this design because it's faster. One 
 can agree or disagree if this approach introduces "syntax artifacts" or 
 if it violates KISS or whatever. Everything has its pros and cons. I 
 just said I disagree.

I preferred that design not only because it's faster, but also because 
it _includes_ *all* /benefits/ of the "slower" rule of pass-by-delegate, 
i.e. it can be slower and more flexible if that's what's needed. The 
only "cost" is a minor change in usage syntax. Essentially I think D has 
hit mother lode with the local instantiation feature because it has all 
the benefits and next to no problem. I sincerely believe it will be 
instrumental to the success of D.

Given its advantages, it would be an extraordinary uphill battle to 
argue that pass-by-alias with local instantiation is a lesser design. 
What pushed my button is not that you undertook said battle, but that 
you made statements that imply you've already won it.

 Also, dsimcha said:
  > This is purely a convenience feature for when you want
  > random number generation to "just work" even if it's not the most
  > efficient thing
  > in the world.

 I agree to him. If you need efficiency, you can write your own code or 
 use a highly specialized library fit for the job. Most users probably 
 wouldn't rely on the standard library if they're optimizing their code. 

Well, you see, here's where we can "agree to disagree" (as much as I 
dislike the PCness of the phrase). IMHO, if most users in search for 
efficiency need to roll their own counterparts of the abstractions 
provided by Phobos, I've majorly wasted my time. Costly abstractions are 
a dime a dozen. I have nothing but contempt for them. Every dog language 
has them in spades. The real challenge is not to write sort in three 
lines. It's to write it in three lines and have it beat the 2000 lines 
sort. (Alas, we're not there yet.)

 As long as the user _can_ write his own code, the standard library
 should prefer to be simple and easy to use. It also prevents bitrot,
 bloat, and incomprehensible library code.

This approach seems to be pushing bitrot, bloat, and incomprehensibility 
into the user code, while keeping the library a collection of useless 
toys. Hardly an approach I'd go for. I am sorry. I just disagree.

Andrei

Feb 14 2009

dsimcha <dsimcha yahoo.com> writes:

== Quote from Andrei Alexandrescu (SeeWebsiteForEmail erdani.org)'s article
 Also, dsimcha said:
  > This is purely a convenience feature for when you want
  > random number generation to "just work" even if it's not the most
  > efficient thing
  > in the world.

 I agree to him. If you need efficiency, you can write your own code or
 use a highly specialized library fit for the job. Most users probably
 wouldn't rely on the standard library if they're optimizing their code.

 Well, you see, here's where we can "agree to disagree" (as much as I
 dislike the PCness of the phrase). IMHO, if most users in search for
 efficiency need to roll their own counterparts of the abstractions
 provided by Phobos, I've majorly wasted my time. Costly abstractions are
 a dime a dozen. I have nothing but contempt for them. Every dog language
 has them in spades. The real challenge is not to write sort in three
 lines. It's to write it in three lines and have it beat the 2000 lines
 sort. (Alas, we're not there yet.)
  > As long as the user _can_ write his own code, the standard library
  > should prefer to be simple and easy to use. It also prevents bitrot,
  > bloat, and incomprehensible library code.
 This approach seems to be pushing bitrot, bloat, and incomprehensibility
 into the user code, while keeping the library a collection of useless
 toys. Hardly an approach I'd go for. I am sorry. I just disagree.
 Andrei

Just to set the record straight, I actually agree with Andrei here.  Elegant but
costly abstractions are simply not D's niche, and are relatively easy to come
by.
 If you want them, use a good dynamic interpreted language instead.  The quote
above was taken out of context.  I believe that a library should provide
efficient
and flexible abstractions that eliminate the need for people to roll their own
and
that the goal of Phobos2 in this regard is a good one.  I probably wouldn't use
much of what was in std.algorithm, at least in the performance critical parts of
my code, if it were pass-by-delegate instead of pass-by-alias.  My only point
was
that, when complexity in the API (or language) is unavoidable if efficiency and
flexibility are to be achieved, I think it's ok to slap a few not-so-efficient
convenience features on top of it as long as the core remains efficient,
flexible
and reasonably usable.

Feb 14 2009

grauzone <none example.net> writes:

dsimcha wrote:
 == Quote from Andrei Alexandrescu (SeeWebsiteForEmail erdani.org)'s article
 Also, dsimcha said:
  > This is purely a convenience feature for when you want
  > random number generation to "just work" even if it's not the most
  > efficient thing
  > in the world.

 I agree to him. If you need efficiency, you can write your own code or
 use a highly specialized library fit for the job. Most users probably
 wouldn't rely on the standard library if they're optimizing their code.

 Well, you see, here's where we can "agree to disagree" (as much as I
 dislike the PCness of the phrase). IMHO, if most users in search for
 efficiency need to roll their own counterparts of the abstractions
 provided by Phobos, I've majorly wasted my time. Costly abstractions are
 a dime a dozen. I have nothing but contempt for them. Every dog language
 has them in spades. The real challenge is not to write sort in three
 lines. It's to write it in three lines and have it beat the 2000 lines
 sort. (Alas, we're not there yet.)
  > As long as the user _can_ write his own code, the standard library
  > should prefer to be simple and easy to use. It also prevents bitrot,
  > bloat, and incomprehensible library code.
 This approach seems to be pushing bitrot, bloat, and incomprehensibility
 into the user code, while keeping the library a collection of useless
 toys. Hardly an approach I'd go for. I am sorry. I just disagree.
 Andrei

 Just to set the record straight, I actually agree with Andrei here.  Elegant
but
 costly abstractions are simply not D's niche, and are relatively easy to come
by.
  If you want them, use a good dynamic interpreted language instead.  The quote

Well, I'm not necessarily talking about "elegant but costly 
abstractions". Rather, D is full of design decisions, where convenience 
and simplicity was preferred to maximal achievable performance:

- relies on GC, rather bad support for fully manual MM
   (also consider the new closures in D2.0)
- expensive checks in non-release mode (array bounds checking,
   object invariants)
- methods are virtual by default
- foreach used to call the enclosed code as a delegate
- lazy keyword
- associative arrays, also .sort
- many runtime core and std-lib functions are designed that way

Maybe there's even more. D always has been a nice blend of performance 
centered low level languages like C/C++, and higher level ones liek 
dynamic or functional languages.

 above was taken out of context.  I believe that a library should provide
efficient
 and flexible abstractions that eliminate the need for people to roll their own
and
 that the goal of Phobos2 in this regard is a good one.  I probably wouldn't use
 much of what was in std.algorithm, at least in the performance critical parts
of
 my code, if it were pass-by-delegate instead of pass-by-alias.  My only point
was
 that, when complexity in the API (or language) is unavoidable if efficiency and
 flexibility are to be achieved, I think it's ok to slap a few not-so-efficient
 convenience features on top of it as long as the core remains efficient,
flexible
 and reasonably usable.

Feb 14 2009

"Denis Koroskin" <2korden gmail.com> writes:

On Sat, 14 Feb 2009 21:38:19 +0300, Andrei Alexandrescu
<SeeWebsiteForEmail erdani.org> wrote:

 grauzone wrote:
 dsimcha wrote:
 == Quote from Walter Bright (newshound1 digitalmars.com)'s article
 Andrei Alexandrescu wrote:
 2. The global random generator will be allocated per thread. Are you
 cool with this too?

 That could have speed problems with a tight loop accessing thread  
 local
 storage.

 I don't think this matters, since for apps that really need it, a RNG  
 can be
 explicitly passed in.  This is purely a convenience feature for when  
 you want
 random number generation to "just work" even if it's not the most  
 efficient thing
 in the world.  Also, checking thread_needLock() is dirt cheap, in my  
 experience
 faster than accessing TLS.  You could also have a global, non-thread  
 local RNG
 that is used if a check to thread_needLock() returns false.

  I don't understand this. If implemented right (D2.0 or gcc thread  
 variables), TLS can be as fast as a normal global variable. You only  
 need an additional check (a simple if()) to lazily initialize the RNG.
  Regarding "just work": it seems D's philosophy, to prefer to do the  
 simple and "right" thing instead of raw efficiency, seems to be  
 abandoned. Like std.algorithm uses weird string mixins, and the  
 predicate is a template argument instead of a normal one (that's IMHO).

 Well I can only welcome the return of persistent flamebaiting to this  
 group. D's philosophy is to prefer to do the right thing, even when that  
 doesn't align with what you believe that is. I don't understand why the  
 pattern of some of your posts is to throw some random (sic) semantic  
 grenade in hope to pick a fight.

 Speaking of "normal" passing of the predicate, I'd been curious this  
 morning to see how a hand-written max fares compared to reduce!(max),  
 with reduce!("b > a ? b : a"), and with an indirect reduce using a  
 delegate passed, ahem, "normally". This is because a nearest-neighbor  
 algorithm I work on needs to compute max over an array in its inner  
 loop. I wanted to see whether I need to roll my own max implementation  
 versus using std.algorithm.

 For 20000 evaluations over a 100000-integers array, reduce!(min),  
 reduce!("b > a ? b : a"), and handMax all finished within 3.4 to 3.5  
 seconds on my machine. The "normal" version took 11.6 seconds to  
 complete. Trials with various sizes reveal a similar pattern throughout:  
 pass-by-delegate is way behind the others, which work about as fast.

 This means that, had std.algorithm taken the "normal" route, it would  
 have been plenty useless for any serious needs. Not taking the "normal"  
 route means that its abstractions rival hand-tuned code so I can use  
 them without worrying they will come unglued when shown a real challenge.

 So, to paraphrase the ugly guy in "The Good, The Bad, and The Ugly": If  
 you want to shoot, shoot, don't troll.


 Andrei

Did you try passing a comparator (C++ way to do things)? It should be pretty
fast, to.

Feb 14 2009

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

Denis Koroskin wrote:
 On Sat, 14 Feb 2009 21:38:19 +0300, Andrei Alexandrescu 
 <SeeWebsiteForEmail erdani.org> wrote:
 For 20000 evaluations over a 100000-integers array, reduce!(min), 
 reduce!("b > a ? b : a"), and handMax all finished within 3.4 to 3.5 
 seconds on my machine. The "normal" version took 11.6 seconds to 
 complete. Trials with various sizes reveal a similar pattern 
 throughout: pass-by-delegate is way behind the others, which work 
 about as fast.

 This means that, had std.algorithm taken the "normal" route, it would 
 have been plenty useless for any serious needs. Not taking the 
 "normal" route means that its abstractions rival hand-tuned code so I 
 can use them without worrying they will come unglued when shown a real 
 challenge.

 So, to paraphrase the ugly guy in "The Good, The Bad, and The Ugly": 
 If you want to shoot, shoot, don't troll.


 Andrei

 
 Did you try passing a comparator (C++ way to do things)? It should be 
 pretty fast, to.

Is this what you meant?

struct StructMax
{
     static uint opCall(uint a, uint b)
     {
         return b > a ? b : a;
     }
}

uint reduceStruct(uint[] x, StructMax s)
{
     invariant n = x.length;
     uint m = x[0];
     for (uint i = 1; i < n; ++i) {
         m = s(m, x[i]);
     }
     return m;
}

I did, and as expected the running time is similar to the other fast 
approaches. Removing "static" does not affect the timings sensibly.


Andrei

Feb 14 2009

Sergey Gromov <snake.scaly gmail.com> writes:

Fri, 13 Feb 2009 17:10:29 -0800, Andrei Alexandrescu wrote:

 4. While we're at it, should uniform(a, b) generate by default something 
 in [a, b] or [a, b)? Someone once explained to me that generating [a, b] 
 for floating point numbers is the source of all evils and that Hitler, 
 Stalin and Kim Il Sung (should he still be alive) must be using that 
 kind of generator. Conversely, generating [a, b) is guaranteed to bring 
 in the long term everlasting peace to Earth. My problem however is that 
 in the integer realm I always want to generate [a, b]. Furthermore, I 
 wouldn't be happy if the shape of the interval was different for 
 integers and floating point numbers. How to break this conundrum?

I'm used to float generators [0,1] and integer generators [0,max).
OTOH, if uniform(0.0,256.0) is uniform [0.0,256.0) then
cast(int)(uniform(0.0,256.0)) is also uniform [0,256) which is good.

Feb 14 2009

Don <nospam nospam.com> writes:

Andrei Alexandrescu wrote:
 4. While we're at it, should uniform(a, b) generate by default something 
 in [a, b] or [a, b)? Someone once explained to me that generating [a, b] 
 for floating point numbers is the source of all evils and that Hitler, 
 Stalin and Kim Il Sung (should he still be alive) must be using that 
 kind of generator. Conversely, generating [a, b) is guaranteed to bring 
 in the long term everlasting peace to Earth. My problem however is that 
 in the integer realm I always want to generate [a, b]. Furthermore, I 
 wouldn't be happy if the shape of the interval was different for 
 integers and floating point numbers. How to break this conundrum? Don't 
 forget that we're only worrying about defaults, explicit generation is 
 always possible with self-explanatory code:
 
 auto rng = Random(unpredictableSeed);
 auto a = 0.0, b = 1.0;
 auto x1 = uniform!("[]")(rng, a, b);
 auto x2 = uniform!("[)")(rng, a, b);
 auto x3 = uniform!("(]")(rng, a, b);
 auto x4 = uniform!("()")(rng, a, b);

This is a general issue applying to any numeric range. I've been giving 
the issue of numeric ranges some thought, and I have begun an 
implementation of a general abstraction.
Any open range can be converted into a closed range, but the converse 
does not apply. So any implementation will be using "[]" internally.

-range("[)", a, b) == range("(]", -b, -a)
range("[)", a, b) == range("[]", a, predecessor(b))
range("()", a, b) == range("[]", successor(a), predecessor(b))


There's a couple of difficult situations involving floating-point numbers.
*  "[)" has the uncomfortable property that (-2,-1, rng) includes -2 but 
not -1, whereas (1, 2, rng) includes 1 but not 2.

* any floating point range which includes 0 is difficult, because there 
are so many numbers which are almost zero. The probability of getting a 
zero for an 80-bit real is so small that you probably wouldn't encounter 
it in your lifetime. I think this weakens arguments based on analogy 
with the integer case.

However, it is much easier to make an unbiased rng for [1,2) than for 
[1,2] or (1,2) (since the number of members in the range is even).

Feb 14 2009

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

Don wrote:
 Andrei Alexandrescu wrote:
 auto rng = Random(unpredictableSeed);
 auto a = 0.0, b = 1.0;
 auto x1 = uniform!("[]")(rng, a, b);
 auto x2 = uniform!("[)")(rng, a, b);
 auto x3 = uniform!("(]")(rng, a, b);
 auto x4 = uniform!("()")(rng, a, b);

 
 This is a general issue applying to any numeric range. I've been giving 
 the issue of numeric ranges some thought, and I have begun an 
 implementation of a general abstraction.
 Any open range can be converted into a closed range, but the converse 
 does not apply. So any implementation will be using "[]" internally.
 
 -range("[)", a, b) == range("(]", -b, -a)
 range("[)", a, b) == range("[]", a, predecessor(b))
 range("()", a, b) == range("[]", successor(a), predecessor(b))
 
 
 There's a couple of difficult situations involving floating-point numbers.
 *  "[)" has the uncomfortable property that (-2,-1, rng) includes -2 but 
 not -1, whereas (1, 2, rng) includes 1 but not 2.
 
 * any floating point range which includes 0 is difficult, because there 
 are so many numbers which are almost zero. The probability of getting a 
 zero for an 80-bit real is so small that you probably wouldn't encounter 
 it in your lifetime. I think this weakens arguments based on analogy 
 with the integer case.
 
 However, it is much easier to make an unbiased rng for [1,2) than for 
 [1,2] or (1,2) (since the number of members in the range is even).

So what would you recommend? [a, b) for floats and [a, b] for ints, or 
[a, b) for everything?

Andrei

Feb 14 2009

Don Clugston <nospam nospam.com> writes:

Andrei Alexandrescu wrote:
 Don wrote:
 Andrei Alexandrescu wrote:
 auto rng = Random(unpredictableSeed);
 auto a = 0.0, b = 1.0;
 auto x1 = uniform!("[]")(rng, a, b);
 auto x2 = uniform!("[)")(rng, a, b);
 auto x3 = uniform!("(]")(rng, a, b);
 auto x4 = uniform!("()")(rng, a, b);

 This is a general issue applying to any numeric range. I've been 
 giving the issue of numeric ranges some thought, and I have begun an 
 implementation of a general abstraction.
 Any open range can be converted into a closed range, but the converse 
 does not apply. So any implementation will be using "[]" internally.

 -range("[)", a, b) == range("(]", -b, -a)
 range("[)", a, b) == range("[]", a, predecessor(b))
 range("()", a, b) == range("[]", successor(a), predecessor(b))


 There's a couple of difficult situations involving floating-point 
 numbers.
 *  "[)" has the uncomfortable property that (-2,-1, rng) includes -2 
 but not -1, whereas (1, 2, rng) includes 1 but not 2.

 * any floating point range which includes 0 is difficult, because 
 there are so many numbers which are almost zero. The probability of 
 getting a zero for an 80-bit real is so small that you probably 
 wouldn't encounter it in your lifetime. I think this weakens arguments 
 based on analogy with the integer case.

 However, it is much easier to make an unbiased rng for [1,2) than for 
 [1,2] or (1,2) (since the number of members in the range is even).

 
 So what would you recommend? [a, b) for floats and [a, b] for ints, or 
 [a, b) for everything?
 
 Andrei

I'm leaning towards [a,b) for everything (consistency with arrays), but 
I'd want to know what the reasoning of the boost/c++0x guys was.

Feb 15 2009

Bill Baxter <wbaxter gmail.com> writes:

On Mon, Feb 16, 2009 at 2:33 AM, Don Clugston <nospam nospam.com> wrote:
 Andrei Alexandrescu wrote:
 Don wrote:
 Andrei Alexandrescu wrote:
 auto rng = Random(unpredictableSeed);
 auto a = 0.0, b = 1.0;
 auto x1 = uniform!("[]")(rng, a, b);
 auto x2 = uniform!("[)")(rng, a, b);
 auto x3 = uniform!("(]")(rng, a, b);
 auto x4 = uniform!("()")(rng, a, b);

 This is a general issue applying to any numeric range. I've been giving
 the issue of numeric ranges some thought, and I have begun an implementation
 of a general abstraction.
 Any open range can be converted into a closed range, but the converse
 does not apply. So any implementation will be using "[]" internally.

 -range("[)", a, b) == range("(]", -b, -a)
 range("[)", a, b) == range("[]", a, predecessor(b))
 range("()", a, b) == range("[]", successor(a), predecessor(b))


 There's a couple of difficult situations involving floating-point
 numbers.
 *  "[)" has the uncomfortable property that (-2,-1, rng) includes -2 but
 not -1, whereas (1, 2, rng) includes 1 but not 2.

 * any floating point range which includes 0 is difficult, because there
 are so many numbers which are almost zero. The probability of getting a zero
 for an 80-bit real is so small that you probably wouldn't encounter it in
 your lifetime. I think this weakens arguments based on analogy with the
 integer case.

 However, it is much easier to make an unbiased rng for [1,2) than for
 [1,2] or (1,2) (since the number of members in the range is even).

 So what would you recommend? [a, b) for floats and [a, b] for ints, or [a,
 b) for everything?

 Andrei

 I'm leaning towards [a,b) for everything (consistency with arrays), but I'd
 want to know what the reasoning of the boost/c++0x guys was.

How do you create a random uint that can take on any of uint's values
with [a,b)?  That's the main reason I can think of to go with [a,b]
for integral types.  With floats it's never useful to use the entire
value range.

--bb

Feb 15 2009

Don <nospam nospam.com> writes:

Bill Baxter wrote:
 On Mon, Feb 16, 2009 at 2:33 AM, Don Clugston <nospam nospam.com> wrote:
 Andrei Alexandrescu wrote:
 Don wrote:
 Andrei Alexandrescu wrote:
 auto rng = Random(unpredictableSeed);
 auto a = 0.0, b = 1.0;
 auto x1 = uniform!("[]")(rng, a, b);
 auto x2 = uniform!("[)")(rng, a, b);
 auto x3 = uniform!("(]")(rng, a, b);
 auto x4 = uniform!("()")(rng, a, b);

 This is a general issue applying to any numeric range. I've been giving
 the issue of numeric ranges some thought, and I have begun an implementation
 of a general abstraction.
 Any open range can be converted into a closed range, but the converse
 does not apply. So any implementation will be using "[]" internally.

 -range("[)", a, b) == range("(]", -b, -a)
 range("[)", a, b) == range("[]", a, predecessor(b))
 range("()", a, b) == range("[]", successor(a), predecessor(b))


 There's a couple of difficult situations involving floating-point
 numbers.
 *  "[)" has the uncomfortable property that (-2,-1, rng) includes -2 but
 not -1, whereas (1, 2, rng) includes 1 but not 2.

 * any floating point range which includes 0 is difficult, because there
 are so many numbers which are almost zero. The probability of getting a zero
 for an 80-bit real is so small that you probably wouldn't encounter it in
 your lifetime. I think this weakens arguments based on analogy with the
 integer case.

 However, it is much easier to make an unbiased rng for [1,2) than for
 [1,2] or (1,2) (since the number of members in the range is even).

 So what would you recommend? [a, b) for floats and [a, b] for ints, or [a,
 b) for everything?

 Andrei

 I'm leaning towards [a,b) for everything (consistency with arrays), but I'd
 want to know what the reasoning of the boost/c++0x guys was.

 
 How do you create a random uint that can take on any of uint's values
 with [a,b)?  That's the main reason I can think of to go with [a,b]
 for integral types.  With floats it's never useful to use the entire
 value range.
 
 --bb

I think that is the primary argument. BUT:
* You _can_ still use "[]" in that case.
* I also think it'd be worth having a "create a random n-byte number", 
as well, which would be the main use for a full uint random number.
* If you're creating a number in the range 0..uint.max+1, you're going 
to have to be careful in lots of places. You can't get that number from 
an array length, for example.
* I think that hard-core scientific/mathematical users are the main 
users of the more esoteric cases, and can be expected to get it right 
(and have no problem with the "[]","()","(]"... notation. I think that 
what's important for the default is that be correct and obvious for as 
many cases as possible.

The strength of "[)" is that if we can say "ALL ranges in D are [) 
unless otherwise stated by the user", it's hard to ever justify breaking 
that convention.

Feb 15 2009

Bill Baxter <wbaxter gmail.com> writes:

On Mon, Feb 16, 2009 at 3:33 AM, Don <nospam nospam.com> wrote:
 Bill Baxter wrote:
 On Mon, Feb 16, 2009 at 2:33 AM, Don Clugston <nospam nospam.com> wrote:
 Andrei Alexandrescu wrote:
 Don wrote:
 Andrei Alexandrescu wrote:
 auto rng = Random(unpredictableSeed);
 auto a = 0.0, b = 1.0;
 auto x1 = uniform!("[]")(rng, a, b);
 auto x2 = uniform!("[)")(rng, a, b);
 auto x3 = uniform!("(]")(rng, a, b);
 auto x4 = uniform!("()")(rng, a, b);

 This is a general issue applying to any numeric range. I've been giving
 the issue of numeric ranges some thought, and I have begun an
 implementation
 of a general abstraction.
 Any open range can be converted into a closed range, but the converse
 does not apply. So any implementation will be using "[]" internally.

 -range("[)", a, b) == range("(]", -b, -a)
 range("[)", a, b) == range("[]", a, predecessor(b))
 range("()", a, b) == range("[]", successor(a), predecessor(b))


 There's a couple of difficult situations involving floating-point
 numbers.
 *  "[)" has the uncomfortable property that (-2,-1, rng) includes -2
 but
 not -1, whereas (1, 2, rng) includes 1 but not 2.

 * any floating point range which includes 0 is difficult, because there
 are so many numbers which are almost zero. The probability of getting a
 zero
 for an 80-bit real is so small that you probably wouldn't encounter it
 in
 your lifetime. I think this weakens arguments based on analogy with the
 integer case.

 However, it is much easier to make an unbiased rng for [1,2) than for
 [1,2] or (1,2) (since the number of members in the range is even).

 So what would you recommend? [a, b) for floats and [a, b] for ints, or
 [a,
 b) for everything?

 Andrei

 I'm leaning towards [a,b) for everything (consistency with arrays), but
 I'd
 want to know what the reasoning of the boost/c++0x guys was.

 How do you create a random uint that can take on any of uint's values
 with [a,b)?  That's the main reason I can think of to go with [a,b]
 for integral types.  With floats it's never useful to use the entire
 value range.

 --bb

 I think that is the primary argument. BUT:
 * You _can_ still use "[]" in that case.
 * I also think it'd be worth having a "create a random n-byte number", as
 well, which would be the main use for a full uint random number.

I guess I'd find it more useful to have a function that gave a random
value of a given type, like random_value!(short).  That way you get
the signed/unsigned specification automatically.  But both could be
useful.

 * If you're creating a number in the range 0..uint.max+1, you're going to
 have to be careful in lots of places. You can't get that number from an
 array length, for example.
 * I think that hard-core scientific/mathematical users are the main users of
 the more esoteric cases, and can be expected to get it right (and have no
 problem with the "[]","()","(]"... notation. I think that what's important
 for the default is that be correct and obvious for as many cases as
 possible.

 The strength of "[)" is that if we can say "ALL ranges in D are [) unless
 otherwise stated by the user", it's hard to ever justify breaking that
 convention.

That sounds reasonable to me.  As long as there's an easy way to also
create a random number that covers an entire range of a given type
(perhaps limited to integral types).

--bb

Feb 15 2009

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

Bill Baxter wrote:
 I guess I'd find it more useful to have a function that gave a random
 value of a given type, like random_value!(short).  That way you get
 the signed/unsigned specification automatically.  But both could be
 useful.

You can, just pass short values into uniform() or specify type arguments 
for it. Actually uniform() used to ask compulsively for its parameter 
type, but I changed that because it was too verbose for no good reason.

 * If you're creating a number in the range 0..uint.max+1, you're going to
 have to be careful in lots of places. You can't get that number from an
 array length, for example.
 * I think that hard-core scientific/mathematical users are the main users of
 the more esoteric cases, and can be expected to get it right (and have no
 problem with the "[]","()","(]"... notation. I think that what's important
 for the default is that be correct and obvious for as many cases as
 possible.

 The strength of "[)" is that if we can say "ALL ranges in D are [) unless
 otherwise stated by the user", it's hard to ever justify breaking that
 convention.

 
 That sounds reasonable to me.  As long as there's an easy way to also
 create a random number that covers an entire range of a given type
 (perhaps limited to integral types).

It's a good litmus test. With the API suggested by Don, in order to 
generate a random byte you'd have to say:

auto b = uniform!("[]")(rng, byte.min, byte.max);

Is this acceptable?


Andrei

Feb 15 2009

Bill Baxter <wbaxter gmail.com> writes:

On Mon, Feb 16, 2009 at 3:51 AM, Andrei Alexandrescu
<SeeWebsiteForEmail erdani.org> wrote:
 Bill Baxter wrote:
 I guess I'd find it more useful to have a function that gave a random
 value of a given type, like random_value!(short).  That way you get
 the signed/unsigned specification automatically.  But both could be
 useful.

 You can, just pass short values into uniform() or specify type arguments for
 it. Actually uniform() used to ask compulsively for its parameter type, but
 I changed that because it was too verbose for no good reason.

 * If you're creating a number in the range 0..uint.max+1, you're going to
 have to be careful in lots of places. You can't get that number from an
 array length, for example.
 * I think that hard-core scientific/mathematical users are the main users
 of
 the more esoteric cases, and can be expected to get it right (and have no
 problem with the "[]","()","(]"... notation. I think that what's
 important
 for the default is that be correct and obvious for as many cases as
 possible.

 The strength of "[)" is that if we can say "ALL ranges in D are [) unless
 otherwise stated by the user", it's hard to ever justify breaking that
 convention.

 That sounds reasonable to me.  As long as there's an easy way to also
 create a random number that covers an entire range of a given type
 (perhaps limited to integral types).

 It's a good litmus test. With the API suggested by Don, in order to generate
 a random byte you'd have to say:

 auto b = uniform!("[]")(rng, byte.min, byte.max);

 Is this acceptable?

If that's what it took I'd probably try this instead:
ubyte b = uniform(rng, 0,256);

And then add an explicit cast to ubyte if the compiler didn't like that.
So no, I don't really think that's good enough.  Others may disagree
about how important a use case this is, though.

--bb

Feb 15 2009

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

Bill Baxter wrote:
 On Mon, Feb 16, 2009 at 3:51 AM, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:
 auto b = uniform!("[]")(rng, byte.min, byte.max);

 Is this acceptable?

 
 If that's what it took I'd probably try this instead:
 ubyte b = uniform(rng, 0,256);
 
 And then add an explicit cast to ubyte if the compiler didn't like that.
 So no, I don't really think that's good enough.  Others may disagree
 about how important a use case this is, though.

This will indeed work:

ubyte b = cast(ubyte) uniform(rng, 0,256);

I just showed the solution involving no cast. So are you ok with that?


Andrei

Feb 15 2009

Bill Baxter <wbaxter gmail.com> writes:

On Mon, Feb 16, 2009 at 4:07 AM, Andrei Alexandrescu
<SeeWebsiteForEmail erdani.org> wrote:
 Bill Baxter wrote:
 On Mon, Feb 16, 2009 at 3:51 AM, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:
 auto b = uniform!("[]")(rng, byte.min, byte.max);

 Is this acceptable?

 If that's what it took I'd probably try this instead:
 ubyte b = uniform(rng, 0,256);

 And then add an explicit cast to ubyte if the compiler didn't like that.
 So no, I don't really think that's good enough.  Others may disagree
 about how important a use case this is, though.

 This will indeed work:

 ubyte b = cast(ubyte) uniform(rng, 0,256);

 I just showed the solution involving no cast. So are you ok with that?

No, but if you are then I'll just make my own 1-liner library
functions, so it's not a big deal.

--bb

Feb 15 2009

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

Bill Baxter wrote:
 On Mon, Feb 16, 2009 at 4:07 AM, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:
 Bill Baxter wrote:
 On Mon, Feb 16, 2009 at 3:51 AM, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:
 auto b = uniform!("[]")(rng, byte.min, byte.max);

 Is this acceptable?

 If that's what it took I'd probably try this instead:
 ubyte b = uniform(rng, 0,256);

 And then add an explicit cast to ubyte if the compiler didn't like that.
 So no, I don't really think that's good enough.  Others may disagree
 about how important a use case this is, though.

 This will indeed work:

 ubyte b = cast(ubyte) uniform(rng, 0,256);

 I just showed the solution involving no cast. So are you ok with that?

 
 No, but if you are then I'll just make my own 1-liner library
 functions, so it's not a big deal.

What signatures would the functions have?

Andrei

Feb 15 2009

Bill Baxter <wbaxter gmail.com> writes:

On Mon, Feb 16, 2009 at 4:36 AM, Andrei Alexandrescu
<SeeWebsiteForEmail erdani.org> wrote:
 Bill Baxter wrote:
 On Mon, Feb 16, 2009 at 4:07 AM, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:
 Bill Baxter wrote:
 On Mon, Feb 16, 2009 at 3:51 AM, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:
 auto b = uniform!("[]")(rng, byte.min, byte.max);

 Is this acceptable?

 If that's what it took I'd probably try this instead:
 ubyte b = uniform(rng, 0,256);

 And then add an explicit cast to ubyte if the compiler didn't like that.
 So no, I don't really think that's good enough.  Others may disagree
 about how important a use case this is, though.

 This will indeed work:

 ubyte b = cast(ubyte) uniform(rng, 0,256);

 I just showed the solution involving no cast. So are you ok with that?

 No, but if you are then I'll just make my own 1-liner library
 functions, so it's not a big deal.

 What signatures would the functions have?

What I'd like best is if all the things you listed would work, but in addition
    T x = uniform!(T)();
would generate a uniform value over T's entire range, using a default
rng.  And rng can be specified if you care what gets used:
    T x = uniform!(T)(rng);

--bb

Feb 15 2009

bearophile <bearophileHUGS lycos.com> writes:

Don:
 * I also think it'd be worth having a "create a random n-byte number", 
 as well, which would be the main use for a full uint random number.

I mostly agree. For example a basic randUint is useful (I have it in dlibs and
I use it once in a while).

I think a randomUint and randUlong may be enough (for the other situations you
can use something like randInt with a limited range).

I am not sure, probably others have to state what they would like the random
functions to be.

Bye,
bearophile

Feb 15 2009

dsimcha <dsimcha yahoo.com> writes:

== Quote from Bill Baxter (wbaxter gmail.com)'s article
 On Mon, Feb 16, 2009 at 2:33 AM, Don Clugston <nospam nospam.com> wrote:
 Andrei Alexandrescu wrote:
 Don wrote:
 Andrei Alexandrescu wrote:
 auto rng = Random(unpredictableSeed);
 auto a = 0.0, b = 1.0;
 auto x1 = uniform!("[]")(rng, a, b);
 auto x2 = uniform!("[)")(rng, a, b);
 auto x3 = uniform!("(]")(rng, a, b);
 auto x4 = uniform!("()")(rng, a, b);

 This is a general issue applying to any numeric range. I've been giving
 the issue of numeric ranges some thought, and I have begun an implementation
 of a general abstraction.
 Any open range can be converted into a closed range, but the converse
 does not apply. So any implementation will be using "[]" internally.

 -range("[)", a, b) == range("(]", -b, -a)
 range("[)", a, b) == range("[]", a, predecessor(b))
 range("()", a, b) == range("[]", successor(a), predecessor(b))


 There's a couple of difficult situations involving floating-point
 numbers.
 *  "[)" has the uncomfortable property that (-2,-1, rng) includes -2 but
 not -1, whereas (1, 2, rng) includes 1 but not 2.

 * any floating point range which includes 0 is difficult, because there
 are so many numbers which are almost zero. The probability of getting a zero
 for an 80-bit real is so small that you probably wouldn't encounter it in
 your lifetime. I think this weakens arguments based on analogy with the
 integer case.

 However, it is much easier to make an unbiased rng for [1,2) than for
 [1,2] or (1,2) (since the number of members in the range is even).

 So what would you recommend? [a, b) for floats and [a, b] for ints, or [a,
 b) for everything?

 Andrei

 I'm leaning towards [a,b) for everything (consistency with arrays), but I'd
 want to know what the reasoning of the boost/c++0x guys was.

 How do you create a random uint that can take on any of uint's values
 with [a,b)?  That's the main reason I can think of to go with [a,b]
 for integral types.  With floats it's never useful to use the entire
 value range.
 --bb

Keep in mind that we're talking only about defaults here.  At least in the
current
implementation, "[)" vs. "[]" can be specified by template parameters.  Wanting
the entire range of an integer type is an edge case, so it's not important that
it
be well-supported by the defaults, as long as it can be done without resorting
to
serious kludges.

Feb 15 2009

Bill Baxter <wbaxter gmail.com> writes:

On Mon, Feb 16, 2009 at 3:41 AM, dsimcha <dsimcha yahoo.com> wrote:
 == Quote from Bill Baxter (wbaxter gmail.com)'s article
 On Mon, Feb 16, 2009 at 2:33 AM, Don Clugston <nospam nospam.com> wrote:
 Andrei Alexandrescu wrote:
 Don wrote:
 Andrei Alexandrescu wrote:
 auto rng = Random(unpredictableSeed);
 auto a = 0.0, b = 1.0;
 auto x1 = uniform!("[]")(rng, a, b);
 auto x2 = uniform!("[)")(rng, a, b);
 auto x3 = uniform!("(]")(rng, a, b);
 auto x4 = uniform!("()")(rng, a, b);

 This is a general issue applying to any numeric range. I've been giving
 the issue of numeric ranges some thought, and I have begun an implementation
 of a general abstraction.
 Any open range can be converted into a closed range, but the converse
 does not apply. So any implementation will be using "[]" internally.

 -range("[)", a, b) == range("(]", -b, -a)
 range("[)", a, b) == range("[]", a, predecessor(b))
 range("()", a, b) == range("[]", successor(a), predecessor(b))


 There's a couple of difficult situations involving floating-point
 numbers.
 *  "[)" has the uncomfortable property that (-2,-1, rng) includes -2 but
 not -1, whereas (1, 2, rng) includes 1 but not 2.

 * any floating point range which includes 0 is difficult, because there
 are so many numbers which are almost zero. The probability of getting a zero
 for an 80-bit real is so small that you probably wouldn't encounter it in
 your lifetime. I think this weakens arguments based on analogy with the
 integer case.

 However, it is much easier to make an unbiased rng for [1,2) than for
 [1,2] or (1,2) (since the number of members in the range is even).

 So what would you recommend? [a, b) for floats and [a, b] for ints, or [a,
 b) for everything?

 Andrei

 I'm leaning towards [a,b) for everything (consistency with arrays), but I'd
 want to know what the reasoning of the boost/c++0x guys was.

 How do you create a random uint that can take on any of uint's values
 with [a,b)?  That's the main reason I can think of to go with [a,b]
 for integral types.  With floats it's never useful to use the entire
 value range.
 --bb

 Keep in mind that we're talking only about defaults here.  At least in the
current
 implementation, "[)" vs. "[]" can be specified by template parameters.  Wanting
 the entire range of an integer type is an edge case, so it's not important
that it
 be well-supported by the defaults, as long as it can be done without resorting
to
 serious kludges.

I don't agree that it's such an edge case.  I often want a random
ubyte that covers the entire range.

--bb

Feb 15 2009

"Joel C. Salomon" <joelcsalomon gmail.com> writes:

Don wrote:
 There's a couple of difficult situations involving floating-point numbers.

<snip>
 * any floating point range which includes 0 is difficult, because there
 are so many numbers which are almost zero. The probability of getting a
 zero for an 80-bit real is so small that you probably wouldn't encounter
 it in your lifetime. I think this weakens arguments based on analogy
 with the integer case.

In that vein: for the floating-point case, do you want to emulate a
“true” uniform random distribution in the range [x, y), or make every
expressible IEEE 754 value in that range come out with equal likelihood?
One is the logarithm of the other, in some sense.

—Joel Salomon

Feb 15 2009

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

Joel C. Salomon wrote:
 Don wrote:
 There's a couple of difficult situations involving floating-point numbers.

 <snip>
 * any floating point range which includes 0 is difficult, because there
 are so many numbers which are almost zero. The probability of getting a
 zero for an 80-bit real is so small that you probably wouldn't encounter
 it in your lifetime. I think this weakens arguments based on analogy
 with the integer case.

 
 In that vein: for the floating-point case, do you want to emulate a
 �true� uniform random distribution in the range [x, y), or make every
 expressible IEEE 754 value in that range come out with equal likelihood?
 One is the logarithm of the other, in some sense.

The former! Otherwise the distribution will be biased toward numbers 
closer to 0 (which are denser than larger numbers).

That being said, for the range [0, 1) the exponent is constant so it's 
nice to generate all possible mantissa values, something that simple 
division by uint.max does not achieve.


Andrei

Feb 15 2009

Don <nospam nospam.com> writes:

Andrei Alexandrescu wrote:
 Joel C. Salomon wrote:
 Don wrote:
 There's a couple of difficult situations involving floating-point 
 numbers.

 <snip>
 * any floating point range which includes 0 is difficult, because there
 are so many numbers which are almost zero. The probability of getting a
 zero for an 80-bit real is so small that you probably wouldn't encounter
 it in your lifetime. I think this weakens arguments based on analogy
 with the integer case.

 In that vein: for the floating-point case, do you want to emulate a
 �true� uniform random distribution in the range [x, y), or make every
 expressible IEEE 754 value in that range come out with equal likelihood?
 One is the logarithm of the other, in some sense.

 
 The former! Otherwise the distribution will be biased toward numbers 
 closer to 0 (which are denser than larger numbers).
 
 That being said, for the range [0, 1) the exponent is constant so it's 
 nice to generate all possible mantissa values, something that simple 
 division by uint.max does not achieve.

You mean [1, 2). The code for complete uniform distribution is in 
tango.math.random, thanks to Fawzi.
Looking closely at his code, I find he's using (a,b) for the 
floating-point case, not the [a,b) which I had expected.

Feb 15 2009

Christopher Wright <dhasenan gmail.com> writes:

Andrei Alexandrescu wrote:
 auto rng = Random(unpredictableSeed);
 auto a = 0.0, b = 1.0;
 auto x1 = uniform!("[]")(rng, a, b);
 auto x2 = uniform!("[)")(rng, a, b);
 auto x3 = uniform!("(]")(rng, a, b);
 auto x4 = uniform!("()")(rng, a, b);

I always have to look up whether "[]" is exclusive or inclusive. It's 
succinct, but you only have to write code once, and you have to read it 
often.

Granted, people who are more math-intensive than I can read that without 
thinking about it, but the preponderance of programmers have no more 
math skills than I.

So, might I suggest using an enum rather than a string mixin?

Feb 14 2009

bearophile <bearophileHUGS lycos.com> writes:

Christopher Wright:
 I always have to look up whether "[]" is exclusive or inclusive.

A cute thing for you: where I live in Mathematics we use a different syntax:
[1, 10] <= closed interval
]1, 10[ <= open interval
[1, 10[ <= interval open on the right
]1, 10] <= interval open on the left


 So, might I suggest using an enum rather than a string mixin?

Argh! :-)

Bye,
bearophile

Feb 14 2009

Ary Borenszweig <ary esperanto.org.ar> writes:

Christopher Wright escribi�:
 Andrei Alexandrescu wrote:
 auto rng = Random(unpredictableSeed);
 auto a = 0.0, b = 1.0;
 auto x1 = uniform!("[]")(rng, a, b);
 auto x2 = uniform!("[)")(rng, a, b);
 auto x3 = uniform!("(]")(rng, a, b);
 auto x4 = uniform!("()")(rng, a, b);

 
 I always have to look up whether "[]" is exclusive or inclusive. It's 
 succinct, but you only have to write code once, and you have to read it 
 often.

You can remember "]" is more in touch with the number of the right (the 
stick is in full touch with the number) while the ")" only touches the 
number in that tiny right bit of the parenthesis. So the "]" includes 
it, but the ")" merely touches it but doesn't include it.

At least that's what comes to my mind with those symbols. :-P

Feb 14 2009

Christopher Wright <dhasenan gmail.com> writes:

Ary Borenszweig wrote:
 Christopher Wright escribi�:
 Andrei Alexandrescu wrote:
 auto rng = Random(unpredictableSeed);
 auto a = 0.0, b = 1.0;
 auto x1 = uniform!("[]")(rng, a, b);
 auto x2 = uniform!("[)")(rng, a, b);
 auto x3 = uniform!("(]")(rng, a, b);
 auto x4 = uniform!("()")(rng, a, b);

 I always have to look up whether "[]" is exclusive or inclusive. It's 
 succinct, but you only have to write code once, and you have to read 
 it often.

 
 You can remember "]" is more in touch with the number of the right (the 
 stick is in full touch with the number) while the ")" only touches the 
 number in that tiny right bit of the parenthesis. So the "]" includes 
 it, but the ")" merely touches it but doesn't include it.
 
 At least that's what comes to my mind with those symbols. :-P

Right, but these helpful mnemonics will flee well before I see any use 
of uniform.

Feb 14 2009

Jason House <jason.james.house gmail.com> writes:

Walter Bright Wrote:

 Andrei Alexandrescu wrote:
 2. The global random generator will be allocated per thread. Are you 
 cool with this too?

 
 That could have speed problems with a tight loop accessing thread local 
 storage. If it's a shared global with no locking, the thread randomness 
 will increase its randomness <g>.


Lately, I wonder why TLS is treated as special and globals are normal. I don't
know enough about threading models on various platforms to fully understand the
implications. I've already been tempted to move some of my multi-threaded D
code to multi-process for the benefits it offers (mostly efficient garbage
collection when scaling to 8+ threads. My use of shared memory is minimal and
easily eliminated. 


 
 3. How should the global rng be initialized? To always generate the same 
 sequence, or not?

 
 Always different, because that's what I want if I just want a freakin' 
 random number!

Feb 14 2009

bearophile <bearophileHUGS lycos.com> writes:

Don:
 arr[random(0..arr.length)]
 is likely to be an _extremely_ common usage pattern. That argues very 
 strongly for "[)".

For that there's choice():

choice(arr)

Bye,
bearophile

Feb 15 2009

Lionello Lunesu <lio lunesu.remove.com> writes:

Andrei Alexandrescu wrote:
 1. Are you cool with making the rng the last parameter and give it a 
 default value?

Yes.

 2. The global random generator will be allocated per thread. Are you 
 cool with this too?

Yes.

 3. How should the global rng be initialized? To always generate the same 
 sequence, or not?

If you go with a rng per thread, then it's probably safest NOT to 
generate the same sequence by default. Somebody how knows what he's 
doing can always seed with a constant (for tests and such.)

 4. While we're at it, should uniform(a, b) generate by default something 
 in [a, b] or [a, b)? Someone once explained to me that generating [a, b] 
 for floating point numbers is the source of all evils and that Hitler, 
 Stalin and Kim Il Sung (should he still be alive) must be using that 
 kind of generator. Conversely, generating [a, b) is guaranteed to bring 
 in the long term everlasting peace to Earth. My problem however is that 
 in the integer realm I always want to generate [a, b]. Furthermore, I 
 wouldn't be happy if the shape of the interval was different for 
 integers and floating point numbers. How to break this conundrum? Don't 
 forget that we're only worrying about defaults, explicit generation is 
 always possible with self-explanatory code:
 
 auto rng = Random(unpredictableSeed);
 auto a = 0.0, b = 1.0;
 auto x1 = uniform!("[]")(rng, a, b);
 auto x2 = uniform!("[)")(rng, a, b);
 auto x3 = uniform!("(]")(rng, a, b);
 auto x4 = uniform!("()")(rng, a, b);

I love your templates as much as the next guy, but let's not forget the 
possibility of using different function names instead of templating 
everything!?

And, as other have said, I'm ok with [) since that's what rand()%max 
boiled down to as well. Futhermore, since arrays in D are also [), users 
of D get used to it so it's not odd at all.

L.

Feb 15 2009

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

Lionello Lunesu wrote:
 Andrei Alexandrescu wrote:
 auto rng = Random(unpredictableSeed);
 auto a = 0.0, b = 1.0;
 auto x1 = uniform!("[]")(rng, a, b);
 auto x2 = uniform!("[)")(rng, a, b);
 auto x3 = uniform!("(]")(rng, a, b);
 auto x4 = uniform!("()")(rng, a, b);

 
 I love your templates as much as the next guy, but let's not forget the 
 possibility of using different function names instead of templating 
 everything!?

I couldn't find good names. uniformOpenClosed, uniformOC, 
uniformOpenLeft... what would be some good names?

 And, as other have said, I'm ok with [) since that's what rand()%max 
 boiled down to as well. Futhermore, since arrays in D are also [), users 
 of D get used to it so it's not odd at all.

Ok. Let me just note that rand()%max is a lousy method of generating 
random numbers between 0 and max-1 and everybody should put that in the 
bin with Popular Examples That Should Never Be Used, together with 
exponential Fibonacci, linear-space factorial, and bubblesort.


Andrei

Feb 15 2009

Steve Schveighoffer <schveiguy yahoo.com> writes:

On Sun, 15 Feb 2009 17:27:38 -0800, Andrei Alexandrescu wrote:

 Ok. Let me just note that rand()%max is a lousy method of generating
 random numbers between 0 and max-1 and everybody should put that in the
 bin with Popular Examples That Should Never Be Used, together with
 exponential Fibonacci, linear-space factorial, and bubblesort.

Do you mean rand()%max specifically in the case of the C function rand
()?  Or using %max on any random number generator in general.  If the 
first case, I totally agree, and I found out that from experience first, 
then googling second :)  But if the latter, I wasn't aware that all RNGs 
were bad at randomizing the lower bits, so could you explain why?

It's also possible that Lionello meant the latter case as well.

-Steve

Feb 15 2009

Bill Baxter <wbaxter gmail.com> writes:

On Mon, Feb 16, 2009 at 12:07 PM, Steve Schveighoffer
<schveiguy yahoo.com> wrote:
 On Sun, 15 Feb 2009 17:27:38 -0800, Andrei Alexandrescu wrote:

 Ok. Let me just note that rand()%max is a lousy method of generating
 random numbers between 0 and max-1 and everybody should put that in the
 bin with Popular Examples That Should Never Be Used, together with
 exponential Fibonacci, linear-space factorial, and bubblesort.

 Do you mean rand()%max specifically in the case of the C function rand
 ()?  Or using %max on any random number generator in general.  If the
 first case, I totally agree, and I found out that from experience first,
 then googling second :)  But if the latter, I wasn't aware that all RNGs
 were bad at randomizing the lower bits, so could you explain why?

 It's also possible that Lionello meant the latter case as well.

It's bad in general.  Say your RNG generates perfectly uniform numbers
between 0 and 10.
Now you say  RNG()%7 to get values 0..6   Not all inputs map to
outputs evenly now.  More inputs map to 0 than to 6, so the
distribution is no longer uniform.
Of course if you % by something that divides evenly into the RNG's
range then it shouldn't be a problem, other than the lack of
randomness in lower bits of some rand() implementations.

--bb

Feb 15 2009

Steve Schveighoffer <schveiguy yahoo.com> writes:

On Mon, 16 Feb 2009 12:20:34 +0900, Bill Baxter wrote:

 On Mon, Feb 16, 2009 at 12:07 PM, Steve Schveighoffer
 <schveiguy yahoo.com> wrote:
 On Sun, 15 Feb 2009 17:27:38 -0800, Andrei Alexandrescu wrote:

 Ok. Let me just note that rand()%max is a lousy method of generating
 random numbers between 0 and max-1 and everybody should put that in
 the bin with Popular Examples That Should Never Be Used, together with
 exponential Fibonacci, linear-space factorial, and bubblesort.

 Do you mean rand()%max specifically in the case of the C function rand
 ()?  Or using %max on any random number generator in general.  If the
 first case, I totally agree, and I found out that from experience
 first, then googling second :)  But if the latter, I wasn't aware that
 all RNGs were bad at randomizing the lower bits, so could you explain
 why?

 It's also possible that Lionello meant the latter case as well.

 
 It's bad in general.  Say your RNG generates perfectly uniform numbers
 between 0 and 10.
 Now you say  RNG()%7 to get values 0..6   Not all inputs map to outputs
 evenly now.  More inputs map to 0 than to 6, so the distribution is no
 longer uniform.
 Of course if you % by something that divides evenly into the RNG's range
 then it shouldn't be a problem, other than the lack of randomness in
 lower bits of some rand() implementations.
 
 --bb

Hm... good point, but probably not significant unless you are generating 
numbers in a range close to the maximum range.  For example, the 0..6 
range isn't going to divide evenly into int.max, but the bias towards 
certain values is going to be very slight.  I would guess then that the 
correct method is to add n to the nth random number generated?  Or is 
there a better way?

-Steve

Feb 15 2009

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

Steve Schveighoffer wrote:
 On Mon, 16 Feb 2009 12:20:34 +0900, Bill Baxter wrote:
 
 On Mon, Feb 16, 2009 at 12:07 PM, Steve Schveighoffer
 <schveiguy yahoo.com> wrote:
 On Sun, 15 Feb 2009 17:27:38 -0800, Andrei Alexandrescu wrote:

 Ok. Let me just note that rand()%max is a lousy method of generating
 random numbers between 0 and max-1 and everybody should put that in
 the bin with Popular Examples That Should Never Be Used, together with
 exponential Fibonacci, linear-space factorial, and bubblesort.

 Do you mean rand()%max specifically in the case of the C function rand
 ()?  Or using %max on any random number generator in general.  If the
 first case, I totally agree, and I found out that from experience
 first, then googling second :)  But if the latter, I wasn't aware that
 all RNGs were bad at randomizing the lower bits, so could you explain
 why?

 It's also possible that Lionello meant the latter case as well.

 It's bad in general.  Say your RNG generates perfectly uniform numbers
 between 0 and 10.
 Now you say  RNG()%7 to get values 0..6   Not all inputs map to outputs
 evenly now.  More inputs map to 0 than to 6, so the distribution is no
 longer uniform.
 Of course if you % by something that divides evenly into the RNG's range
 then it shouldn't be a problem, other than the lack of randomness in
 lower bits of some rand() implementations.

 --bb

 
 Hm... good point, but probably not significant unless you are generating 
 numbers in a range close to the maximum range.  For example, the 0..6 
 range isn't going to divide evenly into int.max, but the bias towards 
 certain values is going to be very slight.  I would guess then that the 
 correct method is to add n to the nth random number generated?  Or is 
 there a better way?

See the implementation of uniform in std.random.

Andrei

Feb 15 2009

Steve Schveighoffer <schveiguy yahoo.com> writes:

On Sun, 15 Feb 2009 20:30:06 -0800, Andrei Alexandrescu wrote:

 Steve Schveighoffer wrote:
 On Mon, 16 Feb 2009 12:20:34 +0900, Bill Baxter wrote:
 
 On Mon, Feb 16, 2009 at 12:07 PM, Steve Schveighoffer
 <schveiguy yahoo.com> wrote:
 On Sun, 15 Feb 2009 17:27:38 -0800, Andrei Alexandrescu wrote:

 Ok. Let me just note that rand()%max is a lousy method of generating
 random numbers between 0 and max-1 and everybody should put that in
 the bin with Popular Examples That Should Never Be Used, together
 with exponential Fibonacci, linear-space factorial, and bubblesort.

 Do you mean rand()%max specifically in the case of the C function
 rand ()?  Or using %max on any random number generator in general. 
 If the first case, I totally agree, and I found out that from
 experience first, then googling second :)  But if the latter, I
 wasn't aware that all RNGs were bad at randomizing the lower bits, so
 could you explain why?

 It's also possible that Lionello meant the latter case as well.

 It's bad in general.  Say your RNG generates perfectly uniform numbers
 between 0 and 10.
 Now you say  RNG()%7 to get values 0..6   Not all inputs map to
 outputs evenly now.  More inputs map to 0 than to 6, so the
 distribution is no longer uniform.
 Of course if you % by something that divides evenly into the RNG's
 range then it shouldn't be a problem, other than the lack of
 randomness in lower bits of some rand() implementations.

 --bb

 
 Hm... good point, but probably not significant unless you are
 generating numbers in a range close to the maximum range.  For example,
 the 0..6 range isn't going to divide evenly into int.max, but the bias
 towards certain values is going to be very slight.  I would guess then
 that the correct method is to add n to the nth random number generated?
  Or is there a better way?

 
 See the implementation of uniform in std.random.
 
 Andrei

Ugh, so you have to repeatedly generate random numbers until it gives you 
a value inside your range?  Using your calculation for bucketsize, if I 
want to generate a random number between 0 and (urng.max - urng.min) / 2 
+ 1 will enter a loop for which it throws away half the random numbers 
generated?  It seems like there should be a better method, but I can't 
think of one right now.

You also can add an optimization to your code, change your final code to:

auto maxvalue = urng.min + myRange * bucketSize + (bucketSize - 1);
do
{
   r = urng.next; // avoid calculation in loop
}
while (r > maxvalue);

return _a + (r - urng.min) / bucketSize;

Not sure if it makes a huge difference, or has a huge negative impact 
when it's not important.

-Steve

Feb 15 2009

"Lionello Lunesu" <lionello lunesu.remove.com> writes:

"Steve Schveighoffer" <schveiguy yahoo.com> wrote in message 
news:gnal9n$19a9$3 digitalmars.com...
 On Sun, 15 Feb 2009 17:27:38 -0800, Andrei Alexandrescu wrote:

 Ok. Let me just note that rand()%max is a lousy method of generating
 random numbers between 0 and max-1 and everybody should put that in the
 bin with Popular Examples That Should Never Be Used, together with
 exponential Fibonacci, linear-space factorial, and bubblesort.

 Do you mean rand()%max specifically in the case of the C function rand
 ()?  Or using %max on any random number generator in general.  If the
 first case, I totally agree, and I found out that from experience first,
 then googling second :)  But if the latter, I wasn't aware that all RNGs
 were bad at randomizing the lower bits, so could you explain why?

 It's also possible that Lionello meant the latter case as well.

Actually, I forgot to add that I always use divisors of RAND_MAX ;-)

L.

Feb 15 2009

"Lionello Lunesu" <lionello lunesu.remove.com> writes:

"Andrei Alexandrescu" <SeeWebsiteForEmail erdani.org> wrote in message 
news:gnafec$1cko$1 digitalmars.com...
 Ok. Let me just note that rand()%max is a lousy method of generating 
 random numbers between 0 and max-1 and everybody should put that in the 
 bin with Popular Examples That Should Never Be Used, together with 
 exponential Fibonacci, linear-space factorial, and bubblesort.

Yes yes, I know I know, but, honestly, when I'm generating noise for some 
terrain generation or a random color, or some fuzzy AI behaviour, I don't 
care about random periods that much and %max does the job. Nobody is using 
%max for private key generation, AFAIK :)

L.

Feb 15 2009

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

Lionello Lunesu wrote:
 
 "Andrei Alexandrescu" <SeeWebsiteForEmail erdani.org> wrote in message 
 news:gnafec$1cko$1 digitalmars.com...
 Ok. Let me just note that rand()%max is a lousy method of generating 
 random numbers between 0 and max-1 and everybody should put that in 
 the bin with Popular Examples That Should Never Be Used, together with 
 exponential Fibonacci, linear-space factorial, and bubblesort.

 
 Yes yes, I know I know, but, honestly, when I'm generating noise for 
 some terrain generation or a random color, or some fuzzy AI behaviour, I 
 don't care about random periods that much and %max does the job.

Ahhh, so that's why all my enemies in Far Cry 1 and 2 behave so 
predictably! :o)

Andrei

Feb 16 2009

Don <nospam nospam.com> writes:

Andrei Alexandrescu wrote:
 Lionello Lunesu wrote:
 Andrei Alexandrescu wrote:
 auto rng = Random(unpredictableSeed);
 auto a = 0.0, b = 1.0;
 auto x1 = uniform!("[]")(rng, a, b);
 auto x2 = uniform!("[)")(rng, a, b);
 auto x3 = uniform!("(]")(rng, a, b);
 auto x4 = uniform!("()")(rng, a, b);

 I love your templates as much as the next guy, but let's not forget 
 the possibility of using different function names instead of 
 templating everything!?

 
 I couldn't find good names. uniformOpenClosed, uniformOC, 
 uniformOpenLeft... what would be some good names?

I don't like that approach. The "[)", "[]" issue comes up every time 
that you have a numeric range. It's not just random number generation.

Unfortunately something like the following is too ugly.

uniform(rng, Interval!("[)")(a, b));

A more radical possibility would be to make .nextUp, .nextDown built-in 
primitives for floating-point types. Then you could make all the ranges 
"[]".

 
 And, as other have said, I'm ok with [) since that's what rand()%max 
 boiled down to as well. Futhermore, since arrays in D are also [), 
 users of D get used to it so it's not odd at all.

 
 Ok. Let me just note that rand()%max is a lousy method of generating 
 random numbers between 0 and max-1 and everybody should put that in the 
 bin with Popular Examples That Should Never Be Used, together with 
 exponential Fibonacci, linear-space factorial, and bubblesort.

 
 Andrei

Feb 16 2009

Leandro Lucarella <llucax gmail.com> writes:

Andrei Alexandrescu, el 13 de febrero a las 17:10 me escribiste:
 Leonardo suggested that some functions in std.random should not require their 
 user to be bothered with creating a random object, i.e.:
 
 auto r = Random(unpredictableSeed);
 auto n = uniform(r, 0, 100);
 
 Instead the library should simply support:
 
 auto n = uniform(0, 100);
 
 and do this by presumably using a global RNG under the hood. So I wanted to
ask 
 all y'all:
 
 1. Are you cool with making the rng the last parameter and give it a default 
 value?

Yes

 2. The global random generator will be allocated per thread. Are you cool with 
 this too?

Yes

 3. How should the global rng be initialized? To always generate the same 
 sequence, or not?

NO! I think this feature is for small script-like programs, where the most
common case is you want a random number to be random =)

If you want to do some more complicated stuff, just set the seed yourself.

 4. While we're at it, should uniform(a, b) generate by default something in
[a, 
 b] or [a, b)? Someone once explained to me that generating [a, b] for floating 
 point numbers is the source of all evils and that Hitler, Stalin and Kim Il
Sung 
 (should he still be alive) must be using that kind of generator. Conversely, 
 generating [a, b) is guaranteed to bring in the long term everlasting peace to 
 Earth. My problem however is that in the integer realm I always want to
generate 
 [a, b]. Furthermore, I wouldn't be happy if the shape of the interval was 
 different for integers and floating point numbers. How to break this
conundrum? 
 Don't forget that we're only worrying about defaults, explicit generation is 
 always possible with self-explanatory code:

Being a C and Python programmer mostly, I find more intuitive [a, b) (The
'for' statement is usually used that way, same for the Python's range()
function)

-- 
Leandro Lucarella (luca) | Blog colectivo: http://www.mazziblog.com.ar/blog/
----------------------------------------------------------------------------
GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145  104C 949E BFB6 5F5A 8D05)
----------------------------------------------------------------------------
Un paracaidista, que no deja de caer.
Lo que me lleva hacia arriba, es lo que me tira hacia abajo.

Feb 16 2009

Jarrett Billingsley <jarrett.billingsley gmail.com> writes:

On Fri, Feb 13, 2009 at 8:10 PM, Andrei Alexandrescu
<SeeWebsiteForEmail erdani.org> wrote:

I feel like I'm mimicking everyone else by now, but:

 1. Are you cool with making the rng the last parameter and give it a default
 value?

Yes.

 2. The global random generator will be allocated per thread. Are you cool
 with this too?

Yes.

 3. How should the global rng be initialized? To always generate the same
 sequence, or not?

Hell no.

 4. While we're at it, should uniform(a, b) generate by default something in
 [a, b] or [a, b)? Someone once explained to me that generating [a, b] for
 floating point numbers is the source of all evils and that Hitler, Stalin
 and Kim Il Sung (should he still be alive) must be using that kind of
 generator. Conversely, generating [a, b) is guaranteed to bring in the long
 term everlasting peace to Earth. My problem however is that in the integer
 realm I always want to generate [a, b]. Furthermore, I wouldn't be happy if
 the shape of the interval was different for integers and floating point
 numbers. How to break this conundrum? Don't forget that we're only worrying
 about defaults, explicit generation is always possible with self-explanatory
 code:

Almost all of my uses of random numbers involve indexing arrays, so
[a, b) would be most useful for me.  But again, I could certainly
"alias uniform!("[)") uniformIdx;".

Feb 16 2009

Daniel Keep <daniel.keep.lists gmail.com> writes:

Slightly off-topic, but I just found this and thought it might be of
interest.

http://mjolnirstudios.com/IanBullard/files/79ffbca75a75720f066d491e9ea935a0-10.php

For reference, here is the ASM source for the mentioned Flipcode random
number generator:

 pshufw mm1, mm0, 0x1E
 paddd mm0, mm1

Finally, here's a link to the DieHard test on Wikipedia:

http://en.wikipedia.org/wiki/Diehard_tests


 -- Daniel

Feb 19 2009

BCS <none anon.com> writes:

Hello Daniel,

 Slightly off-topic, but I just found this and thought it might be of
 interest.
 
 http://mjolnirstudios.com/IanBullard/files/79ffbca75a75720f066d491e9ea
 935a0-10.php
 
 For reference, here is the ASM source for the mentioned Flipcode
 random number generator:
 
 pshufw mm1, mm0, 0x1E
 paddd mm0, mm1

 Finally, here's a link to the DieHard test on Wikipedia:
 
 http://en.wikipedia.org/wiki/Diehard_tests
 
 -- Daniel
 

someone should make a "double DieHard(uint function())" so we can test the 
quality of rand's.

Feb 19 2009

Don <nospam nospam.com> writes:

Daniel Keep wrote:
 Slightly off-topic, but I just found this and thought it might be of
 interest.
 
 http://mjolnirstudios.com/IanBullard/files/79ffbca75a75720f066d491e9ea935a0-10.php
 
 For reference, here is the ASM source for the mentioned Flipcode random
 number generator:
 
 pshufw mm1, mm0, 0x1E
 paddd mm0, mm1

 
 Finally, here's a link to the DieHard test on Wikipedia:
 
 http://en.wikipedia.org/wiki/Diehard_tests
 
 
  -- Daniel

Yeah. It's worth knowing that reason that linear congruential RNGs have 
historically been used so much is because _they are the easiest to 
analyze mathematically_, NOT because they are good in either speed or 
randomness. Since the early days, the techniques for evaluating RNGs 
have improved dramatically (and the DieHard test suite incorporates much 
of the analysis), so their original selling point is gone.
There is NO REASON to use a linear congruential RNG as the default any more.

Feb 20 2009

D Programming

C/C++ Programming

Other

digitalmars.D - default random object?