digitalmars.D.learn - Generating Strings with Random Contents
- =?UTF-8?B?Ik5vcmRsw7Z3Ig==?= (2/2) Jul 14 2014 Is there a natural way of generating/filling a
- bearophile (15/18) Jul 14 2014 Do you mean something like this?
- Brad Anderson (3/21) Jul 14 2014 Alternative:
- Brad Anderson (3/5) Jul 14 2014 std.ascii should really be using std.encoding.AsciiString. Then
- bearophile (4/9) Jul 14 2014 Bye,
- Brad Anderson (2/12) Jul 14 2014 Hmm, good catch. Not the behavior I expected.
- Joseph Rushton Wakeling via Digitalmars-d-learn (6/8) Jul 16 2014 No, I don't think that's appropriate, because it will pick 10 individual...
- =?UTF-8?B?Ik5vcmRsw7Z3Ig==?= (4/22) Jul 14 2014 I was specifically interested in something that exercises (random
- =?UTF-8?B?Ik5vcmRsw7Z3Ig==?= (4/4) Jul 14 2014 On Monday, 14 July 2014 at 22:32:51 UTC, Nordlöw wrote:
- =?UTF-8?B?Ik5vcmRsw7Z3Ig==?= (5/9) Jul 14 2014 isValidCodePoint()
- =?UTF-8?B?Ik5vcmRsw7Z3Ig==?= (6/7) Jul 14 2014 Is it really this simple?
- bearophile (6/9) Jul 14 2014 Several combinations of unicode chars are not meaningful/valid
- =?UTF-8?B?Ik5vcmRsw7Z3Ig==?= (3/6) Jul 14 2014 So I guess we need something more than just isValidCodePoint
- =?UTF-8?B?Ik5vcmRsw7Z3Ig==?= (3/5) Jul 14 2014 Here's a first try:
- bearophile (5/6) Jul 14 2014 Isn't @trusted mostly for small parts of Phobos code? I suggest
- =?UTF-8?B?Ik5vcmRsw7Z3Ig==?= (2/3) Jul 15 2014 Could someone elaborate shortly which cases this means?
- bearophile (5/6) Jul 15 2014 All cases where you really can't live without it :-) It's like a
- =?UTF-8?B?Ik5vcmRsw7Z3Ig==?= (3/4) Jul 15 2014 Hmm. I guess I'm gonna have to remove some @trusted tagging then
- bearophile (5/8) Jul 14 2014 That's harder. Generating all uints and then testing if it's a
- Joseph Rushton Wakeling via Digitalmars-d-learn (6/8) Jul 16 2014 I think you need to be more specific about what kind of random contents ...
- =?UTF-8?B?Ik5vcmRsw7Z3Ig==?= (43/47) Jul 17 2014 Just a random dchar (ciode point) sample like this:
Is there a natural way of generating/filling a string/wstring/dstring of a specific length with random contents?
Jul 14 2014
Nordlöw:Is there a natural way of generating/filling a string/wstring/dstring of a specific length with random contents?Do you mean something like this? import std.stdio, std.random, std.ascii, std.range, std.conv; string genRandomString(in size_t len) { return len .iota .map!(_ => lowercase[uniform(0, $)]) .text; } void main() { import std.stdio; 10.genRandomString.writeln; } Bye, bearophile
Jul 14 2014
On Monday, 14 July 2014 at 22:21:36 UTC, bearophile wrote:Nordlöw:Alternative: randomSample(lowercase, 10, lowercase.length).writeln;Is there a natural way of generating/filling a string/wstring/dstring of a specific length with random contents?Do you mean something like this? import std.stdio, std.random, std.ascii, std.range, std.conv; string genRandomString(in size_t len) { return len .iota .map!(_ => lowercase[uniform(0, $)]) .text; } void main() { import std.stdio; 10.genRandomString.writeln; } Bye, bearophile
Jul 14 2014
On Monday, 14 July 2014 at 22:27:57 UTC, Brad Anderson wrote:Alternative: randomSample(lowercase, 10, lowercase.length).writeln;std.ascii should really be using std.encoding.AsciiString. Then that length wouldn't be necessary.
Jul 14 2014
Brad Anderson:Alternative: randomSample(lowercase, 10, lowercase.length).writeln;From randomSample docs:Selects a random subsample out of r, containing exactly n elements. The order of elements is the same as in the original range.<Bye, bearophile
Jul 14 2014
On Monday, 14 July 2014 at 22:32:25 UTC, bearophile wrote:Brad Anderson:Hmm, good catch. Not the behavior I expected.Alternative: randomSample(lowercase, 10, lowercase.length).writeln;From randomSample docs:Selects a random subsample out of r, containing exactly n elements. The order of elements is the same as in the original range.<Bye, bearophile
Jul 14 2014
Alternative: randomSample(lowercase, 10, lowercase.length).writeln;No, I don't think that's appropriate, because it will pick 10 individual characters from a, b, c, ... , z (i.e. no character will appear more than once), and the characters picked will appear in alphabetical order. Incidentally, if lowercase has the .length property, there's no need to pass the length separately to randomSample. Just passing lowercase itself and the number of sample points desired is sufficient.
Jul 16 2014
On Monday, 14 July 2014 at 22:21:36 UTC, bearophile wrote:Nordlöw:I was specifically interested in something that exercises (random samples) potentially _all_ code points for string, wstring and dstring (all code units that is).Is there a natural way of generating/filling a string/wstring/dstring of a specific length with random contents?Do you mean something like this? import std.stdio, std.random, std.ascii, std.range, std.conv; string genRandomString(in size_t len) { return len .iota .map!(_ => lowercase[uniform(0, $)]) .text; } void main() { import std.stdio; 10.genRandomString.writeln; } Bye, bearophile
Jul 14 2014
On Monday, 14 July 2014 at 22:32:51 UTC, Nordlöw wrote: I believe defining a complete random sampling of all code units in dchar is a good start right? This can then be reused to lazily convert while filling in a string and wstring.
Jul 14 2014
On Monday, 14 July 2014 at 22:35:59 UTC, Nordlöw wrote:On Monday, 14 July 2014 at 22:32:51 UTC, Nordlöw wrote: I believe defining a complete random sampling of all code units in dchar is a good start right? This can then be reused to lazily convert while filling in a string and wstring.isValidCodePoint() at http://dlang.org/phobos/std_encoding.html might be were to start.
Jul 14 2014
On Monday, 14 July 2014 at 22:39:08 UTC, Nordlöw wrote:might be were to start.Is it really this simple? bool isValidCodePoint(dchar c) { return c < 0xD800 || (c >= 0xE000 && c < 0x110000); }
Jul 14 2014
Nordlöw:I believe defining a complete random sampling of all code units in dchar is a good start right? This can then be reused to lazily convert while filling in a string and wstring.Several combinations of unicode chars are not meaningful/valid (like pairs of ligatures). Any thing that has to work correctly with Unicode is complex. Bye, bearophile
Jul 14 2014
On Monday, 14 July 2014 at 22:39:15 UTC, bearophile wrote:Several combinations of unicode chars are not meaningful/valid (like pairs of ligatures). Any thing that has to work correctly with Unicode is complex.So I guess we need something more than just isValidCodePoint right?
Jul 14 2014
On Monday, 14 July 2014 at 22:45:29 UTC, Nordlöw wrote:So I guess we need something more than just isValidCodePoint right?Here's a first try: https://github.com/nordlow/justd/blob/master/random_ex.d#L53
Jul 14 2014
Nordlöw:https://github.com/nordlow/justd/blob/master/random_ex.d#L53Isn't trusted mostly for small parts of Phobos code? I suggest to avoid using trusted in most cases. Bye, bearophile
Jul 14 2014
On Tuesday, 15 July 2014 at 00:03:04 UTC, bearophile wrote:to avoid using trusted in most cases.Could someone elaborate shortly which cases this means?
Jul 15 2014
Nordlöw:Could someone elaborate shortly which cases this means?All cases where you really can't live without it :-) It's like a cast(. Bye, bearophile
Jul 15 2014
On Tuesday, 15 July 2014 at 18:50:06 UTC, bearophile wrote:All cases where you really can't live without it :-) It's likeHmm. I guess I'm gonna have to remove some trusted tagging then ;)
Jul 15 2014
Nordlöw:I was specifically interested in something that exercises (random samples) potentially _all_ code points for string, wstring and dstring (all code units that is).That's harder. Generating all uints and then testing if it's a Unicode dchar seems possible. Bye, bearophile
Jul 14 2014
On 15/07/14 00:16, "Nordlöw" via Digitalmars-d-learn wrote:Is there a natural way of generating/filling a string/wstring/dstring of a specific length with random contents?I think you need to be more specific about what kind of random contents you are interested in having. Are you interested in having each character in the sequence randomly chosen independently of all the others, or do you want a random subset of all available characters (i.e. no character appears more than once), or something else again?
Jul 16 2014
On Wednesday, 16 July 2014 at 23:24:24 UTC, Joseph Rushton Wakeling via Digitalmars-d-learn wrote:Are you interested in having each character in the sequence randomly chosen independently of all the others, or do you want a random subset of all available characters (i.e. no character appears more than once), or something else again?Just a random dchar (ciode point) sample like this: /** Generate Random Contents of $(D x). See also: http://forum.dlang.org/thread/emlgflxpgecxsqweauhc forum.dlang.org */ auto ref randInPlace(ref dchar x) trusted { auto ui = uniform(0, 0xD800 + (0x110000 - 0xE000) - 2 // minus two for U+FFFE and U+FFFF ); if (ui < 0xD800) { return x = ui; } else { ui -= 0xD800; ui += 0xE000; // skip undefined if (ui < 0xFFFE) return x = ui; else ui += 2; assert(ui < 0x110000); return x = ui; } } I don't know how well this plays with unittest { import dbg; dln(randomized!dchar); dstring d = "alphaalphaalphaalphaalphaalphaalphaalphaalphaalpha"; dln(d.randomize); } though. See complete logic at https://github.com/nordlow/justd/blob/master/random_ex.d
Jul 17 2014