digitalmars.D - Random string samples & unicode
- bearophile (31/31) Sep 10 2010 The need to take a random sample without replacement is very common. For...
- bearophile (20/22) Sep 11 2010 The problems are more widespread, this is a simple generator of terms of...
- Andrej Mitrovic (21/43) Sep 11 2010 I think this might be a compiler bug:
- bearophile (4/5) Sep 11 2010 I'll add it to Bugzilla later. But even if you remove that bug, forcing ...
- Andrei Alexandrescu (3/8) Sep 11 2010 This goes into "bearophile's odd posts coming now and then".
- bearophile (4/5) Sep 11 2010 You aren't helping solve those problems.
- Andrei Alexandrescu (3/15) Sep 11 2010 You can't concatenate two integrals.
- bearophile (6/7) Sep 11 2010 The compiler has full type information, so what's wrong in concatenating...
- bearophile (23/27) Sep 11 2010 So that's invalid, I have closed it.
- bearophile (4/9) Sep 11 2010 Shorter:
- Steven Schveighoffer (11/15) Sep 13 2010 It's ambiguous also:
- bearophile (16/23) Sep 13 2010 If you want to discuss about Python2/Python3 I think that the Python new...
- Jonathan M Davis (6/13) Sep 13 2010 I wasn't really trying discuss python 2 vs 3 so much as point out that w...
The need to take a random sample without replacement is very common. For example this is how in Python 2.x I create a random string without replacement of fixed size from a input string of chars: from random import sample d = "0123456789" print "".join(sample(d, 2)) This seems similar D2 code: import std.stdio, std.random, std.array, std.range; void main() { dchar[] d = "0123456789"d.dup; dchar[] res = array(take(randomCover(d, rndGen), 2)); writeln(res); } There randomCover() doesn't work with a string, a dstrings or with a char[]. If later you need to process that res dchar[] with std.string you will have troubles. But randomShuffle() is able to shuffle a char[] in place: import std.stdio, std.random; void main() { char[] d = "0123456789".dup; randomShuffle(d); writeln(d); } If randomCover() receives a char[] I think in theory it has to yield its shuffled chars. And if it receives a string it has to yield its shuffled dchars (converted from the chars). A string may contain UFT8 chars that are longer than 1 byte, but a char[] is not a string, and if you want its items in random order, it has to act like randomShuffle(). My head hurts, and I don't know what the right thing to do is. Maybe I have to work with ubyte[] instead of char[], and add casts: import std.stdio, std.random, std.array, std.range; void main() { char[] d = "0123456789".dup; char[] res = cast(char[])array(take(randomCover(cast(ubyte[])d, rndGen), 2)); writeln(res); } Ideas welcome. Bye, bearophile
Sep 10 2010
There randomCover() doesn't work with a string, a dstrings or with a char[]. If later you need to process that res dchar[] with std.string you will have troubles.The problems are more widespread, this is a simple generator of terms of the "look and say" sequence (to generate a member of the sequence from the previous member, read off the digits of the previous member, counting the number of digits in groups of the same digit: http://en.wikipedia.org/wiki/Look_and_say_sequence ): import std.stdio, std.conv, std.algorithm; string lookAndSay(string input) { string result; foreach (g; group(input)) result ~= to!string(g._1) ~ (cast(char)g._0); return result; } void main() { string last = "1"; writeln(last); foreach (i; 0 .. 10) { last = lookAndSay(last); writeln(last); } } I was not able to remove that cast(char), even if I replace all strings in that program with dstrings. Is someone else using D2? Bye, bearophile
Sep 11 2010
I think this might be a compiler bug: import std.conv : to; void main() { string mystring; dchar mydchar; // ok, appending dchar to string mystring ~=3D mydchar; // error: incompatible types for // ((cast(uint)mydchar) ~ (cast(uint)mydchar)): 'uint' and 'uint' mystring ~=3D mydchar ~ mydchar; } On Sat, Sep 11, 2010 at 3:42 PM, bearophile <bearophileHUGS lycos.com> wrot= e:r[].There randomCover() doesn't work with a string, a dstrings or with a cha=ave troubles.If later you need to process that res dchar[] with std.string you will h=The problems are more widespread, this is a simple generator of terms of =the "look and say" sequence (to generate a member of the sequence from the = previous member, read off the digits of the previous member, counting the n= umber of digits in groups of the same digit: http://en.wikipedia.org/wiki/L= ook_and_say_sequence ):import std.stdio, std.conv, std.algorithm; string lookAndSay(string input) { =A0 =A0string result; =A0 =A0foreach (g; group(input)) =A0 =A0 =A0 =A0result ~=3D to!string(g._1) ~ (cast(char)g._0); =A0 =A0return result; } void main() { =A0 =A0string last =3D "1"; =A0 =A0writeln(last); =A0 =A0foreach (i; 0 .. 10) { =A0 =A0 =A0 =A0last =3D lookAndSay(last); =A0 =A0 =A0 =A0writeln(last); =A0 =A0} } I was not able to remove that cast(char), even if I replace all strings i=n that program with dstrings.Is someone else using D2? Bye, bearophile
Sep 11 2010
Andrej Mitrovic:I think this might be a compiler bug:I'll add it to Bugzilla later. But even if you remove that bug, forcing me to use dstrings in the whole program is strange. Or maybe it's a good thing, and the natural state for D programs is to just use dstrings everywhere. Andrei may offer his opinion on the situation. Bye, bearophile
Sep 11 2010
On 9/11/10 10:24 CDT, bearophile wrote:Andrej Mitrovic:This goes into "bearophile's odd posts coming now and then". AndreiI think this might be a compiler bug:I'll add it to Bugzilla later. But even if you remove that bug, forcing me to use dstrings in the whole program is strange. Or maybe it's a good thing, and the natural state for D programs is to just use dstrings everywhere. Andrei may offer his opinion on the situation. Bye, bearophile
Sep 11 2010
Andrei Alexandrescu:This goes into "bearophile's odd posts coming now and then".You aren't helping solve those problems. Bye, bearophile
Sep 11 2010
On 9/11/10 9:48 CDT, Andrej Mitrovic wrote:I think this might be a compiler bug: import std.conv : to; void main() { string mystring; dchar mydchar; // ok, appending dchar to string mystring ~= mydchar; // error: incompatible types for // ((cast(uint)mydchar) ~ (cast(uint)mydchar)): 'uint' and 'uint' mystring ~= mydchar ~ mydchar; }You can't concatenate two integrals. Andrei
Sep 11 2010
Andrei Alexandrescu:You can't concatenate two integrals.The compiler has full type information, so what's wrong in concatenating two char or two dchar into a string or dstring? And I think there are other problems: http://d.puremagic.com/issues/show_bug.cgi?id=4853 Bye, bearophile
Sep 11 2010
The compiler has full type information, so what's wrong in concatenating two char or two dchar into a string or dstring?But in C the ~ among two chars has a different meaning, so in D you may at best disallow it.And I think there are other problems: http://d.puremagic.com/issues/show_bug.cgi?id=4853So that's invalid, I have closed it. Using a bit of contortions it's possible to write lookAndSay() with no casts, but the code is not good still: import std.stdio, std.conv, std.algorithm; string lookAndSay(string input) { string result; foreach (g; group(input)) { string s = to!string(g._1); s ~= g._0; // string ~ dchar wrong, string ~= dchar good result ~= s; } return result; } void main() { string last = "1"; writeln(last); foreach (i; 0 .. 10) { last = lookAndSay(last); writeln(last); } } Bye, bearophile
Sep 11 2010
foreach (g; group(input)) { string s = to!string(g._1); s ~= g._0; // string ~ dchar wrong, string ~= dchar good result ~= s; }Shorter: foreach (g; group(input)) result ~= text(g._1, g._0); bearophile
Sep 11 2010
On Sat, 11 Sep 2010 13:20:25 -0400, bearophile <bearophileHUGS lycos.com> wrote:Andrei Alexandrescu:It's ambiguous also: string s1 = "abc", s2 = "def"; auto x = s1 ~ s2; would you expect x to be "abcdef" or ["abc", "def"]? Essentially, one of the arguments to concatenation must be an array type in order to avoid ambiguity. Fortunately, you can get the results you wish with the bracket notation: auto x = [s1, s2]; -SteveYou can't concatenate two integrals.The compiler has full type information, so what's wrong in concatenating two char or two dchar into a string or dstring?
Sep 13 2010
Jonathan M Davis:Well, then in comparing python 3 with D, [...]If you want to discuss about Python2/Python3 I think that the Python newsgroup is a better place. I know this sounds like a bit rough answer, but Python in the end is OT here, and most people here show some ignorance about Python matters. ----------------------- Daniel Gibson:Can't you just use byte[] for that? If you're 100% sure your string only contains ASCII characters, you can just cast it to byte[], feed that into algorithms and cast it back to char[] afterwards, I guess.ubyte[] sounds better :-) (Yes, I'd like D to use sbyte/ubyte names). Yes, that's what I sometimes do. The usage of ubyte[] is the last possible solution I have suggested in my first post on this dual thread: http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D&article_id=117206 But that strategy needs casts, I don't like casts a lot, and I don't know how much SafeD supports casts (and you may want to use SafeD for the typically small script-like programs used for some intermediate processing of biological information). I think that generally it's better to use strategies that avoid casts. dsimcha's AsciiString may be able to reduce the need of casts. ----------------------- Kagamin:Why they're chars but not numbers?I presume it's mostly a matter of taste. There's no need to use chars here, but in scripting languages (especially Tcl) you sometimes use strings/chars even in situations where in C you want to use just numbers. Strings in Python are very handy to use, safe, compact in both memory and visual representation on screen. ----------------------- Steven Schveighoffer:Fortunately, you can get the results you wish with the bracket notation: auto x = [s1, s2];Right, thank you. Bye, bearophile
Sep 13 2010
On Monday, September 13, 2010 10:45:48 bearophile wrote:Jonathan M Davis:I wasn't really trying discuss python 2 vs 3 so much as point out that while you were lamenting the issues with porting python 2 code to D, it looks like the situation that you described for python 3 is essentially the same as for D, so it's a non-issue with regards to porting python 3 code. - Jonathan M DavisWell, then in comparing python 3 with D, [...]If you want to discuss about Python2/Python3 I think that the Python newsgroup is a better place. I know this sounds like a bit rough answer, but Python in the end is OT here, and most people here show some ignorance about Python matters.
Sep 13 2010