digitalmars.D - char[] + utf-8 + canFind == bug?
- Andrea Fontana (9/9) Nov 22 2011 I've some problems (again) with UTF-8. Try this code:
- Andrei Alexandrescu (4/6) Nov 22 2011 This will truncate the multi-byte characters. It should be a
- Andrea Fontana (3/12) Nov 22 2011 I guess I should use wchar instead of char. :)
- Jonathan M Davis (6/7) Nov 22 2011 Individual characters really should be processed as dchars in the gener=
- Andrea Fontana (17/26) Nov 22 2011 dchar works but simple solution doesn't.
- Andrei Alexandrescu (5/6) Nov 22 2011 Use string/auto here, or .dup on the string.
- Jacob Carlborg (4/11) Nov 22 2011 Hasn't this already been reported?
- Jonathan M Davis (15/32) Nov 22 2011 ge,V) if
I've some problems (again) with UTF-8. Try this code: char[] chars =3D ['=C3=A0','=C3=A8','=C3=AC']; chars.canFind('=C3=A8'); It doesn't work: std.utf.UTFException std/utf.d(644): Invalid UTF-8 sequence (at index 1) But this one works: string[] chars =3D ["=C3=A0","=C3=A8","=C3=AC"]; chars.canFind("=C3=A8"); I'm using dmd/druntime/phobos downloaded from github today.
Nov 22 2011
On 11/22/11 10:28 AM, Andrea Fontana wrote:I've some problems (again) with UTF-8. Try this code: char[] chars = ['à','è','ì'];This will truncate the multi-byte characters. It should be a compile-time error. Andrei
Nov 22 2011
I guess I should use wchar instead of char. :) Il giorno mar, 22/11/2011 alle 10.31 -0600, Andrei Alexandrescu ha scritto:On 11/22/11 10:28 AM, Andrea Fontana wrote:I've some problems (again) with UTF-8. Try this code: char[] chars =3D ['=C3=A0','=C3=A8','=C3=AC'];=20 This will truncate the multi-byte characters. It should be a=20 compile-time error. =20 Andrei
Nov 22 2011
On Tuesday, November 22, 2011 17:38:36 Andrea Fontana wrote:I guess I should use wchar instead of char. :)Individual characters really should be processed as dchars in the gener= al=20 case. There's a simple solution here though: char[] chars =3D "=C3=A0=C3=A8=C3=AC"; - Jonathan M Davis
Nov 22 2011
dchar works but simple solution doesn't. code: char[] chars =3D "=C3=B2=C3=A0"; chars.canFind('=C3=A0'); It says: Error: cannot implicitly convert expression ("\xc3\xb2\xc3\xa0") of type string to char[] Error: template std.algorithm.canFind(alias pred =3D "a =3D=3D b",Range,V) = if (is(typeof(find!(pred)(range,value)))) does not match any function template declaration Error: template std.algorithm.canFind(alias pred =3D "a =3D=3D b",Range,V) = if (is(typeof(find!(pred)(range,value)))) cannot deduce template function from argument types !()(char[],wchar) Il giorno mar, 22/11/2011 alle 08.49 -0800, Jonathan M Davis ha scritto:On Tuesday, November 22, 2011 17:38:36 Andrea Fontana wrote:=20I guess I should use wchar instead of char. :)=20 Individual characters really should be processed as dchars in the general=case. There's a simple solution here though: =20 char[] chars =3D "=C3=A0=C3=A8=C3=AC"; =20 - Jonathan M Davis
Nov 22 2011
On 11/22/11 11:02 AM, Andrea Fontana wrote:char[] chars = "òà";Use string/auto here, or .dup on the string. I filed http://d.puremagic.com/issues/show_bug.cgi?id=6988 on your behalf. Thanks for sharing! Andrei
Nov 22 2011
On 2011-11-22 18:14, Andrei Alexandrescu wrote:On 11/22/11 11:02 AM, Andrea Fontana wrote:Hasn't this already been reported? -- /Jacob Carlborgchar[] chars = "òà";Use string/auto here, or .dup on the string. I filed http://d.puremagic.com/issues/show_bug.cgi?id=6988 on your behalf. Thanks for sharing! Andrei
Nov 22 2011
On Tuesday, November 22, 2011 18:02:53 Andrea Fontana wrote:dchar works but simple solution doesn't. =20 code: =20 char[] chars =3D "=C3=B2=C3=A0"; chars.canFind('=C3=A0'); =20 It says: =20 Error: cannot implicitly convert expression ("\xc3\xb2\xc3\xa0") of t=ypestring to char[] Error: template std.algorithm.canFind(alias pred =3D "a =3D=3D b",Ran=ge,V) if(is(typeof(find!(pred)(range,value)))) does not match any function template declaration Error: template std.algorithm.canFind(alias pred =3D "a =3D=3D b",Ran=ge,V) if(is(typeof(find!(pred)(range,value)))) cannot deduce template functio=nfrom argument types !()(char[],wchar)Ah. Yes. String literals are immutable (at least in Linux). So, you'de = need to=20 dup it if you want a mutable char[] instead of a string. The normal cas= e is to=20 use a string though, so unless you actually want to mutate the characte= rs in=20 the array (which is frequently an iffy thing to do with char[], since y= ou have=20 to worry about not screwing up the code points), you should use string.= - Jonathan M Davis
Nov 22 2011