digitalmars.D.learn - extended characterset output
- anonymous (23/23) Apr 07 2022 What's the proper way to output all characters in the extended
- =?UTF-8?Q?Ali_=c3=87ehreli?= (45/56) Apr 08 2022 It is not easy to answer because there are a number of concepts here
- anonymous (12/54) Apr 08 2022 I tried that. It didn't work.
- =?UTF-8?Q?Ali_=c3=87ehreli?= (3/5) Apr 08 2022 Some distribution install an old gdc. What version is yours?
- anonymous (11/17) Apr 08 2022 Not sure actually. I just did "apt install gdc" and assumed the
- anonymous (5/8) Apr 08 2022 [snip]
What's the proper way to output all characters in the extended character set? ```d void main() { foreach(char c; 0 .. 256) { write(isControl(c) ? '.' : c); } } ``` Expected output: ``` ................................ ``` Actual output: ``` ................................ ``` Works as expected in python. Thanks
Apr 07 2022
On 4/7/22 23:13, anonymous wrote:What's the proper way to output all characters in the extended character set?It is not easy to answer because there are a number of concepts here that may make it trivial or complicated. The configuration of the output device matters. Is it set to Windows-1252 or are you using Unicode strings in Python?```d void main() { foreach(char c; 0 .. 256)'char' is wrong there because 'char' has a very special meaning in D: A UTF-8 code unit. Not a full Unicode character in many cases, especially in the "extended" set. I think your problem will be solved simply by replacing 'char' with 'dchar' there: foreach (dchar c; ... However, isControl() below won't work because isControl() only knows about the ASCII table. It would miss the unprintable characters above 127.{ write(isControl(c) ? '.' : c); } } ```This works: import std.stdio; bool isPrintableLatin1(dchar value) { if (value < 32) { return false; } if (value > 126 && value < 161) { return false; } return true; } void main() { foreach (dchar c; 0 .. 256) { write(isPrintableLatin1(c) ? c : '.'); } writeln(); // import std.encoding; // foreach(ubyte c; 0 .. 256) { // if (isPrintableLatin1(c)) { // Latin1Char[1] from = [ cast(Latin1Char)c ]; // string to; // transcode(from, to); // write(to); // } else { // write('.'); // } // } // writeln(); } I left some code commented-out, which I experimented with. (That works as well.) Ali
Apr 08 2022
On Friday, 8 April 2022 at 08:36:33 UTC, Ali Çehreli wrote:On 4/7/22 23:13, anonymous wrote:I'm running Ubuntu and my default language is en_US.UTF-8.What's the proper way to output all characters in theextended characterset?It is not easy to answer because there are a number of concepts here that may make it trivial or complicated. The configuration of the output device matters. Is it set to Windows-1252 or are you using Unicode strings in Python?I tried that. It didn't work.```d void main() { foreach(char c; 0 .. 256)'char' is wrong there because 'char' has a very special meaning in D: A UTF-8 code unit. Not a full Unicode character in many cases, especially in the "extended" set. I think your problem will be solved simply by replacing 'char' with 'dchar' there: foreach (dchar c; ...However, isControl() below won't work because isControl() only knows about the ASCII table. It would miss the unprintable characters above 127.Oh okay, that may have been the reason.{ write(isControl(c) ? '.' : c); } } ```This works: import std.stdio; bool isPrintableLatin1(dchar value) { if (value < 32) { return false; } if (value > 126 && value < 161) { return false; } return true; } void main() { foreach (dchar c; 0 .. 256) { write(isPrintableLatin1(c) ? c : '.'); }Nope... running this code, I get a bunch of digits as the output. The dot's don't even show up. Maybe I'm drunk or lacking sleep. Weird, I got this strange feeling that this problem stemmed from the compiler I'm using (GDC) so I installed DMD. Would you believe everything worked fine afterwords? To include the original version where I used isControl and 'dchar' instead of 'char'. I wonder why that is? Thanks Ali.
Apr 08 2022
On 4/8/22 02:51, anonymous wrote:Weird, I got this strange feeling that this problem stemmed from the compiler I'm using (GDC)Some distribution install an old gdc. What version is yours? Ali
Apr 08 2022
On Friday, 8 April 2022 at 15:06:41 UTC, Ali Çehreli wrote:On 4/8/22 02:51, anonymous wrote:Not sure actually. I just did "apt install gdc" and assumed the latest available. Let me check. Here's the version output (10.3.0?): anon ymous:~/$ gdc --version gdc (Ubuntu 10.3.0-1ubuntu1~20.04) 10.3.0 Copyright (C) 2020 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.Weird, I got this strange feeling that this problem stemmedfrom thecompiler I'm using (GDC)Some distribution install an old gdc. What version is yours? Ali
Apr 08 2022
On Friday, 8 April 2022 at 08:36:33 UTC, Ali Çehreli wrote: [snip]However, isControl() below won't work because isControl() only knows about the ASCII table. It would miss the unprintable characters above 127.[snip] This actuall works because I'm using std.uni.isControl() instead of std.ascii.isControl().
Apr 08 2022