digitalmars.D.learn - UTF-8 char and write(f)ln
- Vindex9 (41/41) Mar 04 Program:
- =?UTF-8?Q?Ali_=C3=87ehreli?= (8/26) Mar 05 It shouldn't and does not work in my environment (both inside an Emacs
- Vindex9 (5/7) Mar 05 I used Terminator. I tried other terminal emulators and they
- =?UTF-8?Q?Ali_=C3=87ehreli?= (28/35) Mar 05 The program just outputs characters to its stdout. One way to see this
- Vindex9 (5/7) Mar 05 Further experiments showed inconsistent behavior. The oddities
Program: ```d import std.stdio; void main() { string s = "ταυ"; foreach(i, elem; s) { writefln("%s %s '%s'", i, cast(int)elem, elem); writefln("%s", elem); } } ``` Output: ``` 0 207 '�' 1 132 '�' τ 2 206 '�' 3 177 '�' α 4 207 '�' 5 133 '�' υ ``` How does the second writefln know about the context and can adequately output a character on every other iteration? However, if you do it this way (see below), the output is very strange with arbitrary line breaks. ```d foreach(i, elem; s) { writefln("'%s', %s", elem, elem); } ``` Output: ``` '�', �'�', � '�', �'�', � '�', �'�', � ```
Mar 04
On 3/4/25 12:09 AM, Vindex9 wrote:Program: ```d import std.stdio; void main() { string s = "ταυ"; foreach(i, elem; s) { writefln("%s %s '%s'", i, cast(int)elem, elem); writefln("%s", elem); } } ``` Output: ``` 0 207 '�' 1 132 '�' τ[...]How does the second writefln know about the context and can adequately output a character on every other iteration?It shouldn't and does not work in my environment (both inside an Emacs buffer and inside a Linux terminal). I think you are running your program in an environment where 132 is mapped to τ, etc. Perhaps a "code page" setting is helping (or hurting) you there? Ali
Mar 05
On Wednesday, 5 March 2025 at 17:56:25 UTC, Ali Çehreli wrote:It shouldn't and does not work in my environment (both inside an Emacs buffer and inside a Linux terminal).I used Terminator. I tried other terminal emulators and they behaved differently (output unrecognized bits of characters as question marks). Apparently Terminator has some weird buffering. Sorry to bother you.
Mar 05
On 3/5/25 11:59 AM, Vindex9 wrote:On Wednesday, 5 March 2025 at 17:56:25 UTC, Ali Çehreli wrote:The program just outputs characters to its stdout. One way to see this process is to redirect 'stdout' to a file: $ my_program > my_output Then, when you open file 'my_output' in a hex editor, you should see that the program did output just a single char with value e.g. 132. There are no other UTF-8 characters right after it, so I wouldn't expect 'τ' to be formed on the output.It shouldn't and does not work in my environment (both inside an Emacs buffer and inside a Linux terminal).I used Terminator. I tried other terminal emulators and they behaved differently (output unrecognized bits of characters as question marks). Apparently Terminator has some weird buffering.They probably keep state for Unicode characters but don't reset it. (I don't know whether they are required to.) Could you please try the following program to see whether it prints τ for all tau arrays below? import std.stdio; import std.algorithm; void main() { char[][] taus = [ [ 207, 132 ], [ 207, 0, 132 ], [ 207, 'a', 132 ] ]; foreach (i, tau; taus) { write(i, ": "); tau.each!(write); writeln(); } } For me, only the first one is a τ in a Unicode environment. You may see 3 taus under Terminal.Sorry to bother you.Not at all! I assume everybody here finds these topic very interesting like I do. :) Ali
Mar 05
On Wednesday, 5 March 2025 at 21:26:32 UTC, Ali Çehreli wrote:They probably keep state for Unicode characters but don't reset it. (I don't know whether they are required to.)Further experiments showed inconsistent behavior. The oddities are well reproduced with the terminal plugin for neovim ('akinsho/toggleterm.nvim'). So things aren't that interesting anymore.
Mar 05