D - other languages for output.writeLine
- Y.Tomino (43/46) Nov 22 2003 Hello.
- Y.Tomino (11/11) Nov 22 2003 Sorry, I mistook editing.
- Walter (7/53) Nov 22 2003 I'm puzzled why it's necessary to convert to wide char and then back to
- Y.Tomino (22/25) Nov 23 2003 Because WriteFile can't output unicode letters to console.
- Y.Tomino (3/7) Nov 23 2003 mstring m = toMBCS(w);
- Hauke Duden (27/40) Nov 23 2003 This will only work if the current system codepage is UTF-8, since
- Y.Tomino (8/29) Nov 23 2003 Sorry, It's my editing mistake. My toUTF16(mstring) is not used.
- Matthew Wilson (3/9) Nov 23 2003 That would be my expectation of any unimplemented API function on Win9x ...
- Hauke Duden (9/12) Nov 23 2003 Maybe, maybe not. In my experience, when you're dealing with the Windows...
- Matthew Wilson (10/21) Nov 23 2003 ERROR_CALL_NOT_IMPLEMENTED
- Hauke Duden (19/22) Nov 24 2003 Well, that can only be true for functions that already existed when the
- Y.Tomino (9/12) Nov 24 2003 A even in the case of NT/2000/XP, WriteConsoleW may fail if handle of
- Matthew Wilson (17/37) Nov 24 2003 Sure. I don't think anyone's suggesting otherwise.
- Hauke Duden (26/61) Nov 24 2003 The point I was trying to make is that you cannot generally assume that
- Matthew Wilson (8/14) Nov 24 2003 The alternative is to have the requisite amount of equivalent code build
- Raiko (8/27) Nov 23 2003 Just to jump in for a second.
- Matthew Wilson (6/30) Nov 23 2003 You're not out of place! :-)
- Hauke Duden (10/11) Nov 24 2003 Unfortunately, that would require every D application to ship with the
- Julio César Carrascal Urquijo (2/4) Nov 24 2003 There's always ICU, wich is included in Parrot (Perl 6's engine).
Hello. DMD accepts the unicode identifier when source file is written with UTF-8. But we can't output non-ascii letters (Japanese, etc). This code fix it to be able to output non-ascii letters to console with UTF-8 source code. YT When DMD 0.74 released, Walter wrote.That is a problem, I'm not sure what to do about it. One thing I have been looking for is a mapping from Shift-JIS to unicode. Do you have such a table?---- typedef char[] mstring; //multi-byte encoding string wchar[] toUTF16(mstring s) { wchar[] result; result.length = MultiByteToWideChar(0, 0, s, s.length, null, 0); MultiByteToWideChar(0, 0, s, s.length, result, result.length); return result; } class Console : File { this(HANDLE _handle, FileMode _mode){ super(_handle, _mode); } override void writeString(char[] s) { if(s.length > 0){ DWORD written; wchar[] w = toUTF16(s); if(WriteConsoleW(handle, &w[0], w.length, &written, null) == FALSE){ mstring m = toMBCS(w); if(WriteConsoleA(handle, &m[0], m.length, &written, null) == FALSE){ super.writeString(m); // for redirect } } } } override void write(char[] s) { super.write(s.length); writeExact(&s[0], s.length * char.size); // for binary } static this() { std.stream.stdout = new Console(std.stream.stdout.handle(), FileMode.Out); std.stream.stderr = new Console(std.stream.stderr.handle(), FileMode.Out); } }
Nov 22 2003
Sorry, I mistook editing. toMBCS is here. ---- mstring toMBCS(wchar[] s) { mstring result; result.length = WideCharToMultiByte(0, 0, s, s.length, null, 0, null, null); WideCharToMultiByte(0, 0, s, s.length, result, result.length, null, null); return result; }
Nov 22 2003
I'm puzzled why it's necessary to convert to wide char and then back to multi byte? "Y.Tomino" <demoonlit inter7.jp> wrote in message news:bpp6od$1lfm$1 digitaldaemon.com...Hello. DMD accepts the unicode identifier when source file is written with UTF-8. But we can't output non-ascii letters (Japanese, etc). This code fix it to be able to output non-ascii letters to console with UTF-8 source code. YT When DMD 0.74 released, Walter wrote.beenThat is a problem, I'm not sure what to do about it. One thing I haveFileMode.Out);looking for is a mapping from Shift-JIS to unicode. Do you have such a table?---- typedef char[] mstring; //multi-byte encoding string wchar[] toUTF16(mstring s) { wchar[] result; result.length = MultiByteToWideChar(0, 0, s, s.length, null, 0); MultiByteToWideChar(0, 0, s, s.length, result, result.length); return result; } class Console : File { this(HANDLE _handle, FileMode _mode){ super(_handle, _mode); } override void writeString(char[] s) { if(s.length > 0){ DWORD written; wchar[] w = toUTF16(s); if(WriteConsoleW(handle, &w[0], w.length, &written, null) == FALSE){ mstring m = toMBCS(w); if(WriteConsoleA(handle, &m[0], m.length, &written, null) == FALSE){ super.writeString(m); // for redirect } } } } override void write(char[] s) { super.write(s.length); writeExact(&s[0], s.length * char.size); // for binary } static this() { std.stream.stdout = new Console(std.stream.stdout.handle(),std.stream.stderr = new Console(std.stream.stderr.handle(),FileMode.Out);} }
Nov 22 2003
Because WriteFile can't output unicode letters to console. WriteConsoleW works correctly. String literal on source code is UTF-8, first, it converts to UTF-16 for WriteConsoleW. But WriteConsoleW doesn't wok on Windows95/98/Me. Microsoft Platform SDK says.Implemented as Unicode and ANSI versions on Windows NT/2000/XP. Alsosupported by Microsoft Layer for Unicode. So it call WriteConsoleA if WriteConsoleW failed. WriteConsoleA's argument must be multi-byte string. Multi-byte string is not UTF-8, it's necessary to convert with WideCharToMultiByte. Since Unicode has many characters rather than MBCS(Shift-JIS), it tries WriteConsoleW previously. And when we used redirect( C:\>myexe > output.txt ), Console API may fail. It have to call super.writeString. But as output of redirect, multi-byte encoded text file is natural like other programs. Therefore it pass "m" instead of "s" to super.writeString. Thanks. YT "Walter" <walter digitalmars.com> wrote in message news:bppk03$28l6$1 digitaldaemon.com...I'm puzzled why it's necessary to convert to wide char and then back to multi byte?
Nov 23 2003
Sorry, WriteConsoleA is same as WriteFile in this case, it's unnecessarily.mstring m = toMBCS(w); if(WriteConsoleA(handle, &m[0], m.length, &written, null) == FALSE){ super.writeString(m); // for redirect }mstring m = toMBCS(w); super.writeString(m); // for 95/98/Me and redirect
Nov 23 2003
Y.Tomino wrote:wchar[] toUTF16(mstring s) { wchar[] result; result.length = MultiByteToWideChar(0, 0, s, s.length, null, 0); MultiByteToWideChar(0, 0, s, s.length, result, result.length); return result; }<snip>override void writeString(char[] s) { if(s.length > 0){ DWORD written; wchar[] w = toUTF16(s);This will only work if the current system codepage is UTF-8, since MultiByteToWideChar assumes that the input string is in the current code page. Passing CP_UTF8 to MultiByteToWideChar won't help either, because that is only supported on Win98 and up. Seems to me that the only way to do this is to manually convert the string from UTF-8 to UTF-16 (not that much of a deal). The Win32 functions won't help you much because there's absolutely no Unicode support on Win95.if(WriteConsoleW(handle, &w[0], w.length, &written, null) == FALSE){This call might be a little dangerous. WriteConsoleW is not supported on Win9x, so there's no guarantee that it won't cause a crash on some systems or return an undefined result (or is there some explicit guarantee somewhere in the docs?). It would probably be better to check whether the OS is an NT variant and call the W and A versions accordingly. Something like: OSVERSIONINFO osVersion; GetVersionEx(&osVersion); if(osVersion.dwPlatformId==VER_PLATFORM_WIN32_NT) WriteConsoleW(...); else { mstring m = toMBCS(w); WriteConsoleA(...) } Hauke
Nov 23 2003
Sorry, It's my editing mistake. My toUTF16(mstring) is not used. toUTF16 called from writeString is std.utf.toUTF16(char[]) because D's typedef is strong. (I mistake copied my wrong toUTF16 instead of toMBCS :-)wchar[] toUTF16(mstring s) { wchar[] result; result.length = MultiByteToWideChar(0, 0, s, s.length, null, 0); MultiByteToWideChar(0, 0, s, s.length, result, result.length); return result; }<snip>override void writeString(char[] s) { if(s.length > 0){ DWORD written; wchar[] w = toUTF16(s);This will only work if the current system codepage is UTF-8, since MultiByteToWideChar assumes that the input string is in the current code page. Passing CP_UTF8 to MultiByteToWideChar won't help either, because that is only supported on Win98 and up.This call might be a little dangerous. WriteConsoleW is not supported on Win9x, so there's no guarantee that it won't cause a crash on some systems or return an undefined result (or is there some explicit guarantee somewhere in the docs?).I think ~W API return FALSE and GetLastError() = ERROR_CALL_NOT_IMPLEMENTED on Win9x... Will it crash or undefined result ? YT
Nov 23 2003
ERROR_CALL_NOT_IMPLEMENTEDThis call might be a little dangerous. WriteConsoleW is not supported on Win9x, so there's no guarantee that it won't cause a crash on some systems or return an undefined result (or is there some explicit guarantee somewhere in the docs?).I think ~W API return FALSE and GetLastError() =on Win9x...That would be my expectation of any unimplemented API function on Win9x (as long as it actually exists, of course)
Nov 23 2003
Y.Tomino wrote:I think ~W API return FALSE and GetLastError() = ERROR_CALL_NOT_IMPLEMENTED on Win9x... Will it crash or undefined result ?Maybe, maybe not. In my experience, when you're dealing with the Windows API you should better not rely on anything that is not explicitly stated in the documentation. Otherwise there will quite often be some obscure combination of Windows version, system language and system DLL versions that will violate your assumption. So, since testing on all possible Windows configuations is close to impossible I usually stick to the documented stuff. Hauke
Nov 23 2003
"Hauke Duden" <H.NS.Duden gmx.net> wrote in message news:bprg1c$1s8k$1 digitaldaemon.com...Y.Tomino wrote:ERROR_CALL_NOT_IMPLEMENTEDI think ~W API return FALSE and GetLastError() =It is my understanding that all unimplemented functions in the Win32 for a given operating system cause the thread error to be set to ERROR_CALL_NOT_IMPLEMENTED.on Win9x... Will it crash or undefined result ?Maybe, maybe not. In my experience, when you're dealing with the Windows API you should better not rely on anything that is not explicitly stated in the documentation. Otherwise there will quite often be some obscure combination of Windows version, system language and system DLL versions that will violate your assumption.So, since testing on all possible Windows configuations is close to impossible I usually stick to the documented stuff.Your caution is worthy, and I agree in most cases. However, I think in this case it is safe to go with GetLastError. Cheers Matthew
Nov 23 2003
Matthew Wilson wrote:It is my understanding that all unimplemented functions in the Win32 for a given operating system cause the thread error to be set to ERROR_CALL_NOT_IMPLEMENTED.Well, that can only be true for functions that already existed when the operating system was shipped, right? But I agree, if the "Ansi" version is supported, then the missing Unicode function will probably return NOT_IMPLEMENTED (or some other error - you can never be sure!). However, the Unicode function might fail for other reasons as well and maybe the ANSI version doesn't. Could be a simple case of not having enough free memory for the Unicode strings, but just enough for the ANSI version. An automated fallback might cause inconsistency within the program and its data (e.g. mixed ANSI and Unicode data in a file or something similar). If you go the fallback route you'd have to at least check the error code. If you want to be on the safe side, that is. I find it easier to just check for NTness. Since this boolean doesn't change, it can be checked once at startup and then stored, so you won't have to call GetVersionEx every time you have to decide between Ansi and Unicode versions. This might be something that could be done by Phobos - something like std.os.windows.isWinNT(). Hauke
Nov 24 2003
But I agree, if the "Ansi" version is supported, then the missing Unicode function will probably return NOT_IMPLEMENTED (or some other error - you can never be sure!).A even in the case of NT/2000/XP, WriteConsoleW may fail if handle of standard-output was redirected. (GetLastError() = ERROR_INVALID_HANDLE) WriteConsoleA may fail, too. A simple way is that if WriteConsoleW fails, pass ANSI(MBCS)-converted string to WriteFile. WriteFile can write ANSI string to both Console and redirected file. Thanks. YT
Nov 24 2003
aIt is my understanding that all unimplemented functions in the Win32 forSure. I don't think anyone's suggesting otherwise. I don't understand your point.given operating system cause the thread error to be set to ERROR_CALL_NOT_IMPLEMENTED.Well, that can only be true for functions that already existed when the operating system was shipped, right?But I agree, if the "Ansi" version is supported, then the missing Unicode function will probably return NOT_IMPLEMENTED (or some other error - you can never be sure!).Naturally a particular function may be incorrectly written. What I'm saying is that it is a design feature of Win9x that a stubbed (as opposed to entirely missing) function will set the NOT_IMPL value to the thread error.However, the Unicode function might fail for other reasons as well and maybe the ANSI version doesn't. Could be a simple case of not having enough free memory for the Unicode strings, but just enough for the ANSI version. An automated fallback might cause inconsistency within the program and its data (e.g. mixed ANSI and Unicode data in a file or something similar). If you go the fallback route you'd have to at least check the error code. If you want to be on the safe side, that is.This doesn't make any kind of sense to me. Why would anyone call a function without allocating the appropriate amount of memory, other than through their own incompetence? And why would such incompetence only manifest when doing Unicode programming, and not ANSI?I find it easier to just check for NTness. Since this boolean doesn't change, it can be checked once at startup and then stored, so you won't have to call GetVersionEx every time you have to decide between Ansi and Unicode versions. This might be something that could be done by Phobos - something like std.os.windows.isWinNT().That's entirely true. In fact, this would be more appropriate as a robust and consistent implementation. But, given that, why not simply use MSLU, and take all the hassles from we poor overworked D people and utilise the industry, late in the day though it may be, of Microsoft. The ng for MSLU is well serviced, the library is free and redistributable, it is easy to use, and works well. Matthew
Nov 24 2003
Matthew Wilson wrote:The point I was trying to make is that you cannot generally assume that all unimplemented functions return ERROR_CALL_NOT_IMPLEMENTED. This is not really that much of an issue for Unicode functions that have an implemented Ansi version, but there are other functions that only exist on NT that may not have a stub on Win9x. I guess I'm just saying that a consistent way to handle there issues would be preferable instead of trying to deduce which functions are "stub-unimplemented" as opposed to non-existent. >>However, the Unicode function might fail for other reasons as well andSure. I don't think anyone's suggesting otherwise. I don't understand your point.given operating system cause the thread error to be set to ERROR_CALL_NOT_IMPLEMENTED.Well, that can only be true for functions that already existed when the operating system was shipped, right?Simple example: you have 5000 bytes of free disk space and want to write a 4000 character string to a file, using an imaginary WriteStringToFileA/W function. Your system is Win2000, so WriteStringToFileW exists. However, the call to WriteStringToFileW will fail because this implementation needs 8000 bytes of disc space. The Ansi version will succeed, though, since it only needs 4000 bytes. If you automatically fall back to the Ansi version without checking the error code, then you end up writing Ansi data into a file that was supposed to hold Unicode data.maybe the ANSI version doesn't. Could be a simple case of not having enough free memory for the Unicode strings, but just enough for the ANSI version. An automated fallback might cause inconsistency within the program and its data (e.g. mixed ANSI and Unicode data in a file or something similar). If you go the fallback route you'd have to at least check the error code. If you want to be on the safe side, that is.This doesn't make any kind of sense to me. Why would anyone call a function without allocating the appropriate amount of memory, other than through their own incompetence? And why would such incompetence only manifest when doing Unicode programming, and not ANSI?Because AFAIK the MSLU is not installed on any Win9x system by default. Certainly not on Win95. So you'd have to ship it with every application. For some applications that may be acceptable, but for others it might not. For example, it wouldn't be possible to write a ZIP self-extractor in D, because the .exe file would need an additional DLL to extract its contents. HaukeI find it easier to just check for NTness. Since this boolean doesn't change, it can be checked once at startup and then stored, so you won't have to call GetVersionEx every time you have to decide between Ansi and Unicode versions. This might be something that could be done by Phobos - something like std.os.windows.isWinNT().That's entirely true. In fact, this would be more appropriate as a robust and consistent implementation. But, given that, why not simply use MSLU, and take all the hassles from we poor overworked D people and utilise the industry, late in the day though it may be, of Microsoft. The ng for MSLU is well serviced, the library is free and redistributable, it is easy to use, and works well.
Nov 24 2003
Because AFAIK the MSLU is not installed on any Win9x system by default.CorrectCertainly not on Win95. So you'd have to ship it with every application.True. And I certainly acknowledge the problems this causes.For some applications that may be acceptable, but for others it might not. For example, it wouldn't be possible to write a ZIP self-extractor in D, because the .exe file would need an additional DLL to extract its contents.The alternative is to have the requisite amount of equivalent code build into the library. This is an approach I've taken often. It's a swings & roundabouts deal. I would certainly prefer the statically bound approach, but I'm aware of what a huge job it would be to make this work. Matthew
Nov 24 2003
Hauke Duden wrote:Y.Tomino wrote:Just to jump in for a second. Alot of Unicode APIs are supported in Win9x if you have the Unicode layer ie.. WriteConsoleW (from the Platform SDK docs) Windows Me/98/95: WriteConsoleW is supported by the Microsoft Layer for Unicode. To use this, you must add certain files to your application, as outlined in Microsoft Layer for Unicode on Windows Me/98/95 Systems. Sorry for being out of place :)I think ~W API return FALSE and GetLastError() = ERROR_CALL_NOT_IMPLEMENTED on Win9x... Will it crash or undefined result ?Maybe, maybe not. In my experience, when you're dealing with the Windows API you should better not rely on anything that is not explicitly stated in the documentation. Otherwise there will quite often be some obscure combination of Windows version, system language and system DLL versions that will violate your assumption. So, since testing on all possible Windows configuations is close to impossible I usually stick to the documented stuff. Hauke
Nov 23 2003
You're not out of place! :-) Using MSLU might be an option. It's redistributable, and pretty reliable. (In fact, the December issue of Windows Developer Network contains an interesting article on the issue, by one of our foremost authors ...) Cheers MatthewJust to jump in for a second. Alot of Unicode APIs are supported in Win9x if you have the Unicode layer ie.. WriteConsoleW (from the Platform SDK docs) Windows Me/98/95: WriteConsoleW is supported by the Microsoft Layer for Unicode. To use this, you must add certain files to your application, as outlined in Microsoft Layer for Unicode on Windows Me/98/95 Systems. Sorry for being out of place :)I think ~W API return FALSE and GetLastError() = ERROR_CALL_NOT_IMPLEMENTED on Win9x... Will it crash or undefined result ?Maybe, maybe not. In my experience, when you're dealing with the Windows API you should better not rely on anything that is not explicitly stated in the documentation. Otherwise there will quite often be some obscure combination of Windows version, system language and system DLL versions that will violate your assumption. So, since testing on all possible Windows configuations is close to impossible I usually stick to the documented stuff. Hauke
Nov 23 2003
Raiko wrote:Alot of Unicode APIs are supported in Win9x if you have the Unicode layerUnfortunately, that would require every D application to ship with the MSLU DLL. It's pretty small by todays standards, granted, but I don't think that it should be required. Besides, the MSLU does have some quirks. There are quite a lot of bugs in there when it comes to error handling or rarely used functions. And Microsoft doesn't really support it well either. And, of course, much of the GUI stuff is not included in the MSLU (Common Controls!). Hauke
Nov 24 2003
Unicode. To use this, you must add certain files to your application, as outlined in Microsoft Layer for Unicode on Windows Me/98/95 Systems.There's always ICU, wich is included in Parrot (Perl 6's engine). http://oss.software.ibm.com/icu/userguide/index.html
Nov 24 2003