www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - How to ensure string compatibility In =?UTF-8?B?RO+8nw==?=

reply FrankLike <1150015857 qq.com> writes:
Hi,everyone,
   In C++, _T can guarantee that when converting from ascii 
encoding type to unicode encoding type, the program does not need 
to be modified. What do I need to do in D?

Thanks.
Jan 22
parent reply Olivier Pisano <olivier.pisano laposte.net> writes:
On Tuesday, 22 January 2019 at 13:55:30 UTC, FrankLike wrote:
 Hi,everyone,
   In C++, _T can guarantee that when converting from ascii 
 encoding type to unicode encoding type, the program does not 
 need to be modified. What do I need to do in D?

 Thanks.
Hi, _T is not relevant to C++, but to Windows programming. In D, there is only Unicode. The language doesn't manipulate strings encoded in Windows local code-pages. char means UTF-8 (Unicode encoded in 8bit units). wchar means UTF-16 (Unicode encoded in 16bit units, what Windows documentation calls "Unicode"). dchar means UTF-32 (Unicode encoded in 32bit units). When manipulating data encoded in Windows local code page, use the ubyte[] type.
Jan 22
parent reply FrankLike <1150015857 qq.com> writes:
On Tuesday, 22 January 2019 at 14:07:48 UTC, Olivier Pisano wrote:
 On Tuesday, 22 January 2019 at 13:55:30 UTC, FrankLike wrote:
 In D, there is only Unicode. The language doesn't manipulate 
 strings encoded in Windows local code-pages.
For example: std::wstring strTest(_T("d://")); UINT nRes = ::GetDriveType(strTest.c_str()); It can work in C++. But: //////////////////////////here is work ok/////////////////////////////////////// import std.stdio; import std.string; import std.conv; import win32.winbase; void main() { string strA_Z ="CD"; auto type = GetDriveType((to!string(strA_Z[0])~":\\").toStringz); writeln(to!string(strA_Z[0])~" is ",type); } //////////////////////////here is work error////////////////////////////////////////// import core.sys.windows.windows; import std.stdio; import std.string; import std.conv; void main() { string strA_Z ="CD"; auto type = GetDriveType((to!string(strA_Z[0])~":\\").toStringz); writeln(to!string(strA_Z[0])~" is ",type); } //////////////////////////////////////////////////////////////////// //---------------Error Info--------------------// slicea2.d(9): Error: function core.sys.windows.winbase.GetDriveTypeW(const(wchar )*) is not callable using argument types (immutable(char)*) slicea2.d(9): cannot pass argument toStringz(to(strA_Z[0]) ~ ":\\") of ty pe immutable(char)* to parameter const(wchar)* Some error is "core.sys.windows.windows"? Thank you.
Jan 22
parent reply FrankLike <1150015857 qq.com> writes:
On Tuesday, 22 January 2019 at 16:13:57 UTC, FrankLike wrote:
 On Tuesday, 22 January 2019 at 14:07:48 UTC, Olivier Pisano 
 wrote:
Some error is in "core.sys.windows.windows"? Thank you.
Jan 22
parent reply Adam D. Ruppe <destructionator gmail.com> writes:
Use "mystring"w, notice the w after the closing quote.
Jan 22
next sibling parent FrankLike <1150015857 qq.com> writes:
On Tuesday, 22 January 2019 at 16:18:17 UTC, Adam D. Ruppe wrote:
 Use "mystring"w, notice the w after the closing quote.
"GetDriveType" Function is auto work by "_T" in C++,but how to do in D?
Jan 22
prev sibling parent reply FrankLike <1150015857 qq.com> writes:
On Tuesday, 22 January 2019 at 16:18:17 UTC, Adam D. Ruppe wrote:
 Use "mystring"w, notice the w after the closing quote.
Or toStringz is not work like c_str() in C++?
Jan 22
parent reply Stefan Koch <uplink.coder googlemail.com> writes:
On Tuesday, 22 January 2019 at 16:47:45 UTC, FrankLike wrote:
 On Tuesday, 22 January 2019 at 16:18:17 UTC, Adam D. Ruppe 
 wrote:
 Use "mystring"w, notice the w after the closing quote.
Or toStringz is not work like c_str() in C++?
stringz creates a char* but you need a wchar*
Jan 22
parent reply Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:
On Tuesday, January 22, 2019 12:05:32 PM MST Stefan Koch via Digitalmars-d-
learn wrote:
 On Tuesday, 22 January 2019 at 16:47:45 UTC, FrankLike wrote:
 On Tuesday, 22 January 2019 at 16:18:17 UTC, Adam D. Ruppe

 wrote:
 Use "mystring"w, notice the w after the closing quote.
Or toStringz is not work like c_str() in C++?
stringz creates a char* but you need a wchar*
std.utf.toUTF16z or toUTFz can do that for you, though if your string is already a wstring, then you can also just concatenate '\0' to it. the big advantage toUTF16z is that it will also convert strings of other character types rather than just wstrings. So, you can write your program using proper UTF-8 strings and then only convert to UTF-16 for the Windows stuff when you have to. https://dlang.org/phobos/std_utf.html#toUTF16z https://dlang.org/phobos/std_utf.html#toUTFz - Jonathan M Davis
Jan 22
parent reply bauss <jj_1337 live.dk> writes:
On Tuesday, 22 January 2019 at 19:14:43 UTC, Jonathan M Davis 
wrote:
 On Tuesday, January 22, 2019 12:05:32 PM MST Stefan Koch via 
 Digitalmars-d- learn wrote:
 On Tuesday, 22 January 2019 at 16:47:45 UTC, FrankLike wrote:
 On Tuesday, 22 January 2019 at 16:18:17 UTC, Adam D. Ruppe

 wrote:
 Use "mystring"w, notice the w after the closing quote.
Or toStringz is not work like c_str() in C++?
stringz creates a char* but you need a wchar*
std.utf.toUTF16z or toUTFz can do that for you, though if your string is already a wstring, then you can also just concatenate '\0' to it. the big advantage toUTF16z is that it will also convert strings of other character types rather than just wstrings. So, you can write your program using proper UTF-8 strings and then only convert to UTF-16 for the Windows stuff when you have to. https://dlang.org/phobos/std_utf.html#toUTF16z https://dlang.org/phobos/std_utf.html#toUTFz - Jonathan M Davis
Is there a reason we cannot implement toStringz like: immutable(TChar)* toStringz(TChar = char)(scope const(TChar)[] s) trusted pure nothrow; // Couldn't find a way to get the char type of a string, so couldn't make the following generic: immutable(char)* toStringz(return scope string s) trusted pure nothrow; immutable(wchar)* toStringz(return scope wstring s) trusted pure nothrow; immutable(dchar)* toStringz(return scope dstring s) trusted pure nothrow;
Jan 22
next sibling parent FrankLike <1150015857 qq.com> writes:
On Tuesday, 22 January 2019 at 21:49:00 UTC, bauss wrote:
 On Tuesday, 22 January 2019 at 19:14:43 UTC, Jonathan M Davis
 Is there a reason we cannot implement toStringz like:

 immutable(TChar)* toStringz(TChar = char)(scope const(TChar)[] 
 s)  trusted pure nothrow;
 // Couldn't find a way to get the char type of a string, so
"core.sys.windows.windows.winbase",it's implementation is a good choice.
 couldn't make the following generic:
 immutable(char)* toStringz(return scope string s)  trusted pure 
 nothrow;
 immutable(wchar)* toStringz(return scope wstring s)  trusted 
 pure nothrow;
 immutable(dchar)* toStringz(return scope dstring s)  trusted 
 pure nothrow;
For example: /////////////////////////////////START////////////////////////////////////// import core.sys.windows.windows; import std.stdio; import std.string; import std.conv; void main() { auto strA_Z ="CD"w; auto type = GetDriveType((to!wstring(strA_Z[0])~":\\"w).tos); writeln(to!wstring(strA_Z[0])~" is ",type); } private auto tos(wstring str) { version (ANSI) { writeln("ANSI"); return cast(const(char)*)(str); } else { writeln("Unicode"); return cast(const(wchar)*)(str); } } private auto tos(string str) { version (ANSI) { writeln("ANSI"); return cast(const(char)*)(str); } else { writeln("Unicode"); return cast(const(wchar)*)(str); } } /////////////////////////////////END////////////////////////////////// It's work ok.
Jan 22
prev sibling next sibling parent FrankLike <1150015857 qq.com> writes:
On Tuesday, 22 January 2019 at 21:49:00 UTC, bauss wrote:
 On Tuesday, 22 January 2019 at 19:14:43 UTC, Jonathan M Davis
 Is there a reason we cannot implement toStringz like:

 immutable(TChar)* toStringz(TChar = char)(scope const(TChar)[] 
 s)  trusted pure nothrow;
 // Couldn't find a way to get the char type of a string, so
 couldn't make the following generic:
 immutable(char)* toStringz(return scope string s)  trusted pure 
 nothrow;
 immutable(wchar)* toStringz(return scope wstring s)  trusted 
 pure nothrow;
 immutable(dchar)* toStringz(return scope dstring s)  trusted 
 pure nothrow;
For example: ///////////////////////////////////start/////////////////////////////////// import core.sys.windows.windows; import std.stdio; import std.string; import std.conv; void main() { auto strA_Z ="CD"w; auto type = GetDriveType(tos(to!wstring(strA_Z[0])~":\\")); writeln(to!wstring(strA_Z[0])~" is ",type); } private auto tos(T)(T str) { version (ANSI) { writeln("ANSI"); return cast(const(char)*)(str); } else { writeln("Unicode"); return cast(const(wchar)*)(str); } } ///////////////////////////////end///////////////////////////// It's work ok.
Jan 22
prev sibling parent reply Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:
On Tuesday, January 22, 2019 2:49:00 PM MST bauss via Digitalmars-d-learn 
wrote:
 On Tuesday, 22 January 2019 at 19:14:43 UTC, Jonathan M Davis

 wrote:
 On Tuesday, January 22, 2019 12:05:32 PM MST Stefan Koch via

 Digitalmars-d- learn wrote:
 On Tuesday, 22 January 2019 at 16:47:45 UTC, FrankLike wrote:
 On Tuesday, 22 January 2019 at 16:18:17 UTC, Adam D. Ruppe

 wrote:
 Use "mystring"w, notice the w after the closing quote.
Or toStringz is not work like c_str() in C++?
stringz creates a char* but you need a wchar*
std.utf.toUTF16z or toUTFz can do that for you, though if your string is already a wstring, then you can also just concatenate '\0' to it. the big advantage toUTF16z is that it will also convert strings of other character types rather than just wstrings. So, you can write your program using proper UTF-8 strings and then only convert to UTF-16 for the Windows stuff when you have to. https://dlang.org/phobos/std_utf.html#toUTF16z https://dlang.org/phobos/std_utf.html#toUTFz - Jonathan M Davis
Is there a reason we cannot implement toStringz like: immutable(TChar)* toStringz(TChar = char)(scope const(TChar)[] s) trusted pure nothrow; // Couldn't find a way to get the char type of a string, so couldn't make the following generic: immutable(char)* toStringz(return scope string s) trusted pure nothrow; immutable(wchar)* toStringz(return scope wstring s) trusted pure nothrow; immutable(dchar)* toStringz(return scope dstring s) trusted pure nothrow;
toUTFz is the generic solution. toStringz exists specifically for UTF-8 strings, and it exists primarily because it attempts to avoid actually appending or allocating to the string by checking to see if there's a null character after the end of the string (which is done, because that's always the case with string literals). If you don't care about it trying to avoid that allocation, then there arguably isn't much point to toStringz, and you might as well just to (str ~ '\0').ptr - Jonathan M Davis
Jan 23
parent reply FrankLike <1150015857 qq.com> writes:
On Wednesday, 23 January 2019 at 10:44:51 UTC, Jonathan M Davis 
wrote:
 On Tuesday, January 22, 2019 2:49:00 PM MST bauss via 
 Digitalmars-d-learn wrote:
 toUTFz is the generic solution. toStringz exists specifically
Error: template std.utf.toUTFz cannot deduce function from argument types !()(string), candidates are: E:\D\DMD2\WINDOWS\BIN\..\..\src\phobos\std\utf.d(3070): std.utf.toUTFz(P) I have solved the problem in this way: import core.sys.windows.windows; import std.stdio; import std.string; import std.conv; void main() { auto strA_Z ="CD"w; auto type = GetDriveType(tos(to!wstring(strA_Z[0])~":\\")); writeln(to!wstring(strA_Z[0])~" is ",type); } private auto tos(T)(T str) { version (ANSI) { writeln("ANSI"); return cast(const(char)*)(str); } else { writeln("Unicode"); return cast(const(wchar)*)(str); } } Thanks.
Jan 23
parent reply Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:
On Wednesday, January 23, 2019 5:42:55 AM MST FrankLike via Digitalmars-d-
learn wrote:
 On Wednesday, 23 January 2019 at 10:44:51 UTC, Jonathan M Davis

 wrote:
 On Tuesday, January 22, 2019 2:49:00 PM MST bauss via
 Digitalmars-d-learn wrote:

 toUTFz is the generic solution. toStringz exists specifically
Error: template std.utf.toUTFz cannot deduce function from argument types !()(string), candidates are: E:\D\DMD2\WINDOWS\BIN\..\..\src\phobos\std\utf.d(3070): std.utf.toUTFz(P)
As the documentation shows, toUTFz requires a target type just like std.conv.to does. toUTF16z is a convenience wrapper around toUTFz which specifies the target type as const(wchar)*.
 I have solved the problem in this way:

 import core.sys.windows.windows;
 import std.stdio;
 import std.string;
 import std.conv;

 void main()
 {
   auto    strA_Z ="CD"w;
   auto type = GetDriveType(tos(to!wstring(strA_Z[0])~":\\"));
   writeln(to!wstring(strA_Z[0])~" is ",type);
 }

 private auto tos(T)(T str)
 {
   version (ANSI)
   {
   writeln("ANSI");
   return cast(const(char)*)(str);
   }
   else
   {
   writeln("Unicode");
   return cast(const(wchar)*)(str);

   }
 }

 Thanks.
std.conv.to will allow you to convert between string and wstring, but for calling C functions, you still need the strings to be zero-terminated unless the function specifically takes an argument indicating the number of characters in the string. Strings in D are not zero-terminated, so std.conv.to is not going to produce strings that work with C functions. std.conv.to and std.utf.toUTFz solve different problems. Also, strings of char in D are UTF-8, _not_ ANSI, so passing them to any of the A functions from Windows is not going to work correctly. If you want to do that, you need to use toMBSz and fromMBSz from std.windows.charset. But really, there's no reason at this point to ever use the A functions. There are frequently issues with them that the W functions don't have, and the W functions actually support Unicode. The only real reason to use the A functions would be to use an OS like Windows 98 which didn't have the W functions, and D doesn't support such OSes. So, I would strongly discourage you from doing anything with the A functions, let alone trying to write your code so that it uses either the A or W functions depending on some argument. That's an old Windows-ism that I wouldn't even advise using in C/C++ at this point. It's just begging for bugs. And since D doesn't have the macros for all of the various Windows functions for swapping between the A and W versions of the functions like C/C++ does, you have to explicitly call one or the other in D anyway, making it really hard to call the wrong one (unlike in C/C++, where screwing up how your project is compiled can result in accidentally using the A functions instead of the W functions). This really sounds like you're trying to duplicate something from C/C++ that doesn't make sense in D and really shouldn't be duplicated in D. - Jonathan M Davis
Jan 23
parent FrankLike <1150015857 qq.com> writes:
On Wednesday, 23 January 2019 at 14:12:09 UTC, Jonathan M Davis 
wrote:
 On Wednesday, January 23, 2019 5:42:55 AM MST FrankLike via
 std.conv.to will allow you to convert between string and 
 wstring, but for calling C functions, you still need the 
 strings to be zero-terminated unless the function specifically 
 takes an argument indicating the number of characters in the 
 string. Strings in D are not zero-terminated, so std.conv.to is 
 not going to produce strings that work with C functions. 
 std.conv.to and std.utf.toUTFz solve different problems.

 [...]
Thank you.
Jan 23