digitalmars.D - Let's schedule WinAPI ASCII functions for deprecation!
- Denis Shelomovskij (22/22) May 22 2012 Since Win9x isn't supported any more why do we have ASCII WinAPI
- Dmitry Olshansky (12/32) May 22 2012 Yes, let them burn! Burn, burn, burn!
- Dmitry Olshansky (8/13) May 22 2012 forgot to mention that my GSOC project has support for legacy encodings
- Roman D. Boiko (4/17) May 22 2012 Dmitry, your project looks really cool.
- Roman D. Boiko (3/6) May 22 2012 Especially I liked "...policy based design, thus exposing all of
- Stewart Gordon (12/17) May 23 2012 That's just as easy in almost any language. It's part of why so many we...
- Denis Shelomovskij (6/6) May 22 2012 LPTSTR issue (it aliases char*) is already filled by Martin Nowak:
- Martin Nowak (2/8) May 22 2012 Given that it only requires a 'w' suffix for literals it's a good choice...
- Dmitry Olshansky (8/17) May 22 2012 http://stackoverflow.com/questions/7950271/windows-uses-utf-16-as-its-in...
- Walter Bright (2/4) May 22 2012 Yes. Windows internally is all 16 bit Unicode.
- Trass3r (1/1) May 22 2012 Yeah let 'em burn!
- Gor Gyolchanyan (6/7) May 22 2012 Kill it! Kill it with fire!!!
- Walter Bright (9/21) May 22 2012 First off, I agree that druntime and phobos must not use the A functions...
- Dmitry Olshansky (9/40) May 22 2012 Again correct. The trick is that the way *A functions are provided is in...
- Denis Shelomovskij (22/28) May 24 2012 The key point is what does it mean "interface"?
- Mehrdad (1/1) May 22 2012 I hope this includes SNN.lib, which also uses ANSI functions...
- Kagamin (9/10) May 23 2012 Well, you can't fix C because C explicitly ignores string
- Stewart Gordon (8/16) May 23 2012 A lot of C functions do. Indeed, this is one of the considerations made...
- Jacob Carlborg (5/6) May 23 2012 Since C doesn't really have a concept of encodings it would be whatever
- Regan Heath (5/9) May 24 2012 All the more reason to use byte/ubyte as D's equivalent to C's char.
- Michael (4/7) May 23 2012 +1. For me LoadLibraryA works well.
- Dmitry Olshansky (10/17) May 23 2012 Nope. Quoting random top hit from google:
- Michael (1/2) May 23 2012 I know it ;) But it's platform specific kung-fu.
- Dmitry Olshansky (4/6) May 23 2012 It's the only game in M$ town ;)
- Regan Heath (8/10) May 24 2012 And, if you start to dig a bit things can get a bit hairy in places:
- Michael (3/3) May 24 2012 I knew it till an .net era. Main line is even Windows may handle
Since Win9x isn't supported any more why do we have ASCII WinAPI functions in druntime's core.sys.windows.windows (and, possibly, other places)? Reasons against *A functions: * using of every such function is unsafe (with really seldom exceptions like LoadLibraryA("ntdll")) because inability to encode non-ASCII characters to OEM encoding will almost always give unpredictable results for programmer (simple test: you, reader, what will happen?); * in D it's too easy to make a mistake by passing UTF-8 string pointer to such function because D has no string types other than UTF and elimination of such function is the only solution unless ASCII string type is created * it performs worse because Windows has to convert ASCII string to UTF-16 first And yes, druntime already has encoding bugs because of using such functions. P.S. Let's finally solve encoding problems that should be solved 10 years ago! By the way, Git+TurtoiseGit still has encoding problems on Windows and it is awful (see its changelog). -- Денис В. Шеломовский Denis V. Shelomovskij
May 22 2012
On 22.05.2012 22:11, Denis Shelomovskij wrote:Since Win9x isn't supported any more why do we have ASCII WinAPI functions in druntime's core.sys.windows.windows (and, possibly, other places)? Reasons against *A functions: * using of every such function is unsafe (with really seldom exceptions like LoadLibraryA("ntdll")) because inability to encode non-ASCII characters to OEM encoding will almost always give unpredictable results for programmer (simple test: you, reader, what will happen?); * in D it's too easy to make a mistake by passing UTF-8 string pointer to such function because D has no string types other than UTF and elimination of such function is the only solution unless ASCII string type is created * it performs worse because Windows has to convert ASCII string to UTF-16 first And yes, druntime already has encoding bugs because of using such functions.Yes, let them burn! Burn, burn, burn! Seriously. For those that are bend on compatibility, *A functions also are: - security disasters - limited in more then just one way: 256 max path, and so on and so forth And last but not least: - *W were supported on Win98+ Second Edition with official addon - Unicode Layer for Windows ;) Not to mention the OEM encoding were never supported properly by D.P.S. Let's finally solve encoding problems that should be solved 10 years ago! By the way, Git+TurtoiseGit still has encoding problems on Windows and it is awful (see its changelog).-- Dmitry Olshansky
May 22 2012
forgot to mention that my GSOC project has support for legacy encodings as it's secondary goal. Check out: TODOs, synopsis & status: https://github.com/blackwhale/phobos/wiki/GSOC-Unicode-support/tree/gsoc-uni Original proposal: -- Dmitry OlshanskyP.S. Let's finally solve encoding problems that should be solved 10 years ago! By the way, Git+TurtoiseGit still has encoding problems on Windows and it is awful (see its changelog).
May 22 2012
On Tuesday, 22 May 2012 at 18:39:46 UTC, Dmitry Olshansky wrote:Dmitry, your project looks really cool. As for the topic, I would vote for that, too, but don't have enough knowledge to understand all possible consequences...forgot to mention that my GSOC project has support for legacy encodings as it's secondary goal. Check out: TODOs, synopsis & status: https://github.com/blackwhale/phobos/wiki/GSOC-Unicode-support/tree/gsoc-uni Original proposal:P.S. Let's finally solve encoding problems that should be solved 10 years ago! By the way, Git+TurtoiseGit still has encoding problems on Windows and it is awful (see its changelog).
May 22 2012
On Tuesday, 22 May 2012 at 18:43:58 UTC, Roman D. Boiko wrote:Dmitry, your project looks really cool. As for the topic, I would vote for that, too, but don't have enough knowledge to understand all possible consequences...Especially I liked "...policy based design, thus exposing all of relevant tradeoffs".
May 22 2012
On 22/05/2012 19:24, Dmitry Olshansky wrote: <snip>That's just as easy in almost any language. It's part of why so many websites have character encoding bugs. <snip>* in D it's too easy to make a mistake by passing UTF-8 string pointer to such functionAnd last but not least: - *W were supported on Win98+ Second Edition with official addon - Unicode Layer for Windows ;)<snip> I've heard of MS Layer for Unicode - don't know if that's what you meant or you're talking about something else. From what I recall reading, MSLU had the problem that EXEs have to be explicitly built to depend on it. So a typical app targeted at Win2000 and above wouldn't work with it, and you can't (at least easily) make an app detect whether Unicode is available and use it if it's there. Stewart.
May 23 2012
LPTSTR issue (it aliases char*) is already filled by Martin Nowak: Issue 8132 - LPTSTR always aliases to LPSTR http://d.puremagic.com/issues/show_bug.cgi?id=8132 -- Денис В. Шеломовский Denis V. Shelomovskij
May 22 2012
* it performs worse because Windows has to convert ASCII string to UTF-16 firstIs that a fact?P.S. Let's finally solve encoding problems that should be solved 10 years ago! By the way, Git+TurtoiseGit still has encoding problems on Windows and it is awful (see its changelog).Given that it only requires a 'w' suffix for literals it's a good choice.
May 22 2012
On 22.05.2012 23:32, Martin Nowak wrote:http://stackoverflow.com/questions/7950271/windows-uses-utf-16-as-its-internal-encoding-what-exactly-does-this-mean Second answer sheds some light on the topic. From what I know of Windows NT, the kernel even doesn't use Z-strings most of the time. All stuff that can be called syscalls use a variation of L-strings for 16-bit width chars. -- Dmitry Olshansky* it performs worse because Windows has to convert ASCII string to UTF-16 firstIs that a fact?P.S. Let's finally solve encoding problems that should be solved 10 years ago! By the way, Git+TurtoiseGit still has encoding problems on Windows and it is awful (see its changelog).Given that it only requires a 'w' suffix for literals it's a good choice.
May 22 2012
On 5/22/2012 12:32 PM, Martin Nowak wrote:Yes. Windows internally is all 16 bit Unicode.* it performs worse because Windows has to convert ASCII string to UTF-16 firstIs that a fact?
May 22 2012
On Wed, May 23, 2012 at 12:31 AM, Trass3r <un known.com> wrote:Yeah let 'em burn!Kill it! Kill it with fire!!! +1 -- Bye, Gor Gyolchanyan.
May 22 2012
On 5/22/2012 11:11 AM, Denis Shelomovskij wrote:Since Win9x isn't supported any more why do we have ASCII WinAPI functions in druntime's core.sys.windows.windows (and, possibly, other places)? Reasons against *A functions: * using of every such function is unsafe (with really seldom exceptions like LoadLibraryA("ntdll")) because inability to encode non-ASCII characters to OEM encoding will almost always give unpredictable results for programmer (simple test: you, reader, what will happen?); * in D it's too easy to make a mistake by passing UTF-8 string pointer to such function because D has no string types other than UTF and elimination of such function is the only solution unless ASCII string type is created * it performs worse because Windows has to convert ASCII string to UTF-16 first And yes, druntime already has encoding bugs because of using such functions.First off, I agree that druntime and phobos must not use the A functions without a very, very good reason. Secondly, as a matter of principle, we are not going to fix, improve, refactor, or re-engineer the Windows API, nor any other operating system API, nor the C Standard Library, no matter how tempting that may be. The job of the D interface modules is to simply provide an interface to them, as thin and direct as possible, without editorial comment. The user can decide what to use or not use from it.
May 22 2012
On 23.05.2012 0:41, Walter Bright wrote:On 5/22/2012 11:11 AM, Denis Shelomovskij wrote:Right.Since Win9x isn't supported any more why do we have ASCII WinAPI functions in druntime's core.sys.windows.windows (and, possibly, other places)? Reasons against *A functions: * using of every such function is unsafe (with really seldom exceptions like LoadLibraryA("ntdll")) because inability to encode non-ASCII characters to OEM encoding will almost always give unpredictable results for programmer (simple test: you, reader, what will happen?); * in D it's too easy to make a mistake by passing UTF-8 string pointer to such function because D has no string types other than UTF and elimination of such function is the only solution unless ASCII string type is created * it performs worse because Windows has to convert ASCII string to UTF-16 first And yes, druntime already has encoding bugs because of using such functions.First off, I agree that druntime and phobos must not use the A functions without a very, very good reason.Secondly, as a matter of principle, we are not going to fix, improve, refactor, or re-engineer the Windows API, nor any other operating system API, nor the C Standard Library, no matter how tempting that may be. The job of the D interface modules is to simply provide an interface to them, as thin and direct as possible, without editorial comment. The user can decide what to use or not use from it.Again correct. The trick is that the way *A functions are provided is in fact wrong edit! It signatres are basically saying "hello I'm explicit Win32 API multi-byte string binding and I accept UTF-8 string " ... WTF?! The fact that they are horribly outdated is the perfect moment to both fix the issue and get rid of junk. -- Dmitry Olshansky
May 22 2012
23.05.2012 0:41, Walter Bright написал:Secondly, as a matter of principle, we are not going to fix, improve, refactor, or re-engineer the Windows API, nor any other operating system API, nor the C Standard Library, no matter how tempting that may be. The job of the D interface modules is to simply provide an interface to them, as thin and direct as possible, without editorial comment. The user can decide what to use or not use from it.The key point is what does it mean "interface"? An ability to load DLL and get symbols from it is enough to use every function. Is it an interface? You say "no". It's common in C/C++ to use WinAPI functions without A/W postfixes because preprocessor defines it according to your preferences. Is it an interface? You say "no". Functions like C's memmove are deprecated in VC headers on Windows because they are unsafe. Is it an interface? You say "no". WinAPI functions are more than just C definitions, they have IDL to allow user to avoid pointers and exit code checking. Is it an interface? You say "no". There is no such macros in Windows headers even for dmc and there is no talks at all to generate good D wrappers for WinAPI functions based on its IDL. *A functions are in WinAPI headers obviously for backward compatibility only. Are they definitions an interface? You say "yes". And I completely disagree with the last 2 points. I just want to show that this "principle" isn't as well-shaped as it can look at first sight. -- Денис В. Шеломовский Denis V. Shelomovskij
May 24 2012
I hope this includes SNN.lib, which also uses ANSI functions...
May 22 2012
On Wednesday, 23 May 2012 at 04:01:05 UTC, Mehrdad wrote:I hope this includes SNN.lib, which also uses ANSI functions...Well, you can't fix C because C explicitly ignores string encoding and thoughtlessly passes strings around without any transcoding. Though, D bindings suggest that C functions accept utf-8 strings which leads to assumption that those functions will act properly on utf-8 strings. I'd say that's a bug in bindings: C strings are specified to be in C encoding, not utf-8 encoding. I think, conversion from D string to C string should require at least a cast.
May 23 2012
On 23/05/2012 15:16, Kagamin wrote: <snip>Well, you can't fix C because C explicitly ignores string encoding and thoughtlessly passes strings around without any transcoding. Though, D bindings suggest that C functions accept utf-8 stringsA lot of C functions do. Indeed, this is one of the considerations made in the design of UTF-8.which leads to assumption that those functions will act properly on utf-8 strings. I'd say that's a bug in bindings: C strings are specified to be in C encoding,What is "C encoding"?not utf-8 encoding. I think, conversion from D string to C string should require at least a cast.Several people have dealt with this by using byte or ubyte as D's equivalent of the C char type. Stewart.
May 23 2012
On 2012-05-23 20:34, Stewart Gordon wrote:What is "C encoding"?Since C doesn't really have a concept of encodings it would be whatever a given application/library decides it is. -- /Jacob Carlborg
May 23 2012
On Wed, 23 May 2012 20:54:44 +0100, Jacob Carlborg <doob me.com> wrote:On 2012-05-23 20:34, Stewart Gordon wrote:All the more reason to use byte/ubyte as D's equivalent to C's char. R -- Using Opera's revolutionary email client: http://www.opera.com/mail/What is "C encoding"?Since C doesn't really have a concept of encodings it would be whatever a given application/library decides it is.
May 24 2012
In WinAPI we have: LoadLibraryA/W, but not GetProcAddressA/W because PE COFF limitations exists.Walter Bright The user can decide what to use or not use from it.+1. For me LoadLibraryA works well.256 max pathIt's FS limitation.
May 23 2012
On 23.05.2012 23:29, Michael wrote:In WinAPI we have: LoadLibraryA/W, but not GetProcAddressA/W because PE COFF limitations exists.Nope. Quoting random top hit from google: Individual components of a filename (i.e. each subdirectory along the path, and the final filename) are limited to 255 characters, and the total path length is limited to approximately 32,000 characters. However, you should generally try to limit path lengths to below 260 characters (MAX_PATH) when possible. See http://msdn.microsoft.com/en-us/library/aa365247.aspx for full details. -- Dmitry OlshanskyWalter Bright The user can decide what to use or not use from it.+1. For me LoadLibraryA works well.256 max pathIt's FS limitation.
May 23 2012
approximately 32,000 characters...I know it ;) But it's platform specific kung-fu.
May 23 2012
On 24.05.2012 0:13, Michael wrote:It's the only game in M$ town ;) -- Dmitry Olshanskyapproximately 32,000 characters...I know it ;) But it's platform specific kung-fu.
May 23 2012
On Wed, 23 May 2012 21:13:47 +0100, Michael <pr m1xa.com> wrote:And, if you start to dig a bit things can get a bit hairy in places: http://blogs.msdn.com/b/bclteam/archive/2007/02/13/long-paths-in-net-part-1-of-3-kim-hamilton.aspx http://blogs.msdn.com/b/bclteam/archive/2007/03/26/long-paths-in-net-part-2-of-3-long-path-workarounds-kim-hamilton.aspx http://blogs.msdn.com/b/bclteam/archive/2008/07/07/long-paths-in-net-part-3-of-3-redux-kim-hamilton.aspx R -- Using Opera's revolutionary email client: http://www.opera.com/mail/approximately 32,000 characters...I know it ;) But it's platform specific kung-fu.
May 24 2012
I knew it till an .net era. Main line is even Windows may handle it in a wrong way. WinAPi - interface "as is". So let user decides to use or not.
May 24 2012