www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.bugs - [Issue 8020] New: std.stdio can't open UTF16 file names in Windows

reply d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=8020

           Summary: std.stdio can't open UTF16 file names in Windows
           Product: D
           Version: unspecified
          Platform: All
        OS/Version: Windows
            Status: NEW
          Severity: enhancement
          Priority: P2
         Component: Phobos
        AssignedTo: nobody puremagic.com
        ReportedBy: Oleg.Kuporosov gmail.com



00:02:15 PDT ---
File() and p/open() assume to receive only ASCII or UTF8 file names.
Windows is supporting UTF16 file systems so portability is limited only
by ASCII names. 

We probably may have these API receiving wstring also to satisfy this
enhancement.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
May 03 2012
next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=8020


Walter Bright <bugzilla digitalmars.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |bugzilla digitalmars.com



00:51:22 PDT ---
UTF8 supports the full unicode set, not just ASCII.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
May 03 2012
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=8020




04:54:32 PDT ---
Problem is Windows isn't supporting UTF8. So created file in some 3rd party app
with UTF16 name will not match UTF8 name by std.stdio.
http://d.puremagic.com/issues/show_bug.cgi?id=7648 clearly shows that, even
I think it is not a bug, just OS limitation.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
May 03 2012
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=8020


Dmitry Olshansky <dmitry.olsh gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |dmitry.olsh gmail.com



07:33:42 PDT ---
I assumed it just transcodes UTF-8 into UTF-16 before trying to contact the OS
on win32. Apparently that's not the case.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
May 03 2012
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=8020




06:05:24 PDT ---
Dmitry, we should not assume the name string is in UTF8, it may be also some
another 8-bit code page in being supported in Windows, like 125x and so on.
Such encoding should be done by application itself.
What I think is to have File/open/popen( wstring, string mode ) which should 
care about UTF16 names. Surprisingly I found some links in DMC includes to
_wfopen receiving wchar_t which should exacly help here.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
May 04 2012
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=8020




07:48:07 PDT ---

 Dmitry, we should not assume the name string is in UTF8, it may be also some
 another 8-bit code page in being supported in Windows, like 125x and so on.
 Such encoding should be done by application itself.
Nope, char is UTF-8 codeunit period. See TDPL, language spec etc. Legacy one-byte encodings should be transfered in bytes/ubytes whatever. BTW NTFS is UTF-16 (or subset of it).
 What I think is to have File/open/popen( wstring, string mode ) which should 
 care about UTF16 names. Surprisingly I found some links in DMC includes to
 _wfopen receiving wchar_t which should exacly help here.
Then someone just needs rig current std.file to call toUTF16/toUTFz (see std.uni) and forward the result to the right _wfopen on win32. UTF-16 been the defacto standard in Windows for a looong time. This is all is just embarracing. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
May 04 2012
prev sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=8020


Denis Shelomovskij <verylonglogin.reg gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
                 CC|                            |verylonglogin.reg gmail.com
            Version|unspecified                 |D2
         Resolution|                            |DUPLICATE
           Severity|enhancement                 |major



11:54:47 MSD ---

 Then someone just needs rig current std.file to call toUTF16/toUTFz...
std.file works good with non-ASCII strings. This is std.stdio issue.
 ...and forward the result to the right _wfopen...
And std.file uses plain WinAPI, not its buggy wrapper from Digital Mars C runtime.
 ...This is all is just embarracing.
Yes, but std.stdio is even worse than you think (e.g. it can be 100x slower than direct C function calls as bearophile noted about rawWrite). *** This issue has been marked as a duplicate of issue 7648 *** -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Jul 06 2012