www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.bugs - [Issue 7393] New: Which character code does wchar be, UTF-16BE or UTF-16LE?

reply d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=7393

           Summary: Which character code does wchar be, UTF-16BE or
                    UTF-16LE?
           Product: D
           Version: D2
          Platform: All
        OS/Version: All
            Status: NEW
          Severity: normal
          Priority: P2
         Component: websites
        AssignedTo: nobody puremagic.com
        ReportedBy: zan77137 nifty.com



It is not clear whether wchar is UTF-16LE or UTF-16BE or system-dependent.
There is a similar problem with dchar.

These should be described in specifications clearly.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Jan 28 2012
next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=7393




The current implementation is system-dependent.

In addition, via specifications of the C language, it becomes clear that wchar
is equal with wchar_t(2byte).
http://www.d-programming-language.org/interfaceToC.html

I think that higher accessibility is necessary for these specifications.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Jan 28 2012
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=7393


Kenji Hara <k.hara.pg gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |diagnostic



IMO, The simple way is adding following two rowsinto 'Basic Types' table in
http://www.d-programming-language.org/abi.html:

wchar     16 bit unsigned value (same as ushort)
dchar     32 bit unsigned value (same as uint)

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Jan 28 2012
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=7393




10:51:04 PST ---
C does not specify the size of wchar_t. On Windows, wchar_t is 2 bytes, but on
Linux it is 4.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Jan 28 2012
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=7393





 C does not specify the size of wchar_t. On Windows, wchar_t is 2 bytes, but on
 Linux it is 4.
Yes, I know, and he knows it. The original question is "Does the representation of wchar and dchar type value depend on system-endianness?". The http://d-programming-language.org/abi.html page says "The endianness (byte order) of the layout of the data will conform to the endianness of the target machine.", but following "Basic Types" table does not mention about char, wchar, and dchar types. So I had said to him in Twitter, "D's wchar type is same as C's wchar_t in 32bit system, and wchar_t depends on system endianness. So D's wchar is also system-endianness", but he had not been able to believe that. At least, I think the lack of descriptions about character types in abi page -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Jan 28 2012
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=7393




12:17:16 PST ---

 IMO, The simple way is adding following two rowsinto 'Basic Types' table in
 http://www.d-programming-language.org/abi.html:
 wchar     16 bit unsigned value (same as ushort)
 dchar     32 bit unsigned value (same as uint)
They are already described in type.html. I don't think it is necessary to say if they are BE or LE, any more than saying for a uint which order the bytes come in. The least significant bytes are grabbed with w&0xFF, the most significant with w>>8. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Jan 28 2012
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=7393




Commit pushed to master at https://github.com/D-Programming-Language/dmd

https://github.com/D-Programming-Language/dmd/commit/a48c43c9e3b7dc57092c1a72c1e019c46178f11b
fix Issue 7393 - Which character code does wchar be, UTF-16BE or UTF-16LE?

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Jan 28 2012
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=7393




Commit pushed to dmd-1.x at https://github.com/D-Programming-Language/dmd

https://github.com/D-Programming-Language/dmd/commit/999ef822efdca31c6983ea89805b178526c53c3d
fix Issue 7393 - Which character code does wchar be, UTF-16BE or UTF-16LE?

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Jan 28 2012
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=7393




12:43:42 PST ---
Ignore, those fixes are meant for 4371.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Jan 28 2012
prev sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=7393


monarchdodra gmail.com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
                 CC|                            |monarchdodra gmail.com
         Resolution|                            |INVALID


-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Nov 02 2013