D.gnu - OS X bug: universal alpha indentifiers
- =?UTF-8?B?QW5kZXJzIEYgQmrDtnJrbHVuZA==?= (17/40) Jan 24 2005 Programs with non-ascii identifiers do not
- Thomas Kuehne (8/48) Jan 27 2005 Added to DStress as
- =?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= (9/19) Jan 27 2005 If you are feeling like testing or something,
- Thomas Kuehne (6/17) Jan 27 2005 This shoulde be clarified. I suppose that "digits" are only "0123456789"...
- =?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= (6/15) Jan 27 2005 But I'm also thinking that a "digit" here meant [0-9]...
Programs with non-ascii identifiers do not link, under Mac OS X 10.3 using GDC 0.10... They use the mangled name as a label for the assembler, and then choke on the UTF-8: unialpha.d:void anders() {} void björklund() {} void main() { anders(); björklund(); }Gives the error:/var/tmp//ccW4WWE0.s:44:Invalid mnemonic '?rklundFZv'Here's the disasm:bl __D8unialpha10bj√∂rklundFZvSimilar errors for variables: unialpha2.d:int anders; int björklund; void main() { anders = 1; björklund = 2; }gdc:/var/tmp//cc7ur4wd.s:31:Parameter syntax error (parameter 3) /var/tmp//cc7ur4wd.s:31:Invalid mnemonic '?rklundi-L1$pb)' /var/tmp//cc7ur4wd.s:32:Parameter error: expression must be absolute (parameter 2) /var/tmp//cc7ur4wd.s:32:Invalid mnemonic '?rklundi-L1$pb)(r9)'asm:addis r9,r31,ha16(__D9unialpha210bj√∂rklundi-L1$pb) la r9,lo16(__D9unialpha210bj√∂rklundi-L1$pb)(r9)Not sure how this can be fixed, without changing the way that D mangles the names... Both programs compile just fine on Linux. --anders PS: Assembler is:Apple Computer, Inc. version cctools-525.obj~1, GNU assembler version 1.38http://www.opensource.apple.com/darwinsource/DevToolsAug2004/cctools-525/
Jan 24 2005
Added to DStress as http://dstress.kuehne.cn/run/unicode_03.d http://dstress.kuehne.cn/run/unicode_04.d http://dstress.kuehne.cn/run/unicode_05.d http://dstress.kuehne.cn/run/unicode_06.d http://dstress.kuehne.cn/run/unicode_07.d Thomas Anders F Björklund schrieb in news:ct428n$2qoe$1 digitaldaemon.com :Programs with non-ascii identifiers do not link, under Mac OS X 10.3 using GDC 0.10... They use the mangled name as a label for the assembler, and then choke on the UTF-8: unialpha.d:void anders() {} void björklund() {} void main() { anders(); björklund(); }Gives the error:/var/tmp//ccW4WWE0.s:44:Invalid mnemonic '?rklundFZv'Here's the disasm:bl __D8unialpha10bj??rklundFZvSimilar errors for variables: unialpha2.d:int anders; int björklund; void main() { anders = 1; björklund = 2; }gdc:/var/tmp//cc7ur4wd.s:31:Parameter syntax error (parameter 3) /var/tmp//cc7ur4wd.s:31:Invalid mnemonic '?rklundi-L1$pb)' /var/tmp//cc7ur4wd.s:32:Parameter error: expression must be absolute (parameter 2) /var/tmp//cc7ur4wd.s:32:Invalid mnemonic '?rklundi-L1$pb)(r9)'asm:addis r9,r31,ha16(__D9unialpha210bj??rklundi-L1$pb) la r9,lo16(__D9unialpha210bj??rklundi-L1$pb)(r9)Not sure how this can be fixed, without changing the way that D mangles the names... Both programs compile just fine on Linux. --anders PS: Assembler is:Apple Computer, Inc. version cctools-525.obj~1, GNU assembler version 1.38http://www.opensource.apple.com/darwinsource/DevToolsAug2004/cctools-525/
Jan 27 2005
Thomas Kuehne wrote:Added to DStress as http://dstress.kuehne.cn/run/unicode_03.d http://dstress.kuehne.cn/run/unicode_04.d http://dstress.kuehne.cn/run/unicode_05.d http://dstress.kuehne.cn/run/unicode_06.d http://dstress.kuehne.cn/run/unicode_07.dIf you are feeling like testing or something, here are the rest of the Universal Alphas : http://www.algonet.se/~afb/d/universalalphas/Identifiers start with a letter, _, or unicode alpha, and are followed by any number of letters, _, digits, or universal alphas. Universal alphas are as defined in ISO/IEC 9899:1999(E) Appendix D. (This is the C99 Standard.)http://www.digitalmars.com/d/lex.html#identifier I think Walter officially acknowledged the phrase "unicode alpha" as just a typo for universal... (the meaning is that it can't start with a digit) --anders
Jan 27 2005
Anders F Björklund schrieb in news:ctairr$1ngb$1 digitaldaemon.com :If you are feeling like testing or something, here are the rest of the Universal Alphas : http://www.algonet.se/~afb/d/universalalphas/I've been only testing the name mangling, thus it shouldn't be important what scripts I check.This shoulde be clarified. I suppose that "digits" are only "0123456789" - there are loads of other digits in Unicode. Why is an ancient (1999) version used in the documentation? I've tried codepoints that are assigned in the current standard bu weren't in the 1999 one, and as you might have guessed even currently reserved codepoints weren't caught by the frontent... ThomasIdentifiers start with a letter, _, or unicode alpha, and are followed by any number of letters, _, digits, or universal alphas. Universal alphas are as defined in ISO/IEC 9899:1999(E) Appendix D. (This is the C99 Standard.)http://www.digitalmars.com/d/lex.html#identifier I think Walter officially acknowledged the phrase "unicode alpha" as just a typo for universal... (the meaning is that it can't start with a digit)
Jan 27 2005
Thomas Kuehne wrote:Identifiers start with a letter, _, or unicode alpha, and are followed by any number of letters, _, digits, or universal alphas. Universal alphas are as defined in ISO/IEC 9899:1999(E) Appendix D. (This is the C99 Standard.)This shoulde be clarified. I suppose that "digits" are only "0123456789" - there are loads of other digits in Unicode.Yes, the quoted C99 standard (which isn't all that "ancient") used:Digits: 0660-0669, 06F0-06F9, 0966-096F, 09E6-09EF, 0A66-0A6F, 0AE6-0AEF, 0B66-0B6F, 0BE7-0BEF, 0C66-0C6F, 0CE6-0CEF, 0D66-0D6F, 0E50-0E59, 0ED0-0ED9, 0F20-0F33But I'm also thinking that a "digit" here meant [0-9]... I think a "letter" to Walter is just [a-zA-Z], as well ? And I agree, it would be a lot easier to just say that. --anders
Jan 27 2005