digitalmars.D - toString issue
- Johan Granberg (11/11) Sep 29 2006 As a result of the discussion about char[] above I have been converting
- Vladimir Kulev (2/4) Sep 30 2006 I agree, and the same about toHash. Naming consistency is the right thin...
- Hasan Aljudy (4/9) Sep 30 2006 I totally disagree, what consistency are you talking about?
- Johan Granberg (5/18) Sep 30 2006 The prefix is not important the name collision issue is. The problem is
- Vladimir Kulev (6/9) Sep 30 2006 This methods are implied for all objects, so you can use them as well as
- Sean Kelly (3/15) Sep 30 2006 How about toUtf8() for classes and structs :-)
- Chris Nicholson-Sauls (5/25) Sep 30 2006 Gets my vote. Note that Mango classes typically already do this (with t...
- Charlie (2/29) Oct 01 2006
- Hasan Aljudy (43/48) Oct 01 2006 I think there's a fundamental problem with the way D deals with strings.
- Derek Parnell (12/40) Oct 02 2006 foreach(int i, dchar c; text)
- Hasan Aljudy (9/45) Oct 02 2006 I know, but that's still a work-around. What if you need to iterate back...
- Oskar Linde (3/19) Oct 02 2006 see std.utf.decode and std.utf.stride.
- Hasan Aljudy (5/26) Oct 02 2006 I have .. and I know the functions are all there. but hey, the C
As a result of the discussion about char[] above I have been converting some of my code from dchar[] to char[], but that reminded me of an issue i have with the current state of phobos. in object their is the method toString that happened to have the same name as the COMMONLY used function std.string.toString this causes objects toString to shadow std.strings to string inside class methods. I know that FQN can be used as a workaround but it makes the code unnecessary hard to read and I think that name clashes such as this should be avoided in the standard library. PROPOSAL. change all methods in object to have some prefix for to string I suggest opString as the op prefix is already in use.
Sep 29 2006
Johan Granberg wrote:PROPOSAL. change all methods in object to have some prefix for to string I suggest opString as the op prefix is already in use.I agree, and the same about toHash. Naming consistency is the right thing.
Sep 30 2006
Vladimir Kulev wrote:Johan Granberg wrote:I totally disagree, what consistency are you talking about? toString and toHash are *not* operators, so prefixing them with op is misleading and inconsistent.PROPOSAL. change all methods in object to have some prefix for to string I suggest opString as the op prefix is already in use.I agree, and the same about toHash. Naming consistency is the right thing.
Sep 30 2006
Hasan Aljudy wrote:Vladimir Kulev wrote:The prefix is not important the name collision issue is. The problem is that two commonly used identifiers collide and the use of an op prefix is one way to solve that (and would open up fore making them operator if desired at some later time)Johan Granberg wrote:I totally disagree, what consistency are you talking about? toString and toHash are *not* operators, so prefixing them with op is misleading and inconsistent.PROPOSAL. change all methods in object to have some prefix for to string I suggest opString as the op prefix is already in use.I agree, and the same about toHash. Naming consistency is the right thing.
Sep 30 2006
Hasan Aljudy wrote:I totally disagree, what consistency are you talking about? toString and toHash are *not* operators, so prefixing them with op is misleading and inconsistent.This methods are implied for all objects, so you can use them as well as other unary operators like ~, excepting there are no special symbols for them. Anyway, Object.toString and std.string.toString collision should be resolved, and renaming second one is also suitable for me.
Sep 30 2006
Johan Granberg wrote:As a result of the discussion about char[] above I have been converting some of my code from dchar[] to char[], but that reminded me of an issue i have with the current state of phobos. in object their is the method toString that happened to have the same name as the COMMONLY used function std.string.toString this causes objects toString to shadow std.strings to string inside class methods. I know that FQN can be used as a workaround but it makes the code unnecessary hard to read and I think that name clashes such as this should be avoided in the standard library. PROPOSAL. change all methods in object to have some prefix for to string I suggest opString as the op prefix is already in use.How about toUtf8() for classes and structs :-) Sean
Sep 30 2006
Sean Kelly wrote:Johan Granberg wrote:Gets my vote. Note that Mango classes typically already do this (with toString just calling toUtf8 in most cases), and provide toUtf16/toUtf32 counterparts. It is indeed effective. :) -- Chris Nicholson-SaulsAs a result of the discussion about char[] above I have been converting some of my code from dchar[] to char[], but that reminded me of an issue i have with the current state of phobos. in object their is the method toString that happened to have the same name as the COMMONLY used function std.string.toString this causes objects toString to shadow std.strings to string inside class methods. I know that FQN can be used as a workaround but it makes the code unnecessary hard to read and I think that name clashes such as this should be avoided in the standard library. PROPOSAL. change all methods in object to have some prefix for to string I suggest opString as the op prefix is already in use.How about toUtf8() for classes and structs :-) Sean
Sep 30 2006
Gets my vote too , it's also more descriptive than 'toString' . Chris Nicholson-Sauls wrote:Sean Kelly wrote:Johan Granberg wrote:Gets my vote. Note that Mango classes typically already do this (with toString just calling toUtf8 in most cases), and provide toUtf16/toUtf32 counterparts. It is indeed effective. :) -- Chris Nicholson-SaulsAs a result of the discussion about char[] above I have been converting some of my code from dchar[] to char[], but that reminded me of an issue i have with the current state of phobos. in object their is the method toString that happened to have the same name as the COMMONLY used function std.string.toString this causes objects toString to shadow std.strings to string inside class methods. I know that FQN can be used as a workaround but it makes the code unnecessary hard to read and I think that name clashes such as this should be avoided in the standard library. PROPOSAL. change all methods in object to have some prefix for to string I suggest opString as the op prefix is already in use.How about toUtf8() for classes and structs :-) Sean
Oct 01 2006
Sean Kelly wrote:How about toUtf8() for classes and structs :-) SeanI think there's a fundamental problem with the way D deals with strings. The spec claims that D natively supports strings through char[], at the same time, claims that D fully supports Unicode. The fundamental issue is that UTF-8 is one encoding for Unicode strings, but it's not always the best choice. Phobos mostly only deals with char[], and mixing code that uses wchar[] with code that uses char[] isn't very straight forward. Consider the simple case of reading a text file and detecting "words". To detect a word, you must first recognize letters, no .. not English letters; letters of any language, and for that purpose, we have isUniAlpha function. Now, If you encode the string as char[], then how are you gonna determine whether or not the next character is a Unicode alpha or not? The following definitely shouldn't work: //assuming text is char[] for( int i = 0; i < text.length; i++ ) { bool isLetter = isUniAlpha( text[i] ); .... } because isUniAlpha takes a dchar parameter, and of course, because a single char doesn't necessarily encode a Unicode character just by itself; if you're dealing with non-English text, then most likely a single char will only hold half the encoding for that letter. Surprisingly, the compiler allows this kind of code, but that's not the point. The point is, this code will never work, because char[] is not a very good way to hold a Unicode string. Of course there are ways around this, but they are still just "workarounds". Should you choose wchar[] (or dchar[]) to represent strings, you will get into all kinds of troubles dealing with phobos. The standard library always deals with strings using char[], this includes std.string and std.regexp, and even the Exception class. So, if you're using wchar[] to represent strings, and you want to throw an exception, you can't just say: because the compiler will complain (can't cast wchar[] to char[]), so you'll need toUtf8( myString ), and you're code can quickly become full of calls to toUtf* functions. Personally, I think D needs a proper String class built into the language and the standard library. or at least, casting between the different encodings should be seamless to the coder; just let the compiler call the appropriate toUtf* function and allow implicit casting.
Oct 01 2006
On Mon, 02 Oct 2006 00:52:44 -0600, Hasan Aljudy wrote:Sean Kelly wrote:foreach(int i, dchar c; text) { bool isLetter = isUniAlpha( c ); ... } -- Derek (skype: derek.j.parnell) Melbourne, Australia "Down with mediocrity!" 2/10/2006 5:10:26 PMHow about toUtf8() for classes and structs :-) SeanI think there's a fundamental problem with the way D deals with strings. The spec claims that D natively supports strings through char[], at the same time, claims that D fully supports Unicode. The fundamental issue is that UTF-8 is one encoding for Unicode strings, but it's not always the best choice. Phobos mostly only deals with char[], and mixing code that uses wchar[] with code that uses char[] isn't very straight forward. Consider the simple case of reading a text file and detecting "words". To detect a word, you must first recognize letters, no .. not English letters; letters of any language, and for that purpose, we have isUniAlpha function. Now, If you encode the string as char[], then how are you gonna determine whether or not the next character is a Unicode alpha or not? The following definitely shouldn't work: //assuming text is char[] for( int i = 0; i < text.length; i++ ) { bool isLetter = isUniAlpha( text[i] ); .... }
Oct 02 2006
Derek Parnell wrote:On Mon, 02 Oct 2006 00:52:44 -0600, Hasan Aljudy wrote:I know, but that's still a work-around. What if you need to iterate back and forth? You're gonna need to convert it to dchar[] (or wchar[]). However, that brings up a good point: Notice how foreach allows to iterate a string by Unicode characters (a.k.a code-points)? Shouldn't this kind of iteration be supported outside of foreach as well? Sure I know, you can write you're own String class and even an iterator, but that just proves that string support isn't really/fully built-in.Sean Kelly wrote:foreach(int i, dchar c; text) { bool isLetter = isUniAlpha( c ); ... }How about toUtf8() for classes and structs :-) SeanI think there's a fundamental problem with the way D deals with strings. The spec claims that D natively supports strings through char[], at the same time, claims that D fully supports Unicode. The fundamental issue is that UTF-8 is one encoding for Unicode strings, but it's not always the best choice. Phobos mostly only deals with char[], and mixing code that uses wchar[] with code that uses char[] isn't very straight forward. Consider the simple case of reading a text file and detecting "words". To detect a word, you must first recognize letters, no .. not English letters; letters of any language, and for that purpose, we have isUniAlpha function. Now, If you encode the string as char[], then how are you gonna determine whether or not the next character is a Unicode alpha or not? The following definitely shouldn't work: //assuming text is char[] for( int i = 0; i < text.length; i++ ) { bool isLetter = isUniAlpha( text[i] ); .... }
Oct 02 2006
Hasan Aljudy wrote:Derek Parnell wrote:see std.utf.decode and std.utf.stride. /Oskarforeach(int i, dchar c; text) { bool isLetter = isUniAlpha( c ); ... }I know, but that's still a work-around. What if you need to iterate back and forth? You're gonna need to convert it to dchar[] (or wchar[]). However, that brings up a good point: Notice how foreach allows to iterate a string by Unicode characters (a.k.a code-points)? Shouldn't this kind of iteration be supported outside of foreach as well?
Oct 02 2006
Oskar Linde wrote:Hasan Aljudy wrote:I have .. and I know the functions are all there. but hey, the C standard library also has all sorts of string processing functions. I'm talking about the "built-in" string type, which doesn't really exist, even though the spec claims it does.Derek Parnell wrote:see std.utf.decode and std.utf.stride. /Oskarforeach(int i, dchar c; text) { bool isLetter = isUniAlpha( c ); ... }I know, but that's still a work-around. What if you need to iterate back and forth? You're gonna need to convert it to dchar[] (or wchar[]). However, that brings up a good point: Notice how foreach allows to iterate a string by Unicode characters (a.k.a code-points)? Shouldn't this kind of iteration be supported outside of foreach as well?
Oct 02 2006