www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Rename std.string.toStringz?

reply Jonathan M Davis <jmdavisProg gmx.com> writes:
Okay. I have an open pull request whose main goal is to rename various string 
and character functions so that they're properly camelcased (as has been 
discussed previous in this group): https://github.com/D-Programming-
Language/phobos/pull/101

All in all, I think that the renaming in there is fairly obvious and non-
controversial. However, we have the problem of std.string.toStringz. toStringz 
isn't properly camelcased (at least, _I_ would definitely argue that it 
isn't). Currently (in that pull request), I have it renamed to toStringZ (with 
the old version still around and scheduled to be deprecated, so no code would 
break immediately if it were merged in). But some have said that they consider 
toStringz to be properly camelcased (apparently they view stringz as a special 
word indicating a zero-terminated string). Another suggestion was to rename it 
to toCString (which is arguably much more obvious for newbies). In addition to 
that, we have std.utf.toUTF16z, which matches the naming scheme that toStringz 
has, so if we rename toStringz, we should probably rename toUTF16z as well (to 
toWCString?). Certainly, there's no consensus on what to do with the name of 
toStringz.

Now, toStringz is probably one of the most heavily used string functions in 
Phobos. If we rename it, a _lot_ of code is going to have to be changed. So, 
if we rename it, we need to give it a name which most people would consider 
better than toStringz and worth the consistency that we gain with regards to 
the naming of functions in Phobos. So, the question is, should we

1. Keep toStringz as it is (as well as toUTF16z) and either consider stringz 
to be some sort of word unique to the D community or just admit that we're not 
going to camelcase it because it would break too much code to do so.

2. Just camelcase it properly and rename it to toStringZ (and probably rename 
toUTF16z to toUTF16Z). Code will have to be changed, but the function is still 
immediately recognizable to long time D programmers.

3. Rename it to toCString (probably renaming toUTF16z to something like 
toWCString), so it's then more recognizable to newbies, but it'll take some 
getting used to for everyone else (and of course require lots of code to be 
changed).

4. Rename toStringz to something else which is properly camelcased.

I don't like leaving toStringz as it is because of its casing, but I also 
don't want to cause code breakage without general agreement on the replacement 
name. It's just too important of a function to change the name of willy-nilly. 
So, I'm looking to see what everyone else thinks.

Thoughts? Opinions?

- Jonathan M Davis
Jun 15 2011
next sibling parent Peter Alexander <peter.alexander.au gmail.com> writes:
On 16/06/11 7:30 AM, Jonathan M Davis wrote:
 Thoughts? Opinions?

 - Jonathan M Davis
My preference is on a change to toCString. I've never liked toStringz, and it seems that a common question among newbies is "how do I pass this string to C libraries?". I think changing it to toCString will make it more obvious.
Jun 16 2011
prev sibling next sibling parent reply Daniel Gibson <metalcaedes gmail.com> writes:
Am 16.06.2011 08:30, schrieb Jonathan M Davis:

 
 3. Rename it to toCString (probably renaming toUTF16z to something like 
 toWCString), so it's then more recognizable to newbies
This + keep around aliases for the old names until D3. Cheers, - Daniel
Jun 16 2011
parent reply Timon Gehr <timon.gehr gmx.ch> writes:
Daniel Gibson wrote:
 Am 16.06.2011 08:30, schrieb Jonathan M Davis:

 3. Rename it to toCString (probably renaming toUTF16z to something like
 toWCString), so it's then more recognizable to newbies
This + keep around aliases for the old names until D3. Cheers, - Daniel
+1. But make it 'toCWString'. :o) Timon
Jun 16 2011
next sibling parent Jesse Phillips <jessekphillips+D gmail.com> writes:
Timon Gehr Wrote:

 Daniel Gibson wrote:
 Am 16.06.2011 08:30, schrieb Jonathan M Davis:

 3. Rename it to toCString (probably renaming toUTF16z to something like
 toWCString), so it's then more recognizable to newbies
This + keep around aliases for the old names until D3. Cheers, - Daniel
+1. But make it 'toCWString'. :o) Timon
+1
Jun 16 2011
prev sibling parent reply "Regan Heath" <regan netmail.co.nz> writes:
On Thu, 16 Jun 2011 13:00:23 +0100, Timon Gehr <timon.gehr gmx.ch> wrote:

 Daniel Gibson wrote:
 Am 16.06.2011 08:30, schrieb Jonathan M Davis:

 3. Rename it to toCString (probably renaming toUTF16z to something like
 toWCString), so it's then more recognizable to newbies
This + keep around aliases for the old names until D3.
+1
 +1. But make it 'toCWString'. :o)
Or toWString (looks nicer to me). Or toCStr and toWStr (to save 3 characters on each). winnt.h defines many similar types: typedef __nullterminated WCHAR *NWPSTR, *LPWSTR, *PWSTR; typedef __nullterminated PWSTR *PZPWSTR; typedef __nullterminated CONST PWSTR *PCZPWSTR; typedef __nullterminated WCHAR UNALIGNED *LPUWSTR, *PUWSTR; typedef __nullterminated CONST WCHAR *LPCWSTR, *PCWSTR; typedef __nullterminated PCWSTR *PZPCWSTR; typedef __nullterminated CONST WCHAR UNALIGNED *LPCUWSTR, *PCUWSTR; so variations on any of those make sense. .. I think the toStringz probably came from the winnt.h defines using z.. typedef __nullterminated CHAR *NPSTR, *LPSTR, *PSTR; typedef __nullterminated PSTR *PZPSTR; typedef __nullterminated CONST PSTR *PCZPSTR; typedef __nullterminated CONST CHAR *LPCSTR, *PCSTR; typedef __nullterminated PCSTR *PZPCSTR; -- Using Opera's revolutionary email client: http://www.opera.com/mail/
Jun 16 2011
parent reply KennyTM~ <kennytm gmail.com> writes:
On Jun 16, 11 23:09, Regan Heath wrote:
 On Thu, 16 Jun 2011 13:00:23 +0100, Timon Gehr <timon.gehr gmx.ch> wrote:

 Daniel Gibson wrote:
 Am 16.06.2011 08:30, schrieb Jonathan M Davis:

 3. Rename it to toCString (probably renaming toUTF16z to something like
 toWCString), so it's then more recognizable to newbies
This + keep around aliases for the old names until D3.
+1
 +1. But make it 'toCWString'. :o)
Or toWString (looks nicer to me).
A potential problem is that an immutable(wchar)[] is wstring.
 Or toCStr and toWStr (to save 3 characters on each).

 winnt.h defines many similar types:

 typedef __nullterminated WCHAR *NWPSTR, *LPWSTR, *PWSTR;
 typedef __nullterminated PWSTR *PZPWSTR;
 typedef __nullterminated CONST PWSTR *PCZPWSTR;
 typedef __nullterminated WCHAR UNALIGNED *LPUWSTR, *PUWSTR;
 typedef __nullterminated CONST WCHAR *LPCWSTR, *PCWSTR;
 typedef __nullterminated PCWSTR *PZPCWSTR;
 typedef __nullterminated CONST WCHAR UNALIGNED *LPCUWSTR, *PCUWSTR;

 so variations on any of those make sense.

 .. I think the toStringz probably came from the winnt.h defines using z..

 typedef __nullterminated CHAR *NPSTR, *LPSTR, *PSTR;
 typedef __nullterminated PSTR *PZPSTR;
 typedef __nullterminated CONST PSTR *PCZPSTR;
 typedef __nullterminated CONST CHAR *LPCSTR, *PCSTR;
 typedef __nullterminated PCSTR *PZPCSTR;
Stringz means "Z"ero-terminated "String". The usage predates Windows NT.
Jun 16 2011
parent Dmitry Olshansky <dmitry.olsh gmail.com> writes:
On 16.06.2011 20:07, KennyTM~ wrote:
 On Jun 16, 11 23:09, Regan Heath wrote:
 On Thu, 16 Jun 2011 13:00:23 +0100, Timon Gehr <timon.gehr gmx.ch> 
 wrote:

 Daniel Gibson wrote:
 Am 16.06.2011 08:30, schrieb Jonathan M Davis:

 3. Rename it to toCString (probably renaming toUTF16z to something 
 like
 toWCString), so it's then more recognizable to newbies
This + keep around aliases for the old names until D3.
+1
 +1. But make it 'toCWString'. :o)
Or toWString (looks nicer to me).
A potential problem is that an immutable(wchar)[] is wstring.
 Or toCStr and toWStr (to save 3 characters on each).

 winnt.h defines many similar types:

 typedef __nullterminated WCHAR *NWPSTR, *LPWSTR, *PWSTR;
 typedef __nullterminated PWSTR *PZPWSTR;
 typedef __nullterminated CONST PWSTR *PCZPWSTR;
 typedef __nullterminated WCHAR UNALIGNED *LPUWSTR, *PUWSTR;
 typedef __nullterminated CONST WCHAR *LPCWSTR, *PCWSTR;
 typedef __nullterminated PCWSTR *PZPCWSTR;
 typedef __nullterminated CONST WCHAR UNALIGNED *LPCUWSTR, *PCUWSTR;

 so variations on any of those make sense.

 .. I think the toStringz probably came from the winnt.h defines using 
 z..

 typedef __nullterminated CHAR *NPSTR, *LPSTR, *PSTR;
 typedef __nullterminated PSTR *PZPSTR;
 typedef __nullterminated CONST PSTR *PCZPSTR;
 typedef __nullterminated CONST CHAR *LPCSTR, *PCSTR;
 typedef __nullterminated PCSTR *PZPCSTR;
Stringz means "Z"ero-terminated "String". The usage predates Windows NT.
I suspect it's original name was Z-string and indeed it's usage started much earlier (e.g. it was used in PC BIOS). To be more precise there was notation for ACSII string that comes form having two types of strings: L-string - Length is in first byte (so up 255 characters) Z-string - Zero byte at the end. (or ASCIIZ) Anyway I could live with any of toStringz, toZString, toStringZ, toCString. -- Dmitry Olshansky
Jun 16 2011
prev sibling next sibling parent reply Dejan Lekic <dejan.lekic tiscali.co.uk> writes:
I am against the change for ... social reasons.

Simply put, the D community is used to toStringz . I might be wrong, but 
I think we are all familiar with this function and use it on a daily 
basis. :)

However, if you really, really want to stick to the coding convention, 
and decide to rename it anyway, I would go for toCString ...

Kind regards

Dejan Lekic
Jun 16 2011
parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Thu, 16 Jun 2011 04:47:50 -0400, Dejan Lekic  
<dejan.lekic tiscali.co.uk> wrote:

 I am against the change for ... social reasons.

 Simply put, the D community is used to toStringz . I might be wrong, but  
 I think we are all familiar with this function and use it on a daily  
 basis. :)
I agree with this assessment. I'll add that toStringz is memorable -- I remember how to use it and write it instantly. I'm not sure why, but I think it's because it's name is really unlikely to occur in any other context. I don't agree that toStringz is incorrectly camel cased, and I think toCString is not as descriptive, because it's identifying the language where zero-terminated strings are from, not that the string is zero terminated (signified well by the zero). I would like to change toUTF16z to toWStringz (and likewise for dstrings). You haven't listed this as an option. I see a large inconsistency there. -Steve
Jun 16 2011
next sibling parent Mafi <mafi example.org> writes:
Am 16.06.2011 16:33, schrieb Steven Schveighoffer:
 On Thu, 16 Jun 2011 04:47:50 -0400, Dejan Lekic
 <dejan.lekic tiscali.co.uk> wrote:

 I am against the change for ... social reasons.

 Simply put, the D community is used to toStringz . I might be wrong,
 but I think we are all familiar with this function and use it on a
 daily basis. :)
I agree with this assessment. I'll add that toStringz is memorable -- I remember how to use it and write it instantly. I'm not sure why, but I think it's because it's name is really unlikely to occur in any other context. I don't agree that toStringz is incorrectly camel cased, and I think toCString is not as descriptive, because it's identifying the language where zero-terminated strings are from, not that the string is zero terminated (signified well by the zero). I would like to change toUTF16z to toWStringz (and likewise for dstrings). You haven't listed this as an option. I see a large inconsistency there.
I completely agree!
 -Steve
I think stringz is a name as itself and then having a camel-cased toStringX is correct IMO.
Jun 16 2011
prev sibling parent reply Jonathan M Davis <jmdavisProg gmx.com> writes:
On 2011-06-16 07:33, Steven Schveighoffer wrote:
 On Thu, 16 Jun 2011 04:47:50 -0400, Dejan Lekic
 
 <dejan.lekic tiscali.co.uk> wrote:
 I am against the change for ... social reasons.
 
 Simply put, the D community is used to toStringz . I might be wrong, but
 I think we are all familiar with this function and use it on a daily
 basis. :)
I agree with this assessment. I'll add that toStringz is memorable -- I remember how to use it and write it instantly. I'm not sure why, but I think it's because it's name is really unlikely to occur in any other context. I don't agree that toStringz is incorrectly camel cased
I'm afraid that I don't understand this view at all, given that string is a word and stringz isn't, though there are a few people that have expressed this view now.
 and I think
 toCString is not as descriptive, because it's identifying the language
 where zero-terminated strings are from, not that the string is zero
 terminated (signified well by the zero).
 
 I would like to change toUTF16z to toWStringz (and likewise for
 dstrings).  You haven't listed this as an option.  I see a large
 inconsistency there.
Well, my concern at this point is really toStringz, not toUTF16z. If toStringz is renamed, then toUTF16z should follow suit. If it isn't, then perhaps toUTF16z should still be renamed, but my pull request doesn't do much with std.utf, so messing with toUTF16z isn't really the goal. It's just a side effect of renaming toStringz. So, if we keep toStringz, we may very well still rename toUTF16z, but the real question is whether we want to rename toStringz. - Jonathan M Davis
Jun 16 2011
parent reply Adam D. Ruppe <destructionator gmail.com> writes:
Jonathan M Davis  wrote:
 I'm afraid that I don't understand this view at all, given that
 string is a word and stringz isn't, though there are a few people
 that have expressed this view now.
If we had a function for to!ulong in this style, would you call it toULong or toUlong? I'd expect the latter - the word is "ulong", a single unit, not "U Long". D1 also did it this way: http://digitalmars.com/d/1.0/phobos/std_conv.html
Jun 16 2011
parent reply "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On 2011-06-16 09:21, Adam D. Ruppe wrote:
 Jonathan M Davis wrote:
 I'm afraid that I don't understand this view at all, given that
 string is a word and stringz isn't, though there are a few people
 that have expressed this view now.
If we had a function for to!ulong in this style, would you call it toULong or toUlong? I'd expect the latter - the word is "ulong", a single unit, not "U Long". D1 also did it this way: http://digitalmars.com/d/1.0/phobos/std_conv.html
I'd probably end up calling it to ULong, since U stands for unsigned and it looks a lot better that way, but since ulong is a type name, I could easily see it being named toUlong. However, I don't see what that has to do with toStringz. stringz is neither a type nor a word. The _only_ place that stringz is used AFAIK is in the name toStringz, where the zero presumably stands for zero, as in zero-terminated string. If stringz were a type in D, then yeah, toStringz would be properly camelcased, but stringz isn't a type. - Jonathan M Davis
Jun 16 2011
parent reply Adam D. Ruppe <destructionator gmail.com> writes:
Jonathan M Davis wrote:
 stringz is neither a type nor a word.
It *is* a word. A stringz is a string that ends in zero. This family of traditional names (STRINGZ, ASCIZ, etc.) predates C itself. On the other hand, there is *no such thing* as a StringZ. You'd never call it a "string zero". You'd call it a "zero terminated string", or maybe a "C string". Changing it to "toCString" is completely pointless, a cost without a benefit, but at least it's not a completely nonsensical name like toStringZ.
Jun 16 2011
parent reply Jonathan M Davis <jmdavisProg gmx.com> writes:
On 2011-06-16 12:16, Adam D. Ruppe wrote:
 Jonathan M Davis wrote:
 stringz is neither a type nor a word.
It *is* a word. A stringz is a string that ends in zero. This family of traditional names (STRINGZ, ASCIZ, etc.) predates C itself. On the other hand, there is *no such thing* as a StringZ. You'd never call it a "string zero". You'd call it a "zero terminated string", or maybe a "C string". Changing it to "toCString" is completely pointless, a cost without a benefit, but at least it's not a completely nonsensical name like toStringZ.
Well, I'd argue that stringz is just as nonsensical as stringZ. I have _never_ heard the term stringz used outside of D. Searching for it online brings up _nothing_ (not even D). And I wouldn't really consider an identifier which is completely uppercase to necessarily say anything about where the beginning and end of a word is anyway, so I wouldn't really consider that much of a precedent. But it doesn't really matter. What matters is whether we as a group think that renaming toStringz is worthwhile, and if so, what we name it to. _No one_ thus far has liked the name toStringZ, even if they agree that toStringz should be changed. They're pretty much all voting for toCString. So, there's no way that it's going to end up as toStringZ. I'm going to let this thread go a bit longer before I decide what I'm going to do, but from the looks of it, we're not reaching any kind of consensus on this, and I'm not going to change toStringz unless we actually reach a consensus on the matter. The discussions on fixing the function names in Phobos (and in particular, std.string) resulted in an almost unanimous decision to fix the function names in Phobos to be properly camelcased. So, in general, it's worth making those changes. However, this particular discussion about this particular function is anything but unanimous, so unless a greater agreement is reached than is currently happening, toStringz isn't going to be changed. - Jonathan M Davis
Jun 16 2011
parent reply Alix Pexton <alix.DOT.pexton gmail.DOT.com> writes:
On 16/06/2011 21:32, Jonathan M Davis wrote:

 I have _never_
 heard the term stringz used outside of D. Searching for it online brings up
 _nothing_ (not even D).
Have you tested what results you get when searching for "c string"? A...
Jun 17 2011
parent Timon Gehr <timon.gehr gmx.ch> writes:
Alix Pexton wrote:
 On 16/06/2011 21:32, Jonathan M Davis wrote:

 I have _never_
 heard the term stringz used outside of D. Searching for it online brings up
 _nothing_ (not even D).
Have you tested what results you get when searching for "c string"? A...
LOL.
Jun 17 2011
prev sibling next sibling parent Alix Pexton <alix.DOT.pexton gmail.DOT.com> writes:
On 16/06/2011 07:30, Jonathan M Davis wrote:

 1. Keep toStringz as it is (as well as toUTF16z) and either consider stringz
 to be some sort of word unique to the D community or just admit that we're not
 going to camelcase it because it would break too much code to do so.
I vote for no change, I like stringz as it is ^^ A...
Jun 16 2011
prev sibling next sibling parent reply Kagamin <spam here.lot> writes:
c_str
c_wstr

similarly to c_long
Jun 16 2011
parent reply Daniel Gibson <metalcaedes gmail.com> writes:
Am 16.06.2011 13:01, schrieb Kagamin:
 c_str
 c_wstr
 
 similarly to c_long
So c_long converts a long to a c-long? (Is this a D function/type? I couldn't find it on the homepage)
Jun 16 2011
parent reply Mike Parker <aldacron gmail.com> writes:
On 6/16/2011 7:59 PM, Daniel Gibson wrote:
 Am 16.06.2011 13:01, schrieb Kagamin:
 c_str
 c_wstr

 similarly to c_long
So c_long converts a long to a c-long? (Is this a D function/type? I couldn't find it on the homepage)
It's not a conversion function, but an alias. It's declared, along with c_ulong, in core.stdc.config.
Jun 16 2011
parent reply Daniel Gibson <metalcaedes gmail.com> writes:
Am 16.06.2011 14:56, schrieb Mike Parker:
 On 6/16/2011 7:59 PM, Daniel Gibson wrote:
 Am 16.06.2011 13:01, schrieb Kagamin:
 c_str
 c_wstr

 similarly to c_long
So c_long converts a long to a c-long? (Is this a D function/type? I couldn't find it on the homepage)
It's not a conversion function, but an alias. It's declared, along with c_ulong, in core.stdc.config.
So it has nothing to do with toStringz and similar names would be confusing.
Jun 16 2011
next sibling parent Kagamin <spam here.lot> writes:
Daniel Gibson Wrote:

 It's not a conversion function, but an alias. It's declared, along with
 c_ulong, in core.stdc.config.
So it has nothing to do with toStringz and similar names would be confusing.
Not for c++ people.
Jun 16 2011
prev sibling parent reply Kagamin <spam here.lot> writes:
Daniel Gibson Wrote:

 So c_long converts a long to a c-long?
 (Is this a D function/type? I couldn't find it on the homepage)
It's not a conversion function, but an alias. It's declared, along with c_ulong, in core.stdc.config.
So it has nothing to do with toStringz and similar names would be confusing.
May be struct c_str{} to!c_str();
Jun 16 2011
parent reply Daniel Gibson <metalcaedes gmail.com> writes:
Am 16.06.2011 20:30, schrieb Kagamin:
 Daniel Gibson Wrote:
 
 So c_long converts a long to a c-long?
 (Is this a D function/type? I couldn't find it on the homepage)
It's not a conversion function, but an alias. It's declared, along with c_ulong, in core.stdc.config.
So it has nothing to do with toStringz and similar names would be confusing.
May be struct c_str{} to!c_str();
No. The whole point of toStringz() is that it returns a string that can be fed to normal C functions that work on strings. And C functions expect a "string" to be a char* (or wchar*) pointing to a block of memory containing the string and terminated by '\0'. The functionality of toStringz() should not change. This is just about the name. Cheers, - Daniel
Jun 16 2011
parent reply Kagamin <spam here.lot> writes:
Daniel Gibson Wrote:

 No.
 The whole point of toStringz() is that it returns a string that can be
 fed to normal C functions that work on strings.
 And C functions expect a "string" to be a char* (or wchar*) pointing to
 a block of memory containing the string and terminated by '\0'.
 The functionality of toStringz() should not change.
 This is just about the name.
Why don't you like to!c_str(); ?
Jun 17 2011
parent reply Daniel Gibson <metalcaedes gmail.com> writes:
Am 17.06.2011 09:03, schrieb Kagamin:
 Daniel Gibson Wrote:
 
 No.
 The whole point of toStringz() is that it returns a string that can be
 fed to normal C functions that work on strings.
 And C functions expect a "string" to be a char* (or wchar*) pointing to
 a block of memory containing the string and terminated by '\0'.
 The functionality of toStringz() should not change.
 This is just about the name.
Why don't you like to!c_str(); ?
What is to!c_str() supposed to return? To be a useful alternative to toStringz() it needs to be char* to!c_str(string s) (or immutable(char)* or something) i.e. the related toImpl looks like char* toImpl(c_str, string)(string s) => 3 types! (char*, c_str, string) But the signature of toImpl is T toImpl(T, S)(S s) so the related to's signature is T to(T)(S s) or something like that. This means, that the return type T is the same type you instantiate to with. That means to!c_str(string s) will return a c_str struct and not a char* And, as I explained in my previous post, C functions want a char* not a struct c_str. Cheers, - Daniel
Jun 17 2011
next sibling parent reply Dmitry Olshansky <dmitry.olsh gmail.com> writes:
On 17.06.2011 14:34, Daniel Gibson wrote:
 Am 17.06.2011 09:03, schrieb Kagamin:
 Daniel Gibson Wrote:

 No.
 The whole point of toStringz() is that it returns a string that can be
 fed to normal C functions that work on strings.
 And C functions expect a "string" to be a char* (or wchar*) pointing to
 a block of memory containing the string and terminated by '\0'.
 The functionality of toStringz() should not change.
 This is just about the name.
Why don't you like to!c_str(); ?
What is to!c_str() supposed to return? To be a useful alternative to toStringz() it needs to be char* to!c_str(string s) (or immutable(char)* or something) i.e. the related toImpl looks like char* toImpl(c_str, string)(string s) => 3 types! (char*, c_str, string) But the signature of toImpl is T toImpl(T, S)(S s) so the related to's signature is T to(T)(S s) or something like that. This means, that the return type T is the same type you instantiate to with. That means to!c_str(string s) will return a c_str struct and not a char* And, as I explained in my previous post, C functions want a char* not a struct c_str.
Fixable with alias this, but still I dislike this c_str artifact. It looks like you are converting type, while all it does is ensuring that there is enough 0 bytes past the end of string. -- Dmitry Olshansky
Jun 17 2011
parent Kagamin <spam here.lot> writes:
Dmitry Olshansky Wrote:

 Fixable with alias this, but still I dislike this c_str artifact.
 It looks like you are converting type, while all it does is ensuring 
 that there is enough 0 bytes past the end of string.
It should convert if you pass it, say, dchar[]. toStringz should also support conversion to mutable string.
Jun 17 2011
prev sibling next sibling parent reply Kagamin <spam here.lot> writes:
Daniel Gibson Wrote:

 What is to!c_str() supposed to return?
C string obviously.
 To be a useful alternative to toStringz() it needs to be
 
   char* to!c_str(string s) (or immutable(char)* or something)
 i.e. the related toImpl looks like
   char* toImpl(c_str, string)(string s)
 => 3 types! (char*, c_str, string)
 
 But the signature of toImpl is
   T toImpl(T, S)(S s)
 so the related to's signature is
   T to(T)(S s)
 or something like that.
I thought, D templates allow specialization, so it should be possible to specialize toImpl for c_str like auto toImpl(T,S)(S s) if(T==c_str) { return ... }
Jun 17 2011
parent Daniel Gibson <metalcaedes gmail.com> writes:
Am 17.06.2011 15:16, schrieb Kagamin:
 Daniel Gibson Wrote:
 
 What is to!c_str() supposed to return?
C string obviously.
 To be a useful alternative to toStringz() it needs to be

   char* to!c_str(string s) (or immutable(char)* or something)
 i.e. the related toImpl looks like
   char* toImpl(c_str, string)(string s)
 => 3 types! (char*, c_str, string)

 But the signature of toImpl is
   T toImpl(T, S)(S s)
 so the related to's signature is
   T to(T)(S s)
 or something like that.
I thought, D templates allow specialization, so it should be possible to specialize toImpl for c_str like auto toImpl(T,S)(S s) if(T==c_str) { return ... }
This would be inconsistent with all other to implementations, as far as I know.
Jun 17 2011
prev sibling parent reply Kagamin <spam here.lot> writes:
Daniel Gibson Wrote:

   char* to!c_str(string s) (or immutable(char)* or something)
 i.e. the related toImpl looks like
   char* toImpl(c_str, string)(string s)
 => 3 types! (char*, c_str, string)
btw, if we're talking about C api, the return type must not be char*, because c string character encoding is not utf-8, but char* implies utf-8.
Jun 17 2011
next sibling parent Daniel Gibson <metalcaedes gmail.com> writes:
Am 17.06.2011 15:21, schrieb Kagamin:
 Daniel Gibson Wrote:
 
   char* to!c_str(string s) (or immutable(char)* or something)
 i.e. the related toImpl looks like
   char* toImpl(c_str, string)(string s)
 => 3 types! (char*, c_str, string)
btw, if we're talking about C api, the return type must not be char*, because c string character encoding is not utf-8, but char* implies utf-8.
Of course we're talking about C api, that's the whole point of toStringz What about C functions that *do* deal with UTF8? And C functions that don't care about encoding? I don't think you usually use different types used in C for ASCII, 8bit encodings and UTF8..
Jun 17 2011
prev sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On 2011-06-17 06:21, Kagamin wrote:
 Daniel Gibson Wrote:
   char* to!c_str(string s) (or immutable(char)* or something)
 
 i.e. the related toImpl looks like
 
   char* toImpl(c_str, string)(string s)
 
 => 3 types! (char*, c_str, string)
btw, if we're talking about C api, the return type must not be char*, because c string character encoding is not utf-8, but char* implies utf-8.
The C string encoding that the system expects depends on your locale and the function that you're calling. For the most part, it _is_ UTF-8 - at least on Linux. - Jonathan M Davis
Jun 17 2011
prev sibling next sibling parent Mike Parker <aldacron gmail.com> writes:
On 6/16/2011 3:30 PM, Jonathan M Davis wrote:

 1. Keep toStringz as it is (as well as toUTF16z) and either consider stringz
 to be some sort of word unique to the D community or just admit that we're not
 going to camelcase it because it would break too much code to do so.
My vote goes for this one.
Jun 16 2011
prev sibling next sibling parent KennyTM~ <kennytm gmail.com> writes:
On Jun 16, 11 14:30, Jonathan M Davis wrote:
 Okay. I have an open pull request whose main goal is to rename various string
 and character functions so that they're properly camelcased (as has been
 discussed previous in this group): https://github.com/D-Programming-
 Language/phobos/pull/101

 All in all, I think that the renaming in there is fairly obvious and non-
 controversial. However, we have the problem of std.string.toStringz. toStringz
 isn't properly camelcased (at least, _I_ would definitely argue that it
 isn't). Currently (in that pull request), I have it renamed to toStringZ (with
 the old version still around and scheduled to be deprecated, so no code would
 break immediately if it were merged in). But some have said that they consider
 toStringz to be properly camelcased (apparently they view stringz as a special
 word indicating a zero-terminated string). Another suggestion was to rename it
 to toCString (which is arguably much more obvious for newbies). In addition to
 that, we have std.utf.toUTF16z, which matches the naming scheme that toStringz
 has, so if we rename toStringz, we should probably rename toUTF16z as well (to
 toWCString?). Certainly, there's no consensus on what to do with the name of
 toStringz.

 Now, toStringz is probably one of the most heavily used string functions in
 Phobos. If we rename it, a _lot_ of code is going to have to be changed. So,
 if we rename it, we need to give it a name which most people would consider
 better than toStringz and worth the consistency that we gain with regards to
 the naming of functions in Phobos. So, the question is, should we

 1. Keep toStringz as it is (as well as toUTF16z) and either consider stringz
 to be some sort of word unique to the D community or just admit that we're not
 going to camelcase it because it would break too much code to do so.

 2. Just camelcase it properly and rename it to toStringZ (and probably rename
 toUTF16z to toUTF16Z). Code will have to be changed, but the function is still
 immediately recognizable to long time D programmers.

 3. Rename it to toCString (probably renaming toUTF16z to something like
 toWCString), so it's then more recognizable to newbies, but it'll take some
 getting used to for everyone else (and of course require lots of code to be
 changed).

 4. Rename toStringz to something else which is properly camelcased.

 I don't like leaving toStringz as it is because of its casing, but I also
 don't want to cause code breakage without general agreement on the replacement
 name. It's just too important of a function to change the name of willy-nilly.
 So, I'm looking to see what everyone else thinks.

 Thoughts? Opinions?

 - Jonathan M Davis
+1 toCString +0 toStringz -1 toStringZ & toUTF16Z
Jun 16 2011
prev sibling next sibling parent "Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:
On Wed, 15 Jun 2011 23:30:03 -0700, Jonathan M Davis wrote:

 Okay. I have an open pull request whose main goal is to rename various
 string and character functions so that they're properly camelcased (as
 has been discussed previous in this group):
 https://github.com/D-Programming- Language/phobos/pull/101
 
 All in all, I think that the renaming in there is fairly obvious and
 non- controversial. However, we have the problem of
 std.string.toStringz. [...]
As I said in the pull request, I'd like to keep toStringz the way it is. -Lars
Jun 16 2011
prev sibling next sibling parent reply Paul D. Anderson <paul.d.removethis.anderson comcast.andthis.net> writes:
Jonathan M Davis Wrote:

 
 1. Keep toStringz as it is (as well as toUTF16z) and either consider stringz 
 to be some sort of word unique to the D community or just admit that we're not 
 going to camelcase it because it would break too much code to do so.
 
vote++. 1) If it ain't broke, don't fix it. Too much disruption for too little gain. 2) In a language that uses "enum" to mean "manifest constant", worrying about an upper or lower case z is straining at a gnat and swallowing a camel. Paul
Jun 16 2011
next sibling parent "Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:
On Thu, 16 Jun 2011 13:44:07 -0400, Paul D. Anderson wrote:

 2) In a language that uses "enum" to mean "manifest constant", worrying
 about an upper or lower case z is straining at a gnat and swallowing a
 camel.
:D
Jun 16 2011
prev sibling next sibling parent "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On 2011-06-16 10:44, Paul D. Anderson wrote:
 Jonathan M Davis Wrote:
 1. Keep toStringz as it is (as well as toUTF16z) and either consider
 stringz to be some sort of word unique to the D community or just admit
 that we're not going to camelcase it because it would break too much
 code to do so.
vote++. 1) If it ain't broke, don't fix it. Too much disruption for too little gain. 2) In a language that uses "enum" to mean "manifest constant", worrying about an upper or lower case z is straining at a gnat and swallowing a camel.
It has been made pretty clear in discussions in this group before that in the general case, the consensus is that we want Phobos' function names to consistently follow Phobos' naming conventions (which means camelcased starting with a lowercase letter in the case of functions), even if it means breaking code in the short run in order to fix it. So, following that, if toStringz isn't properly camelcased (and I really don't understand anyone who thinks that it is), then it should be renamed. Whether it's the biggest problem in the language or library or not is irrelevant. Based on past discussions in this group, one would think that most people would want toStringz to be changed to be properly camelcased. However, it _is_ a function which is used a _lot_ and changing it will break a lot of code, so if we change the name, it needs to be worth doing so. I'm just trying to find out if the community at large thinks that it's worth changing toStringz to be properly camelcased given the cost, and if so, whether it's best to just camelcase it properly (toStringZ) or to rename it entirely. It's clear that in the general case, the community believes that it's worth it. The question is whether they believe that it's worth it in this particular case. I don't find whether there are other, bigger issues in the language or library to be particularly relevant unless they affect this particular issue. - Jonathan M Davis
Jun 16 2011
prev sibling parent Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
It would be cool if with each release we distributed a tool that can
at least do partial renaming of old function names to new ones. This
should ease porting a codebase to a new version of DMD/Phobos. Think
about how Python has the 2to3 tool, except our tool might just do a
minimal search & replace between two small versions.
Jun 16 2011
prev sibling next sibling parent reply Kagamin <spam here.lot> writes:
Daniel Gibson Wrote:

 This would be inconsistent with all other to implementations, as far as
 I know.
Inconsistency is the very reason for template specialization to exist.
Jun 17 2011
parent David Nadlinger <see klickverbot.at> writes:
On 6/17/11 3:24 PM, Kagamin wrote:
 Daniel Gibson Wrote:

 This would be inconsistent with all other to implementations, as far as
 I know.
Inconsistency is the very reason for template specialization to exist.
No, it would be inconsistent from a user's point of view, since for all other types, to!Foo(xyz) returns a Foo, and not something else. David
Jun 17 2011
prev sibling parent Kagamin <spam here.lot> writes:
 No, it would be inconsistent from a user's point of view, since for all 
 other types, to!Foo(xyz) returns a Foo, and not something else.

 David
Well, yes, to!c_str(xyz) returns a c string, and not something else.
Jun 20 2011