digitalmars.D - Idea: Introduce zero-terminated string specifier
- Andrej Mitrovic (18/18) Sep 28 2012 I've noticed I'm having to do a lot of to!string calls when I want to
- =?ISO-8859-1?Q?Alex_R=F8nne_Petersen?= (8/26) Sep 28 2012 While the idea is reasonable, the problem then becomes that if you
- Adam D. Ruppe (9/12) Sep 28 2012 That's the same risk with to!string(), yes? We aren't really
- Piotr Szturmaj (4/12) Oct 01 2012 Why not specialize current "%s" for character pointer types so it will
- Jakob Ovrum (3/12) Oct 01 2012 It's not safe to assume that pointers to characters are generally
- Piotr Szturmaj (2/14) Oct 01 2012 Yes, but programmer should know what he's passing anyway.
- Paulo Pinto (7/28) Oct 01 2012 The thinking "the programmer should" only works in one man teams.
- Piotr Szturmaj (23/45) Oct 01 2012 I experienced such team at my previous work and I know what you mean. My...
- Johannes Pfau (7/60) Oct 01 2012 If CString implemented a toString method (probably the variant taking a
- Piotr Szturmaj (12/38) Oct 01 2012 I reworked this example to form a forward range:
- Andrej Mitrovic (8/12) Oct 01 2012 I don't think you can reliably do that because of semantics w.r.t.
- Piotr Szturmaj (4/19) Oct 02 2012 I think that align(1) structs that wrap a single value should be treated...
- Andrej Mitrovic (2/4) Oct 01 2012 missing "in another way" there.
- Jonathan M Davis (26/41) Oct 01 2012 aks
- Piotr Szturmaj (6/33) Oct 01 2012 Imagine you're serializing great amount of text when some of the text
- Steven Schveighoffer (25/35) Oct 01 2012 to!string necessarily allocates, I think that is not a small problem.
- deadalnix (5/5) Sep 30 2012 If you know that a string is 0 terminated, you can easily create a slice...
- Paulo Pinto (6/12) Sep 30 2012 +1
- Vladimir Panteleev (7/10) Sep 30 2012 The problem is that, unsurprisingly, most C APIs (not just libc,
- Vladimir Panteleev (2/8) Sep 30 2012 That's what to!string already does.
- Muhtar (5/15) Sep 30 2012 I aggere you...
- deadalnix (2/11) Oct 01 2012 How does to!string know that the string is 0 terminated ?
- Vladimir Panteleev (2/16) Oct 01 2012 By convention (it doesn't).
- deadalnix (2/17) Oct 01 2012 It is unsafe as hell oO
- Vladimir Panteleev (3/9) Oct 01 2012 Forcing the programmer to put strlen calls everywhere in his code
- deadalnix (3/13) Oct 02 2012 I make the library safer. If the programmer manipulate unsafe construct
- Paulo Pinto (5/21) Oct 04 2012 Thrusting the programmer is what brought upon us the wrath of
- Andrej Mitrovic (4/9) Sep 30 2012 What does that have to do with writef()? You can call to!string, but
- Paulo Pinto (10/26) Sep 30 2012 You should anyway wrap those APIs not to pollute D call with
- Rob T (12/14) Oct 01 2012 I have to agree, esp when it applies to pointers.
- Walter Bright (8/13) Oct 01 2012 Of course, using strlen() is always going to be unsafe. But having %zs
- Steven Schveighoffer (31/46) Oct 01 2012 What about %s just working with zero-terminated strings?
- Walter Bright (6/13) Oct 02 2012 As a matter of principle, I really don't like gobs of Phobos functions
- Andrei Alexandrescu (7/15) Oct 02 2012 Well there are some possible reasons. Clearly useful functionality
- Steven Schveighoffer (15/23) Oct 02 2012 This, arguably, is one of the most important aspects of C to support.
- David Nadlinger (5/10) Oct 02 2012 I didn't look it up, so I could be making quite a fool of myself
-
Steven Schveighoffer
(10/20)
Oct 02 2012
On Tue, 02 Oct 2012 15:17:42 -0400, David Nadlinger
... - David Nadlinger (4/21) Oct 02 2012 Well, make it to!char(char*) then! ;)
- David Nadlinger (4/5) Oct 02 2012 Oh dear, this doesn't get better: Of course, I've meant to write
-
Steven Schveighoffer
(17/21)
Oct 02 2012
On Tue, 02 Oct 2012 15:35:47 -0400, David Nadlinger
... - Regan Heath (6/10) Oct 03 2012 :D
- Steven Schveighoffer (12/21) Oct 03 2012 Almost what I was thinking.
- Regan Heath (12/34) Oct 04 2012 True.
- deadalnix (2/17) Oct 02 2012 Exactly my point.
- Andrej Mitrovic (16/28) Oct 02 2012 How does it hide anything if you have to explicitly mark the format
- Jakob Ovrum (4/27) Oct 02 2012 writefln cannot be @safe if it has to support an unsafe format
- Andrej Mitrovic (5/8) Oct 02 2012 Ah damn I completely forgot about @safe. I tend to avoid recent features...
- H. S. Teoh (9/13) Oct 02 2012 [...]
- Vladimir Panteleev (9/18) Sep 30 2012 I just checked and std.conv.to always allocates a copy, even when
- Jonathan M Davis (8/20) Oct 02 2012 The format string is a runtime argument, so nothing can be proven about ...
- H. S. Teoh (15/37) Oct 02 2012 [...]
- Jakob Ovrum (10/22) Oct 02 2012 It doesn't matter if the argument is known at compile-time or
- Jonathan M Davis (7/13) Oct 02 2012 Yeah. You basically _never_ just mark @trusted and call it a day. You on...
I've noticed I'm having to do a lot of to!string calls when I want to call the versatile writef() function. So I was thinking, why not introduce a special zero-terminated string specifier which would both alleviate the need to call to!string and would probably save on needless memory allocation. If all we want to do is print something, why waste time duplicating a string? Let's say we call the new specifier %zs (we can debate for the actual name): extern(C) const(void)* GetName(); // e.g. some C api functions.. extern(C) const(void)* GetLastName(); Before: writefln("Name %s, Last Name %s", to!string(GetName()), to!string(GetLastName())); After: writefln("Name %zs, Last Name %zs", GetName(), GetLastName()); Of course in this simple case you could just use printf(), but remember that writef() is much more versatile and allows you to specify %s to match any type. It would be great to match printf's original meaning of %s with another specifier.
Sep 28 2012
On 29-09-2012 04:08, Andrej Mitrovic wrote:I've noticed I'm having to do a lot of to!string calls when I want to call the versatile writef() function. So I was thinking, why not introduce a special zero-terminated string specifier which would both alleviate the need to call to!string and would probably save on needless memory allocation. If all we want to do is print something, why waste time duplicating a string? Let's say we call the new specifier %zs (we can debate for the actual name): extern(C) const(void)* GetName(); // e.g. some C api functions.. extern(C) const(void)* GetLastName(); Before: writefln("Name %s, Last Name %s", to!string(GetName()), to!string(GetLastName())); After: writefln("Name %zs, Last Name %zs", GetName(), GetLastName()); Of course in this simple case you could just use printf(), but remember that writef() is much more versatile and allows you to specify %s to match any type. It would be great to match printf's original meaning of %s with another specifier.While the idea is reasonable, the problem then becomes that if you accidentally pass a non-zero terminated char* to %sz, all hell breaks loose just like with printf. -- Alex Rønne Petersen alex lycus.org http://lycus.org
Sep 28 2012
On Saturday, 29 September 2012 at 02:11:12 UTC, Alex Rønne Petersen wrote:While the idea is reasonable, the problem then becomes that if you accidentally pass a non-zero terminated char* to %sz, all hell breaks loose just like with printf.That's the same risk with to!string(), yes? We aren't really losing anything by adding it. Also this reminds me of the utter uselessness of the current behavior of "%s" and a pointer - it prints the address. I think this should be simply disallowed. If you want that, you can use %x, and if you want it printed, that's where the new %z comes in.
Sep 28 2012
Adam D. Ruppe wrote:On Saturday, 29 September 2012 at 02:11:12 UTC, Alex Rønne Petersen wrote:Why not specialize current "%s" for character pointer types so it will print null terminated strings? It's always possible to cast to void* to print an address.While the idea is reasonable, the problem then becomes that if you accidentally pass a non-zero terminated char* to %sz, all hell breaks loose just like with printf.That's the same risk with to!string(), yes? We aren't really losing anything by adding it. Also this reminds me of the utter uselessness of the current behavior of "%s" and a pointer - it prints the address.
Oct 01 2012
On Monday, 1 October 2012 at 09:17:52 UTC, Piotr Szturmaj wrote:Adam D. Ruppe wrote:It's not safe to assume that pointers to characters are generally null terminated.On Saturday, 29 September 2012 at 02:11:12 UTC, Alex Rønne Petersen wrote: Also this reminds me of the utter uselessness of the current behavior of "%s" and a pointer - it prints the address.Why not specialize current "%s" for character pointer types so it will print null terminated strings? It's always possible to cast to void* to print an address.
Oct 01 2012
Jakob Ovrum wrote:On Monday, 1 October 2012 at 09:17:52 UTC, Piotr Szturmaj wrote:Yes, but programmer should know what he's passing anyway.Adam D. Ruppe wrote:It's not safe to assume that pointers to characters are generally null terminated.On Saturday, 29 September 2012 at 02:11:12 UTC, Alex Rønne Petersen wrote: Also this reminds me of the utter uselessness of the current behavior of "%s" and a pointer - it prints the address.Why not specialize current "%s" for character pointer types so it will print null terminated strings? It's always possible to cast to void* to print an address.
Oct 01 2012
On Monday, 1 October 2012 at 09:42:08 UTC, Piotr Szturmaj wrote:Jakob Ovrum wrote:The thinking "the programmer should" only works in one man teams. As soon as you start having teams with disparate programming knowledge among team members, you can forget everything about "the programmer should". .. PauloOn Monday, 1 October 2012 at 09:17:52 UTC, Piotr Szturmaj wrote:Yes, but programmer should know what he's passing anyway.Adam D. Ruppe wrote:It's not safe to assume that pointers to characters are generally null terminated.On Saturday, 29 September 2012 at 02:11:12 UTC, Alex Rønne Petersen wrote: Also this reminds me of the utter uselessness of the current behavior of "%s" and a pointer - it prints the address.Why not specialize current "%s" for character pointer types so it will print null terminated strings? It's always possible to cast to void* to print an address.
Oct 01 2012
Paulo Pinto wrote:On Monday, 1 October 2012 at 09:42:08 UTC, Piotr Szturmaj wrote:I experienced such team at my previous work and I know what you mean. My original thoughts was based on telling writef that I want print a null-terminated string rather than address. to!string will surely work, but it implies double iteration, one in to!string to calculate length (seeking for 0 char) and one in writef (printing). With long strings this is suboptimal. What about something like this: struct CString(T) if (isSomeChar!T) { T* str; } property auto cstring(S : T*, T)(S str) if (isSomeChar!T) { return CString!T(str); } string test = "abc"; immutable(char)* p = test.ptr; writefln("%s", p.cstring); // prints "abc" Here the char pointer type is "annotated" as null terminated string and writefln can use this information.Jakob Ovrum wrote:The thinking "the programmer should" only works in one man teams. As soon as you start having teams with disparate programming knowledge among team members, you can forget everything about "the programmer should".On Monday, 1 October 2012 at 09:17:52 UTC, Piotr Szturmaj wrote:Yes, but programmer should know what he's passing anyway.Adam D. Ruppe wrote:It's not safe to assume that pointers to characters are generally null terminated.On Saturday, 29 September 2012 at 02:11:12 UTC, Alex Rønne Petersen wrote: Also this reminds me of the utter uselessness of the current behavior of "%s" and a pointer - it prints the address.Why not specialize current "%s" for character pointer types so it will print null terminated strings? It's always possible to cast to void* to print an address.
Oct 01 2012
Am Mon, 01 Oct 2012 13:22:46 +0200 schrieb Piotr Szturmaj <bncrbme jadamspam.pl>:Paulo Pinto wrote:If CString implemented a toString method (probably the variant taking a sink delegate), this would already work. I'm not sure about performance though: Isn't writing out bigger buffers a lot faster than writing single chars? You could print every char individually, but wouldn't a p[0 .. strlen(p)] usually be faster?On Monday, 1 October 2012 at 09:42:08 UTC, Piotr Szturmaj wrote:=20 I experienced such team at my previous work and I know what you mean. My original thoughts was based on telling writef that I want print a=20 null-terminated string rather than address. to!string will surely work, but it implies double iteration, one in to!string to calculate length (seeking for 0 char) and one in writef (printing). With long strings this is suboptimal. What about something like this: =20 struct CString(T) if (isSomeChar!T) { T* str; } =20 property auto cstring(S : T*, T)(S str) if (isSomeChar!T) { return CString!T(str); } =20 string test =3D "abc"; immutable(char)* p =3D test.ptr; =20 writefln("%s", p.cstring); // prints "abc" =20 Here the char pointer type is "annotated" as null terminated string and writefln can use this information.Jakob Ovrum wrote:The thinking "the programmer should" only works in one man teams. As soon as you start having teams with disparate programming knowledge among team members, you can forget everything about "the programmer should".On Monday, 1 October 2012 at 09:17:52 UTC, Piotr Szturmaj wrote:Yes, but programmer should know what he's passing anyway.Adam D. Ruppe wrote:It's not safe to assume that pointers to characters are generally null terminated.On Saturday, 29 September 2012 at 02:11:12 UTC, Alex R=C3=B8nne Petersen wrote: Also this reminds me of the utter uselessness of the current behavior of "%s" and a pointer - it prints the address.Why not specialize current "%s" for character pointer types so it will print null terminated strings? It's always possible to cast to void* to print an address.
Oct 01 2012
Johannes Pfau wrote:I reworked this example to form a forward range: http://dpaste.dzfl.pl/7ab1eeec The major advantage over "%zs" is that it could be used anywhere, not only with writef(). For example C binding writers may change: extern(C) char* getstr(); to extern(C) cstring getstr(); so the string may be immediately used with writef();struct CString(T) if (isSomeChar!T) { T* str; } property auto cstring(S : T*, T)(S str) if (isSomeChar!T) { return CString!T(str); } string test = "abc"; immutable(char)* p = test.ptr; writefln("%s", p.cstring); // prints "abc" Here the char pointer type is "annotated" as null terminated string and writefln can use this information.If CString implemented a toString method (probably the variant taking a sink delegate), this would already work.I'm not sure about performance though: Isn't writing out bigger buffers a lot faster than writing single chars? You could print every char individually, but wouldn't a p[0 .. strlen(p)] usually be faster?I think it internally prints single characters anyway. At least it must test each character if it's not zero valued. strlen() does that.
Oct 01 2012
On 10/1/12, Piotr Szturmaj <bncrbme jadamspam.pl> wrote:For example C binding writers may change: extern(C) char* getstr(); to extern(C) cstring getstr();I don't think you can reliably do that because of semantics w.r.t. passing parameters on the stack vs in registers based on whether a type is a pointer or not. I've had this sort of bug when wrapping C++ where the C++ compiler was passing a parameter in one way but the D compiler expected the parameters to be passed, simply because I tried to be clever and fake a return type. See: http://forum.dlang.org/thread/mailman.1547.1346632732.31962.d.gnu puremagic.com#post-mailman.1557.1346690320.31962.d.gnu:40puremagic.com
Oct 01 2012
Andrej Mitrovic wrote:On 10/1/12, Piotr Szturmaj <bncrbme jadamspam.pl> wrote:I think that align(1) structs that wrap a single value should be treated as its type. After all they have the same size and representation. I don't know how this works now, though.For example C binding writers may change: extern(C) char* getstr(); to extern(C) cstring getstr();I don't think you can reliably do that because of semantics w.r.t. passing parameters on the stack vs in registers based on whether a type is a pointer or not. I've had this sort of bug when wrapping C++ where the C++ compiler was passing a parameter in one way but the D compiler expected the parameters to be passed, simply because I tried to be clever and fake a return type. See: http://forum.dlang.org/thread/mailman.1547.1346632732.31962.d.gnu puremagic.com#post-mailman.1557.1346690320.31962.d.gnu:40puremagic.com
Oct 02 2012
On 10/1/12, Andrej Mitrovic <andrej.mitrovich gmail.com> wrote:but the D compiler expected the parameters to be passedmissing "in another way" there.
Oct 01 2012
On Monday, October 01, 2012 11:18:16 Piotr Szturmaj wrote:Adam D. Ruppe wrote:ersen wrote:On Saturday, 29 September 2012 at 02:11:12 UTC, Alex R=C3=B8nne Pet=While the idea is reasonable, the problem then becomes that if you=aksaccidentally pass a non-zero terminated char* to %sz, all hell bre=loose just like with printf.=20 That's the same risk with to!string(), yes? We aren't really losing=or ofanything by adding it. =20 Also this reminds me of the utter uselessness of the current behavi=l"%s" and a pointer - it prints the address.=20 Why not specialize current "%s" for character pointer types so it wil=print null terminated strings? It's always possible to cast to void* =toprint an address.Honestly? One of Phobos' best features is the fact that %s works for=20= _everything_. Specializing it for _anything_ would be horrible. It woul= d also=20 break a _ton_ of code. Who even uses %d, %f, etc. if they don't need to= use=20 format specifiers? It's just way simpler to always use %s. I'm not completely against the idea of %zs, but I confess that I have t= o=20 wonder what someone is doing if they really need to print zero-terminat= ed=20 strings all that often in D for anything other than quick debugging (in= which=20 case to!string works just fine), since only stuff directly interacting = with C=20 code will even care. And if it's really that big a deal, and you're con= stantly=20 interacting with C code like that, you can always use the appropriate C= =20 function - printf - and then it's a non-issue. - Jonathan M Davis
Oct 01 2012
Jonathan M Davis wrote:On Monday, October 01, 2012 11:18:16 Piotr Szturmaj wrote:OK, I think you're right.Adam D. Ruppe wrote:Honestly? One of Phobos' best features is the fact that %s works for _everything_. Specializing it for _anything_ would be horrible. It would also break a _ton_ of code. Who even uses %d, %f, etc. if they don't need to use format specifiers? It's just way simpler to always use %s.On Saturday, 29 September 2012 at 02:11:12 UTC, Alex Rønne Petersen wrote:Why not specialize current "%s" for character pointer types so it will print null terminated strings? It's always possible to cast to void* to print an address.While the idea is reasonable, the problem then becomes that if you accidentally pass a non-zero terminated char* to %sz, all hell breaks loose just like with printf.That's the same risk with to!string(), yes? We aren't really losing anything by adding it. Also this reminds me of the utter uselessness of the current behavior of "%s" and a pointer - it prints the address.I'm not completely against the idea of %zs, but I confess that I have to wonder what someone is doing if they really need to print zero-terminated strings all that often in D for anything other than quick debugging (in which case to!string works just fine), since only stuff directly interacting with C code will even care. And if it's really that big a deal, and you're constantly interacting with C code like that, you can always use the appropriate C function - printf - and then it's a non-issue.Imagine you're serializing great amount of text when some of the text come from a C library (as null-terminated char*) and you're using format() with %s specifiers. Direct handling of C strings would be just faster because it avoids double iteration.
Oct 01 2012
On Mon, 01 Oct 2012 05:54:30 -0400, Jonathan M Davis <jmdavisProg gmx.com> wrote:I'm not completely against the idea of %zs, but I confess that I have to wonder what someone is doing if they really need to print zero-terminated strings all that often in D for anything other than quick debugging (in which case to!string works just fine)to!string necessarily allocates, I think that is not a small problem. I think %s should treat char * as if it is zero-terminated. Invariably, you will have two approaches to this problem: 1. writefln("%s", mycstring); => 0xptrlocation 2. hm.., I guess I'll just use to!string => vulnerable to non-zero-terminated strings! or 2. hm.., to!string will allocate, I guess I'll just use writefln("%s", mycstring[0..strlen(mycstring)]); => vulnerable to non-zero-terminated strings! So how is forcing the user to use one of these methods any safer? I don't see any casts in there..., since only stuff directly interacting with C code will even care. And if it's really that big a deal, and you're constantly interacting with C code like that, you can always use the appropriate C function - printf - and then it's a non-issue.Nobody should ever *ever* use printf, unless you are debugging druntime. It's not a non-issue. printf has no type checking whatsoever. Using it means 1) non-typechecked code (i.e., accidentally pass an int instead of a string, or forget to pass an arg for a specifier, and you've crashed your code), and 2) you have locked yourself into using C's streams (something I hope to remedy in the future). Besides, it doesn't *gain* you anything over having writef(ln) just support char *. Bottom line -- if to!string(arg) is supported, writefln("%s", arg) should be supported, and do the same thing. -Steve
Oct 01 2012
If you know that a string is 0 terminated, you can easily create a slice from it as follow : char* myZeroTerminatedString; char[] myZeroTerminatedString[0 .. strlen(myZeroTerminatedString)]; It is clean and avoid to modify the stdlib in an unsafe way.
Sep 30 2012
On Sunday, 30 September 2012 at 18:31:00 UTC, deadalnix wrote:If you know that a string is 0 terminated, you can easily create a slice from it as follow : char* myZeroTerminatedString; char[] myZeroTerminatedString[0 .. strlen(myZeroTerminatedString)]; It is clean and avoid to modify the stdlib in an unsafe way.+1 We don't need to preserve C's design errors regarding strings and vectors. -- Paulo
Sep 30 2012
On Sunday, 30 September 2012 at 18:58:11 UTC, Paulo Pinto wrote:+1 We don't need to preserve C's design errors regarding strings and vectors.The problem is that, unsurprisingly, most C APIs (not just libc, but also most C libraries and OS APIs) use zero-terminated strings. The philosophy of ignoring the existence of C strings throughout all of D makes working with such APIs needlessly verbose (and sometimes annoying, as D code will compile and produce unexpected results).
Sep 30 2012
On Sunday, 30 September 2012 at 18:31:00 UTC, deadalnix wrote:If you know that a string is 0 terminated, you can easily create a slice from it as follow : char* myZeroTerminatedString; char[] myZeroTerminatedString[0 .. strlen(myZeroTerminatedString)]; It is clean and avoid to modify the stdlib in an unsafe way.That's what to!string already does.
Sep 30 2012
On Sunday, 30 September 2012 at 19:58:16 UTC, Vladimir Panteleev wrote:On Sunday, 30 September 2012 at 18:31:00 UTC, deadalnix wrote:I aggere you... <a href="http://www.tercumesirketi.com/">Tercüme</a> || <a href="http://www.tercumesirketi.com/">Tercüme Büroları</a>If you know that a string is 0 terminated, you can easily create a slice from it as follow : char* myZeroTerminatedString; char[] myZeroTerminatedString[0 .. strlen(myZeroTerminatedString)]; It is clean and avoid to modify the stdlib in an unsafe way.That's what to!string already does.
Sep 30 2012
Le 30/09/2012 21:58, Vladimir Panteleev a écrit :On Sunday, 30 September 2012 at 18:31:00 UTC, deadalnix wrote:How does to!string know that the string is 0 terminated ?If you know that a string is 0 terminated, you can easily create a slice from it as follow : char* myZeroTerminatedString; char[] myZeroTerminatedString[0 .. strlen(myZeroTerminatedString)]; It is clean and avoid to modify the stdlib in an unsafe way.That's what to!string already does.
Oct 01 2012
On Monday, 1 October 2012 at 10:56:36 UTC, deadalnix wrote:Le 30/09/2012 21:58, Vladimir Panteleev a écrit :By convention (it doesn't).On Sunday, 30 September 2012 at 18:31:00 UTC, deadalnix wrote:How does to!string know that the string is 0 terminated ?If you know that a string is 0 terminated, you can easily create a slice from it as follow : char* myZeroTerminatedString; char[] myZeroTerminatedString[0 .. strlen(myZeroTerminatedString)]; It is clean and avoid to modify the stdlib in an unsafe way.That's what to!string already does.
Oct 01 2012
Le 01/10/2012 13:29, Vladimir Panteleev a écrit :On Monday, 1 October 2012 at 10:56:36 UTC, deadalnix wrote:It is unsafe as hell oOLe 30/09/2012 21:58, Vladimir Panteleev a écrit :By convention (it doesn't).On Sunday, 30 September 2012 at 18:31:00 UTC, deadalnix wrote:How does to!string know that the string is 0 terminated ?If you know that a string is 0 terminated, you can easily create a slice from it as follow : char* myZeroTerminatedString; char[] myZeroTerminatedString[0 .. strlen(myZeroTerminatedString)]; It is clean and avoid to modify the stdlib in an unsafe way.That's what to!string already does.
Oct 01 2012
On Monday, 1 October 2012 at 12:12:52 UTC, deadalnix wrote:Le 01/10/2012 13:29, Vladimir Panteleev a écrit :Forcing the programmer to put strlen calls everywhere in his code is not any safer.On Monday, 1 October 2012 at 10:56:36 UTC, deadalnix wrote:It is unsafe as hell oOHow does to!string know that the string is 0 terminated ?By convention (it doesn't).
Oct 01 2012
Le 01/10/2012 22:33, Vladimir Panteleev a écrit :On Monday, 1 October 2012 at 12:12:52 UTC, deadalnix wrote:I make the library safer. If the programmer manipulate unsafe construct (like c strings) it is up to the programmer to ensure safety, not the lib.Le 01/10/2012 13:29, Vladimir Panteleev a écrit :Forcing the programmer to put strlen calls everywhere in his code is not any safer.On Monday, 1 October 2012 at 10:56:36 UTC, deadalnix wrote:It is unsafe as hell oOHow does to!string know that the string is 0 terminated ?By convention (it doesn't).
Oct 02 2012
On Tuesday, 2 October 2012 at 13:07:46 UTC, deadalnix wrote:Le 01/10/2012 22:33, Vladimir Panteleev a écrit :Thrusting the programmer is what brought upon us the wrath of security exploits via buffer overflows. -- PauloOn Monday, 1 October 2012 at 12:12:52 UTC, deadalnix wrote:I make the library safer. If the programmer manipulate unsafe construct (like c strings) it is up to the programmer to ensure safety, not the lib.Le 01/10/2012 13:29, Vladimir Panteleev a écrit :Forcing the programmer to put strlen calls everywhere in his code is not any safer.On Monday, 1 October 2012 at 10:56:36 UTC, deadalnix wrote:It is unsafe as hell oOHow does to!string know that the string is 0 terminated ?By convention (it doesn't).
Oct 04 2012
On 9/30/12, deadalnix <deadalnix gmail.com> wrote:If you know that a string is 0 terminated, you can easily create a slice from it as follow : char* myZeroTerminatedString; char[] myZeroTerminatedString[0 .. strlen(myZeroTerminatedString)]; It is clean and avoid to modify the stdlib in an unsafe way.What does that have to do with writef()? You can call to!string, but that's beside the point. The point was getting rid of this verbosity when using C APIs.
Sep 30 2012
On Sunday, 30 September 2012 at 20:27:16 UTC, Andrej Mitrovic wrote:On 9/30/12, deadalnix <deadalnix gmail.com> wrote:You should anyway wrap those APIs not to pollute D call with lower level APIs. As such I don't find the verbosity, as you put it, that much of an issue. Then again, I favor the Pascal family of languages for systems programming. -- PauloIf you know that a string is 0 terminated, you can easily create a slice from it as follow : char* myZeroTerminatedString; char[] myZeroTerminatedString[0 .. strlen(myZeroTerminatedString)]; It is clean and avoid to modify the stdlib in an unsafe way.What does that have to do with writef()? You can call to!string, but that's beside the point. The point was getting rid of this verbosity when using C APIs.
Sep 30 2012
On Monday, 1 October 2012 at 06:58:41 UTC, Paulo Pinto wrote:You should anyway wrap those APIs not to pollute D call with lower level APIs.I have to agree, esp when it applies to pointers. We should not forget that one of the objectives of D is to make coding "safe" by getting rid of the need to use pointers and other unsafe features. It encourages safe practice by making safe practice much easier to do than using unsafe practice. It however allows unsafe practice where necessary, but the programmer has to intentionally do something extra to make that happen. I think the suggestion of introducing a null string specifier fundamentally goes against the objectives of D, and if introduced will unltimately degrade the quality of the language. --rt
Oct 01 2012
On 9/30/2012 11:31 AM, deadalnix wrote:If you know that a string is 0 terminated, you can easily create a slice from it as follow : char* myZeroTerminatedString; char[] myZeroTerminatedString[0 .. strlen(myZeroTerminatedString)]; It is clean and avoid to modify the stdlib in an unsafe way.Of course, using strlen() is always going to be unsafe. But having %zs is equally unsafe for the same reason. deadalnix's example shows that adding a new format specifier %zs adds little value, but it gets much worse. Since %zs is inherently unsafe, it hides such unsafety in a commonly used library function, which will infect everything else that transitively calls writefln with unsafety. This makes %zs an unacceptable feature.
Oct 01 2012
On Mon, 01 Oct 2012 21:13:47 -0400, Walter Bright <newshound1 digitalmars.com> wrote:On 9/30/2012 11:31 AM, deadalnix wrote:What about %s just working with zero-terminated strings? I was going to argue this point, but I just thought of a very very good counter-case for this. string x = "abc".idup; // no zero-terminator! writefln("%s", x.ptr); What we don't want is for writefln to try and interpret the pointer as a C string. Not only is it bad, but even the code seems to suggest "Hey, this should print a pointer!" The large underlying issue here is that C considers char * to be a zero-terminated string, and D considers it to be a pointer. This means any code which uses C calls heavily will have to awkwardly dance between both worlds. I think there is some value in providing something that is *not* common to do the above work (convert char * to char[]). Hm... system char[] zstr(char *s) { return s[0..strlen(s)]; } provides: writefln("%s", zstr(s)); vs. writefln("%zs", s); Arguably, nobody uses %zs, so even though writefln is common, the specifier is not. However, we can't require an import to use a bizarre specifier, and you can't link un safe code to a specifier, so the zstr concept is far superior in requiring the user to know what he is doing, and having the compiler enforce that. Does it make sense for Phobos to provide such a shortcut in an obscure header somewhere? Like std.cstring? Or should we just say "roll your own if you need it"? -SteveIf you know that a string is 0 terminated, you can easily create a slice from it as follow : char* myZeroTerminatedString; char[] myZeroTerminatedString[0 .. strlen(myZeroTerminatedString)]; It is clean and avoid to modify the stdlib in an unsafe way.Of course, using strlen() is always going to be unsafe. But having %zs is equally unsafe for the same reason. deadalnix's example shows that adding a new format specifier %zs adds little value, but it gets much worse. Since %zs is inherently unsafe, it hides such unsafety in a commonly used library function, which will infect everything else that transitively calls writefln with unsafety. This makes %zs an unacceptable feature.
Oct 01 2012
On 10/1/2012 7:22 PM, Steven Schveighoffer wrote:However, we can't require an import to use a bizarre specifier, and you can't link un safe code to a specifier, so the zstr concept is far superior in requiring the user to know what he is doing, and having the compiler enforce that.Yup.Does it make sense for Phobos to provide such a shortcut in an obscure header somewhere? Like std.cstring? Or should we just say "roll your own if you need it"?As a matter of principle, I really don't like gobs of Phobos functions that are literally one liners. Phobos should not become a mile wide but inch deep library of trivia. It should consist of non-trivial, useful, and relatively deep functions.
Oct 02 2012
On 10/2/12 4:09 AM, Walter Bright wrote:On 10/1/2012 7:22 PM, Steven Schveighoffer wrote:Well there are some possible reasons. Clearly useful functionality that's nontrivial deserves being abstracted in a function. On the other hand, even a short function is valuable if frequent enough and deserving of a name. We have e.g. s.strip even though it's equivalent to s.stripLeft.stripRight. AndreiDoes it make sense for Phobos to provide such a shortcut in an obscure header somewhere? Like std.cstring? Or should we just say "roll your own if you need it"?As a matter of principle, I really don't like gobs of Phobos functions that are literally one liners. Phobos should not become a mile wide but inch deep library of trivia. It should consist of non-trivial, useful, and relatively deep functions.
Oct 02 2012
On Tue, 02 Oct 2012 04:09:43 -0400, Walter Bright <newshound1 digitalmars.com> wrote:On 10/1/2012 7:22 PM, Steven Schveighoffer wrote:This, arguably, is one of the most important aspects of C to support. There are lots of C functions which provide C strings. Yes, we don't want to promote using C strings, but to have one point of conversion so you *can* use safe strings is a good thing. In other words, the sooner you convert your zero-terminated strings to char slices, the better off you are. And if we label it system code, it can't be misused in safe code. Why support zero-terminated strings as literals if it wasn't important? You could argue that things like system calls which return zero-terminated strings are as safe to use as string literals which you know have zero terminated values. The only other alternative is to wrap those C functions with D ones that convert to char[]. I don't find this any more appealing. -SteveDoes it make sense for Phobos to provide such a shortcut in an obscure header somewhere? Like std.cstring? Or should we just say "roll your own if you need it"?As a matter of principle, I really don't like gobs of Phobos functions that are literally one liners. Phobos should not become a mile wide but inch deep library of trivia. It should consist of non-trivial, useful, and relatively deep functions.
Oct 02 2012
On Tuesday, 2 October 2012 at 02:22:33 UTC, Steven Schveighoffer wrote:system char[] zstr(char *s) { return s[0..strlen(s)]; } […] Does it make sense for Phobos to provide such a shortcut in an obscure header somewhere? Like std.cstring? Or should we just say "roll your own if you need it"?I didn't look it up, so I could be making quite a fool of myself right now, but doesn't to!string(char*) provide exactly that? David
Oct 02 2012
On Tue, 02 Oct 2012 15:17:42 -0400, David Nadlinger <see klickverbot.at>= = wrote:On Tuesday, 2 October 2012 at 02:22:33 UTC, Steven Schveighoffer wrote=:e =system char[] zstr(char *s) { return s[0..strlen(s)]; } [=E2=80=A6] Does it make sense for Phobos to provide such a shortcut in an obscur=r =header somewhere? Like std.cstring? Or should we just say "roll you==own if you need it"?I didn't look it up, so I could be making quite a fool of myself right=now, but doesn't to!string(char*) provide exactly that?string is immutable. Must allocate. You fool :) just kidding, honest mistake. -Steve
Oct 02 2012
On Tuesday, 2 October 2012 at 19:31:33 UTC, Steven Schveighoffer wrote:On Tue, 02 Oct 2012 15:17:42 -0400, David Nadlinger <see klickverbot.at> wrote:Well, make it to!char(char*) then! ;) DavidOn Tuesday, 2 October 2012 at 02:22:33 UTC, Steven Schveighoffer wrote:string is immutable. Must allocate. You fool :)system char[] zstr(char *s) { return s[0..strlen(s)]; } […] Does it make sense for Phobos to provide such a shortcut in an obscure header somewhere? Like std.cstring? Or should we just say "roll your own if you need it"?I didn't look it up, so I could be making quite a fool of myself right now, but doesn't to!string(char*) provide exactly that?
Oct 02 2012
On Tuesday, 2 October 2012 at 19:34:31 UTC, David Nadlinger wrote:Well, make it to!char(char*) then! ;)Oh dear, this doesn't get better: Of course, I've meant to write »to!(char[])(char*)«. David
Oct 02 2012
On Tue, 02 Oct 2012 15:35:47 -0400, David Nadlinger <see klickverbot.at>= = wrote:On Tuesday, 2 October 2012 at 19:34:31 UTC, David Nadlinger wrote:Well, make it to!char(char*) then! ;)Oh dear, this doesn't get better: Of course, I've meant to write ==C2=BBto!(char[])(char*)=C2=AB.Right. I agree, this should not allocate (I think someone said it does,= = but it's probably not necessary to). But still, what looks better? auto x =3D SomeSystemCallThatReturnsACString(); writefln("%s", to!(char[])(x)); writefln("%s", zstr(x)); I want something easy to type, and not too difficult to visually parse. In fact, a better solution would be to define a C string type (other tha= n = char *), and just pretend those system calls return that. Then support = = that C string type in writef. -Steve
Oct 02 2012
On Tue, 02 Oct 2012 21:44:11 +0100, Steven Schveighoffer <schveiguy yahoo.com> wrote:In fact, a better solution would be to define a C string type (other than char *), and just pretend those system calls return that. Then support that C string type in writef. -Steve:D http://comments.gmane.org/gmane.comp.lang.d.general/97793 -- Using Opera's revolutionary email client: http://www.opera.com/mail/
Oct 03 2012
On Wed, 03 Oct 2012 08:37:14 -0400, Regan Heath <regan netmail.co.nz> wrote:On Tue, 02 Oct 2012 21:44:11 +0100, Steven Schveighoffer <schveiguy yahoo.com> wrote:Almost what I was thinking. :) Though, at that point, I don't think we need a special specifier for writef. %s works. However, looking at the vast reach of these changes, I wonder if it's worth it. That's a lot of prototypes to C functions that have to change, and a large compiler change (treating string literals as CString instead of char *), just so C strings print out with writef. Not to mention code that will certainly break... -SteveIn fact, a better solution would be to define a C string type (other than char *), and just pretend those system calls return that. Then support that C string type in writef. -Steve:D http://comments.gmane.org/gmane.comp.lang.d.general/97793
Oct 03 2012
On Thu, 04 Oct 2012 01:05:14 +0100, Steven Schveighoffer <schveiguy yahoo.com> wrote:On Wed, 03 Oct 2012 08:37:14 -0400, Regan Heath <regan netmail.co.nz> wrote:True.On Tue, 02 Oct 2012 21:44:11 +0100, Steven Schveighoffer <schveiguy yahoo.com> wrote:Almost what I was thinking. :) Though, at that point, I don't think we need a special specifier for writef. %s works.In fact, a better solution would be to define a C string type (other than char *), and just pretend those system calls return that. Then support that C string type in writef. -Steve:D http://comments.gmane.org/gmane.comp.lang.d.general/97793However, looking at the vast reach of these changes, I wonder if it's worth it. That's a lot of prototypes to C functions that have to change, and a large compiler change (treating string literals as CString instead of char *), just so C strings print out with writef.That's not the only motivation. The change brings more type safety in general and should help to catch bugs, like for example the common one made by people just starting out with D (from a C/C++ background).Not to mention code that will certainly break...Some code will definitely stop compiling, but it's debatable as to whether this code is not already "broken" to some degree.. it's likely not as safe/robust as it could be. R -- Using Opera's revolutionary email client: http://www.opera.com/mail/
Oct 04 2012
Le 02/10/2012 03:13, Walter Bright a écrit :On 9/30/2012 11:31 AM, deadalnix wrote:Exactly my point.If you know that a string is 0 terminated, you can easily create a slice from it as follow : char* myZeroTerminatedString; char[] myZeroTerminatedString[0 .. strlen(myZeroTerminatedString)]; It is clean and avoid to modify the stdlib in an unsafe way.Of course, using strlen() is always going to be unsafe. But having %zs is equally unsafe for the same reason. deadalnix's example shows that adding a new format specifier %zs adds little value, but it gets much worse. Since %zs is inherently unsafe, it hides such unsafety in a commonly used library function, which will infect everything else that transitively calls writefln with unsafety. This makes %zs an unacceptable feature.
Oct 02 2012
On 10/2/12, Walter Bright <newshound1 digitalmars.com> wrote:On 9/30/2012 11:31 AM, deadalnix wrote:How does it hide anything if you have to explicitly mark the format specifier as %zs? It would be documented, just like it's documented that passing pointers to garbage-collected memory to the C side is inherently unsafe.If you know that a string is 0 terminated, you can easily create a slice from it as follow : char* myZeroTerminatedString; char[] myZeroTerminatedString[0 .. strlen(myZeroTerminatedString)];Since %zs is inherently unsafe, it hides such unsafety in a commonly used library function, which will infect everything else that transitively calls writefln with unsafety. This makes %zs an unacceptable feature.deadalnix's example shows that adding a new format specifier %zs adds little value.It adds convenience, which is an important trait in this day and age. If that's not a concern, why is printf a symbol you can get your hands on as soon as you import std.stdio? And if safety is a concern why is printf used in Phobos at all? I count 427 lines of printf calls in Phobos and 843 lines in Druntime (druntime might have a good excuse since it shouldn't import Phobos functions). Many of these calls in Phobos are not simple D string literal printf calls either. Btw, some weeks ago when dstep was announced you were jumping for joy and were instantly proposing language changes to add better support for wrapping C. But asking for better library support is somehow controversial. I don't understand the double-standard.
Oct 02 2012
On Tuesday, 2 October 2012 at 21:30:35 UTC, Andrej Mitrovic wrote:On 10/2/12, Walter Bright <newshound1 digitalmars.com> wrote:writefln cannot be safe if it has to support an unsafe format specifier. It's "hidden" because it affects every call to writefln, even if it doesn't use the unsafe format specifier.On 9/30/2012 11:31 AM, deadalnix wrote:How does it hide anything if you have to explicitly mark the format specifier as %zs? It would be documented, just like it's documented that passing pointers to garbage-collected memory to the C side is inherently unsafe.If you know that a string is 0 terminated, you can easily create a slice from it as follow : char* myZeroTerminatedString; char[] myZeroTerminatedString[0 .. strlen(myZeroTerminatedString)];Since %zs is inherently unsafe, it hides such unsafety in a commonly used library function, which will infect everything else that transitively calls writefln with unsafety. This makes %zs an unacceptable feature.
Oct 02 2012
On 10/3/12, Jakob Ovrum <jakobovrum gmail.com> wrote:writefln cannot be safe if it has to support an unsafe format specifier. It's "hidden" because it affects every call to writefln, even if it doesn't use the unsafe format specifier.Ah damn I completely forgot about safe. I tend to avoid recent features.. OK then I think my arguments are moot. Nevertheless I can always define a helper function for my own purposes I guess. Sorry Walter for not taking safe into account. :)
Oct 02 2012
On Wed, Oct 03, 2012 at 03:07:14AM +0200, Andrej Mitrovic wrote:On 10/3/12, Jakob Ovrum <jakobovrum gmail.com> wrote:[...] Hmm, this seems to impose unnecessary limitations on safe. I guess the current language doesn't allow for a "conditionally-safe" tag where something can be implicitly marked safe if it's provable at compile-time that it's safe? T -- Elegant or ugly code as well as fine or rude sentences have something in common: they don't depend on the language. -- Luca De Vitiswritefln cannot be safe if it has to support an unsafe format specifier. It's "hidden" because it affects every call to writefln, even if it doesn't use the unsafe format specifier.
Oct 02 2012
On Saturday, 29 September 2012 at 02:07:38 UTC, Andrej Mitrovic wrote:I've noticed I'm having to do a lot of to!string calls when I want to call the versatile writef() function. So I was thinking, why not introduce a special zero-terminated string specifier which would both alleviate the need to call to!string and would probably save on needless memory allocation. If all we want to do is print something, why waste time duplicating a string?I just checked and std.conv.to always allocates a copy, even when constness doesn't require it. It should not reallocate when constness doesn't change, or is a safe conversion (e.g. immutable -> const). A discussion on a related topic (formatting of C strings results in unexpected behavior) is here: http://d.puremagic.com/issues/show_bug.cgi?id=8384
Sep 30 2012
On Tuesday, October 02, 2012 18:21:30 H. S. Teoh wrote:On Wed, Oct 03, 2012 at 03:07:14AM +0200, Andrej Mitrovic wrote:The format string is a runtime argument, so nothing can be proven about it at compile time. If you want any kind of safe inferrence, you need to use a template. If writefln took the format string as a template argument and generated different code (which was safe or not depending on what it did) based on what was in the format string, then inferrence could take place, but otherwise no. - Jonathan M DavisOn 10/3/12, Jakob Ovrum <jakobovrum gmail.com> wrote:[...] Hmm, this seems to impose unnecessary limitations on safe. I guess the current language doesn't allow for a "conditionally-safe" tag where something can be implicitly marked safe if it's provable at compile-time that it's safe?writefln cannot be safe if it has to support an unsafe format specifier. It's "hidden" because it affects every call to writefln, even if it doesn't use the unsafe format specifier.
Oct 02 2012
On Tue, Oct 02, 2012 at 07:50:09PM -0700, Jonathan M Davis wrote:On Tuesday, October 02, 2012 18:21:30 H. S. Teoh wrote:[...] Yes that's what I mean. If the format string is known at compile-time and known to involve only safe code, then this would work. Something like this might work if CTFE is used to parse the format string piecemeal (i.e., translate something like writefln("%d %s",x,y) into write!int(x); write!string(" "); write!string(y)). The safe instances of write!T(...) will be marked safe. But it does seem like a lot of work just so we can use safe, though. I suppose we could just use trusted and call it a day. T -- Claiming that your operating system is the best in the world because more people use it is like saying McDonalds makes the best food in the world. -- Carl B. ConstantineOn Wed, Oct 03, 2012 at 03:07:14AM +0200, Andrej Mitrovic wrote:The format string is a runtime argument, so nothing can be proven about it at compile time. If you want any kind of safe inferrence, you need to use a template. If writefln took the format string as a template argument and generated different code (which was safe or not depending on what it did) based on what was in the format string, then inferrence could take place, but otherwise no.On 10/3/12, Jakob Ovrum <jakobovrum gmail.com> wrote:[...] Hmm, this seems to impose unnecessary limitations on safe. I guess the current language doesn't allow for a "conditionally-safe" tag where something can be implicitly marked safe if it's provable at compile-time that it's safe?writefln cannot be safe if it has to support an unsafe format specifier. It's "hidden" because it affects every call to writefln, even if it doesn't use the unsafe format specifier.
Oct 02 2012
On Wednesday, 3 October 2012 at 05:04:01 UTC, H. S. Teoh wrote:Yes that's what I mean. If the format string is known at compile-time and known to involve only safe code, then this would work. Something like this might work if CTFE is used to parse the format string piecemeal (i.e., translate something like writefln("%d %s",x,y) into write!int(x); write!string(" "); write!string(y)). The safe instances of write!T(...) will be marked safe.It doesn't matter if the argument is known at compile-time or not, because there's no way to know that without receiving the format string as a template parameter, in which case it must *always* be known at compile-time (runtime format string would not be supported), and then the syntax is no longer writefln("%d %s", x, y). Obviously, such a change is not acceptable.I suppose we could just use trusted and call it a day.No, that would be abusing trusted. The function would no longer be safe, *because it contains possibly unsafe code*. trusted is for safe functions that the compiler cannot prove safe.
Oct 02 2012
On Wednesday, October 03, 2012 07:35:23 Jakob Ovrum wrote:Yeah. You basically _never_ just mark trusted and call it a day. You only mark something trusted if you've verified that _everything_ that that function does which is system is done in a way that's ultimately safe. In particular, marking much of anything which is templated as trusted is almost always just plain wrong. - Jonathan M DavisI suppose we could just use trusted and call it a day.No, that would be abusing trusted. The function would no longer be safe, *because it contains possibly unsafe code*. trusted is for safe functions that the compiler cannot prove safe.
Oct 02 2012