www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - arrays: if(null == [ ])

reply Gor Gyolchanyan <gor.f.gyolchanyan gmail.com> writes:
Hi! I have a small question:
Is the test for a null array equivalent to a test for zero-length array?
This is particularly interesting for strings.
For instance, I could return an empty string from a toString-like function
and the empty string would be printed, but If I returned a null string,
that would indicate, that there is no string representation and it would
cause some default string to be printed.
So, the question is, if a null array is any different from an empty array?

-- 
Bye,
Gor Gyolchanyan.
May 14 2012
next sibling parent reply simendsjo <simendsjo gmail.com> writes:
On Mon, 14 May 2012 12:08:17 +0200, Gor Gyolchanyan  
<gor.f.gyolchanyan gmail.com> wrote:

 Hi! I have a small question:
 Is the test for a null array equivalent to a test for zero-length array?
 This is particularly interesting for strings.
 For instance, I could return an empty string from a toString-like  
 function
 and the empty string would be printed, but If I returned a null string,
 that would indicate, that there is no string representation and it would
 cause some default string to be printed.
 So, the question is, if a null array is any different from an empty  
 array?
This passes. null and [] is a null string, but "" gives a non-null string. Tested on dmd-head and 2.059. void main() { string s1 = null; assert(s1 is null); assert(s1.length == 0); assert(s1.ptr is null); assert(s1 == []); assert(s1 == ""); string s2 = []; assert(s2 is null); assert(s2.length == 0); assert(s2.ptr is null); assert(s2 == []); assert(s2 == ""); string s3 = ""; assert(s3 !is null); assert(s3.length == 0); assert(s3.ptr !is null); assert(s3 == []); assert(s3 == ""); }
May 14 2012
parent reply Gor Gyolchanyan <gor.f.gyolchanyan gmail.com> writes:
So, null arrays and empty arrays are always the same, except for an empty
string, which is a valid non-nill array of characters with length 0, right?

On Mon, May 14, 2012 at 2:24 PM, simendsjo <simendsjo gmail.com> wrote:

 On Mon, 14 May 2012 12:08:17 +0200, Gor Gyolchanyan <
 gor.f.gyolchanyan gmail.com> wrote:

  Hi! I have a small question:
 Is the test for a null array equivalent to a test for zero-length array?
 This is particularly interesting for strings.
 For instance, I could return an empty string from a toString-like function
 and the empty string would be printed, but If I returned a null string,
 that would indicate, that there is no string representation and it would
 cause some default string to be printed.
 So, the question is, if a null array is any different from an empty array?
This passes. null and [] is a null string, but "" gives a non-null string. Tested on dmd-head and 2.059. void main() { string s1 = null; assert(s1 is null); assert(s1.length == 0); assert(s1.ptr is null); assert(s1 == []); assert(s1 == ""); string s2 = []; assert(s2 is null); assert(s2.length == 0); assert(s2.ptr is null); assert(s2 == []); assert(s2 == ""); string s3 = ""; assert(s3 !is null); assert(s3.length == 0); assert(s3.ptr !is null); assert(s3 == []); assert(s3 == ""); }
-- Bye, Gor Gyolchanyan.
May 14 2012
parent reply deadalnix <deadalnix gmail.com> writes:
Le 14/05/2012 12:49, Gor Gyolchanyan a écrit :
 So, null arrays and empty arrays are always the same, except for an
 empty string, which is a valid non-nill array of characters with length
 0, right?
If it is the current behavior, it deserve a WAT !
May 14 2012
next sibling parent reply simendsjo <simendsjo gmail.com> writes:
On Mon, 14 May 2012 13:51:40 +0200, deadalnix <deadalnix gmail.com> wrot=
e:

 Le 14/05/2012 12:49, Gor Gyolchanyan a =C3=A9crit :
 So, null arrays and empty arrays are always the same, except for an
 empty string, which is a valid non-nill array of characters with leng=
th
 0, right?
If it is the current behavior, it deserve a WAT !
It is according to my tests.. It's quite a gotcha. So check for .length or =3D=3D "" or =3D=3D [] if you need "null or empt= y" and use = "is null" for null/[]
May 14 2012
parent Gor Gyolchanyan <gor.f.gyolchanyan gmail.com> writes:
I think any kind of null array should be different from an array of zero
length.

On Mon, May 14, 2012 at 3:55 PM, simendsjo <simendsjo gmail.com> wrote:

 On Mon, 14 May 2012 13:51:40 +0200, deadalnix <deadalnix gmail.com> wrote=
:
  Le 14/05/2012 12:49, Gor Gyolchanyan a =C3=A9crit :
 So, null arrays and empty arrays are always the same, except for an
 empty string, which is a valid non-nill array of characters with length
 0, right?
If it is the current behavior, it deserve a WAT !
It is according to my tests.. It's quite a gotcha. So check for .length or =3D=3D "" or =3D=3D [] if you need "null or empty=
" and use
 "is null" for null/[]
--=20 Bye, Gor Gyolchanyan.
May 14 2012
prev sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 05/14/2012 01:51 PM, deadalnix wrote:
 Le 14/05/2012 12:49, Gor Gyolchanyan a écrit :
 So, null arrays and empty arrays are always the same, except for an
 empty string, which is a valid non-nill array of characters with length
 0, right?
If it is the current behavior, it deserve a WAT !
I agree, but it is explained easily. Built-in string literals are always zero-terminated, therefore an empty string literal must point into accessible memory. I'd like to have [] !is null as well, so that null can reliably be used as a sentinel value.
May 14 2012
parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Mon, 14 May 2012 18:07:24 -0400, Timon Gehr <timon.gehr gmx.ch> wrote=
:

 On 05/14/2012 01:51 PM, deadalnix wrote:
 Le 14/05/2012 12:49, Gor Gyolchanyan a =C3=A9crit :
 So, null arrays and empty arrays are always the same, except for an
 empty string, which is a valid non-nill array of characters with len=
gth
 0, right?
If it is the current behavior, it deserve a WAT !
I agree, but it is explained easily. Built-in string literals are alwa=
ys =
 zero-terminated, therefore an empty string literal must point into  =
 accessible memory. I'd like to have [] !is null as well, so that null =
=
 can reliably be used as a sentinel value.
This would mean either a) allocating memory for a 0 length array, or b) = = pointing it at non-null but non-heap memory. a) is certainly out of the question. b) is possible, but I still think we should discourage using null as a = sentinel, it leads to confusing code. Regardless, we should fix if(!arr) to mean if(!arr.length). -Steve
May 16 2012
next sibling parent reply "bearophile" <bearophileHUGS lycos.com> writes:
Steven Schveighoffer:

 Regardless, we should fix if(!arr) to mean if(!arr.length).
This seems a nice idea (and Python programmers will be thankful, because they are used to empty collections/strings to be false). Is this request in Bugzilla? Are people opposed to this little D breaking change? Bye, bearophile
May 16 2012
next sibling parent reply =?UTF-8?B?QWxleCBSw7hubmUgUGV0ZXJzZW4=?= <alex lycus.org> writes:
On 16-05-2012 16:23, bearophile wrote:
 Steven Schveighoffer:

 Regardless, we should fix if(!arr) to mean if(!arr.length).
This seems a nice idea (and Python programmers will be thankful, because they are used to empty collections/strings to be false). Is this request in Bugzilla? Are people opposed to this little D breaking change? Bye, bearophile
I would be very happy about this change, too. -- Alex Rønne Petersen alex lycus.org http://lycus.org
May 16 2012
parent "Tobias Pankrath" <panke tzi.de> writes:
On Wednesday, 16 May 2012 at 14:26:49 UTC, Alex Rønne Petersen 
wrote:
 On 16-05-2012 16:23, bearophile wrote:
 Steven Schveighoffer:

 Regardless, we should fix if(!arr) to mean if(!arr.length).
This seems a nice idea (and Python programmers will be thankful, because they are used to empty collections/strings to be false). Is this request in Bugzilla? Are people opposed to this little D breaking change? Bye, bearophile
I would be very happy about this change, too.
I ran into this, too.
May 16 2012
prev sibling parent Timon Gehr <timon.gehr gmx.ch> writes:
On 05/16/2012 04:23 PM, bearophile wrote:
  Is this request in Bugzilla?
http://d.puremagic.com/issues/show_bug.cgi?id=4733 http://d.puremagic.com/issues/show_bug.cgi?id=7539 The first report happens to be yours =).
May 16 2012
prev sibling next sibling parent reply "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Wednesday, May 16, 2012 09:18:38 Steven Schveighoffer wrote:
 but I still think we should discourage using null as a
 sentinel, it leads to confusing code.
If null were actually properly differentiated from empty, then this wouldn't be a problem, but it's not. It _should_ be possible to treat null as a sentinel. The fact that it causes issues is a major flaw in the language IMHO. But given that flaw, it does very quickly become error-prone to use null as a sentinel. In general, I'd say that the only reasonable place to do so is when returning an array (and especially a string) from a function. The return value can then be immeditely checked with is null before it has the chance to have something happen to it which could cause it to be empty but non-null. - Jonathan M Davis
May 16 2012
parent "Regan Heath" <regan netmail.co.nz> writes:
On Thu, 17 May 2012 00:08:49 +0100, Jonathan M Davis <jmdavisProg gmx.com>  
wrote:

 On Wednesday, May 16, 2012 09:18:38 Steven Schveighoffer wrote:
 but I still think we should discourage using null as a
 sentinel, it leads to confusing code.
If null were actually properly differentiated from empty, then this wouldn't be a problem, but it's not. It _should_ be possible to treat null as a sentinel. The fact that it causes issues is a major flaw in the language IMHO. But given that flaw, it does very quickly become error-prone to use null as a sentinel. In general, I'd say that the only reasonable place to do so is when returning an array (and especially a string) from a function. The return value can then be immeditely checked with is null before it has the chance to have something happen to it which could cause it to be empty but non-null.
I want to re-re-re-register my dismay in the situation also. For me it always comes back to.. I can do it with a pointer.. a pointer! Why not an array?!? R -- Using Opera's revolutionary email client: http://www.opera.com/mail/
May 17 2012
prev sibling parent Marco Leise <Marco.Leise gmx.de> writes:
Am Wed, 16 May 2012 09:18:38 -0400
schrieb "Steven Schveighoffer" <schveiguy yahoo.com>:

 Regardless, we should fix if(!arr) to mean if(!arr.length).
 
 -Steve
I'm still using 2.057 (GDC). My mental model of D tells me: A reference type's pointer is implicitly converted to bool, when used inside an if-expression. These are equal statements for any reference type: if(reference_type is null) if(!(reference_type !is null)) if(!reference_type) As an example, I use the these semantics and they feel correct to me. Look at this example where a solution set is expressed as a long[]: long[] empty_solution = []; assert(empty_solution); // the solution set is empty (a = a + 42) long[] no_solution = null; assert(!no_solution); // a solution is not computable (a = a) The shortest form is also the most basic one: Do we have a solution set at all? Once I'm past the 'existence' check, I'd look at the length: if (solution.length == 1) ... In other use cases you make no distinction between "is null" and "length == 0". For those it is ok to check "if (arr.length)", but I want to make you aware that both cases exist and I think the way it worked in 2.057 was consistent. Now with 2.059 I have to turn a solution set into a structure with a flag like 'solved'. The language got less expressive here. If that's how it is going too stay then yes, "if(arr)" should really mean "if(arr.length)", because the only time it is not the same is when the language accidentally exposes the implementation detail that an empty string actually needs memory. :) -- Marco
May 23 2012
prev sibling next sibling parent "Simen Kjaeraas" <simen.kjaras gmail.com> writes:
On Mon, 14 May 2012 12:08:17 +0200, Gor Gyolchanyan  
<gor.f.gyolchanyan gmail.com> wrote:

 Hi! I have a small question:
 Is the test for a null array equivalent to a test for zero-length array?
 This is particularly interesting for strings.
 For instance, I could return an empty string from a toString-like  
 function
 and the empty string would be printed, but If I returned a null string,
 that would indicate, that there is no string representation and it would
 cause some default string to be printed.
 So, the question is, if a null array is any different from an empty  
 array?
The two are different, yes. [] == null actually compares length. If you want to know if the length is zero, use arr.length. If you want to know if it points to null, check if arr.ptr is null.
May 14 2012
prev sibling next sibling parent reply FeepingCreature <default_357-line yahoo.de> writes:
On 05/14/12 12:08, Gor Gyolchanyan wrote:
 Hi! I have a small question:
 Is the test for a null array equivalent to a test for zero-length array?
 This is particularly interesting for strings.
 For instance, I could return an empty string from a toString-like function and
the empty string would be printed, but If I returned a null string, that would
indicate, that there is no string representation and it would cause some
default string to be printed.
 So, the question is, if a null array is any different from an empty array?
 
 -- 
 Bye,
 Gor Gyolchanyan.
The more interesting question, imo, is how the behavior of 'if (string)' (ie. bool conversion) should be defined. To my knowledge, it checks for ptr, which can lead to some confusion since "" == null, 'if ("")' is true but 'if (null)' is false! I think this behavior is still best though, for the simple reason that people think (or ought to think) of arrays as "pointers with length", so it makes sense that 'if (string)' tests the pointer. It's not intuitively obvious if you consider strings as sequences of characters, but it's obvious if you consider them as D arrays.
May 14 2012
parent reply deadalnix <deadalnix gmail.com> writes:
Le 14/05/2012 12:42, FeepingCreature a écrit :
 On 05/14/12 12:08, Gor Gyolchanyan wrote:
 Hi! I have a small question:
 Is the test for a null array equivalent to a test for zero-length array?
 This is particularly interesting for strings.
 For instance, I could return an empty string from a toString-like function and
the empty string would be printed, but If I returned a null string, that would
indicate, that there is no string representation and it would cause some
default string to be printed.
 So, the question is, if a null array is any different from an empty array?

 --
 Bye,
 Gor Gyolchanyan.
The more interesting question, imo, is how the behavior of 'if (string)' (ie. bool conversion) should be defined. To my knowledge, it checks for ptr, which can lead to some confusion since "" == null, 'if ("")' is true but 'if (null)' is false! I think this behavior is still best though, for the simple reason that people think (or ought to think) of arrays as "pointers with length", so it makes sense that 'if (string)' tests the pointer. It's not intuitively obvious if you consider strings as sequences of characters, but it's obvious if you consider them as D arrays.
A good solution would be to set the pointer to 0 when the length is set to 0.
May 14 2012
parent reply travert phare.normalesup.org (Christophe) writes:
deadalnix , dans le message (digitalmars.D:167258), a �crit�:
 A good solution would be to set the pointer to 0 when the length is set 
 to 0.
String literal are zero-terminated. "" cannot point to 0x0, unless we drop this rule. Maybe we should...
May 14 2012
parent reply Gor Gyolchanyan <gor.f.gyolchanyan gmail.com> writes:
I thing the zero-terminated literal shtick is pointless. Literals are
rarely passed to C functions, so we gotta use the std.utf.toUTFz  anyway.

On Mon, May 14, 2012 at 5:03 PM, Christophe <travert phare.normalesup.org>w=
rote:

 deadalnix , dans le message (digitalmars.D:167258), a =C3=A9crit :
 A good solution would be to set the pointer to 0 when the length is set
 to 0.
String literal are zero-terminated. "" cannot point to 0x0, unless we drop this rule. Maybe we should...
--=20 Bye, Gor Gyolchanyan.
May 14 2012
parent reply =?UTF-8?B?QWxleCBSw7hubmUgUGV0ZXJzZW4=?= <xtzgzorex gmail.com> writes:
On 14-05-2012 15:21, Gor Gyolchanyan wrote:
 I thing the zero-terminated literal shtick is pointless. Literals are
 rarely passed to C functions, so we gotta use the std.utf.toUTFz  anyway.

 On Mon, May 14, 2012 at 5:03 PM, Christophe
 <travert phare.normalesup.org <mailto:travert phare.normalesup.org>> wrote:

     deadalnix , dans le message (digitalmars.D:167258), a écrit :
      > A good solution would be to set the pointer to 0 when the length
     is set
      > to 0.

     String literal are zero-terminated. "" cannot point to 0x0,
     unless we drop this rule. Maybe we should...




 --
 Bye,
 Gor Gyolchanyan.
This is very false. I invite you to read almost any module in druntime. You'll find that it makes heavy use of printf debugging. That being said, dropping the null-termination rule when passing strings to non-const(char)* parameters/variables/etc would be sane enough (I think). -- - Alex
May 14 2012
parent reply deadalnix <deadalnix gmail.com> writes:
Le 14/05/2012 19:38, Alex Rønne Petersen a écrit :
 On 14-05-2012 15:21, Gor Gyolchanyan wrote:
 I thing the zero-terminated literal shtick is pointless. Literals are
 rarely passed to C functions, so we gotta use the std.utf.toUTFz anyway.

 On Mon, May 14, 2012 at 5:03 PM, Christophe
 <travert phare.normalesup.org <mailto:travert phare.normalesup.org>>
 wrote:

 deadalnix , dans le message (digitalmars.D:167258), a écrit :
 A good solution would be to set the pointer to 0 when the length
is set
 to 0.
String literal are zero-terminated. "" cannot point to 0x0, unless we drop this rule. Maybe we should... -- Bye, Gor Gyolchanyan.
This is very false. I invite you to read almost any module in druntime. You'll find that it makes heavy use of printf debugging. That being said, dropping the null-termination rule when passing strings to non-const(char)* parameters/variables/etc would be sane enough (I think).
This looks to me like a bad practice. C string and D string are different beasts, and we have toStringz . It is kind of dumb to create a WAT is the language because druntime dev did mistakes. It have to be fixed.
May 15 2012
next sibling parent reply travert phare.normalesup.org (Christophe) writes:
deadalnix , dans le message (digitalmars.D:167404), a �crit�:
 This looks to me like a bad practice. C string and D string are 
 different beasts, and we have toStringz .
C string and D string are different, but it's not a bad idea to have string *literals* that works for both C and D strings, otherwise using printf will lead to a bug each time the programmer forget the trailing \0.
 It is kind of dumb to create a WAT is the language because druntime dev 
 did mistakes. It have to be fixed.
You can't rely on an empty string to be null since you must be able to reserve place at the end of the array, and or the string could be the result of poping a full string.
May 15 2012
next sibling parent reply Gor Gyolchanyan <gor.f.gyolchanyan gmail.com> writes:
On Tue, May 15, 2012 at 7:51 PM, Christophe <travert phare.normalesup.org>wrote:

 using printf will lead to a bug each time the programmer forget the
 trailing
 \0.
First of all, printf shouldn't be used! There's writef and it's superior to printf in any way! Second of all, if the zero-termination of literals are to be removed, the literals will no longer be accepted as a pointer to a character. The appropriate type mismatch error will force the user to use toUTF8z to get ht e zero-terminated utf-8 version of the original string. In case it's a literal, one could use the compile-time version of toUTF8z to avoid run-time overhead. This all doesn't sound like a bad idea to me. I don't see any security or performance flaws in this scheme. -- Bye, Gor Gyolchanyan.
May 15 2012
next sibling parent reply Dmitry Olshansky <dmitry.olsh gmail.com> writes:
On 15.05.2012 20:19, Gor Gyolchanyan wrote:
 On Tue, May 15, 2012 at 7:51 PM, Christophe
 <travert phare.normalesup.org <mailto:travert phare.normalesup.org>> wrote:

     using printf will lead to a bug each time the programmer forget the
     trailing
     \0.


 First of all, printf shouldn't be used! There's writef and it's superior
 to printf in any way!
 Second of all, if the zero-termination of literals are to be removed,
 the literals will no longer be accepted as a pointer to a character.
 The appropriate type mismatch error will force the user to use toUTF8z
 to get ht e zero-terminated utf-8 version of the original string.
 In case it's a literal, one could use the compile-time version of
 toUTF8z to avoid run-time overhead.
 This all doesn't sound like a bad idea to me. I don't see any security
 or performance flaws in this scheme.
Moreover compiler can do some extra string pooling iff zero termination goes away. Like: "Hello World!" & "Hello" sharing the same piece of ROM. -- Dmitry Olshansky
May 15 2012
parent reply "David Nadlinger" <see klickverbot.at> writes:
On Tuesday, 15 May 2012 at 16:22:22 UTC, Dmitry Olshansky wrote:
 Moreover compiler can do some extra string pooling iff zero 
 termination goes away. Like:
 "Hello World!" & "Hello" sharing the same piece of ROM.
Has anyone actually done some research on how much space this could actually save in practice? David
May 15 2012
parent reply Gor Gyolchanyan <gor.f.gyolchanyan gmail.com> writes:
On Tue, May 15, 2012 at 9:23 PM, David Nadlinger <see klickverbot.at> wrote:

 On Tuesday, 15 May 2012 at 16:22:22 UTC, Dmitry Olshansky wrote:

 Moreover compiler can do some extra string pooling iff zero termination
 goes away. Like:
 "Hello World!" & "Hello" sharing the same piece of ROM.
Has anyone actually done some research on how much space this could actually save in practice? David
It can't be accurately measured, because the number of string literals available at a single compiler pass is vastly varying. -- Bye, Gor Gyolchanyan.
May 15 2012
parent "David Nadlinger" <see klickverbot.at> writes:
On Tuesday, 15 May 2012 at 17:30:53 UTC, Gor Gyolchanyan wrote:
 It can't be accurately measured, because the number of string 
 literals
 available at a single compiler pass is vastly varying.
Of course the actual amount varies from application to application, but it should be possible to obtain some ballpark figures for usual and for string-heavy applications… David
May 15 2012
prev sibling next sibling parent reply deadalnix <deadalnix gmail.com> writes:
Le 15/05/2012 18:19, Gor Gyolchanyan a écrit :
 On Tue, May 15, 2012 at 7:51 PM, Christophe
 <travert phare.normalesup.org <mailto:travert phare.normalesup.org>> wrote:

     using printf will lead to a bug each time the programmer forget the
     trailing
     \0.


 First of all, printf shouldn't be used! There's writef and it's superior
 to printf in any way!
 Second of all, if the zero-termination of literals are to be removed,
 the literals will no longer be accepted as a pointer to a character.
 The appropriate type mismatch error will force the user to use toUTF8z
 to get ht e zero-terminated utf-8 version of the original string.
 In case it's a literal, one could use the compile-time version of
 toUTF8z to avoid run-time overhead.
 This all doesn't sound like a bad idea to me. I don't see any security
 or performance flaws in this scheme.
 --
 Bye,
 Gor Gyolchanyan.
May god ear you !
May 15 2012
parent reply Andrew Wiley <wiley.andrew.j gmail.com> writes:
On Tue, May 15, 2012 at 11:46 AM, deadalnix <deadalnix gmail.com> wrote:

 Le 15/05/2012 18:19, Gor Gyolchanyan a =E9crit :

 On Tue, May 15, 2012 at 7:51 PM, Christophe
 <travert phare.normalesup.org <mailto:travert phare.**normalesup.org<tra=
vert phare.normalesup.org>>>
 wrote:

    using printf will lead to a bug each time the programmer forget the
    trailing
    \0.


 First of all, printf shouldn't be used! There's writef and it's superior
 to printf in any way!
 Second of all, if the zero-termination of literals are to be removed,
 the literals will no longer be accepted as a pointer to a character.
 The appropriate type mismatch error will force the user to use toUTF8z
 to get ht e zero-terminated utf-8 version of the original string.
 In case it's a literal, one could use the compile-time version of
 toUTF8z to avoid run-time overhead.
 This all doesn't sound like a bad idea to me. I don't see any security
 or performance flaws in this scheme.
 --
 Bye,
 Gor Gyolchanyan.
May god ear you !
Unfortunately, using writef/writefln would make DRuntime depend on Phobos, which is unacceptable.
May 15 2012
parent reply deadalnix <deadalnix gmail.com> writes:
Le 15/05/2012 21:57, Andrew Wiley a �crit :
 On Tue, May 15, 2012 at 11:46 AM, deadalnix <deadalnix gmail.com
 <mailto:deadalnix gmail.com>> wrote:

     Le 15/05/2012 18:19, Gor Gyolchanyan a �crit :



         On Tue, May 15, 2012 at 7:51 PM, Christophe
         <travert phare.normalesup.org
         <mailto:travert phare.normalesup.org>
         <mailto:travert phare.__normalesup.org
         <mailto:travert phare.normalesup.org>>> wrote:

             using printf will lead to a bug each time the programmer
         forget the
             trailing
             \0.


         First of all, printf shouldn't be used! There's writef and it's
         superior
         to printf in any way!
         Second of all, if the zero-termination of literals are to be
         removed,
         the literals will no longer be accepted as a pointer to a character.
         The appropriate type mismatch error will force the user to use
         toUTF8z
         to get ht e zero-terminated utf-8 version of the original string.
         In case it's a literal, one could use the compile-time version of
         toUTF8z to avoid run-time overhead.
         This all doesn't sound like a bad idea to me. I don't see any
         security
         or performance flaws in this scheme.
         --
         Bye,
         Gor Gyolchanyan.


     May god ear you !


 Unfortunately, using writef/writefln would make DRuntime depend on
 Phobos, which is unacceptable.
druntime isn't supposed to printf stuff.
May 15 2012
parent =?ISO-8859-1?Q?Alex_R=F8nne_Petersen?= <alex lycus.org> writes:
On 16-05-2012 00:20, deadalnix wrote:
 Le 15/05/2012 21:57, Andrew Wiley a �crit :
 On Tue, May 15, 2012 at 11:46 AM, deadalnix <deadalnix gmail.com
 <mailto:deadalnix gmail.com>> wrote:

 Le 15/05/2012 18:19, Gor Gyolchanyan a �crit :



 On Tue, May 15, 2012 at 7:51 PM, Christophe
 <travert phare.normalesup.org
 <mailto:travert phare.normalesup.org>
 <mailto:travert phare.__normalesup.org
 <mailto:travert phare.normalesup.org>>> wrote:

 using printf will lead to a bug each time the programmer
 forget the
 trailing
 \0.


 First of all, printf shouldn't be used! There's writef and it's
 superior
 to printf in any way!
 Second of all, if the zero-termination of literals are to be
 removed,
 the literals will no longer be accepted as a pointer to a character.
 The appropriate type mismatch error will force the user to use
 toUTF8z
 to get ht e zero-terminated utf-8 version of the original string.
 In case it's a literal, one could use the compile-time version of
 toUTF8z to avoid run-time overhead.
 This all doesn't sound like a bad idea to me. I don't see any
 security
 or performance flaws in this scheme.
 --
 Bye,
 Gor Gyolchanyan.


 May god ear you !


 Unfortunately, using writef/writefln would make DRuntime depend on
 Phobos, which is unacceptable.
druntime isn't supposed to printf stuff.
It's called debugging. ;) -- Alex R�nne Petersen alex lycus.org http://lycus.org
May 16 2012
prev sibling next sibling parent reply =?UTF-8?B?QWxleCBSw7hubmUgUGV0ZXJzZW4=?= <alex lycus.org> writes:
On 15-05-2012 18:19, Gor Gyolchanyan wrote:
 On Tue, May 15, 2012 at 7:51 PM, Christophe
 <travert phare.normalesup.org <mailto:travert phare.normalesup.org>> wrote:

     using printf will lead to a bug each time the programmer forget the
     trailing
     \0.


 First of all, printf shouldn't be used! There's writef and it's superior
 to printf in any way!
Nope. write* perform GC allocation.
 Second of all, if the zero-termination of literals are to be removed,
 the literals will no longer be accepted as a pointer to a character.
 The appropriate type mismatch error will force the user to use toUTF8z
 to get ht e zero-terminated utf-8 version of the original string.
 In case it's a literal, one could use the compile-time version of
 toUTF8z to avoid run-time overhead.
 This all doesn't sound like a bad idea to me. I don't see any security
 or performance flaws in this scheme.
 --
 Bye,
 Gor Gyolchanyan.
You're assuming everyone uses Phobos. This is not the case. -- Alex Rønne Petersen alex lycus.org http://lycus.org
May 15 2012
parent reply Gor Gyolchanyan <gor.f.gyolchanyan gmail.com> writes:
 On Tue, May 15, 2012 at 9:16 PM, Alex R=C3=B8nne Petersen <alex lycus.or=
g>wrote:
 Nope. write* perform GC allocation.
1. Give me the top 3 use cases, where GC allocation is intolerable when writing to an output stream. 2. writef and friends could get cousins like nogcwritef and nogcwritefln. (see comments beloaw) 3. GC can be turned off and gc-allocated memory can be GC.freeed. 4. printf could get wrapped to take d-strings by malloc-ing new buffers for the c-strings if necessary.
 On Tue, May 15, 2012 at 9:16 PM, Alex R=C3=B8nne Petersen <alex lycus.or=
g> wrote:
 You're assuming everyone uses Phobos. This is not the case.
I'm assuming everyone is sane, because Phobos is called "the standard library" for a damned good reason. For those who don't - they're welcome to use whatever they want and convert d-strings to c-strings any way they choose if necessary. --=20 Bye, Gor Gyolchanyan.
May 15 2012
parent reply =?UTF-8?B?QWxleCBSw7hubmUgUGV0ZXJzZW4=?= <alex lycus.org> writes:
On 15-05-2012 19:29, Gor Gyolchanyan wrote:
  >> On Tue, May 15, 2012 at 9:16 PM, Alex Rønne Petersen <alex lycus.org
 <mailto:alex lycus.org>> wrote:
  >> Nope. write* perform GC allocation.

 1. Give me the top 3 use cases, where GC allocation is intolerable when
 writing to an output stream.
Who the hell am I or you to dictate use cases? But a few off the top of my head: 1) Building without a GC *at all* (and yes, it is possible). 2) When writing high performance tools for doing UNIX-style program output piping.
 2. writef and friends could get cousins like nogcwritef and
 nogcwritefln. (see comments beloaw)
Patches welcome.
 3. GC can be turned off and gc-allocated memory can be GC.freeed.
Yes, if you alter the implementation. You can't free it from the call site. Again, patches welcome.
 4. printf could get wrapped to take d-strings by malloc-ing new buffers
 for the c-strings if necessary.
And just about every C function taking strings, ever. This is a bad strategy, and makes D's C interoperability worse.
  >> On Tue, May 15, 2012 at 9:16 PM, Alex Rønne Petersen <alex lycus.org
 <mailto:alex lycus.org>> wrote:
  >> You're assuming everyone uses Phobos. This is not the case.

 I'm assuming everyone is sane, because Phobos is called "the standard
 library" for a damned good reason. For those who don't - they're welcome
 to use whatever they want and convert d-strings to c-strings any way
 they choose if necessary.
Yes, let's cripple a systems language for the comfort of the non-systems programmers. Excellent idea.
 --
 Bye,
 Gor Gyolchanyan.
PS: Removing null-terminated string literals is not going to fix the array slice corner cases by itself. I think this discussion is fairly pointless. If you want to fix slices, go all the way with it, not just half the way. Besides, this is probably not going to change anyway. We're focusing on stabilizing the language, not changing it. -- Alex Rønne Petersen alex lycus.org http://lycus.org
May 15 2012
parent reply deadalnix <deadalnix gmail.com> writes:
Le 15/05/2012 20:34, Alex Rønne Petersen a écrit :
 Besides, this is probably not going to change anyway. We're focusing on
 stabilizing the language, not changing it.
This always have been a design mistake to auto cast array in pointers. This is silent fallback to usafe world, and what we want to avoid. This has no benefit because using .ptr isn't really complex and make the transition obvious. This has been raised many time in the past as being an issue, and it fit nicely here. Having \0 terminated string in D were it has no usage is quite dumb.
May 15 2012
parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 05/16/2012 12:29 AM, deadalnix wrote:
 Le 15/05/2012 20:34, Alex Rønne Petersen a écrit :
 Besides, this is probably not going to change anyway. We're focusing on
 stabilizing the language, not changing it.
This always have been a design mistake to auto cast array in pointers. This is silent fallback to usafe world, and what we want to avoid.
Getting a pointer to the beginning of a zero-terminated string literal is perfectly safe.
 This has no benefit because using .ptr isn't really complex and make the
 transition obvious.

 This has been raised many time in the past as being an issue, and it fit
 nicely here.
This is a compile time error: int[] arr; int* p=arr; What exactly are you asking for?
 Having \0 terminated string in D were it has no usage is quite dumb.
What you don't seem to get is that it actually has usage.
May 16 2012
parent reply deadalnix <deadalnix gmail.com> writes:
Le 16/05/2012 12:10, Timon Gehr a écrit :
 On 05/16/2012 12:29 AM, deadalnix wrote:
 Le 15/05/2012 20:34, Alex Rønne Petersen a écrit :
 Besides, this is probably not going to change anyway. We're focusing on
 stabilizing the language, not changing it.
This always have been a design mistake to auto cast array in pointers. This is silent fallback to usafe world, and what we want to avoid.
Getting a pointer to the beginning of a zero-terminated string literal is perfectly safe.
 This has no benefit because using .ptr isn't really complex and make the
 transition obvious.

 This has been raised many time in the past as being an issue, and it fit
 nicely here.
This is a compile time error: int[] arr; int* p=arr; What exactly are you asking for?
void foo(const(char)*); foo("bar"); isn't .
 Having \0 terminated string in D were it has no usage is quite dumb.
What you don't seem to get is that it actually has usage.
I understand that. I want to propose something more subtle. Array in D are already typed according to what the are assigned. int[] foo = [1, 2] and immutable(int)[] = [1, 2] are both possible). Isn't it possible to \0 terminate string chen they are used as char* and not when they are used as array ?
May 16 2012
parent reply =?UTF-8?B?QWxleCBSw7hubmUgUGV0ZXJzZW4=?= <alex lycus.org> writes:
On 16-05-2012 22:00, deadalnix wrote:
 Le 16/05/2012 12:10, Timon Gehr a écrit :
 On 05/16/2012 12:29 AM, deadalnix wrote:
 Le 15/05/2012 20:34, Alex Rønne Petersen a écrit :
 Besides, this is probably not going to change anyway. We're focusing on
 stabilizing the language, not changing it.
This always have been a design mistake to auto cast array in pointers. This is silent fallback to usafe world, and what we want to avoid.
Getting a pointer to the beginning of a zero-terminated string literal is perfectly safe.
 This has no benefit because using .ptr isn't really complex and make the
 transition obvious.

 This has been raised many time in the past as being an issue, and it fit
 nicely here.
This is a compile time error: int[] arr; int* p=arr; What exactly are you asking for?
void foo(const(char)*); foo("bar"); isn't .
And shouldn't be. Working with C APIs or just working on druntime would be a nightmare.
 Having \0 terminated string in D were it has no usage is quite dumb.
What you don't seem to get is that it actually has usage.
I understand that. I want to propose something more subtle. Array in D are already typed according to what the are assigned. int[] foo = [1, 2] and immutable(int)[] = [1, 2] are both possible). Isn't it possible to \0 terminate string chen they are used as char* and not when they are used as array ?
Theoretically, yes, practically, not really. void myLog(string msg) { printf(msg); } -- Alex Rønne Petersen alex lycus.org http://lycus.org
May 16 2012
next sibling parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Wed, 16 May 2012 16:19:36 -0400, Alex R=C3=B8nne Petersen <alex lycus=
.org>  =

wrote:

 Theoretically, yes, practically, not really.

 void myLog(string msg)
 {
      printf(msg);
 }
Wait, this should be an error. You need toStringz there. -Steve
May 16 2012
parent reply =?UTF-8?B?QWxleCBSw7hubmUgUGV0ZXJzZW4=?= <alex lycus.org> writes:
On 16-05-2012 22:42, Steven Schveighoffer wrote:
 On Wed, 16 May 2012 16:19:36 -0400, Alex Rønne Petersen <alex lycus.org>
 wrote:

 Theoretically, yes, practically, not really.

 void myLog(string msg)
 {
 printf(msg);
 }
Wait, this should be an error. You need toStringz there. -Steve
Sorry, I meant: void myLog(string msg) { printf(msg.ptr); } (Which works as expected because string literals are null-terminated. This is also how things work when you pass a string literal to a const(char)* value; it just does "literal".ptr.) -- Alex Rønne Petersen alex lycus.org http://lycus.org
May 16 2012
parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Wed, 16 May 2012 17:06:41 -0400, Alex R=C3=B8nne Petersen <alex lycus=
.org>  =

wrote:

 On 16-05-2012 22:42, Steven Schveighoffer wrote:
 On Wed, 16 May 2012 16:19:36 -0400, Alex R=C3=B8nne Petersen <alex ly=
cus.org>
 wrote:

 Theoretically, yes, practically, not really.

 void myLog(string msg)
 {
 printf(msg);
 }
Wait, this should be an error. You need toStringz there. -Steve
Sorry, I meant: void myLog(string msg) { printf(msg.ptr); } (Which works as expected because string literals are null-terminated. =
=
 This is also how things work when you pass a string literal to a  =
 const(char)* value; it just does "literal".ptr.)
No, it doesn't: myLog("abc"[0..1]); // prints abc instead of the requested a string is not necessarily a literal. A literal has a special polysemous= = type, and special properties. An ordinary string does not. -Steve
May 16 2012
parent reply =?UTF-8?B?QWxleCBSw7hubmUgUGV0ZXJzZW4=?= <alex lycus.org> writes:
On 16-05-2012 23:09, Steven Schveighoffer wrote:
 On Wed, 16 May 2012 17:06:41 -0400, Alex Rønne Petersen <alex lycus.org>
 wrote:

 On 16-05-2012 22:42, Steven Schveighoffer wrote:
 On Wed, 16 May 2012 16:19:36 -0400, Alex Rønne Petersen <alex lycus.org>
 wrote:

 Theoretically, yes, practically, not really.

 void myLog(string msg)
 {
 printf(msg);
 }
Wait, this should be an error. You need toStringz there. -Steve
Sorry, I meant: void myLog(string msg) { printf(msg.ptr); } (Which works as expected because string literals are null-terminated. This is also how things work when you pass a string literal to a const(char)* value; it just does "literal".ptr.)
No, it doesn't: myLog("abc"[0..1]); // prints abc instead of the requested a string is not necessarily a literal. A literal has a special polysemous type, and special properties. An ordinary string does not. -Steve
I was referring to: myLog("abc"); When you start bringing slicing into the mix, you're bound to make C interop harder and more error-prone because of null-termination. -- Alex Rønne Petersen alex lycus.org http://lycus.org
May 16 2012
parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Wed, May 16, 2012 at 11:22:48PM +0200, Alex R�nne Petersen wrote:
 On 16-05-2012 23:09, Steven Schveighoffer wrote:
On Wed, 16 May 2012 17:06:41 -0400, Alex R�nne Petersen <alex lycus.org>
wrote:
[...]
void myLog(string msg)
{
printf(msg.ptr);
}

(Which works as expected because string literals are null-terminated.
This is also how things work when you pass a string literal to a
const(char)* value; it just does "literal".ptr.)
No, it doesn't: myLog("abc"[0..1]); // prints abc instead of the requested a string is not necessarily a literal. A literal has a special polysemous type, and special properties. An ordinary string does not. -Steve
I was referring to: myLog("abc"); When you start bringing slicing into the mix, you're bound to make C interop harder and more error-prone because of null-termination.
[...] I think his point is that myLog is poorly written because it declares itself to have a string parameter, yet does not function properly when called with a string that isn't NULL-terminated (e.g. a string slice). T -- Let's not fight disease by killing the patient. -- Sean 'Shaleh' Perry
May 16 2012
prev sibling parent reply deadalnix <deadalnix gmail.com> writes:
Le 16/05/2012 22:19, Alex Rønne Petersen a écrit :
 On 16-05-2012 22:00, deadalnix wrote:
 Le 16/05/2012 12:10, Timon Gehr a écrit :
 On 05/16/2012 12:29 AM, deadalnix wrote:
 Le 15/05/2012 20:34, Alex Rønne Petersen a écrit :
 Besides, this is probably not going to change anyway. We're
 focusing on
 stabilizing the language, not changing it.
This always have been a design mistake to auto cast array in pointers. This is silent fallback to usafe world, and what we want to avoid.
Getting a pointer to the beginning of a zero-terminated string literal is perfectly safe.
 This has no benefit because using .ptr isn't really complex and make
 the
 transition obvious.

 This has been raised many time in the past as being an issue, and it
 fit
 nicely here.
This is a compile time error: int[] arr; int* p=arr; What exactly are you asking for?
void foo(const(char)*); foo("bar"); isn't .
And shouldn't be. Working with C APIs or just working on druntime would be a nightmare.
 Having \0 terminated string in D were it has no usage is quite dumb.
What you don't seem to get is that it actually has usage.
I understand that. I want to propose something more subtle. Array in D are already typed according to what the are assigned. int[] foo = [1, 2] and immutable(int)[] = [1, 2] are both possible). Isn't it possible to \0 terminate string chen they are used as char* and not when they are used as array ?
Theoretically, yes, practically, not really. void myLog(string msg) { printf(msg); }
This is exactly why array shouldn't fallback into pointer silently. Your code here is flawed and unsafe. You NEED a toStringz here.
May 16 2012
parent Timon Gehr <timon.gehr gmx.ch> writes:
On 05/16/2012 11:08 PM, deadalnix wrote:
 Le 16/05/2012 22:19, Alex Rønne Petersen a écrit :
  ...
 Theoretically, yes, practically, not really.

 void myLog(string msg)
 {
     printf(msg);
 }
This is exactly why array shouldn't fallback into pointer silently. Your code here is flawed and unsafe. You NEED a toStringz here.
It is not unsafe, it is invalid.
May 16 2012
prev sibling parent Timon Gehr <timon.gehr gmx.ch> writes:
On 05/15/2012 06:19 PM, Gor Gyolchanyan wrote:
 On Tue, May 15, 2012 at 7:51 PM, Christophe
 <travert phare.normalesup.org <mailto:travert phare.normalesup.org>> wrote:

     using printf will lead to a bug each time the programmer forget the
     trailing
     \0.


 First of all, printf shouldn't be used!
First of all, 'is' shouldn't be used to compare built-in arrays!
 There's writef and it's superior to printf in any way!
No it is not! printf and scanf are so much faster than writef/readf that it is relevant! The poor performance of writef/readf makes it embarrassing for a university to use D as a teaching language!
 Second of all, if the zero-termination of literals are to be removed,
 the literals will no longer be accepted as a pointer to a character.
 The appropriate type mismatch error will force the user to use toUTF8z
 to get ht e zero-terminated utf-8 version of the original string.
 In case it's a literal, one could use the compile-time version of
 toUTF8z to avoid run-time overhead.
 This all doesn't sound like a bad idea to me. I don't see any security
 or performance flaws in this scheme.
There are none in the current scheme.
May 15 2012
prev sibling next sibling parent reply deadalnix <deadalnix gmail.com> writes:
Le 15/05/2012 17:51, Christophe a �crit :
 deadalnix , dans le message (digitalmars.D:167404), a �crit :
 This looks to me like a bad practice. C string and D string are
 different beasts, and we have toStringz .
C string and D string are different, but it's not a bad idea to have string *literals* that works for both C and D strings, otherwise using printf will lead to a bug each time the programmer forget the trailing \0.
 It is kind of dumb to create a WAT is the language because druntime dev
 did mistakes. It have to be fixed.
You can't rely on an empty string to be null since you must be able to reserve place at the end of the array, and or the string could be the result of poping a full string.
This is why I stated put null when the string length is SET to 0, not when it is 0. So it is nulled when I create an empty slice, or do arr.length = 0
May 15 2012
parent Timon Gehr <timon.gehr gmx.ch> writes:
On 05/15/2012 06:45 PM, deadalnix wrote:
 Le 15/05/2012 17:51, Christophe a �crit :
 deadalnix , dans le message (digitalmars.D:167404), a �crit :
 This looks to me like a bad practice. C string and D string are
 different beasts, and we have toStringz .
C string and D string are different, but it's not a bad idea to have string *literals* that works for both C and D strings, otherwise using printf will lead to a bug each time the programmer forget the trailing \0.
 It is kind of dumb to create a WAT is the language because druntime dev
 did mistakes. It have to be fixed.
You can't rely on an empty string to be null since you must be able to reserve place at the end of the array, and or the string could be the result of poping a full string.
This is why I stated put null when the string length is SET to 0, not when it is 0. So it is nulled when I create an empty slice, or do arr.length = 0
Might as well start Microsoft Flight Simulator when the length is set to 9797. If you want to clear the contents of the array, use arr=null or arr=[].
May 15 2012
prev sibling parent reply deadalnix <deadalnix gmail.com> writes:
Le 15/05/2012 17:51, Christophe a �crit :
 deadalnix , dans le message (digitalmars.D:167404), a �crit :
 This looks to me like a bad practice. C string and D string are
 different beasts, and we have toStringz .
C string and D string are different, but it's not a bad idea to have string *literals* that works for both C and D strings, otherwise using printf will lead to a bug each time the programmer forget the trailing \0.
Due to slicing, it is already unsafe to pass a D string to C code. The main problem is array casting silently to pointers, making the error easy to do. Fixing the problem for literal isn't going to solve it at all. The real solution is toStringz
May 15 2012
parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Tue, 15 May 2012 18:31:26 -0400, deadalnix <deadalnix gmail.com> wrot=
e:

 Le 15/05/2012 17:51, Christophe a =C3=A9crit :
 deadalnix , dans le message (digitalmars.D:167404), a =C3=A9crit :
 This looks to me like a bad practice. C string and D string are
 different beasts, and we have toStringz .
C string and D string are different, but it's not a bad idea to have string *literals* that works for both C and D strings, otherwise usin=
g
 printf will lead to a bug each time the programmer forget the trailin=
g
 \0.
Due to slicing, it is already unsafe to pass a D string to C code. The=
=
 main problem is array casting silently to pointers, making the error  =
 easy to do.
How so? strings are immutable, and literals are *truly* immutable.
 Fixing the problem for literal isn't going to solve it at all.

 The real solution is toStringz
toStringz can allocate a new block in order to ensure 0 gets added. Thi= s = is ludicrous! You are trying to tell me that any time I want to call a C function with= a = string literal, I have to first heap-allocate it, even though I *know* = it's safe. I don't see a "problem" anywhere. The current system is perfect for wha= t = it needs to do. -Steve
May 16 2012
next sibling parent reply Gor Gyolchanyan <gor.f.gyolchanyan gmail.com> writes:
On Wed, May 16, 2012 at 5:25 PM, Steven Schveighoffer
<schveiguy yahoo.com>wrote:

 On Tue, 15 May 2012 18:31:26 -0400, deadalnix <deadalnix gmail.com> wrote=
:
  Le 15/05/2012 17:51, Christophe a =C3=A9crit :
 deadalnix , dans le message (digitalmars.D:167404), a =C3=A9crit :

 This looks to me like a bad practice. C string and D string are
 different beasts, and we have toStringz .
C string and D string are different, but it's not a bad idea to have string *literals* that works for both C and D strings, otherwise using printf will lead to a bug each time the programmer forget the trailing \0.
Due to slicing, it is already unsafe to pass a D string to C code. The main problem is array casting silently to pointers, making the error eas=
y
 to do.
How so? strings are immutable, and literals are *truly* immutable. Fixing the problem for literal isn't going to solve it at all.
 The real solution is toStringz
toStringz can allocate a new block in order to ensure 0 gets added. This is ludicrous! You are trying to tell me that any time I want to call a C function with =
a
 string literal, I have to first heap-allocate it, even though I *know* it=
's
 safe.

 I don't see a "problem" anywhere.  The current system is perfect for what
 it needs to do.

 -Steve
Aside from the string problem the very existence of this debate exposes a fundamental flaw in the entire software engineering industry: heavy usage of ancient crap. If some library is so damned hard to refresh, then something's terribly wrong with it. It's about damned time ancient libraries are thrown away. --=20 Bye, Gor Gyolchanyan.
May 16 2012
next sibling parent reply =?UTF-8?B?QWxleCBSw7hubmUgUGV0ZXJzZW4=?= <alex lycus.org> writes:
On 16-05-2012 16:04, Gor Gyolchanyan wrote:
 On Wed, May 16, 2012 at 5:25 PM, Steven Schveighoffer
 <schveiguy yahoo.com <mailto:schveiguy yahoo.com>> wrote:

     On Tue, 15 May 2012 18:31:26 -0400, deadalnix <deadalnix gmail.com
     <mailto:deadalnix gmail.com>> wrote:

         Le 15/05/2012 17:51, Christophe a écrit :

             deadalnix , dans le message (digitalmars.D:167404), a écrit :

                 This looks to me like a bad practice. C string and D
                 string are
                 different beasts, and we have toStringz .


             C string and D string are different, but it's not a bad idea
             to have
             string *literals* that works for both C and D strings,
             otherwise using
             printf will lead to a bug each time the programmer forget
             the trailing
             \0.


         Due to slicing, it is already unsafe to pass a D string to C
         code. The main problem is array casting silently to pointers,
         making the error easy to do.


     How so?  strings are immutable, and literals are *truly* immutable.


         Fixing the problem for literal isn't going to solve it at all.

         The real solution is toStringz


     toStringz can allocate a new block in order to ensure 0 gets added.
       This is ludicrous!

     You are trying to tell me that any time I want to call a C function
     with a string literal, I have to first heap-allocate it, even though
     I *know* it's safe.

     I don't see a "problem" anywhere.  The current system is perfect for
     what it needs to do.

     -Steve


 Aside from the string problem the very existence of this debate exposes
 a fundamental flaw in the entire software engineering industry: heavy
 usage of ancient crap.
 If some library is so damned hard to refresh, then something's terribly
 wrong with it. It's about damned time ancient libraries are thrown away.

 --
 Bye,
 Gor Gyolchanyan.
I... don't think that's a very pragmatic view. Yes, software sucks. Deal with it, etc. -- Alex Rønne Petersen alex lycus.org http://lycus.org
May 16 2012
parent reply Gor Gyolchanyan <gor.f.gyolchanyan gmail.com> writes:
On Wed, May 16, 2012 at 6:10 PM, Alex R=C3=B8nne Petersen <alex lycus.org> =
wrote:

 On 16-05-2012 16:04, Gor Gyolchanyan wrote:

 On Wed, May 16, 2012 at 5:25 PM, Steven Schveighoffer
 <schveiguy yahoo.com <mailto:schveiguy yahoo.com>> wrote:

    On Tue, 15 May 2012 18:31:26 -0400, deadalnix <deadalnix gmail.com
    <mailto:deadalnix gmail.com>> wrote:

        Le 15/05/2012 17:51, Christophe a =C3=A9crit :

            deadalnix , dans le message (digitalmars.D:167404), a =C3=A9c=
rit :
                This looks to me like a bad practice. C string and D
                string are
                different beasts, and we have toStringz .


            C string and D string are different, but it's not a bad idea
            to have
            string *literals* that works for both C and D strings,
            otherwise using
            printf will lead to a bug each time the programmer forget
            the trailing
            \0.


        Due to slicing, it is already unsafe to pass a D string to C
        code. The main problem is array casting silently to pointers,
        making the error easy to do.


    How so?  strings are immutable, and literals are *truly* immutable.


        Fixing the problem for literal isn't going to solve it at all.

        The real solution is toStringz


    toStringz can allocate a new block in order to ensure 0 gets added.
      This is ludicrous!

    You are trying to tell me that any time I want to call a C function
    with a string literal, I have to first heap-allocate it, even though
    I *know* it's safe.

    I don't see a "problem" anywhere.  The current system is perfect for
    what it needs to do.

    -Steve


 Aside from the string problem the very existence of this debate exposes
 a fundamental flaw in the entire software engineering industry: heavy
 usage of ancient crap.
 If some library is so damned hard to refresh, then something's terribly
 wrong with it. It's about damned time ancient libraries are thrown away.

 --
 Bye,
 Gor Gyolchanyan.
I... don't think that's a very pragmatic view. Yes, software sucks. Deal with it, etc. -- Alex R=C3=B8nne Petersen alex lycus.org http://lycus.org
Deal with it? That's the attitude that made it this way in the first place. If you like having software this way till the end of times - be my guest. I for one will not tolerate this unacceptably obsolete software. If you want it to stop being this bad - you're welcome to join me in the effort to put an end to this. It seems impossible only because nobody actually tried doing anything and all everybody does is complain about ancient stuff still requiring compatibility. With some effort that can be changed. Ancient libraries still require compatibility not because it's a rule, but because there are people who use them. They use them because there are no alternatives. If some people deliberately refuse to embrace the progress - it's their damned problem. --=20 Bye, Gor Gyolchanyan.
May 16 2012
parent =?UTF-8?B?QWxleCBSw7hubmUgUGV0ZXJzZW4=?= <alex lycus.org> writes:
On 16-05-2012 16:25, Gor Gyolchanyan wrote:
 On Wed, May 16, 2012 at 6:10 PM, Alex Rønne Petersen <alex lycus.org
 <mailto:alex lycus.org>> wrote:

     On 16-05-2012 16:04, Gor Gyolchanyan wrote:

         On Wed, May 16, 2012 at 5:25 PM, Steven Schveighoffer
         <schveiguy yahoo.com <mailto:schveiguy yahoo.com>
         <mailto:schveiguy yahoo.com <mailto:schveiguy yahoo.com>>> wrote:

             On Tue, 15 May 2012 18:31:26 -0400, deadalnix
         <deadalnix gmail.com <mailto:deadalnix gmail.com>
         <mailto:deadalnix gmail.com <mailto:deadalnix gmail.com>>> wrote:

                 Le 15/05/2012 17:51, Christophe a écrit :

                     deadalnix , dans le message (digitalmars.D:167404),
         a écrit :

                         This looks to me like a bad practice. C string and D
                         string are
                         different beasts, and we have toStringz .


                     C string and D string are different, but it's not a
         bad idea
                     to have
                     string *literals* that works for both C and D strings,
                     otherwise using
                     printf will lead to a bug each time the programmer
         forget
                     the trailing
                     \0.


                 Due to slicing, it is already unsafe to pass a D string to C
                 code. The main problem is array casting silently to
         pointers,
                 making the error easy to do.


             How so?  strings are immutable, and literals are *truly*
         immutable.


                 Fixing the problem for literal isn't going to solve it
         at all.

                 The real solution is toStringz


             toStringz can allocate a new block in order to ensure 0 gets
         added.
               This is ludicrous!

             You are trying to tell me that any time I want to call a C
         function
             with a string literal, I have to first heap-allocate it,
         even though
             I *know* it's safe.

             I don't see a "problem" anywhere.  The current system is
         perfect for
             what it needs to do.

             -Steve


         Aside from the string problem the very existence of this debate
         exposes
         a fundamental flaw in the entire software engineering industry:
         heavy
         usage of ancient crap.
         If some library is so damned hard to refresh, then something's
         terribly
         wrong with it. It's about damned time ancient libraries are
         thrown away.

         --
         Bye,
         Gor Gyolchanyan.


     I... don't think that's a very pragmatic view.

     Yes, software sucks. Deal with it, etc.


     --
     Alex Rønne Petersen
     alex lycus.org <mailto:alex lycus.org>
     http://lycus.org


 Deal with it? That's the attitude that made it this way in the first
 place. If you like having software this way till the end of times - be
 my guest. I for one will not tolerate this unacceptably obsolete
 software. If you want it to stop being this bad - you're welcome to join
 me in the effort to put an end to this. It seems impossible only because
 nobody actually tried doing anything and all everybody does is complain
 about ancient stuff still requiring compatibility. With some effort that
 can be changed. Ancient libraries still require compatibility not
 because it's a rule, but because there are people who use them. They use
 them because there are no alternatives. If some people deliberately
 refuse to embrace the progress - it's their damned problem.

 --
 Bye,
 Gor Gyolchanyan.
C support and interoperability has always been a goal of D; and I don't see that changing. That's not saying that reimplementing these libraries in D is a bad idea - in fact, it would make everyone's lives easier. So by all means, do that. But I'm using some libraries such as libgc (the Boehm-Demers-Weiser GC) and libffi (foreign function interface for C) that would take eons to port, audit, test, ... and I have a project that depends on them that I need to work on. -- Alex Rønne Petersen alex lycus.org http://lycus.org
May 16 2012
prev sibling parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Wed, 16 May 2012 10:04:50 -0400, Gor Gyolchanyan  
<gor.f.gyolchanyan gmail.com> wrote:

 On Wed, May 16, 2012 at 5:25 PM, Steven Schveighoffer
 <schveiguy yahoo.com>wrote:
 I don't see a "problem" anywhere.  The current system is perfect for  
 what
 it needs to do.
Aside from the string problem the very existence of this debate exposes a fundamental flaw in the entire software engineering industry: heavy usage of ancient crap. If some library is so damned hard to refresh, then something's terribly wrong with it. It's about damned time ancient libraries are thrown away.
It's quite difficult to "throw out" OS libraries that you need ;) printf is hardly the only C interface that requires null-terminated strings. D is a pragmatic language, not an ideological one. -Steve
May 16 2012
parent reply Gor Gyolchanyan <gor.f.gyolchanyan gmail.com> writes:
On Wed, May 16, 2012 at 7:22 PM, Steven Schveighoffer
<schveiguy yahoo.com>wrote:

 On Wed, 16 May 2012 10:04:50 -0400, Gor Gyolchanyan <
 gor.f.gyolchanyan gmail.com> wrote:

  On Wed, May 16, 2012 at 5:25 PM, Steven Schveighoffer
 <schveiguy yahoo.com>wrote:
 I don't see a "problem" anywhere.  The current system is perfect for what
 it needs to do.
Aside from the string problem the very existence of this debate exposes a fundamental flaw in the entire software engineering industry: heavy usage of ancient crap. If some library is so damned hard to refresh, then something's terribly wrong with it. It's about damned time ancient libraries are thrown away.
It's quite difficult to "throw out" OS libraries that you need ;) printf is hardly the only C interface that requires null-terminated strings. D is a pragmatic language, not an ideological one. -Steve
Dear Steven and Alex. By no means, I say, that every ancient technology is to be thrown out at once. That's a technological suicide. What I mean, that knowing, that the technology is ancient, we should at least put some effort to gradually move away from it. If it needs to be done - it needs to be done. If it happens to be expensive to do - oh, well. I understand, that the human resources are limited, but hanging on ancient technology for _too_ long is a death wish for any new technology. -- Bye, Gor Gyolchanyan.
May 16 2012
parent reply =?UTF-8?B?QWxleCBSw7hubmUgUGV0ZXJzZW4=?= <alex lycus.org> writes:
On 16-05-2012 18:12, Gor Gyolchanyan wrote:
 On Wed, May 16, 2012 at 7:22 PM, Steven Schveighoffer
 <schveiguy yahoo.com <mailto:schveiguy yahoo.com>> wrote:

     On Wed, 16 May 2012 10:04:50 -0400, Gor Gyolchanyan
     <gor.f.gyolchanyan gmail.com <mailto:gor.f.gyolchanyan gmail.com>>
     wrote:

         On Wed, May 16, 2012 at 5:25 PM, Steven Schveighoffer
         <schveiguy yahoo.com <mailto:schveiguy yahoo.com>>wrote:



             I don't see a "problem" anywhere.  The current system is
             perfect for what
             it needs to do.


         Aside from the string problem the very existence of this debate
         exposes a
         fundamental flaw in the entire software engineering industry:
         heavy usage
         of ancient crap.
         If some library is so damned hard to refresh, then something's
         terribly
         wrong with it. It's about damned time ancient libraries are
         thrown away.


     It's quite difficult to "throw out" OS libraries that you need ;)
       printf is hardly the only C interface that requires
     null-terminated strings.

     D is a pragmatic language, not an ideological one.

     -Steve


 Dear Steven and Alex. By no means, I say, that every ancient technology
 is to be thrown out at once. That's a technological suicide. What I
 mean, that knowing, that the technology is ancient, we should at least
 put some effort to gradually move away from it. If it needs to be done -
 it needs to be done. If it happens to be expensive to do - oh, well. I
 understand, that the human resources are limited, but hanging on ancient
 technology for _too_ long is a death wish for any new technology.

 --
 Bye,
 Gor Gyolchanyan.
Yes, but the thing is, throwing out null-terminated strings is not something you do gradually - you have to do it from one day to another. It's such a simple feature that you either have it or you don't. -- Alex Rønne Petersen alex lycus.org http://lycus.org
May 16 2012
parent reply Gor Gyolchanyan <gor.f.gyolchanyan gmail.com> writes:
On Wed, May 16, 2012 at 8:16 PM, Alex R=C3=B8nne Petersen <alex lycus.org> =
wrote:

 On 16-05-2012 18:12, Gor Gyolchanyan wrote:

 On Wed, May 16, 2012 at 7:22 PM, Steven Schveighoffer
 <schveiguy yahoo.com <mailto:schveiguy yahoo.com>> wrote:

    On Wed, 16 May 2012 10:04:50 -0400, Gor Gyolchanyan
    <gor.f.gyolchanyan gmail.com <mailto:gor.f.gyolchanyan **gmail.com<go=
r.f.gyolchanyan gmail.com>

wrote: On Wed, May 16, 2012 at 5:25 PM, Steven Schveighoffer <schveiguy yahoo.com <mailto:schveiguy yahoo.com>>**wrote: I don't see a "problem" anywhere. The current system is perfect for what it needs to do. Aside from the string problem the very existence of this debate exposes a fundamental flaw in the entire software engineering industry: heavy usage of ancient crap. If some library is so damned hard to refresh, then something's terribly wrong with it. It's about damned time ancient libraries are thrown away. It's quite difficult to "throw out" OS libraries that you need ;) printf is hardly the only C interface that requires null-terminated strings. D is a pragmatic language, not an ideological one. -Steve Dear Steven and Alex. By no means, I say, that every ancient technology is to be thrown out at once. That's a technological suicide. What I mean, that knowing, that the technology is ancient, we should at least put some effort to gradually move away from it. If it needs to be done - it needs to be done. If it happens to be expensive to do - oh, well. I understand, that the human resources are limited, but hanging on ancient technology for _too_ long is a death wish for any new technology. -- Bye, Gor Gyolchanyan.
Yes, but the thing is, throwing out null-terminated strings is not something you do gradually - you have to do it from one day to another. It's such a simple feature that you either have it or you don't. -- Alex R=C3=B8nne Petersen alex lycus.org http://lycus.org
if("" !=3D []) assert("".length !=3D 0); Will this fail? --=20 Bye, Gor Gyolchanyan.
May 16 2012
parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Wed, 16 May 2012 12:21:27 -0400, Gor Gyolchanyan  
<gor.f.gyolchanyan gmail.com> wrote:

 if("" != []) assert("".length != 0);

 Will this fail?
No. Ambiguities only come into play when you use 'is'. I highly recommend not using 'is' for arrays unless you really have a good reason, since two slices can be 'equal' but 'point at different instances'. For example: auto str = "abcabc"; assert(str[0..3] == str[3..$]); // pass assert(str[0..3] is str[3..$]); // fail which is very counterintuitive. -Steve
May 16 2012
next sibling parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Wed, May 16, 2012 at 01:07:54PM -0400, Steven Schveighoffer wrote:
[...]
 For example:
 
 auto str = "abcabc";
 assert(str[0..3] == str[3..$]); // pass
 assert(str[0..3] is str[3..$]); // fail
 
 which is very counterintuitive.
[...] I don't find that counterintuitive at all. To me, 'is' concerns memory identity: are the two things actually one and the same _in memory_? (In this case, no, because they are different chunks of memory that just happens to contain the same values.) Whereas '==' concerns logical identity: do the two things represent the same logical entity? (In this case, yes, these two arrays contain exactly the same elements.) I'd argue that 99% of the time, what you want is logical identity (i.e., ==), not memory identity. T -- Stop staring at me like that! You'll offend... no, you'll hurt your eyes!
May 16 2012
parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Wed, 16 May 2012 13:16:36 -0400, H. S. Teoh <hsteoh quickfur.ath.cx>  
wrote:

 On Wed, May 16, 2012 at 01:07:54PM -0400, Steven Schveighoffer wrote:
 [...]
 For example:

 auto str = "abcabc";
 assert(str[0..3] == str[3..$]); // pass
 assert(str[0..3] is str[3..$]); // fail

 which is very counterintuitive.
[...] I don't find that counterintuitive at all. To me, 'is' concerns memory identity: are the two things actually one and the same _in memory_? (In this case, no, because they are different chunks of memory that just happens to contain the same values.) Whereas '==' concerns logical identity: do the two things represent the same logical entity? (In this case, yes, these two arrays contain exactly the same elements.) I'd argue that 99% of the time, what you want is logical identity (i.e., ==), not memory identity.
What's counter intuitive is if you use null as a 'special marker', then you use == in most cases, but that one case where you want to 'check for the special marker', in which case you *have* to use is. -Steve
May 16 2012
next sibling parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Wed, May 16, 2012 at 01:36:27PM -0400, Steven Schveighoffer wrote:
 On Wed, 16 May 2012 13:16:36 -0400, H. S. Teoh
 <hsteoh quickfur.ath.cx> wrote:
 
On Wed, May 16, 2012 at 01:07:54PM -0400, Steven Schveighoffer wrote:
[...]
For example:

auto str = "abcabc";
assert(str[0..3] == str[3..$]); // pass
assert(str[0..3] is str[3..$]); // fail

which is very counterintuitive.
[...] I don't find that counterintuitive at all. To me, 'is' concerns memory identity: are the two things actually one and the same _in memory_? (In this case, no, because they are different chunks of memory that just happens to contain the same values.) Whereas '==' concerns logical identity: do the two things represent the same logical entity? (In this case, yes, these two arrays contain exactly the same elements.) I'd argue that 99% of the time, what you want is logical identity (i.e., ==), not memory identity.
What's counter intuitive is if you use null as a 'special marker', then you use == in most cases, but that one case where you want to 'check for the special marker', in which case you *have* to use is.
[...] It depends upon one's mental model of what an array is. If you think of an array as a container that exists apart from its contents, then you'd expect null != [] because null means even the container itself doesn't exist, whereas [] means the container exists but contains nothing. However, if you regard the array simply as the sum total of its contents, then you'd expect null == [] because there is no container to speak of, either there are elements, or there are none. When there are no elements, there is also no array (or equivalently, the array is empty). Therefore, null and [] are the same thing. It seems that D takes the latter view, at least as far as == is concerned. Thus, to distinguish between null and [], one has to bypass == and use 'is' (i.e., open up the hood of the mental model of an array, and look into its actual implementation). T -- Ignorance is bliss... but only until you suffer the consequences!
May 16 2012
parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Wed, 16 May 2012 13:52:57 -0400, H. S. Teoh <hsteoh quickfur.ath.cx>  
wrote:

 On Wed, May 16, 2012 at 01:36:27PM -0400, Steven Schveighoffer wrote:
 On Wed, 16 May 2012 13:16:36 -0400, H. S. Teoh
 <hsteoh quickfur.ath.cx> wrote:

On Wed, May 16, 2012 at 01:07:54PM -0400, Steven Schveighoffer wrote:
[...]
For example:

auto str = "abcabc";
assert(str[0..3] == str[3..$]); // pass
assert(str[0..3] is str[3..$]); // fail

which is very counterintuitive.
[...] I don't find that counterintuitive at all. To me, 'is' concerns memory identity: are the two things actually one and the same _in memory_? (In this case, no, because they are different chunks of memory that just happens to contain the same values.) Whereas '==' concerns logical identity: do the two things represent the same logical entity? (In this case, yes, these two arrays contain exactly the same elements.) I'd argue that 99% of the time, what you want is logical identity (i.e., ==), not memory identity.
What's counter intuitive is if you use null as a 'special marker', then you use == in most cases, but that one case where you want to 'check for the special marker', in which case you *have* to use is.
[...] It depends upon one's mental model of what an array is. If you think of an array as a container that exists apart from its contents, then you'd expect null != [] because null means even the container itself doesn't exist, whereas [] means the container exists but contains nothing. However, if you regard the array simply as the sum total of its contents, then you'd expect null == [] because there is no container to speak of, either there are elements, or there are none. When there are no elements, there is also no array (or equivalently, the array is empty). Therefore, null and [] are the same thing. It seems that D takes the latter view, at least as far as == is concerned. Thus, to distinguish between null and [], one has to bypass == and use 'is' (i.e., open up the hood of the mental model of an array, and look into its actual implementation).
Part of the source of this confusion is that D slices are not actually arrays or containers. They reference, they don't contain. -Steve
May 16 2012
parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Wed, May 16, 2012 at 02:03:44PM -0400, Steven Schveighoffer wrote:
 On Wed, 16 May 2012 13:52:57 -0400, H. S. Teoh
 <hsteoh quickfur.ath.cx> wrote:
[...]
It depends upon one's mental model of what an array is.

If you think of an array as a container that exists apart from its
contents, then you'd expect null != [] because null means even the
container itself doesn't exist, whereas [] means the container exists
but contains nothing.

However, if you regard the array simply as the sum total of its
contents, then you'd expect null == [] because there is no container
to speak of, either there are elements, or there are none. When there
are no elements, there is also no array (or equivalently, the array
is empty). Therefore, null and [] are the same thing.

It seems that D takes the latter view, at least as far as == is
concerned. Thus, to distinguish between null and [], one has to
bypass == and use 'is' (i.e., open up the hood of the mental model of
an array, and look into its actual implementation).
Part of the source of this confusion is that D slices are not actually arrays or containers. They reference, they don't contain.
[...] Right, which is why D's arrays (or rather, slices) are closer to the second model than the first. Actually, arrays only exist in the GC, right? Even an "explicitly declared" array is really just a slice, that just happens to reference the contents of that chunk of GC memory. Except static arrays, of course, but we're not worried about those here. T -- Never trust an operating system you don't have source for! -- Martin Schulze
May 16 2012
parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Wed, 16 May 2012 14:32:44 -0400, H. S. Teoh <hsteoh quickfur.ath.cx>  
wrote:

 Right, which is why D's arrays (or rather, slices) are closer to the
 second model than the first. Actually, arrays only exist in the GC,
 right? Even an "explicitly declared" array is really just a slice, that
 just happens to reference the contents of that chunk of GC memory.
Yes. -Steve
May 16 2012
prev sibling parent reply =?UTF-8?B?QWxleCBSw7hubmUgUGV0ZXJzZW4=?= <alex lycus.org> writes:
On 16-05-2012 19:36, Steven Schveighoffer wrote:
 On Wed, 16 May 2012 13:16:36 -0400, H. S. Teoh <hsteoh quickfur.ath.cx>
 wrote:

 On Wed, May 16, 2012 at 01:07:54PM -0400, Steven Schveighoffer wrote:
 [...]
 For example:

 auto str = "abcabc";
 assert(str[0..3] == str[3..$]); // pass
 assert(str[0..3] is str[3..$]); // fail

 which is very counterintuitive.
[...] I don't find that counterintuitive at all. To me, 'is' concerns memory identity: are the two things actually one and the same _in memory_? (In this case, no, because they are different chunks of memory that just happens to contain the same values.) Whereas '==' concerns logical identity: do the two things represent the same logical entity? (In this case, yes, these two arrays contain exactly the same elements.) I'd argue that 99% of the time, what you want is logical identity (i.e., ==), not memory identity.
What's counter intuitive is if you use null as a 'special marker', then you use == in most cases, but that one case where you want to 'check for the special marker', in which case you *have* to use is. -Steve
I guess we can conclude that one should not use 'null' or 'is' for arrays unless absolutely necessary. '[]' and '==' should probably do for the majority of code. -- Alex Rønne Petersen alex lycus.org http://lycus.org
May 16 2012
parent "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Wednesday, May 16, 2012 19:52:07 Alex Rønne Petersen wrote:
 I guess we can conclude that one should not use 'null' or 'is' for
 arrays unless absolutely necessary. '[]' and '==' should probably do for
 the majority of code.
The only reason to use is is if you're checking for identity rather than equality. == checks for equality. It should be clear when you need one or the other. null and [] are essentially equivalent, so it doesn't really matter which you use. However, I'd argue that using == with null or [] is a bad move, because it tends to show a lack of understanding, simply because it's so natural for people to try and check whether something is null by comparing against ==, and that _doesn't work_. So, if you want to check whether an array is empty, use empty or length == 0, whereas if you want to check whether an array is null, then use is null. But aside from the issues of clarity surrounding checking whether an array is empty by using == null or == [], I think that it's quite clear when == or is should be used. If it's not, it's because you don't understand the differences between the two. - Jonathan M Davis
May 16 2012
prev sibling next sibling parent reply Gor Gyolchanyan <gor.f.gyolchanyan gmail.com> writes:
On Wed, May 16, 2012 at 9:07 PM, Steven Schveighoffer
<schveiguy yahoo.com>wrote:

 On Wed, 16 May 2012 12:21:27 -0400, Gor Gyolchanyan <
 gor.f.gyolchanyan gmail.com> wrote:


 if("" != []) assert("".length != 0);

 Will this fail?
No. Ambiguities only come into play when you use 'is'. I highly recommend not using 'is' for arrays unless you really have a good reason, since two slices can be 'equal' but 'point at different instances'. For example: auto str = "abcabc"; assert(str[0..3] == str[3..$]); // pass assert(str[0..3] is str[3..$]); // fail which is very counterintuitive. -Steve
Doesn't assert("".length != 0) look extremely counter-intuitive? -- Bye, Gor Gyolchanyan.
May 16 2012
parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Wed, 16 May 2012 13:17:17 -0400, Gor Gyolchanyan  
<gor.f.gyolchanyan gmail.com> wrote:

 On Wed, May 16, 2012 at 9:07 PM, Steven Schveighoffer
 <schveiguy yahoo.com>wrote:

 On Wed, 16 May 2012 12:21:27 -0400, Gor Gyolchanyan <
 gor.f.gyolchanyan gmail.com> wrote:


 if("" != []) assert("".length != 0);

 Will this fail?
No. Ambiguities only come into play when you use 'is'. I highly recommend not using 'is' for arrays unless you really have a good reason, since two slices can be 'equal' but 'point at different instances'. For example: auto str = "abcabc"; assert(str[0..3] == str[3..$]); // pass assert(str[0..3] is str[3..$]); // fail which is very counterintuitive. -Steve
Doesn't assert("".length != 0) look extremely counter-intuitive?
That assert would always fail, if the if statement would ever succeed. It doesn't look counter-intuitive, it looks like a bug! You basically said: if(0) assert("".length != 0); -Steve
May 16 2012
prev sibling parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Wed, May 16, 2012 at 09:17:17PM +0400, Gor Gyolchanyan wrote:
 On Wed, May 16, 2012 at 9:07 PM, Steven Schveighoffer
 <schveiguy yahoo.com>wrote:
 
 On Wed, 16 May 2012 12:21:27 -0400, Gor Gyolchanyan <
 gor.f.gyolchanyan gmail.com> wrote:


 if("" != []) assert("".length != 0);

 Will this fail?
No. Ambiguities only come into play when you use 'is'. I highly recommend not using 'is' for arrays unless you really have a good reason, since two slices can be 'equal' but 'point at different instances'.
[...]
 Doesn't assert("".length != 0) look extremely counter-intuitive?
[...] Code: import std.stdio; void main() { writeln("" == []); writeln("" != []); writeln("".length); } Output: true false 0 Where's the problem? T -- Recently, our IT department hired a bug-fix engineer. He used to work for Volkswagen.
May 16 2012
prev sibling parent reply travert phare.normalesup.org (Christophe Travert) writes:
"Steven Schveighoffer" , dans le message (digitalmars.D:167556), a
 toStringz can allocate a new block in order to ensure 0 gets added.  This  
 is ludicrous!
 
 You are trying to tell me that any time I want to call a C function with a  
 string literal, I have to first heap-allocate it, even though I *know*  
 it's safe.
How about "mystring\0".ptr ?
May 18 2012
parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Fri, 18 May 2012 11:05:21 -0400, Christophe Travert  
<travert phare.normalesup.org> wrote:

 "Steven Schveighoffer" , dans le message (digitalmars.D:167556), a
 toStringz can allocate a new block in order to ensure 0 gets added.   
 This
 is ludicrous!

 You are trying to tell me that any time I want to call a C function  
 with a
 string literal, I have to first heap-allocate it, even though I *know*
 it's safe.
How about "mystring\0".ptr ?
AKA "mystring" :) I'm sorry, I don't see the reason to require this. All for the sake of making "" a null slice. I find the net gain quite trivial. -Steve
May 18 2012
parent reply "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Friday, May 18, 2012 11:18:46 Steven Schveighoffer wrote:
 On Fri, 18 May 2012 11:05:21 -0400, Christophe Travert
 
 <travert phare.normalesup.org> wrote:
 "Steven Schveighoffer" , dans le message (digitalmars.D:167556), a
 
 toStringz can allocate a new block in order to ensure 0 gets added.
 This
 is ludicrous!
 
 You are trying to tell me that any time I want to call a C function
 with a
 string literal, I have to first heap-allocate it, even though I *know*
 it's safe.
How about "mystring\0".ptr ?
AKA "mystring" :) I'm sorry, I don't see the reason to require this. All for the sake of making "" a null slice. I find the net gain quite trivial.
And I find the net gain to be negative, since the fact that "" is non-null is _useful_. - Jonathan M Davis
May 18 2012
parent travert phare.normalesup.org (Christophe Travert) writes:
"Jonathan M Davis" , dans le message (digitalmars.D:167901), a écrit :
 On Friday, May 18, 2012 11:18:46 Steven Schveighoffer wrote:
 On Fri, 18 May 2012 11:05:21 -0400, Christophe Travert
 
 <travert phare.normalesup.org> wrote:
 "Steven Schveighoffer" , dans le message (digitalmars.D:167556), a
 
 toStringz can allocate a new block in order to ensure 0 gets added.
 This
 is ludicrous!
 
 You are trying to tell me that any time I want to call a C function
 with a
 string literal, I have to first heap-allocate it, even though I *know*
 it's safe.
How about "mystring\0".ptr ?
AKA "mystring" :) I'm sorry, I don't see the reason to require this. All for the sake of making "" a null slice. I find the net gain quite trivial.
And I find the net gain to be negative, since the fact that "" is non-null is _useful_. - Jonathan M Davis
I'm not saying "" should point to null. I'm saying people claiming that they have to heap-allocate (via toStringz) each time they call a c-function are just wrong. I tend to accept the point that making strings automatically zero terminated for making calls to c-function easier is not such a good idea, but I have no problem with [] !is "". Empty strings should be tested with empty. -- Christophe
May 22 2012
prev sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 05/15/2012 10:39 AM, deadalnix wrote:
 Le 14/05/2012 19:38, Alex Rønne Petersen a écrit :
 On 14-05-2012 15:21, Gor Gyolchanyan wrote:
 I thing the zero-terminated literal shtick is pointless. Literals are
 rarely passed to C functions, so we gotta use the std.utf.toUTFz anyway.

 On Mon, May 14, 2012 at 5:03 PM, Christophe
 <travert phare.normalesup.org <mailto:travert phare.normalesup.org>>
 wrote:

 deadalnix , dans le message (digitalmars.D:167258), a écrit :
 A good solution would be to set the pointer to 0 when the length
is set
 to 0.
String literal are zero-terminated. "" cannot point to 0x0, unless we drop this rule. Maybe we should... -- Bye, Gor Gyolchanyan.
This is very false. I invite you to read almost any module in druntime. You'll find that it makes heavy use of printf debugging. That being said, dropping the null-termination rule when passing strings to non-const(char)* parameters/variables/etc would be sane enough (I think).
This looks to me like a bad practice. C string and D string are different beasts, and we have toStringz .
It is not. Claiming valid use cases are bad practice does not help the discussion. It is disrespectful and patronising.
 It is kind of dumb to create a WAT is the language because druntime dev
 did mistakes.
The conclusion is based on a wrong premise therefore it is meaningless.
 It have to be fixed.
It can be fixed better by making (null !is []) hold instead of making ("" is null) hold.
May 15 2012
parent Marco Leise <Marco.Leise gmx.de> writes:
Am Tue, 15 May 2012 20:23:53 +0200
schrieb Timon Gehr <timon.gehr gmx.ch>:

 On 05/15/2012 10:39 AM, deadalnix wrote:
 Le 14/05/2012 19:38, Alex R=C3=B8nne Petersen a =C3=A9crit :
 [...]
 That being said, dropping the null-termination rule when passing strin=
gs
 to non-const(char)* parameters/variables/etc would be sane enough (I
 think).
I thought the same.
 This looks to me like a bad practice. C string and D string are
 different beasts, and we have toStringz .
When we talk about literal strings they can be both. void foo(const(char)*) will turn a string literal into a '\0'-terminated C = string. void foo(string) will use D string (slices). It is a special case for C interop. There may already be a few others in th= e language.
 It is not. Claiming valid use cases are bad practice does not help the=20
 discussion. It is disrespectful and patronising.
Alex' use case of this feature: void log(string text) { printf(text); } made me think the same: "That's a bad idea...". This runs against a D progr= ammer's understanding of a string as a pointer and length without explicit = termination. If you just had this in mind: printf("abc"); that looks kosher.
 It is kind of dumb to create a WAT is the language because druntime dev
 did mistakes.
Yes, let's have zero appended when literals are used as char* parameters. I just greped druntime for [f]printf and except for one ternary operator an= d one (tbuf.ptr, tbuf.length, ...) the format specifiers were all literals.= No concatenations and slicing (thank god ;) ) or lookups and storage in a = string before use. If these 759 printf statements reflect the general case,= then there should be no problem with the proposal. (The one use of the ter= nary operator can probably be changed to an if-else.)
 The conclusion is based on a wrong premise therefore it is meaningless.
=20
 It have to be fixed.
=20 It can be fixed better by making (null !is []) hold instead of making=20 ("" is null) hold.
It's easy. Revert to 2.057. I agree with you, but I'm ignorant to the reaso= ns behind the change. It could have meant an allocation for empty arrays (t= he way it was implemented in 2.057) or that the GC cannot collect a large m= emory block because a slice of length 0 is holding a reference to it. --=20 Marco
May 23 2012
prev sibling parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Mon, 14 May 2012 06:08:17 -0400, Gor Gyolchanyan  
<gor.f.gyolchanyan gmail.com> wrote:

 Hi! I have a small question:
 Is the test for a null array equivalent to a test for zero-length array?
== tests for length and content equivalence. 'is' tests for both pointer and length equivalence (and therefore, content equality is implied). There is a large confusion with null arrays. A null array is simply an empty array that happens to be pointing to null. Other than that, it is equivalent to an empty array, and should be treated as such. One can use the idea that "null arrays are special", but it leads to likely confusing semantics, where an empty array is different from a null array. if(arr) should IMO succeed iff length > 0. That is one of the main reasons of the confusion. Note that [] is a request to the runtime to build an empty array. The runtime detects this, and rather than consuming a heap allocation to build nothing, it simply returns a null-pointed array. This is 100% the right decision, and I don't think anyone would ever convince me (or Andrei or Walter) otherwise.
 This is particularly interesting for strings.
 For instance, I could return an empty string from a toString-like  
 function
 and the empty string would be printed, but If I returned a null string,
 that would indicate, that there is no string representation and it would
 cause some default string to be printed.
These are the confusing semantics I was referring to ;) I would recommend we try to avoid this kind of distinction wherever possible.
 So, the question is, if a null array is any different from an empty  
 array?
I would say it technically is different, but you should treat it as equivalent unless you have a really really good reason not to. It's just another empty array which happens to be pointing at 0. -Steve
May 14 2012
parent reply deadalnix <deadalnix gmail.com> writes:
Le 14/05/2012 16:37, Steven Schveighoffer a écrit :
 Note that [] is a request to the runtime to build an empty array. The
 runtime detects this, and rather than consuming a heap allocation to
 build nothing, it simply returns a null-pointed array. This is 100% the
 right decision, and I don't think anyone would ever convince me (or
 Andrei or Walter) otherwise.
Obviously this is the right thing to do ! The question is why an array of length 0 isn't nulled ? It lead to confusing semantic here, and can keep alive memory that can't be accessed.
May 14 2012
parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Mon, 14 May 2012 15:30:25 -0400, deadalnix <deadalnix gmail.com> wrot=
e:

 Le 14/05/2012 16:37, Steven Schveighoffer a =C3=A9crit :
 Note that [] is a request to the runtime to build an empty array. The=
 runtime detects this, and rather than consuming a heap allocation to
 build nothing, it simply returns a null-pointed array. This is 100% t=
he
 right decision, and I don't think anyone would ever convince me (or
 Andrei or Walter) otherwise.
Obviously this is the right thing to do ! The question is why an array of length 0 isn't nulled ? It lead to =
 confusing semantic here, and can keep alive memory that can't be  =
 accessed.
int[] arr; arr.reserve(10000); assert(arr.length =3D=3D 0); -Steve
May 14 2012
parent reply deadalnix <deadalnix gmail.com> writes:
Le 14/05/2012 21:53, Steven Schveighoffer a écrit :
 On Mon, 14 May 2012 15:30:25 -0400, deadalnix <deadalnix gmail.com> wrote:

 Le 14/05/2012 16:37, Steven Schveighoffer a écrit :
 Note that [] is a request to the runtime to build an empty array. The
 runtime detects this, and rather than consuming a heap allocation to
 build nothing, it simply returns a null-pointed array. This is 100% the
 right decision, and I don't think anyone would ever convince me (or
 Andrei or Walter) otherwise.
Obviously this is the right thing to do ! The question is why an array of length 0 isn't nulled ? It lead to confusing semantic here, and can keep alive memory that can't be accessed.
int[] arr; arr.reserve(10000); assert(arr.length == 0); -Steve
The length isn't set to 0 here. You obviously don't want that to be nulled.
May 15 2012
parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Tue, 15 May 2012 04:42:10 -0400, deadalnix <deadalnix gmail.com> wrot=
e:

 Le 14/05/2012 21:53, Steven Schveighoffer a =C3=A9crit :
 On Mon, 14 May 2012 15:30:25 -0400, deadalnix <deadalnix gmail.com>  =
 wrote:

 Le 14/05/2012 16:37, Steven Schveighoffer a =C3=A9crit :
 Note that [] is a request to the runtime to build an empty array. T=
he
 runtime detects this, and rather than consuming a heap allocation t=
o
 build nothing, it simply returns a null-pointed array. This is 100%=
=
 the
 right decision, and I don't think anyone would ever convince me (or=
 Andrei or Walter) otherwise.
Obviously this is the right thing to do ! The question is why an array of length 0 isn't nulled ? It lead to confusing semantic here, and can keep alive memory that can't be accessed.
int[] arr; arr.reserve(10000); assert(arr.length =3D=3D 0); -Steve
The length isn't set to 0 here. You obviously don't want that to be =
 nulled.
The assert disagrees with you :) -Steve
May 16 2012
parent reply deadalnix <deadalnix gmail.com> writes:
Le 16/05/2012 15:12, Steven Schveighoffer a écrit :
 On Tue, 15 May 2012 04:42:10 -0400, deadalnix <deadalnix gmail.com> wrote:

 Le 14/05/2012 21:53, Steven Schveighoffer a écrit :
 On Mon, 14 May 2012 15:30:25 -0400, deadalnix <deadalnix gmail.com>
 wrote:

 Le 14/05/2012 16:37, Steven Schveighoffer a écrit :
 Note that [] is a request to the runtime to build an empty array. The
 runtime detects this, and rather than consuming a heap allocation to
 build nothing, it simply returns a null-pointed array. This is 100%
 the
 right decision, and I don't think anyone would ever convince me (or
 Andrei or Walter) otherwise.
Obviously this is the right thing to do ! The question is why an array of length 0 isn't nulled ? It lead to confusing semantic here, and can keep alive memory that can't be accessed.
int[] arr; arr.reserve(10000); assert(arr.length == 0); -Steve
The length isn't set to 0 here. You obviously don't want that to be nulled.
The assert disagrees with you :) -Steve
The length IS 0. It IS 0 before the call to reserve. It is never SET to 0.
May 16 2012
parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Wed, 16 May 2012 17:11:58 -0400, deadalnix <deadalnix gmail.com> wrot=
e:

 Le 16/05/2012 15:12, Steven Schveighoffer a =C3=A9crit :
 On Tue, 15 May 2012 04:42:10 -0400, deadalnix <deadalnix gmail.com>  =
 wrote:

 Le 14/05/2012 21:53, Steven Schveighoffer a =C3=A9crit :
 On Mon, 14 May 2012 15:30:25 -0400, deadalnix <deadalnix gmail.com>=
 wrote:

 Le 14/05/2012 16:37, Steven Schveighoffer a =C3=A9crit :
 Note that [] is a request to the runtime to build an empty array.=
=
 The
 runtime detects this, and rather than consuming a heap allocation=
to
 build nothing, it simply returns a null-pointed array. This is 10=
0%
 the
 right decision, and I don't think anyone would ever convince me (=
or
 Andrei or Walter) otherwise.
Obviously this is the right thing to do ! The question is why an array of length 0 isn't nulled ? It lead to=
 confusing semantic here, and can keep alive memory that can't be
 accessed.
int[] arr; arr.reserve(10000); assert(arr.length =3D=3D 0); -Steve
The length isn't set to 0 here. You obviously don't want that to be nulled.
The assert disagrees with you :) -Steve
The length IS 0. It IS 0 before the call to reserve. It is never SET t=
o =
 0.
OK, so it's allowed to be 0 and not-null. doesn't this lead to the = confusing semantics you were talking about? What about this? int[] arr; arr.reserve(10000); int[] arr2 =3D [1,2,3]; arr2 =3D arr; // now length has been *set* to 0, should it also be nulle= d? But I want arr2 and arr to point at the same thing, maybe I'm not using = = arr anymore. Maybe I returned it from a function, and I no longer have = = access to arr. -Steve
May 16 2012
parent deadalnix <deadalnix gmail.com> writes:
Le 16/05/2012 23:15, Steven Schveighoffer a écrit :
 On Wed, 16 May 2012 17:11:58 -0400, deadalnix <deadalnix gmail.com> wrote:

 Le 16/05/2012 15:12, Steven Schveighoffer a écrit :
 On Tue, 15 May 2012 04:42:10 -0400, deadalnix <deadalnix gmail.com>
 wrote:

 Le 14/05/2012 21:53, Steven Schveighoffer a écrit :
 On Mon, 14 May 2012 15:30:25 -0400, deadalnix <deadalnix gmail.com>
 wrote:

 Le 14/05/2012 16:37, Steven Schveighoffer a écrit :
 Note that [] is a request to the runtime to build an empty array.
 The
 runtime detects this, and rather than consuming a heap allocation to
 build nothing, it simply returns a null-pointed array. This is 100%
 the
 right decision, and I don't think anyone would ever convince me (or
 Andrei or Walter) otherwise.
Obviously this is the right thing to do ! The question is why an array of length 0 isn't nulled ? It lead to confusing semantic here, and can keep alive memory that can't be accessed.
int[] arr; arr.reserve(10000); assert(arr.length == 0); -Steve
The length isn't set to 0 here. You obviously don't want that to be nulled.
The assert disagrees with you :) -Steve
The length IS 0. It IS 0 before the call to reserve. It is never SET to 0.
OK, so it's allowed to be 0 and not-null. doesn't this lead to the confusing semantics you were talking about? What about this? int[] arr; arr.reserve(10000); int[] arr2 = [1,2,3]; arr2 = arr; // now length has been *set* to 0, should it also be nulled? But I want arr2 and arr to point at the same thing, maybe I'm not using arr anymore. Maybe I returned it from a function, and I no longer have access to arr. -Steve
That make sense :D
May 17 2012