www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Empty String == Empty Array == Empty Static Array?

reply AJG <AJG_member pathlink.com> writes:
Hi there,

I was under the impression that strings (char[]) were strictly compatible with
and equivalent to other D arrays. However, the empty string seems to throw a
monkey wrench to this idea. In addition, static arrays also mess things up.

// Here we go:
char[]  nul = null;
char[]  str = "";
char[]  dyn;
char[0] sta;
char[]  ini = new char[0];

printf("%d", nul.length);
printf("%d", str.length);
printf("%d", dyn.length);
printf("%d", sta.length);
printf("%d", ini.length);
// The above all print 0, so far so good.

printf(nul ? "true" : "false");
printf(str ? "true" : "false");
printf(dyn ? "true" : "false");
printf(sta ? "true" : "false");
printf(ini ? "true" : "false");
// Here we get false, true, false, true, false.

What's up with this? Why are empty strings (which are empty even by .length
accounts) and static arrays different than other empty arrays?

I have a feeling this has got to do with the internal array pointer, but I'm not
sure. At any rate, it's not very semantically intuitive. Is this done on
purpose, or can we look forward to some convergence?

Cheers,
--AJG.

PS: What exactly does passing NULL to an array parameter do?

================================
2B || !2B, that is the question.
Jun 24 2005
next sibling parent Stefan <Stefan_member pathlink.com> writes:
Have a look at the digitalmars.D.bugs NG.
There was recently an extensive thread that discussed this whole
(non-) issue:

http://www.digitalmars.com/drn-bin/wwwnews?digitalmars.D.bugs/4301

Kind regards,
Stefan



In article <d9io82$ei9$1 digitaldaemon.com>, AJG says...
Hi there,

I was under the impression that strings (char[]) were strictly compatible with
and equivalent to other D arrays. However, the empty string seems to throw a
monkey wrench to this idea. In addition, static arrays also mess things up.

// Here we go:
char[]  nul = null;
char[]  str = "";
char[]  dyn;
char[0] sta;
char[]  ini = new char[0];

printf("%d", nul.length);
printf("%d", str.length);
printf("%d", dyn.length);
printf("%d", sta.length);
printf("%d", ini.length);
// The above all print 0, so far so good.

printf(nul ? "true" : "false");
printf(str ? "true" : "false");
printf(dyn ? "true" : "false");
printf(sta ? "true" : "false");
printf(ini ? "true" : "false");
// Here we get false, true, false, true, false.

What's up with this? Why are empty strings (which are empty even by .length
accounts) and static arrays different than other empty arrays?

I have a feeling this has got to do with the internal array pointer, but I'm not
sure. At any rate, it's not very semantically intuitive. Is this done on
purpose, or can we look forward to some convergence?

Cheers,
--AJG.

PS: What exactly does passing NULL to an array parameter do?

================================
2B || !2B, that is the question.
Jun 25 2005
prev sibling parent reply "Regan Heath" <regan netwin.co.nz> writes:
On Sat, 25 Jun 2005 04:57:38 +0000 (UTC), AJG <AJG_member pathlink.com>  
wrote:
 I was under the impression that strings (char[]) were strictly  
 compatible with
 and equivalent to other D arrays. However, the empty string seems to  
 throw a
 monkey wrench to this idea. In addition, static arrays also mess things  
 up.

 // Here we go:
 char[]  nul = null;
 char[]  str = "";
 char[]  dyn;
 char[0] sta;
 char[]  ini = new char[0];

 printf("%d", nul.length);
 printf("%d", str.length);
 printf("%d", dyn.length);
 printf("%d", sta.length);
 printf("%d", ini.length);
 // The above all print 0, so far so good.

 printf(nul ? "true" : "false");
 printf(str ? "true" : "false");
 printf(dyn ? "true" : "false");
 printf(sta ? "true" : "false");
 printf(ini ? "true" : "false");
 // Here we get false, true, false, true, false.

 What's up with this? Why are empty strings (which are empty even by  
 .length accounts) and static arrays different than other empty arrays?

 I have a feeling this has got to do with the internal array pointer, but  
 I'm not sure.
I believe so. When you compare an array "reference" to null you end up comparing the array data pointer to null. An array is essentially a struct in the form: struct array { int length; void* data; } (perhaps not using void, but instead the actual data type?) Note, a static array is special. It's data member cannot be null and it's length parameter is actually macro replaced upon compilation. eg. char[5] sta; char[] dyn; void main() { int* p = &sta.length; int* q = &dyn.length; } sta.d(1): constant 5 is not an lvalue sta.d(7): dyn.length is not an lvalue sta.d(7): cannot implicitly convert expression (#dyn.length) of type uint* to int* Errors 1 & 2 show 'sta.length' replaced by '5'. Errors 3 & 4 are due to .length being a property/getter method call, not an int. I suspect also that the reference refers directly to the data and not to an array struct (shown above).
 At any rate, it's not very semantically intuitive. Is this done on
 purpose, or can we look forward to some convergence?
It's a metter of asking for what you really want to know. For example... if(arr is null) ;//array never assigned, i.e. non-existant. else if (arr.length == 0) ;//array exists, no data present. else ;//array exists, has some data. In many cases "arr.length == 0" is all you care about, in some "arr is null" might be important.
 PS: What exactly does passing NULL to an array parameter do?
Creates an array (struct shown above) with data set to null and length set to 0. This is why .length always 'works' and never gives a segmentation fault for a 'null' array. Regan
Jun 26 2005
parent reply AJG <AJG_member pathlink.com> writes:
Hi Regan,

Thanks a ton for the post! This actually cleared everything up for me. At least
now I know what to expect from the language regarding arrays.

 PS: What exactly does passing NULL to an array parameter do?
Creates an array (struct shown above) with data set to null and length set to 0. This is why .length always 'works' and never gives a segmentation fault for a 'null' array.
I guess this is a good thing since it prevents functions expecting only an array (even if empty) from segfaulting on a null. However, it also reduces the possibilities for the programmer. Since passing null means passing "an empty array," you can't differentiate between them (safely). Right? In addition, this is somewhat inconsistent with the way classes and objects are treated. For a class, you can pass a null, and the function will receive that null (not an "empty" class, whatever that would be) and it will segfault unless you guard against it. This is why I was originally confused with the behaviour of arrays. Finally, a small suggestion. What if when doing something like: if (someArray) // Do stuff. Meant implicitly checking the .length property automagically (instead of checking the internal pointer)? The former seems more useful to me, since the concept of a null array no longer exists; it's also universal across all array types. The latter, on the other hand, seems kind of implementation-ish and hackish, IMHO. Anyway, thanks again for the help. Cheers, --AJG. In article <opsszz0xna23k2f5 nrage.netwin.co.nz>, Regan Heath says...
On Sat, 25 Jun 2005 04:57:38 +0000 (UTC), AJG <AJG_member pathlink.com>  
wrote:
 I was under the impression that strings (char[]) were strictly  
 compatible with
 and equivalent to other D arrays. However, the empty string seems to  
 throw a
 monkey wrench to this idea. In addition, static arrays also mess things  
 up.

 // Here we go:
 char[]  nul = null;
 char[]  str = "";
 char[]  dyn;
 char[0] sta;
 char[]  ini = new char[0];

 printf("%d", nul.length);
 printf("%d", str.length);
 printf("%d", dyn.length);
 printf("%d", sta.length);
 printf("%d", ini.length);
 // The above all print 0, so far so good.

 printf(nul ? "true" : "false");
 printf(str ? "true" : "false");
 printf(dyn ? "true" : "false");
 printf(sta ? "true" : "false");
 printf(ini ? "true" : "false");
 // Here we get false, true, false, true, false.

 What's up with this? Why are empty strings (which are empty even by  
 .length accounts) and static arrays different than other empty arrays?

 I have a feeling this has got to do with the internal array pointer, but  
 I'm not sure.
I believe so. When you compare an array "reference" to null you end up comparing the array data pointer to null. An array is essentially a struct in the form: struct array { int length; void* data; } (perhaps not using void, but instead the actual data type?) Note, a static array is special. It's data member cannot be null and it's length parameter is actually macro replaced upon compilation. eg. char[5] sta; char[] dyn; void main() { int* p = &sta.length; int* q = &dyn.length; } sta.d(1): constant 5 is not an lvalue sta.d(7): dyn.length is not an lvalue sta.d(7): cannot implicitly convert expression (#dyn.length) of type uint* to int* Errors 1 & 2 show 'sta.length' replaced by '5'. Errors 3 & 4 are due to .length being a property/getter method call, not an int. I suspect also that the reference refers directly to the data and not to an array struct (shown above).
 At any rate, it's not very semantically intuitive. Is this done on
 purpose, or can we look forward to some convergence?
It's a metter of asking for what you really want to know. For example... if(arr is null) ;//array never assigned, i.e. non-existant. else if (arr.length == 0) ;//array exists, no data present. else ;//array exists, has some data. In many cases "arr.length == 0" is all you care about, in some "arr is null" might be important.
 PS: What exactly does passing NULL to an array parameter do?
Creates an array (struct shown above) with data set to null and length set to 0. This is why .length always 'works' and never gives a segmentation fault for a 'null' array. Regan
================================ 2B || !2B, that is the question.
Jun 26 2005
parent reply "Regan Heath" <regan netwin.co.nz> writes:
On Mon, 27 Jun 2005 05:09:02 +0000 (UTC), AJG <AJG_member pathlink.com>  
wrote:
 Hi Regan,

 Thanks a ton for the post! This actually cleared everything up for me.  
 At least now I know what to expect from the language regarding arrays.

 PS: What exactly does passing NULL to an array parameter do?
Creates an array (struct shown above) with data set to null and length set to 0. This is why .length always 'works' and never gives a segmentation fault for a 'null' array.
I guess this is a good thing since it prevents functions expecting only an array (even if empty) from segfaulting on a null.
Yep.
 However, it also reduces the
 possibilities for the programmer. Since passing null means passing "an  
 empty array," you can't differentiate between them (safely). Right?
Actually you can still differentiate between a null array and an empty array, eg. if (arr is null) ;//non-existant else if (arr.length == 0) ;//empty else ;//has items However don't set length to 0 or you'll turn an empty array into a null array. I believe this behaviour is broken.
 In addition, this is somewhat inconsistent with the way classes and  
 objects are
 treated. For a class, you can pass a null, and the function will receive  
 that
 null (not an "empty" class, whatever that would be) and it will segfault  
 unless
 you guard against it. This is why I was originally confused with the  
 behaviour
 of arrays.
True. Were it up to me...
 Finally, a small suggestion. What if when doing something like:

 if (someArray) // Do stuff.

 Meant implicitly checking the .length property automagically (instead of
 checking the internal pointer)?
That has been suggested. IMO it breaks consistency with all other types where if (x) compares x will null.
 The former seems more useful to me, since the concept of a null array no  
 longer exists;
But it does, it's just not obvious (maybe that's a good thing?) in general cases you do not care about null, only if (arr.length == 0). Regan
Jun 27 2005
parent reply Tom S <h3r3tic remove.mat.uni.torun.pl> writes:
Regan Heath wrote:
 On Mon, 27 Jun 2005 05:09:02 +0000 (UTC), AJG <AJG_member pathlink.com>  
 Finally, a small suggestion. What if when doing something like:

 if (someArray) // Do stuff.

 Meant implicitly checking the .length property automagically (instead of
 checking the internal pointer)?
That has been suggested. IMO it breaks consistency with all other types where if (x) compares x will null.
So if we can't have it, I'd strongly suggest making 'if (arr)' illegal. Otherwise massive confusion will follow. Currently arrays are treated both as classes and structs at the same time. We can do if(arr), checking null-ness like with classes but passing arrays to functions works as if they were structs - thus we have to use inout to modify ptr or length. Where's the consistency ??? :( -- Tomasz Stachowiak /+ a.k.a. h3r3tic +/
Jun 27 2005
parent reply "Regan Heath" <regan netwin.co.nz> writes:
On Mon, 27 Jun 2005 12:31:32 +0200, Tom S  
<h3r3tic remove.mat.uni.torun.pl> wrote:
 Regan Heath wrote:
 On Mon, 27 Jun 2005 05:09:02 +0000 (UTC), AJG <AJG_member pathlink.com>
 Finally, a small suggestion. What if when doing something like:

 if (someArray) // Do stuff.

 Meant implicitly checking the .length property automagically (instead  
 of
 checking the internal pointer)?
That has been suggested. IMO it breaks consistency with all other types where if (x) compares x will null.
So if we can't have it, I'd strongly suggest making 'if (arr)' illegal.
No thanks.
   Otherwise massive confusion will follow.
Somewhat prophetic?
 Currently arrays are treated both as classes and structs at the same  
 time.
That is because they share properties of both but are in fact neither. They are 'arrays'.
 We can do if(arr), checking null-ness like with classes but passing  
 arrays to functions works as if they were structs
The specific nature of arrays makes slicing possible. Slicing is incredibly powerful (I think you'll agree). When you slice you create a new reference to a new 'struct' where the data pointer refers to the start of the data and the length is set accordingly. When you pass an array as an 'in' parameter you effectively slice the entire array, you basically duplicate the 'struct'. Sure, the default could have been to pass the reference, but then you loose the choice to pass as it currently does. You still have the choice of passing as a reference, simply use 'inout'. It's all about choice. "if (arr)" compares the data pointer, but, it is actually comparing the reference at the same time as a null reference has a null data pointer. Making it silently compare the length would be confusing IMO. The only problem I have with arrays is with "arr.length = 0;" setting the data pointer to null, this turns an existing/empty array into a null/non-existant array.
 - thus we have to use inout to modify ptr or length. Where's the  
 consistency ??? :(
An array is not a class. An array is not a struct. An array is a unique type that has properties of both classes and structs. Regan
Jun 27 2005
parent reply Tom S <h3r3tic remove.mat.uni.torun.pl> writes:
Regan Heath wrote:
 On Mon, 27 Jun 2005 12:31:32 +0200, Tom S  
 <h3r3tic remove.mat.uni.torun.pl> wrote:
 
 Regan Heath wrote:
   Otherwise massive confusion will follow.
Somewhat prophetic?
There is some confusion at the moment. Granted that more people will find interest in D, I think I could be a prophet by profession.
 Currently arrays are treated both as classes and structs at the same  
 time.
That is because they share properties of both but are in fact neither. They are 'arrays'.
Yup. I agree with that.
 We can do if(arr), checking null-ness like with classes but passing  
 arrays to functions works as if they were structs
The specific nature of arrays makes slicing possible. Slicing is incredibly powerful (I think you'll agree). When you slice you create a new reference to a new 'struct' where the data pointer refers to the start of the data and the length is set accordingly. When you pass an array as an 'in' parameter you effectively slice the entire array, you basically duplicate the 'struct'. Sure, the default could have been to pass the reference, but then you loose the choice to pass as it currently does. You still have the choice of passing as a reference, simply use 'inout'. It's all about choice.
That is the correct behaviour IMO and I got used to it long time ago.
 The only problem I have with arrays is with "arr.length = 0;" setting 
 the  data pointer to null, this turns an existing/empty array into a  
 null/non-existant array.
I haven't had a need to differentiate between an empty and a null array, so for me 'if (arr)' should be true when either the pointer is null or the length is == 0. Thus if (arr) checking ptr makes little sense IMO, but for you it's the desired behaviour. Thus I suggested disallowing implicit conversions from arrays to bool's in order to make sure the programmer specifies exactly he/she wants. Some people will quickly grasp the difference between nullness and emptiness of arrays. But others will find their code buggy due to the implicit conversion. How often do you want to check the nullness of an array in your code compared to checking its emptiness ? I haven't had a single reason to check for nullness yet. Thus disallowing 'if (arr)' seems like the best option IMO. Any other solutions ?
 - thus we have to use inout to modify ptr or length. Where's the  
 consistency ??? :(
An array is not a class. An array is not a struct. An array is a unique type that has properties of both classes and structs.
Agreed -- Tomasz Stachowiak /+ a.k.a. h3r3tic +/
Jun 27 2005
parent reply "Regan Heath" <regan netwin.co.nz> writes:
On Mon, 27 Jun 2005 17:44:24 +0200, Tom S  
<h3r3tic remove.mat.uni.torun.pl> wrote:
 Regan Heath wrote:
 The only problem I have with arrays is with "arr.length = 0;" setting  
 the  data pointer to null, this turns an existing/empty array into a   
 null/non-existant array.
I haven't had a need to differentiate between an empty and a null array, so for me 'if (arr)' should be true when either the pointer is null or the length is == 0.
You mean false?
 Thus if (arr) checking ptr makes little sense IMO, but for you it's the  
 desired behaviour.
"if (arr)" does not check the ptr, it checks the reference, comparing it to null/0 just like it does for _every other type_ in D. It just happens that for arrays a null reference cannot exist, instead you get a reference to a struct with a null data pointer, thus the confusion.
 Thus I suggested disallowing implicit conversions from arrays to bool's  
 in order to make sure the programmer specifies exactly he/she wants.
I dont think arrays have a stronger argument for this than any other type in D.
 Some people will quickly grasp the difference between nullness and  
 emptiness of arrays. But others will find their code buggy due to the  
 implicit conversion.
People simply need to learn that an array is not a reference, not a struct, but a unique type with properties of both. It's a reference to a struct and cannot be null. I dont think the implicit conversion is the cause of confusion, I think the nature of arrays is, once that is grasped confusion vanishes.
 How often do you want to check the nullness of an array in your code  
 compared to checking its emptiness ? I haven't had a single reason to  
 check for nullness yet.
I'll admit it's much less frequent than simply wanting to know if there are any items or not.
 Thus disallowing 'if (arr)' seems like the best option IMO. Any other  
 solutions ?
Fix "arr.length = 0;" so it does not set the data ptr to null. Nothing else needs to be done IMO. Regan
Jun 27 2005
parent reply Tom S <h3r3tic remove.mat.uni.torun.pl> writes:
Regan Heath wrote:
 On Mon, 27 Jun 2005 17:44:24 +0200, Tom S  
 <h3r3tic remove.mat.uni.torun.pl> wrote:
 
 Regan Heath wrote:
 I haven't had a need to differentiate between an empty and a null 
 array,  so for me 'if (arr)' should be true when either the pointer is 
 null or  the length is == 0.
You mean false?
Sorry, my bug.
 Thus if (arr) checking ptr makes little sense IMO, but for you it's 
 the  desired behaviour.
"if (arr)" does not check the ptr, it checks the reference, comparing it to null/0 just like it does for _every other type_ in D. It just happens that for arrays a null reference cannot exist, instead you get a reference to a struct with a null data pointer, thus the confusion.
And it happens to make little sense in most cases. Simple operations should be simple. 99% of time I want to check if an array is empty. if (foo) then means 'does foo have valid data ? / does foo make any sense ?'. if (obj) <-- obj reference is not null ? if (arr) <-- array is not empty ? That's the test I mean by default. Instead by default I'm getting a something that can only be a source of bugs :/ But it seems that I'm alone with my view on the subject, so I'll just shut up and go play with my toys.
 Thus I suggested disallowing implicit conversions from arrays to 
 bool's  in order to make sure the programmer specifies exactly he/she 
 wants.
I dont think arrays have a stronger argument for this than any other type in D.
No ? Arrays are the only type in D that can be asked for their members (ptr, length) when they are null (when in your reasoning, the nullness of the ptr member means nullness of the array reference). This doesn't happen to be true with classes nor structs /+ excluding static members +/
 Some people will quickly grasp the difference between nullness and  
 emptiness of arrays. But others will find their code buggy due to the  
 implicit conversion.
People simply need to learn that an array is not a reference, not a struct, but a unique type with properties of both. It's a reference to a struct and cannot be null. I dont think the implicit conversion is the cause of confusion, I think the nature of arrays is, once that is grasped confusion vanishes.
Confusion vanishes, bugs dont. Once one understands the nature of the for loop, they won't write code like: for (....); { } or will they ? /+ yes, I know that dmd reports an error here +/ -- Tomasz Stachowiak /+ a.k.a. h3r3tic +/
Jun 27 2005
parent reply "Regan Heath" <regan netwin.co.nz> writes:
On Mon, 27 Jun 2005 23:37:46 +0200, Tom S  
<h3r3tic remove.mat.uni.torun.pl> wrote:
 Thus if (arr) checking ptr makes little sense IMO, but for you it's  
 the  desired behaviour.
"if (arr)" does not check the ptr, it checks the reference, comparing it to null/0 just like it does for _every other type_ in D. It just happens that for arrays a null reference cannot exist, instead you get a reference to a struct with a null data pointer, thus the confusion.
And it happens to make little sense in most cases.
The rule is simple "if (x)" compares x with null/0. In the case of an array you're comparing the array *reference*, just like any other reference. You don't actually need to know any more than that. Of course, arrays are not just references, they cannot be null, this fact is not important till you write "if (x.length == 0)" without checking for null, in which case bug avoided! :) If you don't think "if (x)" is explicit enough, don't use it, use "if (x is null)" or "if (x !is null)" or "if (x.length == 0)". In fact, I'd recommend that if you're coding with more than just yourself.
 Simple operations should be simple. 99% of time I want to check if an  
 array is empty.
And it's simple "if (x.length == 0)". Arrays cannot be null so this is safe, simple, explicit, obvious, perfect (that last one might be an exageration)
 if (foo) then means 'does foo have valid data ? / does foo make any  
 sense ?'.
Nope, "is foo null or 0"
 if (obj)  <-- obj reference is not null ?
Yep, "is obj null or 0"
 if (arr)  <-- array is not empty ?
Nope, "is arr null or 0" (arr being like a reference here) Notice the pattern? In short "if (i)" == is 'i' null or 0 What type is 'i'? Who cares! it doesn't actually matter the same rule applies to all types (except a struct/union for which this is illegal)
 That's the test I mean by default. Instead by default I'm getting a  
 something that can only be a source of bugs :/
Anything you do not understand (not you specifically, people in general) will be a source of bugs.
 But it seems that I'm alone with my view on the subject, so I'll just  
 shut up and go play with my toys.
You might be alone, I doubt it tho, every crazy view has at least one other supporter ;) (joke!) Argument is a good thing so long as both parties listen and attempt to understand the other point of view. If you're not doing that, you're not arguing, you're simply saying things at someone. (again, not you specifically, people in general)
 Thus I suggested disallowing implicit conversions from arrays to  
 bool's  in order to make sure the programmer specifies exactly he/she  
 wants.
I dont think arrays have a stronger argument for this than any other type in D.
No ? Arrays are the only type in D that can be asked for their members (ptr, length) when they are null
Sure.
 (when in your reasoning, the nullness of the ptr member means nullness  
 of the array reference).
Or rather is equivalent to. A null data ptr means the array reference would be null, only it cannot be.
 This doesn't happen to be true with classes nor structs /+ excluding  
 static members +/
Sure, so what? I mean yes, they're different, but why shouldn't you be able to say "if (arr)" and check the nullness of the reference. That's what it does, effectively, even if it actually uses the data ptr and the reference itself is not null.
 Some people will quickly grasp the difference between nullness and   
 emptiness of arrays. But others will find their code buggy due to the   
 implicit conversion.
People simply need to learn that an array is not a reference, not a struct, but a unique type with properties of both. It's a reference to a struct and cannot be null. I dont think the implicit conversion is the cause of confusion, I think the nature of arrays is, once that is grasped confusion vanishes.
Confusion vanishes, bugs dont. Once one understands the nature of the for loop, they won't write code like: for (....); { } or will they ? /+ yes, I know that dmd reports an error here +/
Sure, there will always be human error, nothing you do can remove that (except to remove humans...). DMD tries to lessen it (as in the case shown above). I dont think making if (x) will lessen human error, and it'll certainly make arrays inconsistent with all other types in D. Perhaps it's my C background but if (x) has always made sense to me. Regan
Jun 27 2005
parent reply Tom S <h3r3tic remove.mat.uni.torun.pl> writes:
Regan Heath wrote:
 If you don't think "if (x)" is explicit enough, don't use it, use "if 
 (x  is null)" or "if (x !is null)" or "if (x.length == 0)". In fact, 
 I'd  recommend that if you're coding with more than just yourself.
I will. But that is just a convention. Just like doing 'if (const == var)' instead of 'if (var == const)' in C/C++. That doesn't prevent new coders that do not use this convention from committing the same error over and over.
 Nope, "is foo null or 0"
 
 if (obj)  <-- obj reference is not null ?
Yep, "is obj null or 0"
 if (arr)  <-- array is not empty ?
Nope, "is arr null or 0" (arr being like a reference here)
Yes ! Exactly. "is arr null" <=> "is arr.ptr null" ; "or 0" <=> "or arr.length == 0". I would really like 'if (arr)' to work that way.
 Notice the pattern?
 
 In short "if (i)" == is 'i' null or 0
 What type is 'i'? Who cares! it doesn't actually matter the same rule  
 applies to all types (except a struct/union for which this is illegal)
Ok... but from what you're saying it's apparent to me that: 'if (arr)' should mean 'if (arr.ptr !is null && arr.length > 0)'. Then it would be consistent.
 But it seems that I'm alone with my view on the subject, so I'll just  
 shut up and go play with my toys.
You might be alone, I doubt it tho, every crazy view has at least one other supporter ;) (joke!) Argument is a good thing so long as both parties listen and attempt to understand the other point of view. If you're not doing that, you're not arguing, you're simply saying things at someone. (again, not you specifically, people in general)
You're right, I'm just surprised that I don't hear any other opinions on this matter than yours (which I appreciate of course). Maybe everyone's too busy. Wait... I should be studying for my exams <lol>
 I dont think making if (x) will lessen human error, and it'll certainly  
 make arrays inconsistent with all other types in D. Perhaps it's my C  
 background but if (x) has always made sense to me.
Ok, what about the other apprach, 'if (x)' checking both the nullness and emptiness at once ? -- Tomasz Stachowiak /+ a.k.a. h3r3tic +/
Jun 27 2005
parent reply "Regan Heath" <regan netwin.co.nz> writes:
On Tue, 28 Jun 2005 01:32:05 +0200, Tom S  
<h3r3tic remove.mat.uni.torun.pl> wrote:
 Regan Heath wrote:
 If you don't think "if (x)" is explicit enough, don't use it, use "if  
 (x  is null)" or "if (x !is null)" or "if (x.length == 0)". In fact,  
 I'd  recommend that if you're coding with more than just yourself.
I will. But that is just a convention. Just like doing 'if (const == var)' instead of 'if (var == const)' in C/C++.
Yep.
 That doesn't prevent new coders that do not use this convention from  
 committing the same error over and over.
You're right. It wasn't intended to help new coders write code, but rather help any coder reading another coders code. (that's a tongue twister!)
 Nope, "is foo null or 0"

 if (obj)  <-- obj reference is not null ?
Yep, "is obj null or 0"
 if (arr)  <-- array is not empty ?
Nope, "is arr null or 0" (arr being like a reference here)
Yes ! Exactly. "is arr null" <=> "is arr.ptr null" ; "or 0" <=> "or arr.length == 0". I would really like 'if (arr)' to work that way.
In D as in C null *is* 0. C doesn't make a distinction. D is the same (to a more limited degree perhaps?). When you compare any variable 'x' using 'if' i.e. "if (x)" you're saying is the variable 'x' null/0, you're not comparing any property or member of 'x' with null/0. To do so would be a hidden implicit (it's not explicitly naming the member) and very unintuitive operation IMO. Before you note that in the case of an array it's comparing the data ptr, I'll remind you that this occurs because in all cases where an array reference would be null/0, but cannot be, the data pointer *is* null/0. Compare that to the length property which can be 0 when the reference would not be null/0. i.e. char[] arr = ""; In other words the data pointer *is* the reference for purposes of detemining whether the reference is null/0.
 Notice the pattern?
  In short "if (i)" == is 'i' null or 0
 What type is 'i'? Who cares! it doesn't actually matter the same rule   
 applies to all types (except a struct/union for which this is illegal)
Ok... but from what you're saying it's apparent to me that: 'if (arr)' should mean 'if (arr.ptr !is null && arr.length > 0)'. Then it would be consistent.
No, to be consistent "if (x)" must always compare the variable "x" with null/0 not implicitly compare one or more of it's members to null/0 or something else. If we don't care about consistency then this change would mean that: char[] emp = ""; char[] nul = null; if (emp) ; //false if (nul) ; //false Meaning, basically, an array with no items is not ever 'true'. This might be nice as it's the most common case, but it's still inconsistent. I think I prefer consistency here.
 But it seems that I'm alone with my view on the subject, so I'll just   
 shut up and go play with my toys.
You might be alone, I doubt it tho, every crazy view has at least one other supporter ;) (joke!) Argument is a good thing so long as both parties listen and attempt to understand the other point of view. If you're not doing that, you're not arguing, you're simply saying things at someone. (again, not you specifically, people in general)
You're right, I'm just surprised that I don't hear any other opinions on this matter than yours (which I appreciate of course). Maybe everyone's too busy. Wait... I should be studying for my exams <lol>
It might be a timezone thing, I notice more posts here late night (for me). I've also noticed less posts over the last few months (exams maybe?). Worry not! I will always share my opinion ;)
 I dont think making if (x) will lessen human error, and it'll  
 certainly  make arrays inconsistent with all other types in D. Perhaps  
 it's my C  background but if (x) has always made sense to me.
Ok, what about the other apprach, 'if (x)' checking both the nullness and emptiness at once ?
So, "if(x)" <==> "if(x.ptr==null && x.length==0)". That would work, because neither of these cases is possible: A) x.ptr!=null && x.length==0 B) x.ptr==null && x.length!=0 but, if neither of those cases is possible then you can simply choose and check one or the other for null/0 as it currently does, checking ptr. I think ptr is a better choice than length for reasons mentioned above. Regan
Jun 27 2005
parent reply Tom S <h3r3tic remove.mat.uni.torun.pl> writes:
Regan Heath wrote:
 On Tue, 28 Jun 2005 01:32:05 +0200, Tom S  
 <h3r3tic remove.mat.uni.torun.pl> wrote:
 If we don't care about consistency then this change would mean that:
   char[] emp = "";
   char[] nul = null;
 
   if (emp) ; //false
   if (nul) ; //false
 
 Meaning, basically, an array with no items is not ever 'true'. This 
 might  be nice as it's the most common case, but it's still 
 inconsistent. I think  I prefer consistency here.
Ok... but I still am stubborn and think both of these arrays are 'false' as they dont make much sense while printing :) But I agree on your point about consistency. I'll just have to set a trap in the 'if' expression code writing routine in my neural networks to signal me that I should be checking 'arr.length' instead of 'arr'. And I *hope* it will work.
 It might be a timezone thing, I notice more posts here late night (for  
 me). I've also noticed less posts over the last few months (exams 
 maybe?).  Worry not! I will always share my opinion ;)
Thanks. Get ready cuz I'll have more issues to discuss ;) /+ But probably after the tomorrow's exam +/
 Ok, what about the other apprach, 'if (x)' checking both the nullness  
 and emptiness at once ?
So, "if(x)" <==> "if(x.ptr==null && x.length==0)". That would work, because neither of these cases is possible:
// you probably meant "if (!x)"
 A) x.ptr!=null && x.length==0
 B) x.ptr==null && x.length!=0
 
 but, if neither of those cases is possible then you can simply choose 
 and  check one or the other for null/0 as it currently does, checking ptr.
A) is possible. char[] foo = "" which means I have to check for foo.length instead of foo.ptr to see if it contains any useful data. In this case it contains an out-of-bounds '\0'. -- Tomasz Stachowiak /+ a.k.a. h3r3tic +/
Jun 28 2005
parent reply "Regan Heath" <regan netwin.co.nz> writes:
On Tue, 28 Jun 2005 10:42:13 +0200, Tom S  
<h3r3tic remove.mat.uni.torun.pl> wrote:
 Regan Heath wrote:
 On Tue, 28 Jun 2005 01:32:05 +0200, Tom S   
 <h3r3tic remove.mat.uni.torun.pl> wrote:
 If we don't care about consistency then this change would mean that:
   char[] emp = "";
   char[] nul = null;
    if (emp) ; //false
   if (nul) ; //false
  Meaning, basically, an array with no items is not ever 'true'. This  
 might  be nice as it's the most common case, but it's still  
 inconsistent. I think  I prefer consistency here.
Ok... but I still am stubborn and think both of these arrays are 'false' as they dont make much sense while printing :)
Yeah, or any other time you're only interested in the contents of the array and not whether it exists or not.
 But I agree on your point about consistency. I'll just have to set a  
 trap in the 'if' expression code writing routine in my neural networks  
 to signal me that I should be checking 'arr.length' instead of 'arr'.  
 And I *hope* it will work.
At least it won't ever segv :)
 It might be a timezone thing, I notice more posts here late night (for   
 me). I've also noticed less posts over the last few months (exams  
 maybe?).  Worry not! I will always share my opinion ;)
Thanks. Get ready cuz I'll have more issues to discuss ;) /+ But probably after the tomorrow's exam +/
Good luck.
 Ok, what about the other apprach, 'if (x)' checking both the nullness   
 and emptiness at once ?
So, "if(x)" <==> "if(x.ptr==null && x.length==0)". That would work, because neither of these cases is possible:
// you probably meant "if (!x)"
Yeah, thanks.
 A) x.ptr!=null && x.length==0
 B) x.ptr==null && x.length!=0
  but, if neither of those cases is possible then you can simply choose  
 and  check one or the other for null/0 as it currently does, checking  
 ptr.
A) is possible. char[] foo = ""
Doh! True.
 which means I have to check for foo.length instead of foo.ptr to see if  
 it contains any useful data. In this case it contains an out-of-bounds  
 '\0'.
Yeah, that's a whole other issue! (damn compatibility!) Regan
Jun 28 2005
parent reply Tom S <h3r3tic remove.mat.uni.torun.pl> writes:
Regan Heath wrote:

 It might be a timezone thing, I notice more posts here late night 
 (for   me). I've also noticed less posts over the last few months 
 (exams  maybe?).  Worry not! I will always share my opinion ;)
Thanks. Get ready cuz I'll have more issues to discuss ;) /+ But probably after the tomorrow's exam +/
Good luck.
Thanks :)
 A) x.ptr!=null && x.length==0
 B) x.ptr==null && x.length!=0
  but, if neither of those cases is possible then you can simply 
 choose  and  check one or the other for null/0 as it currently does, 
 checking  ptr.
A) is possible. char[] foo = ""
Doh! True.
 which means I have to check for foo.length instead of foo.ptr to see 
 if  it contains any useful data. In this case it contains an 
 out-of-bounds  '\0'.
Yeah, that's a whole other issue! (damn compatibility!)
/+ I assume it was irony +/ In this case it *is* /damn/ compatibility cuz I have to check for 'if (str.length)'. Checking 'if (str)' works most of the time just like the former but then it fails on empty strings. -- Tomasz Stachowiak /+ a.k.a. h3r3tic +/
Jun 28 2005
parent reply "Regan Heath" <regan netwin.co.nz> writes:
On Tue, 28 Jun 2005 11:07:01 +0200, Tom S  
<h3r3tic remove.mat.uni.torun.pl> wrote:
 A) x.ptr!=null && x.length==0
 B) x.ptr==null && x.length!=0
  but, if neither of those cases is possible then you can simply  
 choose  and  check one or the other for null/0 as it currently does,  
 checking  ptr.
A) is possible. char[] foo = ""
Doh! True.
 which means I have to check for foo.length instead of foo.ptr to see  
 if  it contains any useful data. In this case it contains an  
 out-of-bounds  '\0'.
Yeah, that's a whole other issue! (damn compatibility!)
/+ I assume it was irony +/ In this case it *is* /damn/ compatibility cuz I have to check for 'if (str.length)'. Checking 'if (str)' works most of the time just like the former but then it fails on empty strings.
Actually I was talking about C compatibility this time. IIRC hard coded/static strings have a \0 on the end to be compatible with C strings which must end in \0. Regardless you can create an empty array in other ways too eg. char[] tmp = "testing123"; char[] emp = tmp[0..0]; Regan
Jun 28 2005
parent reply Tom S <h3r3tic remove.mat.uni.torun.pl> writes:
Regan Heath wrote:
 Actually I was talking about C compatibility this time.
 
 IIRC hard coded/static strings have a \0 on the end to be compatible 
 with  C strings which must end in \0.
Yup I know. It happens that I've also got a C/C++ background...
 Regardless you can create an empty array in other ways too eg.
 
 char[] tmp = "testing123";
 char[] emp = tmp[0..0];
I see. Funnily, setting emp.length to 0 will null the pointer although length already is 0. A nice example of why you want it changed ;) -- Tomasz Stachowiak /+ a.k.a. h3r3tic +/
Jun 28 2005
parent "Regan Heath" <regan netwin.co.nz> writes:
On Tue, 28 Jun 2005 14:12:33 +0200, Tom S  
<h3r3tic remove.mat.uni.torun.pl> wrote:
 Regan Heath wrote:
 Actually I was talking about C compatibility this time.
  IIRC hard coded/static strings have a \0 on the end to be compatible  
 with  C strings which must end in \0.
Yup I know. It happens that I've also got a C/C++ background...
:)
 Regardless you can create an empty array in other ways too eg.
  char[] tmp = "testing123";
 char[] emp = tmp[0..0];
I see. Funnily, setting emp.length to 0 will null the pointer although length already is 0. A nice example of why you want it changed ;)
Yep, that is exactly the problem I want fixed. Regan
Jun 28 2005