digitalmars.D - Empty String == Empty Array == Empty Static Array?

AJG (32/32) Jun 24 2005 Hi there,

Stefan (7/39) Jun 25 2005 Have a look at the digitalmars.D.bugs NG.
Regan Heath (39/70) Jun 26 2005 I believe so. When you compare an array "reference" to null you end up

AJG (25/104) Jun 26 2005 Hi Regan,

Regan Heath (16/45) Jun 27 2005 Yep.

Tom S (9/20) Jun 27 2005 So if we can't have it, I'd strongly suggest making 'if (arr)' illegal.

Regan Heath (25/44) Jun 27 2005 No thanks.

Tom S (21/59) Jun 27 2005 There is some confusion at the moment. Granted that more people will

Regan Heath (19/38) Jun 27 2005 You mean false?

Tom S (23/58) Jun 27 2005 And it happens to make little sense in most cases. Simple operations

Regan Heath (43/85) Jun 27 2005 The rule is simple "if (x)" compares x with null/0.

Tom S (17/50) Jun 27 2005 I will. But that is just a convention. Just like doing 'if (const ==

Regan Heath (41/81) Jun 27 2005 Yep.

Tom S (15/41) Jun 28 2005 Ok... but I still am stubborn and think both of these arrays are 'false'...

Regan Heath (10/47) Jun 28 2005 Yeah, or any other time you're only interested in the contents of the

Tom S (7/37) Jun 28 2005 /+ I assume it was irony +/ In this case it *is* /damn/ compatibility

Regan Heath (9/28) Jun 28 2005 Actually I was talking about C compatibility this time.

Tom S (6/14) Jun 28 2005 I see. Funnily, setting emp.length to 0 will null the pointer although

Regan Heath (5/15) Jun 28 2005 :)

AJG <AJG_member pathlink.com> writes:

Hi there,

I was under the impression that strings (char[]) were strictly compatible with
and equivalent to other D arrays. However, the empty string seems to throw a
monkey wrench to this idea. In addition, static arrays also mess things up.

// Here we go:
char[]  nul = null;
char[]  str = "";
char[]  dyn;
char[0] sta;
char[]  ini = new char[0];

printf("%d", nul.length);
printf("%d", str.length);
printf("%d", dyn.length);
printf("%d", sta.length);
printf("%d", ini.length);
// The above all print 0, so far so good.

printf(nul ? "true" : "false");
printf(str ? "true" : "false");
printf(dyn ? "true" : "false");
printf(sta ? "true" : "false");
printf(ini ? "true" : "false");
// Here we get false, true, false, true, false.

What's up with this? Why are empty strings (which are empty even by .length
accounts) and static arrays different than other empty arrays?

I have a feeling this has got to do with the internal array pointer, but I'm not
sure. At any rate, it's not very semantically intuitive. Is this done on
purpose, or can we look forward to some convergence?

Cheers,
--AJG.

PS: What exactly does passing NULL to an array parameter do?

================================
2B || !2B, that is the question.

Jun 24 2005

Stefan <Stefan_member pathlink.com> writes:

Have a look at the digitalmars.D.bugs NG.
There was recently an extensive thread that discussed this whole
(non-) issue:

http://www.digitalmars.com/drn-bin/wwwnews?digitalmars.D.bugs/4301

Kind regards,
Stefan



In article <d9io82$ei9$1 digitaldaemon.com>, AJG says...
Hi there,

I was under the impression that strings (char[]) were strictly compatible with
and equivalent to other D arrays. However, the empty string seems to throw a
monkey wrench to this idea. In addition, static arrays also mess things up.

// Here we go:
char[]  nul = null;
char[]  str = "";
char[]  dyn;
char[0] sta;
char[]  ini = new char[0];

printf("%d", nul.length);
printf("%d", str.length);
printf("%d", dyn.length);
printf("%d", sta.length);
printf("%d", ini.length);
// The above all print 0, so far so good.

printf(nul ? "true" : "false");
printf(str ? "true" : "false");
printf(dyn ? "true" : "false");
printf(sta ? "true" : "false");
printf(ini ? "true" : "false");
// Here we get false, true, false, true, false.

What's up with this? Why are empty strings (which are empty even by .length
accounts) and static arrays different than other empty arrays?

I have a feeling this has got to do with the internal array pointer, but I'm not
sure. At any rate, it's not very semantically intuitive. Is this done on
purpose, or can we look forward to some convergence?

Cheers,
--AJG.

PS: What exactly does passing NULL to an array parameter do?

================================
2B || !2B, that is the question.

Jun 25 2005

"Regan Heath" <regan netwin.co.nz> writes:

On Sat, 25 Jun 2005 04:57:38 +0000 (UTC), AJG <AJG_member pathlink.com>  
wrote:
 I was under the impression that strings (char[]) were strictly  
 compatible with
 and equivalent to other D arrays. However, the empty string seems to  
 throw a
 monkey wrench to this idea. In addition, static arrays also mess things  
 up.

 // Here we go:
 char[]  nul = null;
 char[]  str = "";
 char[]  dyn;
 char[0] sta;
 char[]  ini = new char[0];

 printf("%d", nul.length);
 printf("%d", str.length);
 printf("%d", dyn.length);
 printf("%d", sta.length);
 printf("%d", ini.length);
 // The above all print 0, so far so good.

 printf(nul ? "true" : "false");
 printf(str ? "true" : "false");
 printf(dyn ? "true" : "false");
 printf(sta ? "true" : "false");
 printf(ini ? "true" : "false");
 // Here we get false, true, false, true, false.

 What's up with this? Why are empty strings (which are empty even by  
 .length accounts) and static arrays different than other empty arrays?

 I have a feeling this has got to do with the internal array pointer, but  
 I'm not sure.

I believe so. When you compare an array "reference" to null you end up  
comparing the array data pointer to null. An array is essentially a struct  
in the form:

struct array {
   int length;
   void* data;
}

(perhaps not using void, but instead the actual data type?)

Note, a static array is special. It's data member cannot be null and it's  
length parameter is actually macro replaced upon compilation. eg.

char[5] sta;
char[] dyn;

void main()
{
	int* p = &sta.length;
	int* q = &dyn.length;
}

sta.d(1): constant 5 is not an lvalue

sta.d(7): dyn.length is not an lvalue
sta.d(7): cannot implicitly convert expression (#dyn.length) of type uint*  
to int*

Errors 1 & 2 show 'sta.length' replaced by '5'.
Errors 3 & 4 are due to .length being a property/getter method call, not  
an int.

I suspect also that the reference refers directly to the data and not to  
an array struct (shown above).

 At any rate, it's not very semantically intuitive. Is this done on
 purpose, or can we look forward to some convergence?

It's a metter of asking for what you really want to know. For example...

   if(arr is null) ;//array never assigned, i.e. non-existant.
   else if (arr.length == 0) ;//array exists, no data present.
   else ;//array exists, has some data.

In many cases "arr.length == 0" is all you care about, in some "arr is  
null" might be important.

 PS: What exactly does passing NULL to an array parameter do?

Creates an array (struct shown above) with data set to null and length set  
to 0. This is why .length always 'works' and never gives a segmentation  
fault for a 'null' array.

Regan

Jun 26 2005

AJG <AJG_member pathlink.com> writes:

Hi Regan,

Thanks a ton for the post! This actually cleared everything up for me. At least
now I know what to expect from the language regarding arrays.

 PS: What exactly does passing NULL to an array parameter do?

Creates an array (struct shown above) with data set to null and length set  
to 0. This is why .length always 'works' and never gives a segmentation  
fault for a 'null' array.

I guess this is a good thing since it prevents functions expecting only an array
(even if empty) from segfaulting on a null. However, it also reduces the
possibilities for the programmer. Since passing null means passing "an empty
array," you can't differentiate between them (safely). Right?

In addition, this is somewhat inconsistent with the way classes and objects are
treated. For a class, you can pass a null, and the function will receive that
null (not an "empty" class, whatever that would be) and it will segfault unless
you guard against it. This is why I was originally confused with the behaviour
of arrays.

Finally, a small suggestion. What if when doing something like:

if (someArray) // Do stuff.

Meant implicitly checking the .length property automagically (instead of
checking the internal pointer)? The former seems more useful to me, since the
concept of a null array no longer exists; it's also universal across all array
types. The latter, on the other hand, seems kind of implementation-ish and
hackish, IMHO.

Anyway, thanks again for the help.
Cheers,
--AJG.



In article <opsszz0xna23k2f5 nrage.netwin.co.nz>, Regan Heath says...
On Sat, 25 Jun 2005 04:57:38 +0000 (UTC), AJG <AJG_member pathlink.com>  
wrote:
 I was under the impression that strings (char[]) were strictly  
 compatible with
 and equivalent to other D arrays. However, the empty string seems to  
 throw a
 monkey wrench to this idea. In addition, static arrays also mess things  
 up.

 // Here we go:
 char[]  nul = null;
 char[]  str = "";
 char[]  dyn;
 char[0] sta;
 char[]  ini = new char[0];

 printf("%d", nul.length);
 printf("%d", str.length);
 printf("%d", dyn.length);
 printf("%d", sta.length);
 printf("%d", ini.length);
 // The above all print 0, so far so good.

 printf(nul ? "true" : "false");
 printf(str ? "true" : "false");
 printf(dyn ? "true" : "false");
 printf(sta ? "true" : "false");
 printf(ini ? "true" : "false");
 // Here we get false, true, false, true, false.

 What's up with this? Why are empty strings (which are empty even by  
 .length accounts) and static arrays different than other empty arrays?

 I have a feeling this has got to do with the internal array pointer, but  
 I'm not sure.

I believe so. When you compare an array "reference" to null you end up  
comparing the array data pointer to null. An array is essentially a struct  
in the form:

struct array {
   int length;
   void* data;
}

(perhaps not using void, but instead the actual data type?)

Note, a static array is special. It's data member cannot be null and it's  
length parameter is actually macro replaced upon compilation. eg.

char[5] sta;
char[] dyn;

void main()
{
	int* p = &sta.length;
	int* q = &dyn.length;
}

sta.d(1): constant 5 is not an lvalue

sta.d(7): dyn.length is not an lvalue
sta.d(7): cannot implicitly convert expression (#dyn.length) of type uint*  
to int*

Errors 1 & 2 show 'sta.length' replaced by '5'.
Errors 3 & 4 are due to .length being a property/getter method call, not  
an int.

I suspect also that the reference refers directly to the data and not to  
an array struct (shown above).

 At any rate, it's not very semantically intuitive. Is this done on
 purpose, or can we look forward to some convergence?

It's a metter of asking for what you really want to know. For example...

   if(arr is null) ;//array never assigned, i.e. non-existant.
   else if (arr.length == 0) ;//array exists, no data present.
   else ;//array exists, has some data.

In many cases "arr.length == 0" is all you care about, in some "arr is  
null" might be important.

 PS: What exactly does passing NULL to an array parameter do?

Creates an array (struct shown above) with data set to null and length set  
to 0. This is why .length always 'works' and never gives a segmentation  
fault for a 'null' array.

Regan

================================
2B || !2B, that is the question.

Jun 26 2005

"Regan Heath" <regan netwin.co.nz> writes:

On Mon, 27 Jun 2005 05:09:02 +0000 (UTC), AJG <AJG_member pathlink.com>  
wrote:
 Hi Regan,

 Thanks a ton for the post! This actually cleared everything up for me.  
 At least now I know what to expect from the language regarding arrays.

 PS: What exactly does passing NULL to an array parameter do?

 Creates an array (struct shown above) with data set to null and length  
 set
 to 0. This is why .length always 'works' and never gives a segmentation
 fault for a 'null' array.

 I guess this is a good thing since it prevents functions expecting only  
 an array (even if empty) from segfaulting on a null.

Yep.

 However, it also reduces the
 possibilities for the programmer. Since passing null means passing "an  
 empty array," you can't differentiate between them (safely). Right?

Actually you can still differentiate between a null array and an empty  
array, eg.

if (arr is null) ;//non-existant
else if (arr.length == 0) ;//empty
else ;//has items

However don't set length to 0 or you'll turn an empty array into a null  
array. I believe this behaviour is broken.

 In addition, this is somewhat inconsistent with the way classes and  
 objects are
 treated. For a class, you can pass a null, and the function will receive  
 that
 null (not an "empty" class, whatever that would be) and it will segfault  
 unless
 you guard against it. This is why I was originally confused with the  
 behaviour
 of arrays.

True. Were it up to me...

 Finally, a small suggestion. What if when doing something like:

 if (someArray) // Do stuff.

 Meant implicitly checking the .length property automagically (instead of
 checking the internal pointer)?

That has been suggested. IMO it breaks consistency with all other types  
where if (x) compares x will null.

 The former seems more useful to me, since the concept of a null array no  
 longer exists;

But it does, it's just not obvious (maybe that's a good thing?) in general  
cases you do not care about null, only if (arr.length == 0).

Regan

Jun 27 2005

Tom S <h3r3tic remove.mat.uni.torun.pl> writes:

Regan Heath wrote:
 On Mon, 27 Jun 2005 05:09:02 +0000 (UTC), AJG <AJG_member pathlink.com>  
 Finally, a small suggestion. What if when doing something like:

 if (someArray) // Do stuff.

 Meant implicitly checking the .length property automagically (instead of
 checking the internal pointer)?

 
 
 That has been suggested. IMO it breaks consistency with all other types  
 where if (x) compares x will null.

So if we can't have it, I'd strongly suggest making 'if (arr)' illegal. 
  Otherwise massive confusion will follow. Currently arrays are treated 
both as classes and structs at the same time. We can do if(arr), 
checking null-ness like with classes but passing arrays to functions 
works as if they were structs - thus we have to use inout to modify ptr 
or length. Where's the consistency ??? :(



-- 
Tomasz Stachowiak  /+ a.k.a. h3r3tic +/

Jun 27 2005

"Regan Heath" <regan netwin.co.nz> writes:

On Mon, 27 Jun 2005 12:31:32 +0200, Tom S  
<h3r3tic remove.mat.uni.torun.pl> wrote:
 Regan Heath wrote:
 On Mon, 27 Jun 2005 05:09:02 +0000 (UTC), AJG <AJG_member pathlink.com>
 Finally, a small suggestion. What if when doing something like:

 if (someArray) // Do stuff.

 Meant implicitly checking the .length property automagically (instead  
 of
 checking the internal pointer)?

   That has been suggested. IMO it breaks consistency with all other  
 types  where if (x) compares x will null.

 So if we can't have it, I'd strongly suggest making 'if (arr)' illegal.

No thanks.

   Otherwise massive confusion will follow.

Somewhat prophetic?

 Currently arrays are treated both as classes and structs at the same  
 time.

That is because they share properties of both but are in fact neither.  
They are 'arrays'.

 We can do if(arr), checking null-ness like with classes but passing  
 arrays to functions works as if they were structs

The specific nature of arrays makes slicing possible. Slicing is  
incredibly powerful (I think you'll agree). When you slice you create a  
new reference to a new 'struct' where the data pointer refers to the start  
of the data and the length is set accordingly.

When you pass an array as an 'in' parameter you effectively slice the  
entire array, you basically duplicate the 'struct'. Sure, the default  
could have been to pass the reference, but then you loose the choice to  
pass as it currently does. You still have the choice of passing as a  
reference, simply use 'inout'. It's all about choice.

"if (arr)" compares the data pointer, but, it is actually comparing the  
reference at the same time as a null reference has a null data pointer.  
Making it silently compare the length would be confusing IMO.

The only problem I have with arrays is with "arr.length = 0;" setting the  
data pointer to null, this turns an existing/empty array into a  
null/non-existant array.

 - thus we have to use inout to modify ptr or length. Where's the  
 consistency ??? :(

An array is not a class.
An array is not a struct.
An array is a unique type that has properties of both classes and structs.

Regan

Jun 27 2005

Tom S <h3r3tic remove.mat.uni.torun.pl> writes:

Regan Heath wrote:
 On Mon, 27 Jun 2005 12:31:32 +0200, Tom S  
 <h3r3tic remove.mat.uni.torun.pl> wrote:
 
 Regan Heath wrote:
   Otherwise massive confusion will follow.

 
 
 Somewhat prophetic?

There is some confusion at the moment. Granted that more people will 
find interest in D, I think I could be a prophet by profession.


 Currently arrays are treated both as classes and structs at the same  
 time.

 
 
 That is because they share properties of both but are in fact neither.  
 They are 'arrays'.

Yup. I agree with that.


 We can do if(arr), checking null-ness like with classes but passing  
 arrays to functions works as if they were structs

 
 
 The specific nature of arrays makes slicing possible. Slicing is  
 incredibly powerful (I think you'll agree). When you slice you create a  
 new reference to a new 'struct' where the data pointer refers to the 
 start  of the data and the length is set accordingly.
 
 When you pass an array as an 'in' parameter you effectively slice the  
 entire array, you basically duplicate the 'struct'. Sure, the default  
 could have been to pass the reference, but then you loose the choice to  
 pass as it currently does. You still have the choice of passing as a  
 reference, simply use 'inout'. It's all about choice.

That is the correct behaviour IMO and I got used to it long time ago.


 The only problem I have with arrays is with "arr.length = 0;" setting 
 the  data pointer to null, this turns an existing/empty array into a  
 null/non-existant array.

I haven't had a need to differentiate between an empty and a null array, 
so for me 'if (arr)' should be true when either the pointer is null or 
the length is == 0. Thus if (arr) checking ptr makes little sense IMO, 
but for you it's the desired behaviour.
Thus I suggested disallowing implicit conversions from arrays to bool's 
in order to make sure the programmer specifies exactly he/she wants.
Some people will quickly grasp the difference between nullness and 
emptiness of arrays. But others will find their code buggy due to the 
implicit conversion.
How often do you want to check the nullness of an array in your code 
compared to checking its emptiness ? I haven't had a single reason to 
check for nullness yet. Thus disallowing 'if (arr)' seems like the best 
option IMO. Any other solutions ?


 - thus we have to use inout to modify ptr or length. Where's the  
 consistency ??? :(

 
 
 An array is not a class.
 An array is not a struct.
 An array is a unique type that has properties of both classes and structs.

Agreed


-- 
Tomasz Stachowiak  /+ a.k.a. h3r3tic +/

Jun 27 2005

"Regan Heath" <regan netwin.co.nz> writes:

On Mon, 27 Jun 2005 17:44:24 +0200, Tom S  
<h3r3tic remove.mat.uni.torun.pl> wrote:
 Regan Heath wrote:
 The only problem I have with arrays is with "arr.length = 0;" setting  
 the  data pointer to null, this turns an existing/empty array into a   
 null/non-existant array.

 I haven't had a need to differentiate between an empty and a null array,  
 so for me 'if (arr)' should be true when either the pointer is null or  
 the length is == 0.

You mean false?

 Thus if (arr) checking ptr makes little sense IMO, but for you it's the  
 desired behaviour.

"if (arr)" does not check the ptr, it checks the reference, comparing it  
to null/0 just like it does for _every other type_ in D. It just happens  
that for arrays a null reference cannot exist, instead you get a reference  
to a struct with a null data pointer, thus the confusion.

 Thus I suggested disallowing implicit conversions from arrays to bool's  
 in order to make sure the programmer specifies exactly he/she wants.

I dont think arrays have a stronger argument for this than any other type  
in D.

 Some people will quickly grasp the difference between nullness and  
 emptiness of arrays. But others will find their code buggy due to the  
 implicit conversion.

People simply need to learn that an array is not a reference, not a  
struct, but a unique type with properties of both. It's a reference to a  
struct and cannot be null. I dont think the implicit conversion is the  
cause of confusion, I think the nature of arrays is, once that is grasped  
confusion vanishes.

 How often do you want to check the nullness of an array in your code  
 compared to checking its emptiness ? I haven't had a single reason to  
 check for nullness yet.

I'll admit it's much less frequent than simply wanting to know if there  
are any items or not.

 Thus disallowing 'if (arr)' seems like the best option IMO. Any other  
 solutions ?

Fix "arr.length = 0;" so it does not set the data ptr to null. Nothing  
else needs to be done IMO.

Regan

Jun 27 2005

Tom S <h3r3tic remove.mat.uni.torun.pl> writes:

Regan Heath wrote:
 On Mon, 27 Jun 2005 17:44:24 +0200, Tom S  
 <h3r3tic remove.mat.uni.torun.pl> wrote:
 
 Regan Heath wrote:
 I haven't had a need to differentiate between an empty and a null 
 array,  so for me 'if (arr)' should be true when either the pointer is 
 null or  the length is == 0.

 
 
 You mean false?

Sorry, my bug.


 Thus if (arr) checking ptr makes little sense IMO, but for you it's 
 the  desired behaviour.

 
 
 "if (arr)" does not check the ptr, it checks the reference, comparing 
 it  to null/0 just like it does for _every other type_ in D. It just 
 happens  that for arrays a null reference cannot exist, instead you get 
 a reference  to a struct with a null data pointer, thus the confusion.

And it happens to make little sense in most cases. Simple operations 
should be simple. 99% of time I want to check if an array is empty. if 
(foo) then means 'does foo have valid data ? / does foo make any sense ?'.
if (obj)  <-- obj reference is not null ?
if (arr)  <-- array is not empty ?
That's the test I mean by default. Instead by default I'm getting a 
something that can only be a source of bugs :/
But it seems that I'm alone with my view on the subject, so I'll just 
shut up and go play with my toys.



 Thus I suggested disallowing implicit conversions from arrays to 
 bool's  in order to make sure the programmer specifies exactly he/she 
 wants.

 
 
 I dont think arrays have a stronger argument for this than any other 
 type  in D.

No ? Arrays are the only type in D that can be asked for their members 
(ptr, length) when they are null (when in your reasoning, the nullness 
of the ptr member means nullness of the array reference). This doesn't 
happen to be true with classes nor structs /+ excluding static members +/


 Some people will quickly grasp the difference between nullness and  
 emptiness of arrays. But others will find their code buggy due to the  
 implicit conversion.

 
 
 People simply need to learn that an array is not a reference, not a  
 struct, but a unique type with properties of both. It's a reference to 
 a  struct and cannot be null. I dont think the implicit conversion is 
 the  cause of confusion, I think the nature of arrays is, once that is 
 grasped  confusion vanishes.

Confusion vanishes, bugs dont. Once one understands the nature of the 
for loop, they won't write code like:

for (....);
{
}

or will they ?  /+ yes, I know that dmd reports an error here +/



-- 
Tomasz Stachowiak  /+ a.k.a. h3r3tic +/

Jun 27 2005

"Regan Heath" <regan netwin.co.nz> writes:

On Mon, 27 Jun 2005 23:37:46 +0200, Tom S  
<h3r3tic remove.mat.uni.torun.pl> wrote:
 Thus if (arr) checking ptr makes little sense IMO, but for you it's  
 the  desired behaviour.

   "if (arr)" does not check the ptr, it checks the reference, comparing  
 it  to null/0 just like it does for _every other type_ in D. It just  
 happens  that for arrays a null reference cannot exist, instead you get  
 a reference  to a struct with a null data pointer, thus the confusion.

 And it happens to make little sense in most cases.

The rule is simple "if (x)" compares x with null/0.

In the case of an array you're comparing the array *reference*, just like  
any other reference. You don't actually need to know any more than that.

Of course, arrays are not just references, they cannot be null, this fact  
is not important till you write "if (x.length == 0)" without checking for  
null, in which case bug avoided! :)

If you don't think "if (x)" is explicit enough, don't use it, use "if (x  
is null)" or "if (x !is null)" or "if (x.length == 0)". In fact, I'd  
recommend that if you're coding with more than just yourself.

 Simple operations should be simple. 99% of time I want to check if an  
 array is empty.

And it's simple "if (x.length == 0)". Arrays cannot be null so this is  
safe, simple, explicit, obvious, perfect (that last one might be an  
exageration)

 if (foo) then means 'does foo have valid data ? / does foo make any  
 sense ?'.

Nope, "is foo null or 0"

 if (obj)  <-- obj reference is not null ?

Yep, "is obj null or 0"

 if (arr)  <-- array is not empty ?

Nope, "is arr null or 0" (arr being like a reference here)

Notice the pattern?

In short "if (i)" == is 'i' null or 0
What type is 'i'? Who cares! it doesn't actually matter the same rule  
applies to all types (except a struct/union for which this is illegal)

 That's the test I mean by default. Instead by default I'm getting a  
 something that can only be a source of bugs :/

Anything you do not understand (not you specifically, people in general)  
will be a source of bugs.

 But it seems that I'm alone with my view on the subject, so I'll just  
 shut up and go play with my toys.

You might be alone, I doubt it tho, every crazy view has at least one  
other supporter ;) (joke!)

Argument is a good thing so long as both parties listen and attempt to  
understand the other point of view. If you're not doing that, you're not  
arguing, you're simply saying things at someone. (again, not you  
specifically, people in general)

 Thus I suggested disallowing implicit conversions from arrays to  
 bool's  in order to make sure the programmer specifies exactly he/she  
 wants.

   I dont think arrays have a stronger argument for this than any other  
 type  in D.

 No ? Arrays are the only type in D that can be asked for their members  
 (ptr, length) when they are null

Sure.

 (when in your reasoning, the nullness of the ptr member means nullness  
 of the array reference).

Or rather is equivalent to. A null data ptr means the array reference  
would be null, only it cannot be.

 This doesn't happen to be true with classes nor structs /+ excluding  
 static members +/

Sure, so what? I mean yes, they're different, but why shouldn't you be  
able to say "if (arr)" and check the nullness of the reference. That's  
what it does, effectively, even if it actually uses the data ptr and the  
reference itself is not null.

 Some people will quickly grasp the difference between nullness and   
 emptiness of arrays. But others will find their code buggy due to the   
 implicit conversion.

   People simply need to learn that an array is not a reference, not a   
 struct, but a unique type with properties of both. It's a reference to  
 a  struct and cannot be null. I dont think the implicit conversion is  
 the  cause of confusion, I think the nature of arrays is, once that is  
 grasped  confusion vanishes.

 Confusion vanishes, bugs dont. Once one understands the nature of the  
 for loop, they won't write code like:

 for (....);
 {
 }

 or will they ?  /+ yes, I know that dmd reports an error here +/

Sure, there will always be human error, nothing you do can remove that  
(except to remove humans...). DMD tries to lessen it (as in the case shown  
above).

I dont think making if (x) will lessen human error, and it'll certainly  
make arrays inconsistent with all other types in D. Perhaps it's my C  
background but if (x) has always made sense to me.

Regan

Jun 27 2005

Tom S <h3r3tic remove.mat.uni.torun.pl> writes:

Regan Heath wrote:
 If you don't think "if (x)" is explicit enough, don't use it, use "if 
 (x  is null)" or "if (x !is null)" or "if (x.length == 0)". In fact, 
 I'd  recommend that if you're coding with more than just yourself.

I will. But that is just a convention. Just like doing 'if (const == 
var)' instead of 'if (var == const)' in C/C++. That doesn't prevent new 
coders that do not use this convention from committing the same error 
over and over.


 Nope, "is foo null or 0"
 
 if (obj)  <-- obj reference is not null ?

 
 
 Yep, "is obj null or 0"
 
 if (arr)  <-- array is not empty ?

 
 
 Nope, "is arr null or 0" (arr being like a reference here)

Yes ! Exactly. "is arr null" <=> "is arr.ptr null" ; "or 0" <=> "or 
arr.length == 0". I would really like 'if (arr)' to work that way.


 Notice the pattern?
 
 In short "if (i)" == is 'i' null or 0
 What type is 'i'? Who cares! it doesn't actually matter the same rule  
 applies to all types (except a struct/union for which this is illegal)

Ok... but from what you're saying it's apparent to me that:
'if (arr)' should mean 'if (arr.ptr !is null && arr.length > 0)'. Then 
it would be consistent.


 But it seems that I'm alone with my view on the subject, so I'll just  
 shut up and go play with my toys.

 
 
 You might be alone, I doubt it tho, every crazy view has at least one  
 other supporter ;) (joke!)
 
 Argument is a good thing so long as both parties listen and attempt to  
 understand the other point of view. If you're not doing that, you're 
 not  arguing, you're simply saying things at someone. (again, not you  
 specifically, people in general)

You're right, I'm just surprised that I don't hear any other opinions on 
this matter than yours (which I appreciate of course). Maybe everyone's 
too busy. Wait... I should be studying for my exams <lol>


 I dont think making if (x) will lessen human error, and it'll certainly  
 make arrays inconsistent with all other types in D. Perhaps it's my C  
 background but if (x) has always made sense to me.

Ok, what about the other apprach, 'if (x)' checking both the nullness 
and emptiness at once ?


-- 
Tomasz Stachowiak  /+ a.k.a. h3r3tic +/

Jun 27 2005

"Regan Heath" <regan netwin.co.nz> writes:

On Tue, 28 Jun 2005 01:32:05 +0200, Tom S  
<h3r3tic remove.mat.uni.torun.pl> wrote:
 Regan Heath wrote:
 If you don't think "if (x)" is explicit enough, don't use it, use "if  
 (x  is null)" or "if (x !is null)" or "if (x.length == 0)". In fact,  
 I'd  recommend that if you're coding with more than just yourself.

 I will. But that is just a convention. Just like doing 'if (const ==  
 var)' instead of 'if (var == const)' in C/C++.

Yep.

 That doesn't prevent new coders that do not use this convention from  
 committing the same error over and over.

You're right. It wasn't intended to help new coders write code, but rather  
help any coder reading another coders code. (that's a tongue twister!)

 Nope, "is foo null or 0"

 if (obj)  <-- obj reference is not null ?

   Yep, "is obj null or 0"

 if (arr)  <-- array is not empty ?

   Nope, "is arr null or 0" (arr being like a reference here)

 Yes ! Exactly. "is arr null" <=> "is arr.ptr null" ; "or 0" <=> "or  
 arr.length == 0". I would really like 'if (arr)' to work that way.

In D as in C null *is* 0. C doesn't make a distinction. D is the same (to  
a more limited degree perhaps?).

When you compare any variable 'x' using 'if' i.e. "if (x)" you're saying  
is the variable 'x' null/0, you're not comparing any property or member of  
'x' with null/0. To do so would be a hidden implicit (it's not explicitly  
naming the member) and very unintuitive operation IMO.

Before you note that in the case of an array it's comparing the data ptr,  
I'll remind you that this occurs because in all cases where an array  
reference would be null/0, but cannot be, the data pointer *is* null/0.

Compare that to the length property which can be 0 when the reference  
would not be null/0. i.e.

char[] arr = "";

In other words the data pointer *is* the reference for purposes of  
detemining whether the reference is null/0.

 Notice the pattern?
  In short "if (i)" == is 'i' null or 0
 What type is 'i'? Who cares! it doesn't actually matter the same rule   
 applies to all types (except a struct/union for which this is illegal)

 Ok... but from what you're saying it's apparent to me that:
 'if (arr)' should mean 'if (arr.ptr !is null && arr.length > 0)'. Then  
 it would be consistent.

No, to be consistent "if (x)" must always compare the variable "x" with  
null/0 not implicitly compare one or more of it's members to null/0 or  
something else.

If we don't care about consistency then this change would mean that:
   char[] emp = "";
   char[] nul = null;

   if (emp) ; //false
   if (nul) ; //false

Meaning, basically, an array with no items is not ever 'true'. This might  
be nice as it's the most common case, but it's still inconsistent. I think  
I prefer consistency here.

 But it seems that I'm alone with my view on the subject, so I'll just   
 shut up and go play with my toys.

   You might be alone, I doubt it tho, every crazy view has at least  
 one  other supporter ;) (joke!)
  Argument is a good thing so long as both parties listen and attempt  
 to  understand the other point of view. If you're not doing that,  
 you're not  arguing, you're simply saying things at someone. (again,  
 not you  specifically, people in general)

 You're right, I'm just surprised that I don't hear any other opinions on  
 this matter than yours (which I appreciate of course). Maybe everyone's  
 too busy. Wait... I should be studying for my exams <lol>

It might be a timezone thing, I notice more posts here late night (for  
me). I've also noticed less posts over the last few months (exams maybe?).  
Worry not! I will always share my opinion ;)

 I dont think making if (x) will lessen human error, and it'll  
 certainly  make arrays inconsistent with all other types in D. Perhaps  
 it's my C  background but if (x) has always made sense to me.

 Ok, what about the other apprach, 'if (x)' checking both the nullness  
 and emptiness at once ?

So, "if(x)" <==> "if(x.ptr==null && x.length==0)". That would work,  
because neither of these cases is possible:

A) x.ptr!=null && x.length==0
B) x.ptr==null && x.length!=0

but, if neither of those cases is possible then you can simply choose and  
check one or the other for null/0 as it currently does, checking ptr.

I think ptr is a better choice than length for reasons mentioned above.

Regan

Jun 27 2005

Tom S <h3r3tic remove.mat.uni.torun.pl> writes:

Regan Heath wrote:
 On Tue, 28 Jun 2005 01:32:05 +0200, Tom S  
 <h3r3tic remove.mat.uni.torun.pl> wrote:
 If we don't care about consistency then this change would mean that:
   char[] emp = "";
   char[] nul = null;
 
   if (emp) ; //false
   if (nul) ; //false
 
 Meaning, basically, an array with no items is not ever 'true'. This 
 might  be nice as it's the most common case, but it's still 
 inconsistent. I think  I prefer consistency here.

Ok... but I still am stubborn and think both of these arrays are 'false' 
as they dont make much sense while printing :) But I agree on your point 
about consistency. I'll just have to set a trap in the 'if' expression 
code writing routine in my neural networks to signal me that I should be 
checking 'arr.length' instead of 'arr'. And I *hope* it will work.



 It might be a timezone thing, I notice more posts here late night (for  
 me). I've also noticed less posts over the last few months (exams 
 maybe?).  Worry not! I will always share my opinion ;)

Thanks. Get ready cuz I'll have more issues to discuss ;) /+ But 
probably after the tomorrow's exam +/


 Ok, what about the other apprach, 'if (x)' checking both the nullness  
 and emptiness at once ?

 
 
 So, "if(x)" <==> "if(x.ptr==null && x.length==0)". That would work,  
 because neither of these cases is possible:

// you probably meant "if (!x)"


 A) x.ptr!=null && x.length==0
 B) x.ptr==null && x.length!=0
 
 but, if neither of those cases is possible then you can simply choose 
 and  check one or the other for null/0 as it currently does, checking ptr.

A) is possible.

char[] foo = ""

which means I have to check for foo.length instead of foo.ptr to see if 
it contains any useful data. In this case it contains an out-of-bounds '\0'.



-- 
Tomasz Stachowiak  /+ a.k.a. h3r3tic +/

Jun 28 2005

"Regan Heath" <regan netwin.co.nz> writes:

On Tue, 28 Jun 2005 10:42:13 +0200, Tom S  
<h3r3tic remove.mat.uni.torun.pl> wrote:
 Regan Heath wrote:
 On Tue, 28 Jun 2005 01:32:05 +0200, Tom S   
 <h3r3tic remove.mat.uni.torun.pl> wrote:
 If we don't care about consistency then this change would mean that:
   char[] emp = "";
   char[] nul = null;
    if (emp) ; //false
   if (nul) ; //false
  Meaning, basically, an array with no items is not ever 'true'. This  
 might  be nice as it's the most common case, but it's still  
 inconsistent. I think  I prefer consistency here.

 Ok... but I still am stubborn and think both of these arrays are 'false'  
 as they dont make much sense while printing :)

Yeah, or any other time you're only interested in the contents of the  
array and not whether it exists or not.

 But I agree on your point about consistency. I'll just have to set a  
 trap in the 'if' expression code writing routine in my neural networks  
 to signal me that I should be checking 'arr.length' instead of 'arr'.  
 And I *hope* it will work.

At least it won't ever segv :)

 It might be a timezone thing, I notice more posts here late night (for   
 me). I've also noticed less posts over the last few months (exams  
 maybe?).  Worry not! I will always share my opinion ;)

 Thanks. Get ready cuz I'll have more issues to discuss ;) /+ But  
 probably after the tomorrow's exam +/

Good luck.

 Ok, what about the other apprach, 'if (x)' checking both the nullness   
 and emptiness at once ?

   So, "if(x)" <==> "if(x.ptr==null && x.length==0)". That would work,   
 because neither of these cases is possible:

 // you probably meant "if (!x)"

Yeah, thanks.

 A) x.ptr!=null && x.length==0
 B) x.ptr==null && x.length!=0
  but, if neither of those cases is possible then you can simply choose  
 and  check one or the other for null/0 as it currently does, checking  
 ptr.

 A) is possible.

 char[] foo = ""

Doh! True.

 which means I have to check for foo.length instead of foo.ptr to see if  
 it contains any useful data. In this case it contains an out-of-bounds  
 '\0'.

Yeah, that's a whole other issue! (damn compatibility!)

Regan

Jun 28 2005

Tom S <h3r3tic remove.mat.uni.torun.pl> writes:

Regan Heath wrote:

 It might be a timezone thing, I notice more posts here late night 
 (for   me). I've also noticed less posts over the last few months 
 (exams  maybe?).  Worry not! I will always share my opinion ;)


 Thanks. Get ready cuz I'll have more issues to discuss ;) /+ But  
 probably after the tomorrow's exam +/

 
 
 Good luck.

Thanks :)



 A) x.ptr!=null && x.length==0
 B) x.ptr==null && x.length!=0
  but, if neither of those cases is possible then you can simply 
 choose  and  check one or the other for null/0 as it currently does, 
 checking  ptr.


 A) is possible.

 char[] foo = ""

 
 
 Doh! True.
 
 which means I have to check for foo.length instead of foo.ptr to see 
 if  it contains any useful data. In this case it contains an 
 out-of-bounds  '\0'.

 
 
 Yeah, that's a whole other issue! (damn compatibility!)

/+ I assume it was irony +/ In this case it *is* /damn/ compatibility 
cuz I have to check for 'if (str.length)'. Checking 'if (str)' works 
most of the time just like the former but then it fails on empty strings.


-- 
Tomasz Stachowiak  /+ a.k.a. h3r3tic +/

Jun 28 2005

"Regan Heath" <regan netwin.co.nz> writes:

On Tue, 28 Jun 2005 11:07:01 +0200, Tom S  
<h3r3tic remove.mat.uni.torun.pl> wrote:
 A) x.ptr!=null && x.length==0
 B) x.ptr==null && x.length!=0
  but, if neither of those cases is possible then you can simply  
 choose  and  check one or the other for null/0 as it currently does,  
 checking  ptr.


 A) is possible.

 char[] foo = ""

   Doh! True.

 which means I have to check for foo.length instead of foo.ptr to see  
 if  it contains any useful data. In this case it contains an  
 out-of-bounds  '\0'.

   Yeah, that's a whole other issue! (damn compatibility!)

 /+ I assume it was irony +/ In this case it *is* /damn/ compatibility  
 cuz I have to check for 'if (str.length)'. Checking 'if (str)' works  
 most of the time just like the former but then it fails on empty strings.

Actually I was talking about C compatibility this time.

IIRC hard coded/static strings have a \0 on the end to be compatible with  
C strings which must end in \0.
Regardless you can create an empty array in other ways too eg.

char[] tmp = "testing123";
char[] emp = tmp[0..0];

Regan

Jun 28 2005

Tom S <h3r3tic remove.mat.uni.torun.pl> writes:

Regan Heath wrote:
 Actually I was talking about C compatibility this time.
 
 IIRC hard coded/static strings have a \0 on the end to be compatible 
 with  C strings which must end in \0.

Yup I know. It happens that I've also got a C/C++ background...



 Regardless you can create an empty array in other ways too eg.
 
 char[] tmp = "testing123";
 char[] emp = tmp[0..0];

I see. Funnily, setting emp.length to 0 will null the pointer although 
length already is 0. A nice example of why you want it changed ;)



-- 
Tomasz Stachowiak  /+ a.k.a. h3r3tic +/

Jun 28 2005

"Regan Heath" <regan netwin.co.nz> writes:

On Tue, 28 Jun 2005 14:12:33 +0200, Tom S  
<h3r3tic remove.mat.uni.torun.pl> wrote:
 Regan Heath wrote:
 Actually I was talking about C compatibility this time.
  IIRC hard coded/static strings have a \0 on the end to be compatible  
 with  C strings which must end in \0.

 Yup I know. It happens that I've also got a C/C++ background...

:)

 Regardless you can create an empty array in other ways too eg.
  char[] tmp = "testing123";
 char[] emp = tmp[0..0];

 I see. Funnily, setting emp.length to 0 will null the pointer although  
 length already is 0. A nice example of why you want it changed ;)

Yep, that is exactly the problem I want fixed.

Regan

Jun 28 2005

D Programming

C/C++ Programming

Other

digitalmars.D - Empty String == Empty Array == Empty Static Array?