www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.bugs - An empty array: true or false?

reply Stewart Gordon <smjg_1998 yahoo.com> writes:
Using DMD 0.126, Windows 98SE.

cppstrings.html
----------
C++ strings use a function to determine if a string is empty:

     string str;
     if (str.empty())
         // string is empty

In D, an empty string is just null:

     char[] str;
     if (!str)
         // string is empty
----------

This bit of the documentation is hopelessly confusing the concepts of 
"null" and "empty".  If this D code is really supposed to be the 
equivalent of the C++ code immediately above it, then the compiler isn't 
seeing it this way.

In fact, this is testing whether str is null, not whether it is empty.

----------
import std.stdio;

void main() {
     char[] qwert = "";
     writefln(qwert.length);
     if (qwert) {
         writefln("qwert");
     }
     if (!qwert) {
         writefln("!qwert");
     }
     if (qwert == null) {
         writefln("qwert == null");
     }
     if (qwert != null) {
         writefln("qwert != null");
     }

     qwert = null;
     if (qwert) {
         writefln("qwert");
     }
     if (!qwert) {
         writefln("!qwert");
     }
     if (qwert == null) {
         writefln("qwert == null");
     }
     if (qwert != null) {
         writefln("qwert != null");
     }
}
----------
0
qwert
qwert == null
!qwert
qwert == null
----------

Whereas if it's meant to test for empty, then one should expect this output:

----------
0
!qwert
qwert == null
!qwert
qwert == null
----------

Same for arrays of other types.

Stewart.

-- 
My e-mail is valid but not my primary mailbox.  Please keep replies on 
the 'group where everyone may benefit.
Jun 14 2005
next sibling parent reply "Regan Heath" <regan netwin.co.nz> writes:
On Tue, 14 Jun 2005 10:29:25 +0100, Stewart Gordon <smjg_1998 yahoo.com>  
wrote:
 Using DMD 0.126, Windows 98SE.

 cppstrings.html
 ----------
 C++ strings use a function to determine if a string is empty:

      string str;
      if (str.empty())
          // string is empty

 In D, an empty string is just null:

      char[] str;
      if (!str)
          // string is empty
 ----------

 This bit of the documentation is hopelessly confusing the concepts of  
 "null" and "empty".  If this D code is really supposed to be the  
 equivalent of the C++ code immediately above it, then the compiler isn't  
 seeing it this way.

 In fact, this is testing whether str is null, not whether it is empty.

 ----------
 import std.stdio;

 void main() {
      char[] qwert = "";
      writefln(qwert.length);
      if (qwert) {
          writefln("qwert");
      }
      if (!qwert) {
          writefln("!qwert");
      }
      if (qwert == null) {
          writefln("qwert == null");
      }
      if (qwert != null) {
          writefln("qwert != null");
      }

      qwert = null;
      if (qwert) {
          writefln("qwert");
      }
      if (!qwert) {
          writefln("!qwert");
      }
      if (qwert == null) {
          writefln("qwert == null");
      }
      if (qwert != null) {
          writefln("qwert != null");
      }
 }
 ----------
 0
 qwert
 qwert == null
 !qwert
 qwert == null
 ----------

 Whereas if it's meant to test for empty, then one should expect this  
 output:

 ----------
 0
 !qwert
 qwert == null
 !qwert
 qwert == null
 ----------

 Same for arrays of other types.
I reckon the only way to test for empty is: import std.stdio; template isEmpty(ArrayType) { bool isEmpty(ArrayType[] arr) { return (arr.length == 0 && arr.ptr != null); } } int main(char [][] args) { char[] string; writefln("Setting string to null:"); string = null; writefln("string is ",(isEmpty!(char)(string))?"empty":"not empty"); if (string == null) writefln("and, string is the same as null"); if (string == "") writefln("and, string is the same as \"\""); writefln(""); writefln("Setting string to \"\":"); string = ""; writefln("string is ",(isEmpty!(char)(string))?"empty":"not empty"); if (string == null) writefln("and, string is the same as null"); if (string == "") writefln("and, string is the same as \"\""); writefln(""); writefln("Setting length to zero:"); string.length = 0; writefln("string is ",(isEmpty!(char)(string))?"empty":"not empty"); if (string == null) writefln("and, string is the same as null"); if (string == "") writefln("and, string is the same as \"\""); return 0; } (I appologise if you feel I am hijacking your thread but this is one of the things about D which annoys me, so...) As you can see, "" is treated as being the same as null, and further "" is transformed into null when setting the length to 0. Both of these facts lead me to believe that D treats "" and null as the same thing. Now, the problem I have with that: In concept an empty array is different to a non existant one, and both concepts are useful. What is a non-existant array useful for I hear you ask, simple, it represents the non-existance of something, in contrast an empty array represents the existance of it, but also the fact that it's empty. When is that useful, well I can think of a simple example, a web form contains text fields, when you recieve the form data you get "field1=value&field2=value&field3=value&&" where 'value' can be blank, as in "field1=&field2=value&&", what this tells you is that: a. field1 was present on the form b. no value was entered into field1 The existance or non-existance of field1 may cause different behaviour to the program recieving the data, i.e. if they're settings non-existance means "dont change the current setting" whereas "existance, but empty" means set it to nothing. It has been pointed out that you can work around this deficiency with an AA, sure, but why should you have to? Regan
Jun 14 2005
next sibling parent "Regan Heath" <regan netwin.co.nz> writes:
Adding a 3rd test, "if (string is null)". D does at least allow you to  
differentiate between a null reference and "" with 'is'. So, not all is  
lost ;)

int main(char [][] args)
{
	char[] string;

	writefln("Setting string to null:");
	string = null;
	writefln("string is ",(isEmpty!(char)(string))?"empty":"not empty");
	
	if (string is null) writefln("and, string is null");
	if (string == null) writefln("and, string == null");
	if (string == "") writefln("and, string == \"\"");

	writefln("");
	writefln("Setting string to \"\":");
	string = "";
	writefln("string is ",(isEmpty!(char)(string))?"empty":"not empty");
	
	if (string is null) writefln("and, string is null");
	if (string == null) writefln("and, string == null");
	if (string == "") writefln("and, string == \"\"");
	
	writefln("");
	writefln("Setting length to zero:");
	string.length = 0;
	writefln("string is ",(isEmpty!(char)(string))?"empty":"not empty");
	
	if (string is null) writefln("and, string is null");
	if (string == null) writefln("and, string == null");
	if (string == "") writefln("and, string == \"\"");
	
	return 0;
}

Regan

On Tue, 14 Jun 2005 23:40:52 +1200, Regan Heath <regan netwin.co.nz> wrote:
 On Tue, 14 Jun 2005 10:29:25 +0100, Stewart Gordon <smjg_1998 yahoo.com>  
 wrote:
 Using DMD 0.126, Windows 98SE.

 cppstrings.html
 ----------
 C++ strings use a function to determine if a string is empty:

      string str;
      if (str.empty())
          // string is empty

 In D, an empty string is just null:

      char[] str;
      if (!str)
          // string is empty
 ----------

 This bit of the documentation is hopelessly confusing the concepts of  
 "null" and "empty".  If this D code is really supposed to be the  
 equivalent of the C++ code immediately above it, then the compiler  
 isn't seeing it this way.

 In fact, this is testing whether str is null, not whether it is empty.

 ----------
 import std.stdio;

 void main() {
      char[] qwert = "";
      writefln(qwert.length);
      if (qwert) {
          writefln("qwert");
      }
      if (!qwert) {
          writefln("!qwert");
      }
      if (qwert == null) {
          writefln("qwert == null");
      }
      if (qwert != null) {
          writefln("qwert != null");
      }

      qwert = null;
      if (qwert) {
          writefln("qwert");
      }
      if (!qwert) {
          writefln("!qwert");
      }
      if (qwert == null) {
          writefln("qwert == null");
      }
      if (qwert != null) {
          writefln("qwert != null");
      }
 }
 ----------
 0
 qwert
 qwert == null
 !qwert
 qwert == null
 ----------

 Whereas if it's meant to test for empty, then one should expect this  
 output:

 ----------
 0
 !qwert
 qwert == null
 !qwert
 qwert == null
 ----------

 Same for arrays of other types.
I reckon the only way to test for empty is: import std.stdio; template isEmpty(ArrayType) { bool isEmpty(ArrayType[] arr) { return (arr.length == 0 && arr.ptr != null); } } int main(char [][] args) { char[] string; writefln("Setting string to null:"); string = null; writefln("string is ",(isEmpty!(char)(string))?"empty":"not empty"); if (string == null) writefln("and, string is the same as null"); if (string == "") writefln("and, string is the same as \"\""); writefln(""); writefln("Setting string to \"\":"); string = ""; writefln("string is ",(isEmpty!(char)(string))?"empty":"not empty"); if (string == null) writefln("and, string is the same as null"); if (string == "") writefln("and, string is the same as \"\""); writefln(""); writefln("Setting length to zero:"); string.length = 0; writefln("string is ",(isEmpty!(char)(string))?"empty":"not empty"); if (string == null) writefln("and, string is the same as null"); if (string == "") writefln("and, string is the same as \"\""); return 0; } (I appologise if you feel I am hijacking your thread but this is one of the things about D which annoys me, so...) As you can see, "" is treated as being the same as null, and further "" is transformed into null when setting the length to 0. Both of these facts lead me to believe that D treats "" and null as the same thing. Now, the problem I have with that: In concept an empty array is different to a non existant one, and both concepts are useful. What is a non-existant array useful for I hear you ask, simple, it represents the non-existance of something, in contrast an empty array represents the existance of it, but also the fact that it's empty. When is that useful, well I can think of a simple example, a web form contains text fields, when you recieve the form data you get "field1=value&field2=value&field3=value&&" where 'value' can be blank, as in "field1=&field2=value&&", what this tells you is that: a. field1 was present on the form b. no value was entered into field1 The existance or non-existance of field1 may cause different behaviour to the program recieving the data, i.e. if they're settings non-existance means "dont change the current setting" whereas "existance, but empty" means set it to nothing. It has been pointed out that you can work around this deficiency with an AA, sure, but why should you have to? Regan
Jun 14 2005
prev sibling next sibling parent Stewart Gordon <smjg_1998 yahoo.com> writes:
Regan Heath wrote:
<snip>
 I reckon the only way to test for empty is:
 
 import std.stdio;
 
 template isEmpty(ArrayType) {
     bool isEmpty(ArrayType[] arr)
     {
         return (arr.length == 0 && arr.ptr != null);
     }
 }
<snip> Or equivalently return arr !is null && arr.length == 0; or return arr !is null && arr == null; or return arr !is null && arr == ""; Stewart. -- My e-mail is valid but not my primary mailbox. Please keep replies on the 'group where everyone may benefit.
Jun 14 2005
prev sibling next sibling parent Derek Parnell <derek psych.ward> writes:
On Tue, 14 Jun 2005 23:40:52 +1200, Regan Heath wrote:


[snip]
 As you can see, "" is treated as being the same as null, and further "" is  
 transformed into null when setting the length to 0. Both of these facts  
 lead me to believe that D treats "" and null as the same thing.
 
 Now, the problem I have with that: In concept an empty array is different  
 to a non existant one, and both concepts are useful.
Preaching to the converted here. I totally agree with you. An empty string and a non-existent string are two separate concepts and both are useful. -- Derek Parnell Melbourne, Australia 14/06/2005 10:58:56 PM
Jun 14 2005
prev sibling parent reply "Ben Hinkle" <bhinkle mathworks.com> writes:
 I reckon the only way to test for empty is:

 import std.stdio;

 template isEmpty(ArrayType) {
 bool isEmpty(ArrayType[] arr)
 {
 return (arr.length == 0 && arr.ptr != null);
 }
 }
So an array with null ptr and 0 length is non-empty? That doesn't seem intuitive. I think the ptr is irrelevant when testing if the array has any elements in it or not. In fact one could argue that Walter should change the implicit conversion of an array from returning the ptr to returning the length so that the test if (!str) {...} is equivalent to if (str.length != 0) {...} instead of being equivalent to if (str.ptr != null) {...} as happens today.
Jun 14 2005
next sibling parent reply "Ben Hinkle" <bhinkle mathworks.com> writes:
"Ben Hinkle" <bhinkle mathworks.com> wrote in message 
news:d8mpoc$1571$1 digitaldaemon.com...
 I reckon the only way to test for empty is:

 import std.stdio;

 template isEmpty(ArrayType) {
 bool isEmpty(ArrayType[] arr)
 {
 return (arr.length == 0 && arr.ptr != null);
 }
 }
So an array with null ptr and 0 length is non-empty? That doesn't seem intuitive. I think the ptr is irrelevant when testing if the array has any elements in it or not. In fact one could argue that Walter should change the implicit conversion of an array from returning the ptr to returning the length so that the test if (!str) {...} is equivalent to if (str.length != 0) {...} instead of being equivalent to if (str.ptr != null) {...} as happens today.
I should add another option is to remove the implicit conversion entirely and force users to say what they mean if (!str.length) {...} or if (!str.ptr) {...}
Jun 14 2005
parent Tom S <h3r3tic remove.mat.uni.torun.pl> writes:
Ben Hinkle wrote:
 I should add another option is to remove the implicit conversion entirely 
 and force users to say what they mean
  if (!str.length) {...}
 or
  if (!str.ptr) {...}
Gets my vote. I got bitten once by 'if (!str) {...}' -- Tomasz Stachowiak /+ a.k.a. h3r3tic +/
Jun 14 2005
prev sibling next sibling parent reply Stewart Gordon <smjg_1998 yahoo.com> writes:
Ben Hinkle wrote:

 I reckon the only way to test for empty is:
 
 import std.stdio;
 
 template isEmpty(ArrayType) {
 bool isEmpty(ArrayType[] arr)
 {
 return (arr.length == 0 && arr.ptr != null);
 }
 } 
So an array with null ptr and 0 length is non-empty? That doesn't seem intuitive.
No, because there's no array here. There are three distinct cases to distinguish: - null array reference - empty array - non-empty array The code above is to test specifically for the empty array case, where a null array reference might have different semantics in the context.
 I think the ptr is irrelevant when testing if the array has any 
 elements in it or not. In fact one could argue that Walter should 
 change the implicit conversion of an array from returning the ptr to 
 returning the length so that the test
_The_ implicit conversion of an array? There are two: (a) Conversion to a pointer, which obviously has to return the pointer. (b) Conversion to a boolean, which is what this is about. A boolean can't hold either a pointer or a length, only true or false. The only implicit conversion that could possibly return the length is conversion to a number. At the moment there is no conversion from an array to a number, and to introduce one would be a cause of confusion.
  if (!str) {...}
 is equivalent to
  if (str.length != 0) {...}
 instead of being equivalent to
  if (str.ptr != null) {...}
 as happens today.
Huh? So that non-empty is false and empty or null is true? Sounds very counter-intuitive. From the bit of the docs I quoted, I would've expected if (str) { ... } to be equivalent to if (str.length != 0) { ... } Stewart. -- My e-mail is valid but not my primary mailbox. Please keep replies on the 'group where everyone may benefit.
Jun 14 2005
parent reply "Ben Hinkle" <bhinkle mathworks.com> writes:
"Stewart Gordon" <smjg_1998 yahoo.com> wrote in message 
news:d8mved$1a1b$3 digitaldaemon.com...
 Ben Hinkle wrote:

 I reckon the only way to test for empty is:

 import std.stdio;

 template isEmpty(ArrayType) {
 bool isEmpty(ArrayType[] arr)
 {
 return (arr.length == 0 && arr.ptr != null);
 }
 }
So an array with null ptr and 0 length is non-empty? That doesn't seem intuitive.
No, because there's no array here. There are three distinct cases to distinguish: - null array reference - empty array - non-empty array The code above is to test specifically for the empty array case, where a null array reference might have different semantics in the context.
I understand your definitions. IMO any isEmpty function should return true for an input that has a null ptr and zero length (I've avoided using the terms "array" or "empty" in that sentence to be absolutely clear). Or to put it another way "if a foreach does nothing then it is empty". The function above I would call isNonNullAndEmpty or something. [snip]
  if (!str) {...}
 is equivalent to
  if (str.length != 0) {...}
 instead of being equivalent to
  if (str.ptr != null) {...}
 as happens today.
Huh? So that non-empty is false and empty or null is true? Sounds very counter-intuitive. From the bit of the docs I quoted, I would've expected if (str) { ... } to be equivalent to if (str.length != 0) { ... }
yeah - you're right. I wasn't paying close enough attention. sorry about that.
Jun 14 2005
next sibling parent reply Derek Parnell <derek psych.ward> writes:
On Tue, 14 Jun 2005 14:12:23 -0400, Ben Hinkle wrote:

 IMO any isEmpty function should return true 
 for an input that has a null ptr and zero length.
Or yet other words, for an array to be empty it must first be an array (i.e. arr.ptr != 0), and the length must be zero (i.e. arr.length == 0). There are four conditions: .ptr == 0 .len == 0 .ptr == 0 .len != 0 .ptr != 0 .len == 0 .ptr != 0 .len != 0 By my reckoning, if the .ptr == 0 then the entity is not an array and so these two conditions should both return NULL. It the .ptr != 0 then the array exists and it either has zero or more elements, so if .len == 0 then it is EMPTY otherwise it is neither empty nor null. The apparent ambiguous situation of both .len and .ptr being zero is the problem area. I would pedantically call it null because there is no array. This problem exists mainly because D sets .ptr to zero whenever the .len is set to zero, and that is not a good idea because we loose information when that happens. If I want to set the .ptr to zero I would have preferred that 'arr = null' or 'arr.ptr = 0' would set both to zero, and 'arr.length = 0' just sets the current length to zero but retains the RAM allocation. In my code, I always use 'if (arr.length == 0)' to test for an empty array and never use 'if (!arr)'. I've been bitten by the lack of distinction between an empty array and a null array too. And by the way, D isn't consistent when it nullifies an array either, in that some situations 'string = ""' does *not* set .ptr to zero and other situations it does. -- Derek Parnell Melbourne, Australia 15/06/2005 6:36:43 AM
Jun 14 2005
parent reply "Walter" <newshound digitalmars.com> writes:
"Derek Parnell" <derek psych.ward> wrote in message
news:2o7hnscxa0hj$.s1owhetd92i0$.dlg 40tude.net...
 This problem exists mainly because D sets .ptr to zero whenever the .len
is
 set to zero,
No, it doesn't. The reason it doesn't is to support the 'reserve some space in advance' idiom: array.length = 100; array.length = 0; Now, array.ptr points to the start of the buffer that is at least 100 long. Then, the array can be appended to without undergoing reallocation.
 and that is not a good idea because we loose information when
 that happens. If I want to set the .ptr to zero I would have preferred
that
 'arr = null'
That does set both to 0.
 or 'arr.ptr = 0' would set both to zero,
Having .length!=0 and .ptr==0 is not a legal configuration for an array.
 and 'arr.length = 0'
 just sets the current length to zero but retains the RAM allocation.
That's what it does now.
 In my code, I always use 'if (arr.length == 0)' to test for an empty array
 and never use 'if (!arr)'. I've been bitten by the lack of distinction
 between an empty array and a null array too. And by the way, D isn't
 consistent when it nullifies an array either, in that some situations
 'string = ""' does *not* set .ptr to zero and other situations it does.
To test for an empty or null array, test for (array.length == 0). To test for a null array, test for (array is null).
Jun 14 2005
next sibling parent "Regan Heath" <regan netwin.co.nz> writes:
On Tue, 14 Jun 2005 15:14:08 -0700, Walter <newshound digitalmars.com>  
wrote:
 "Derek Parnell" <derek psych.ward> wrote in message
 news:2o7hnscxa0hj$.s1owhetd92i0$.dlg 40tude.net...
 This problem exists mainly because D sets .ptr to zero whenever the .len
is
 set to zero,
No, it doesn't. The reason it doesn't is to support the 'reserve some space in advance' idiom: array.length = 100; array.length = 0; Now, array.ptr points to the start of the buffer that is at least 100 long. Then, the array can be appended to without undergoing reallocation.
The please explain these results: import std.stdio; template isEmpty(ArrayType) { bool isEmpty(ArrayType[] arr) { return (arr.length == 0 && arr.ptr != null); } } int main(char [][] args) { char[] string; writefln("Setting string to null:"); string = null; writefln("string is ",(isEmpty!(char)(string))?"empty":"not empty"); if (string is null) writefln("and, string is null"); if (string == null) writefln("and, string == null"); if (string == "") writefln("and, string == \"\""); writefln(""); writefln("Setting string to \"\":"); string = ""; writefln("string is ",(isEmpty!(char)(string))?"empty":"not empty"); if (string is null) writefln("and, string is null"); if (string == null) writefln("and, string == null"); if (string == "") writefln("and, string == \"\""); writefln(""); writefln("Setting length to zero:"); string.length = 0; writefln("string is ",(isEmpty!(char)(string))?"empty":"not empty"); if (string is null) writefln("and, string is null"); if (string == null) writefln("and, string == null"); if (string == "") writefln("and, string == \"\""); return 0; } Setting string to null: string is not empty and, string is null and, string == null and, string == "" Setting string to "": string is empty and, string == null and, string == "" Setting length to zero: string is not empty and, string is null and, string == null and, string == "" Regan
Jun 14 2005
prev sibling next sibling parent "Ben Hinkle" <ben.hinkle gmail.com> writes:
"Walter" <newshound digitalmars.com> wrote in message 
news:d8nl8j$20dp$1 digitaldaemon.com...
 "Derek Parnell" <derek psych.ward> wrote in message
 news:2o7hnscxa0hj$.s1owhetd92i0$.dlg 40tude.net...
 This problem exists mainly because D sets .ptr to zero whenever the .len
 is set to zero,
No, it doesn't. The reason it doesn't is to support the 'reserve some space in advance' idiom: array.length = 100; array.length = 0; Now, array.ptr points to the start of the buffer that is at least 100 long. Then, the array can be appended to without undergoing reallocation.
Derek is correct. _d_setarraylength in internal/gc/gc.d says if (newlength) { [snip] } else { newdata = null; } p.data = newdata; p.length = newlength; so setting length to 0 nulls the ptr. Similarly setting from from 0 to any non-zero value ignores the ptr and always allocates a new memory block. To see it in action try int main() { int[] x = new int[10]; printf("%p\n",x.ptr); x.length = 0; printf("%p\n",x.ptr); x = new int[10]; x = x[0 .. 0]; // doesn't reset the ptr printf("%p\n",x.ptr); x.length = 10; // will make a new allocation printf("%p\n",x.ptr); return 0; }
Jun 14 2005
prev sibling parent reply Derek Parnell <derek psych.ward> writes:
On Tue, 14 Jun 2005 15:14:08 -0700, Walter wrote:

 "Derek Parnell" <derek psych.ward> wrote in message
 news:2o7hnscxa0hj$.s1owhetd92i0$.dlg 40tude.net...
 This problem exists mainly because D sets .ptr to zero whenever the .len
is
 set to zero,
No, it doesn't. The reason it doesn't is to support the 'reserve some space in advance' idiom: array.length = 100; array.length = 0; Now, array.ptr points to the start of the buffer that is at least 100 long. Then, the array can be appended to without undergoing reallocation.
Hmmm... then the code below produces unexpected results ... <code> import std.stdio; void func(int w, char[] x, int y, bool z) { char[] ok; char[] addr; char[] emp; char[] nul; ok = "Good"; if (x.length != y) ok = "Bad1"; if (z && x.ptr == null) ok = "Bad2"; if (!z && x.ptr != null) ok = "Bad3"; if (z) addr = "non-zero"; else addr = " 0"; emp = (x.length == 0 ? " empty" : "!empty"); nul = (x is null ? " null" : "!null"); " Expected Len=%d Addr=%s (%s) %s %s\n", w, x.length, cast(ulong)x.ptr, y, addr, ok, emp, nul); } void main() { char[] b; func(1, "abc", 3, true); func(2, "", 0, true); func(3, "abc".dup, 3, true); func(4, "".dup, 0, true); func(5, b, 0, false); b = "qwerty".dup; func(6, b, 6, true); b.length = 0; func(7, b, 0, true); b = null; func(8, b, 0, false); b = "poiuyt".dup; writefln("Addr is %8x with %d elements", cast(ulong)b.ptr, b.length); b.length = 0; writefln("Addr is %8x with %d elements", cast(ulong)b.ptr, b.length); b.length = 100; writefln("Addr is %8x with %d elements", cast(ulong)b.ptr, b.length); b.length = 0; writefln("Addr is %8x with %d elements", cast(ulong)b.ptr, b.length); } </code> <results using v0.126> Expected Len=3 Addr=non-zero (Good) !empty !null Expected Len=0 Addr=non-zero (Good) empty !null Expected Len=3 Addr=non-zero (Good) !empty !null Expected Len=0 Addr=non-zero (Bad2) empty null Expected Len=0 Addr= 0 (Good) empty null Expected Len=6 Addr=non-zero (Good) !empty !null Expected Len=0 Addr=non-zero (Bad2) empty null Expected Len=0 Addr= 0 (Good) empty null Addr is 870fc0 with 6 elements Addr is 0 with 0 elements Addr is 872f00 with 100 elements Addr is 0 with 0 elements </results> final tests confirm the anomaly.
 and that is not a good idea because we loose information when
 that happens. If I want to set the .ptr to zero I would have preferred
that
 'arr = null'
That does set both to 0.
Good.
 or 'arr.ptr = 0' would set both to zero,
Having .length!=0 and .ptr==0 is not a legal configuration for an array.
Agreed, and that's why whenever the pointer is set to zero then length should be too.
 and 'arr.length = 0'
 just sets the current length to zero but retains the RAM allocation.
That's what it does now.
See code example above which seems to say otherwise.
 In my code, I always use 'if (arr.length == 0)' to test for an empty array
 and never use 'if (!arr)'. I've been bitten by the lack of distinction
 between an empty array and a null array too. And by the way, D isn't
 consistent when it nullifies an array either, in that some situations
 'string = ""' does *not* set .ptr to zero and other situations it does.
To test for an empty or null array, test for (array.length == 0). To test for a null array, test for (array is null).
Walter, these are direct questions: Do you believe there is a distinction between an empty array and a null array? And if so, will D support the two concepts in a consistent manner? -- Derek Melbourne, Australia 15/06/2005 8:39:31 AM
Jun 14 2005
next sibling parent reply "Walter" <newshound digitalmars.com> writes:
"Derek Parnell" <derek psych.ward> wrote in message
news:1lruav5hsvxn6$.uj97bpn5gk38$.dlg 40tude.net...
 Hmmm... then the code below produces unexpected results ...
Oh darn, looks like I've got work to do...
 Walter, these are direct questions: Do you believe there is a distinction
 between an empty array and a null array?
I've tried to eliminate the distinction.
 And if so, will D support the two
 concepts in a consistent manner?
Jun 14 2005
next sibling parent reply Derek Parnell <derek psych.ward> writes:
On Tue, 14 Jun 2005 19:50:01 -0700, Walter wrote:

 "Derek Parnell" <derek psych.ward> wrote in message
 news:1lruav5hsvxn6$.uj97bpn5gk38$.dlg 40tude.net...
 Hmmm... then the code below produces unexpected results ...
Oh darn, looks like I've got work to do...
Fair enough.
 Walter, these are direct questions: Do you believe there is a distinction
 between an empty array and a null array?
I've tried to eliminate the distinction.
Right, so there is a distinction but you are trying to remove it. *DON'T* - please. It is a useful distinction and really ought not to be artificially removed. -- Derek Melbourne, Australia 15/06/2005 12:59:35 PM
Jun 14 2005
parent reply "Walter" <newshound digitalmars.com> writes:
"Derek Parnell" <derek psych.ward> wrote in message
news:zbvlue18tuv7.h1nffudg5kvn$.dlg 40tude.net...
 Right, so there is a distinction but you are trying to remove it.

 *DON'T* - please. It is a useful distinction and really ought not to be
 artificially removed.
What exactly is the advantage to the distinction?
Jun 17 2005
parent reply Derek Parnell <derek psych.ward> writes:
On Fri, 17 Jun 2005 18:31:33 -0700, Walter wrote:

 "Derek Parnell" <derek psych.ward> wrote in message
 news:zbvlue18tuv7.h1nffudg5kvn$.dlg 40tude.net...
 Right, so there is a distinction but you are trying to remove it.

 *DON'T* - please. It is a useful distinction and really ought not to be
 artificially removed.
What exactly is the advantage to the distinction?
For example, to be able to tell if something has been set to empty or has never been set at all. To distinguish between the presence of emptiness and the absence of anything. -- Derek Parnell Melbourne, Australia 18/06/2005 12:21:21 PM
Jun 17 2005
parent reply "Regan Heath" <regan netwin.co.nz> writes:
On Sat, 18 Jun 2005 12:23:09 +1000, Derek Parnell <derek psych.ward> wrote:
 On Fri, 17 Jun 2005 18:31:33 -0700, Walter wrote:

 "Derek Parnell" <derek psych.ward> wrote in message
 news:zbvlue18tuv7.h1nffudg5kvn$.dlg 40tude.net...
 Right, so there is a distinction but you are trying to remove it.

 *DON'T* - please. It is a useful distinction and really ought not to be
 artificially removed.
What exactly is the advantage to the distinction?
For example, to be able to tell if something has been set to empty or has never been set at all. To distinguish between the presence of emptiness and the absence of anything.
Exactly, for example you have a web pages which *may* contain settings A, B, C, D, ... when the page is submitted you want to know if the setting was present, and if so, what it was set to. So, there are 3 possible states for each setting: 1 - not present 2 - present, set to "" 3 - present, set to <anything> Such a post might look like: A=text&B=&C=other+text&& Where: example, do the following to the previously stored setting values: overwrite A with "text" overwrite B with "" overwrite C with "other text" leave D as is difference between B & D. (no pun intended, well maybe a little) Compare D's arrays with C's pointer, a pointer can represent the 3 states: char *p; //p is set here I think we want/need for: - arrays in a certain state, stay in that state until intentionally changed (i.e. not when length is set to 0) - a reccomended/standard way of identifying the states (I think we have this) eg. char[] p = null; //p is set here I can't see a downside to ensuring the distinction remains, after all you dont get a seg-fault calling: char[] p = null; if (p.length == 0) already, and that doesn't need to change. Regan Note: There are workaround solutions (like using an AA and the 'in' operator) but these are seldom as intuitive (for me at least, and perhaps other C/pointer-style programmers) or as direct a solution to the problem as simply supporting the 3 states.
Jun 17 2005
parent reply Nick <Nick_member pathlink.com> writes:
Regan Heath:
I can't see a downside to ensuring the distinction remains
The downside IMHO is that it makes strings and arrays more complicated objects, having not two states (non-empty and empty) but three (non-empty, empty and null). This makes them more confusing, and I think the coding style that this encourages will be less readable and more prone to have hard-to-find bugs. But that's just my opinion. Nick
Jun 18 2005
parent reply "Regan Heath" <regan netwin.co.nz> writes:
On Sat, 18 Jun 2005 12:39:54 +0000 (UTC), Nick <Nick_member pathlink.com>  
wrote:
 Regan Heath:
 I can't see a downside to ensuring the distinction remains
The downside IMHO is that it makes strings and arrays more complicated objects, having not two states (non-empty and empty) but three (non-empty, empty and null). This makes them more confusing, and I think the coding style that this encourages will be less readable and more prone to have hard-to-find bugs. But that's just my opinion.
Cool. Got any examples or evidence to support this opinion? I ask because, like I said, if you can write: char[] a = null; if (a.length == 0) without getting a seg-v (due to a being null), then you can effectively you can ignore it and never have any trouble, confusion or hard-to-find bugs. Regan
Jun 18 2005
parent reply Nick <Nick_member pathlink.com> writes:
In article <opsslpgewu23k2f5 nrage.netwin.co.nz>, Regan Heath says...
Cool. Got any examples or evidence to support this opinion?
Well, not any specific examples as such. But the problem is that when you add 'null' as a distinct state, what should happen with equality? If I do a="" and b=null, then you probably agree that a==b should still be true. But that means the two states are not really distinct after all, if you think of equality == as a proper way of defining distinct states. So arrays become "2.5"-state objects, and that might be a bit confusing. One of the consequences is that the order in which you compare things becomes important, eg. if(a == null) do something else if(a == b) do something else else do something entirely different is NOT the same as if(a == b) do something else else if(a == null) do something else do something entirely different since b might be "", and the first test would succeed even if a is null. Do the same with classes then your program will just crash, but the above will run and appear to work correctly even though it does not. This probably isn't the end of the world, but it gives me an uneasy feeling.
I ask because, like I said, if you can write:

char[] a = null;
if (a.length == 0)

without getting a seg-v (due to a being null), then you can effectively  

you can ignore it and never have any trouble, confusion or hard-to-find  
bugs.
Well, ok, but it really isn't a matter of whether or not _I_ am using 2 or 3 states. If the feature is there (and documented), then people will use it. I believe in the current D mantra of making bugs harder to write and easier to find. Having a semi-destinct 'null'-state in arrays doesn't go along with that goal, IMHO. Nick
Jun 19 2005
next sibling parent Derek Parnell <derek psych.ward> writes:
On Sun, 19 Jun 2005 22:02:06 +0000 (UTC), Nick wrote:

 In article <opsslpgewu23k2f5 nrage.netwin.co.nz>, Regan Heath says...
Cool. Got any examples or evidence to support this opinion?
Well, not any specific examples as such. But the problem is that when you add 'null' as a distinct state, what should happen with equality? If I do a="" and b=null, then you probably agree that a==b should still be true.
No I would not. They are two different things. Just like ... if ("" == null) doesn't make sense.
 But that means
 the two states are not really distinct after all, if you think of equality ==
as
 a proper way of defining distinct states. So arrays become "2.5"-state objects,
 and that might be a bit confusing.
Except that I disagree with you about empty and null being equal.
 One of the consequences is that the order in which you compare things becomes
 important, eg.
 
 if(a == null) do something
 else if(a == b) do something else
 else do something entirely different
 
 is NOT the same as
 
 if(a == b) do something else
 else if(a == null) do something
 else do something entirely different
Correct. We already do such tests with class instances, for example.
 since b might be "", and the first test would succeed even if a is null.
But it shouldn't.
 Do the
 same with classes then your program will just crash, but the above will run and
 appear to work correctly even though it does not.
 
 This probably isn't the end of the world, but it gives me an uneasy feeling.
 
I ask because, like I said, if you can write:

char[] a = null;
if (a.length == 0)

without getting a seg-v (due to a being null), then you can effectively  

you can ignore it and never have any trouble, confusion or hard-to-find  
bugs.
Well, ok, but it really isn't a matter of whether or not _I_ am using 2 or 3 states. If the feature is there (and documented), then people will use it.
Exactly. If empty and null were documented as two different states then people would know that and code accordingly.
 I believe in the current D mantra of making bugs harder to write and easier to
 find. Having a semi-destinct 'null'-state in arrays doesn't go along with that
 goal, IMHO.
Absolutely correct; having such a semi-state does not help anyone. However, that is *not* what we are talking about. We are talking about truly distinct states. -- Derek Parnell Melbourne, Australia 20/06/2005 8:54:21 AM
Jun 19 2005
prev sibling parent "Regan Heath" <regan netwin.co.nz> writes:
On Sun, 19 Jun 2005 22:02:06 +0000 (UTC), Nick <Nick_member pathlink.com>  
wrote:
 In article <opsslpgewu23k2f5 nrage.netwin.co.nz>, Regan Heath says...
 Cool. Got any examples or evidence to support this opinion?
Well, not any specific examples as such. But the problem is that when you add 'null' as a distinct state, what should happen with equality?
Nothing needs to change. Equality is a comparrison of the value to which the reference refers, not a comparrison of the reference itself. eg. char[] p = null; if (p is null) //compares the *referece* to null if (p == null) //compares the *value* to which p refers with null if (p == "") //compares the *value* to which p refers with "" still compare the reference itself with null I have the 3 states required. In short: reference is null //does not exit value is null, value is "" //exists, is empty value is "bob" //exists, is "bob"
 If I do a="" and b=null, then you probably agree that a==b should still  
 be true.
I can, another option is to make null and "" distinct states for the *value* as well. I don't think we need to go that far, in fact I think it's better we don't. <snip>
 I ask because, like I said, if you can write:

 char[] a = null;
 if (a.length == 0)

 without getting a seg-v (due to a being null), then you can effectively
 treat arrays as having 2 states, not 3. So, unless you care about state  

 you can ignore it and never have any trouble, confusion or hard-to-find
 bugs.
Well, ok, but it really isn't a matter of whether or not _I_ am using 2 or 3 states. If the feature is there (and documented), then people will use it. I believe in the current D mantra of making bugs harder to write and easier to find. Having a semi-destinct 'null'-state in arrays doesn't go along with that goal, IMHO.
You're correct, which is why I'm not advocating that at all ;) Regan
Jun 20 2005
prev sibling parent Nick <Nick_member pathlink.com> writes:
In article <d8o53s$2bs0$1 digitaldaemon.com>, Walter says...
 Hmmm... then the code below produces unexpected results ...
Oh darn, looks like I've got work to do...
So you mean you weren't aware that it currently sets ptr=null when you set lenght=0? Or have you just changed your mind about it? :) I mean, some of the phobos code have been written specifically to work around this feature. Look at these two excerpts from stream.d: I think that such tricks are extremely ugly, though, so if you fix it I'll be very happy :) Nick
Jun 18 2005
prev sibling parent Thomas Kuehne <thomas-dloop kuehne.this-is-spam.cn> writes:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Derek Parnell schrieb am Wed, 15 Jun 2005 09:06:41 +1000:
 On Tue, 14 Jun 2005 15:14:08 -0700, Walter wrote:

 "Derek Parnell" <derek psych.ward> wrote in message
 news:2o7hnscxa0hj$.s1owhetd92i0$.dlg 40tude.net...
 This problem exists mainly because D sets .ptr to zero whenever the .len
is
 set to zero,
No, it doesn't. The reason it doesn't is to support the 'reserve some space in advance' idiom: array.length = 100; array.length = 0; Now, array.ptr points to the start of the buffer that is at least 100 long. Then, the array can be appended to without undergoing reallocation.
Hmmm... then the code below produces unexpected results ... <code> import std.stdio; void func(int w, char[] x, int y, bool z) { char[] ok; char[] addr; char[] emp; char[] nul; ok = "Good"; if (x.length != y) ok = "Bad1"; if (z && x.ptr == null) ok = "Bad2"; if (!z && x.ptr != null) ok = "Bad3"; if (z) addr = "non-zero"; else addr = " 0"; emp = (x.length == 0 ? " empty" : "!empty"); nul = (x is null ? " null" : "!null"); " Expected Len=%d Addr=%s (%s) %s %s\n", w, x.length, cast(ulong)x.ptr, y, addr, ok, emp, nul); } void main() { char[] b; func(1, "abc", 3, true); func(2, "", 0, true); func(3, "abc".dup, 3, true); func(4, "".dup, 0, true); func(5, b, 0, false); b = "qwerty".dup; func(6, b, 6, true); b.length = 0; func(7, b, 0, true); b = null; func(8, b, 0, false); b = "poiuyt".dup; writefln("Addr is %8x with %d elements", cast(ulong)b.ptr, b.length); b.length = 0; writefln("Addr is %8x with %d elements", cast(ulong)b.ptr, b.length); b.length = 100; writefln("Addr is %8x with %d elements", cast(ulong)b.ptr, b.length); b.length = 0; writefln("Addr is %8x with %d elements", cast(ulong)b.ptr, b.length); } </code> <results using v0.126> Expected Len=3 Addr=non-zero (Good) !empty !null Expected Len=0 Addr=non-zero (Good) empty !null Expected Len=3 Addr=non-zero (Good) !empty !null Expected Len=0 Addr=non-zero (Bad2) empty null Expected Len=0 Addr= 0 (Good) empty null Expected Len=6 Addr=non-zero (Good) !empty !null Expected Len=0 Addr=non-zero (Bad2) empty null Expected Len=0 Addr= 0 (Good) empty null Addr is 870fc0 with 6 elements Addr is 0 with 0 elements Addr is 872f00 with 100 elements Addr is 0 with 0 elements </results>
Added to DStress as http://dstress.kuehne.cn/run/p/ptr_10_A.d http://dstress.kuehne.cn/run/p/ptr_10_B.d http://dstress.kuehne.cn/run/p/ptr_10_C.d http://dstress.kuehne.cn/run/p/ptr_10_D.d http://dstress.kuehne.cn/run/p/ptr_10_E.d http://dstress.kuehne.cn/run/p/ptr_10_F.d http://dstress.kuehne.cn/run/p/ptr_10_G.d http://dstress.kuehne.cn/run/p/ptr_10_H.d Thomas -----BEGIN PGP SIGNATURE----- iD8DBQFCs88d3w+/yD4P9tIRAu6JAJ4iXGnp1Gb4lXGFNZsTWbS6QqaDRQCdH8lM CjBwYB1cqje3PFNEpM7Gzec= =EDZq -----END PGP SIGNATURE-----
Jun 18 2005
prev sibling parent "Regan Heath" <regan netwin.co.nz> writes:
On Tue, 14 Jun 2005 14:12:23 -0400, Ben Hinkle <bhinkle mathworks.com>  
wrote:
 "Stewart Gordon" <smjg_1998 yahoo.com> wrote in message
 news:d8mved$1a1b$3 digitaldaemon.com...
 Ben Hinkle wrote:

 I reckon the only way to test for empty is:

 import std.stdio;

 template isEmpty(ArrayType) {
 bool isEmpty(ArrayType[] arr)
 {
 return (arr.length == 0 && arr.ptr != null);
 }
 }
So an array with null ptr and 0 length is non-empty? That doesn't seem intuitive.
No, because there's no array here. There are three distinct cases to distinguish: - null array reference - empty array - non-empty array The code above is to test specifically for the empty array case, where a null array reference might have different semantics in the context.
I understand your definitions. IMO any isEmpty function should return true for an input that has a null ptr and zero length (I've avoided using the terms "array" or "empty" in that sentence to be absolutely clear). Or to put it another way "if a foreach does nothing then it is empty".
I don't think that's a valid definition (due to my definition of empty, which I'll explain below). I think all you can state is "if a foreach does nothing then there are no elements". Using the glass example again: "is a non existant glass empty?" Logically, at least to me, the answer is "no". Remember, the existance of the object is important, by only asking whether it contains anything you're ignoring whether it exists or not, quite simply a non-existant glass and an empty glass hold the exact same amount of liquid, none.
 The function
 above I would call isNonNullAndEmpty or something.
I wouldn't, because to me "empty" implies "existance" thus "non null". Regan
Jun 14 2005
prev sibling parent reply "Regan Heath" <regan netwin.co.nz> writes:
On Tue, 14 Jun 2005 10:31:39 -0400, Ben Hinkle <bhinkle mathworks.com>  
wrote:

 I reckon the only way to test for empty is:

 import std.stdio;

 template isEmpty(ArrayType) {
 bool isEmpty(ArrayType[] arr)
 {
 return (arr.length == 0 && arr.ptr != null);
 }
 }
So an array with null ptr and 0 length is non-empty? That doesn't seem intuitive.
Maybe, maybe not. Think about it this way, is a non existant glass and empty glass? I don't think you'll get too many people saying yes to that. A prerequisite to being "empty" is for the thing to exist at all. If it doesn't exist it cannot be empty. Sure, in some cases you want to treat the empty and non-existant case as the same thing, you can, using .length. Sometimes you want to treat them differently, we can, using "a is null". However, and this is the problem, D converts from one to the other when you set length to 0, and that is wrong.
 I think the ptr is irrelevant when testing if the array has any
 elements in it or not.
You're correct, but this is irrelevant. I'm not simply asking whether "the array has any elements in it". I'm first asking whether it exists or not, and then if it's got any elements. Thus the double check.
 In fact one could argue that Walter should change the
 implicit conversion of an array from returning the ptr to returning the
 length so that the test
  if (!str) {...}
 is equivalent to
  if (str.length != 0) {...}
 instead of being equivalent to
  if (str.ptr != null) {...}
 as happens today.
We could, but in my mind that would be counter intuitive. "if (a)" tests a class references, pointers and intrinsics against null, or 0, if we change arrays we now have different behaviour for them as compared to everything else. Regan
Jun 14 2005
parent reply Tom S <h3r3tic remove.mat.uni.torun.pl> writes:
Regan Heath wrote:
 On Tue, 14 Jun 2005 10:31:39 -0400, Ben Hinkle <bhinkle mathworks.com>  
 wrote:
 In fact one could argue that Walter should change the
 implicit conversion of an array from returning the ptr to returning the
 length so that the test
  if (!str) {...}
 is equivalent to
  if (str.length != 0) {...}
 instead of being equivalent to
  if (str.ptr != null) {...}
 as happens today.
We could, but in my mind that would be counter intuitive. "if (a)" tests a class references, pointers and intrinsics against null, or 0, if we change arrays we now have different behaviour for them as compared to everything else.
Yeah, but we already have this kind of 'inconsistency': Yields: 'expression f of type Foo does not have a boolean value'. I prefer to percieve D's arrays as structs, so having to check their members: 'ptr' and 'length' would be pretty logical, consistent and less error prone than the current approach IHMO. -- Tomasz Stachowiak /+ a.k.a. h3r3tic +/
Jun 14 2005
parent "Regan Heath" <regan netwin.co.nz> writes:
On Wed, 15 Jun 2005 01:59:07 +0200, Tom S  
<h3r3tic remove.mat.uni.torun.pl> wrote:
 Regan Heath wrote:
 On Tue, 14 Jun 2005 10:31:39 -0400, Ben Hinkle <bhinkle mathworks.com>   
 wrote:
 In fact one could argue that Walter should change the
 implicit conversion of an array from returning the ptr to returning the
 length so that the test
  if (!str) {...}
 is equivalent to
  if (str.length != 0) {...}
 instead of being equivalent to
  if (str.ptr != null) {...}
 as happens today.
We could, but in my mind that would be counter intuitive. "if (a)" tests a class references, pointers and intrinsics against null, or 0, if we change arrays we now have different behaviour for them as compared to everything else.
Yeah, but we already have this kind of 'inconsistency':
A struct/union is the *only* type that this doesn't work for, because it is a value type, which cannot be implicitly compared to 0, 0.0, or null.
 Yields: 'expression f of type Foo does not have a boolean value'. I  
 prefer to percieve D's arrays as structs
Regardless arrays are references, not structs. char[] a; Declares a reference to a char array.
 , so having to check their members: 'ptr' and 'length' would be pretty  
 logical, consistent and less error prone than the current approach IHMO.
An array is no different to a class with length and ptr properties. As with the class, you simply have to remember that fact. That said, the array opCmp operator handles having a null rhs, which it then treats the same as a "" rhs, this is also part of the problem, IMO. Regan
Jun 14 2005
prev sibling parent reply "Ben Hinkle" <ben.hinkle gmail.com> writes:
"Stewart Gordon" <smjg_1998 yahoo.com> wrote in message 
news:d8m81l$n81$1 digitaldaemon.com...
 Using DMD 0.126, Windows 98SE.

 cppstrings.html
 ----------
 C++ strings use a function to determine if a string is empty:

     string str;
     if (str.empty())
         // string is empty

 In D, an empty string is just null:

     char[] str;
     if (!str)
         // string is empty
 ----------

 This bit of the documentation is hopelessly confusing the concepts of 
 "null" and "empty".  If this D code is really supposed to be the 
 equivalent of the C++ code immediately above it, then the compiler isn't 
 seeing it this way.

 In fact, this is testing whether str is null, not whether it is empty.
Agreed - the doc should be changed to use str.length == 0 to mean "empty" and/or that particular section in the cppstring.html should be removed. Testing the ptr isn't the same as testing for 0 length. OTOH since other containers will likely have an "empty" or "isEmpty" member maybe one should be added to the builtin arrays.
Jun 14 2005
next sibling parent "Walter" <newshound digitalmars.com> writes:
I agree. I'll fix the doc.
Jun 14 2005
prev sibling parent Stewart Gordon <smjg_1998 yahoo.com> writes:
Ben Hinkle wrote:
<snip>
 Agreed - the doc should be changed to use str.length == 0 to mean "empty" 
 and/or that particular section in the cppstring.html should be removed. 
 Testing the ptr isn't the same as testing for 0 length.
 OTOH since other containers will likely have an "empty" or "isEmpty" member 
 maybe one should be added to the builtin arrays. 
Either way, we'll still need somewhere in the spec an indication of whether using an array reference as a boolean tests for non-null or non-empty, or if it's going to become illegal. Stewart. -- My e-mail is valid but not my primary mailbox. Please keep replies on the 'group where everyone may benefit.
Jun 15 2005