digitalmars.D - empty arrays - no complaints?
- Farmer (108/108) Jun 27 2004 Why are there (almost) no complaints about D's support for empty arrays?
- Sean Kelly (10/23) Jun 27 2004 Not really. I'd rather argue that D tries to make both usable and reduc...
- Farmer (26/48) Jun 27 2004 You misunderstood me, I meant that the function interface is a good one.
- Derek Parnell (15/64) Jun 27 2004 Well....the *use* of an uninitialized array it which it is assumed to be
- Kris (17/20) Jun 27 2004 What I do to handle such issues is to check the array length only. See, ...
- Regan Heath (27/50) Jun 27 2004 I think Derek is thinking more of this other example he gave:
- Andy Friesen (14/22) Jun 27 2004 I think the problem is that D arrays almost always behave like reference...
- Farmer (12/20) Jun 27 2004 Yes, this is a problem. It is a necessary evil to archive that outstandi...
- Andy Friesen (33/52) Jun 27 2004 Conceptually they are. If the length is zero, then the data pointer is
- Derek Parnell (21/82) Jun 27 2004 Huh? There are times when a zero-length array is valid and an uninitaliz...
- Regan Heath (32/86) Jun 27 2004 D allows both empty arrays *and* null arrays.
- Andy Friesen (25/57) Jun 27 2004 D arrays are implement exactly so:
- Regan Heath (36/93) Jun 27 2004 I see what you're saying... the internal data pointer for the array can ...
- Andy Friesen (15/32) Jun 28 2004 You say that as though it is self-evident that strings must absolutely,
- Regan Heath (16/50) Jun 28 2004 I have not used C++ containers. I program in C for a living, and C++ for...
- Andy Friesen (18/31) Jun 28 2004 Yeah, it's called std::string, and it's more or less the default.
- Derek Parnell (8/50) Jun 28 2004 Agreed, D doesn't seem to work that way, but isn't that the issue. Some
- Bent Rasmussen (4/6) Jun 28 2004 I must say, I kind of like that. I don't have to write a read/write prop...
- Regan Heath (14/47) Jun 28 2004 And it's crap. IMNSHO.
- Andy Friesen (6/10) Jun 28 2004 That would work, but it might be better to adjust your thinking to match...
- Regan Heath (112/121) Jun 29 2004 You may be right, so in an effort to change my thinking, pls consider
- Charlie (7/129) Jun 29 2004 ---
- Regan Heath (7/176) Jun 29 2004 an empty char[] has a length of 0.
- Andy Friesen (20/31) Jun 29 2004 In this case, I would say that the best thing to do on failure is to
- Regan Heath (32/64) Jun 29 2004 Nope. This is taken from a real life example, I have a config file with ...
- Andy Friesen (21/39) Jun 29 2004 I guess it's just a matter of preference. I don't have a problem with
- Regan Heath (30/70) Jun 30 2004 It's more like:
- Andy Friesen (27/55) Jun 30 2004 I very much doubt this. Associative arrays maintain an internal list of...
- Regan Heath (39/94) Jun 30 2004 I agree totally. I am not disputing how an associative array works, what...
- Andy Friesen (30/42) Jun 30 2004 This could never work anyway. Types for which null does not make sense
- Regan Heath (29/70) Jun 30 2004 I think.. I agree. :)
- Andy Friesen (9/21) Jul 01 2004 Right, but Cmp functions return 0 to indicate equality, which would be
- Arcane Jill (13/20) Jun 30 2004 And indeed that very situation is ALSO true with integer parameters. How...
- Regan Heath (14/42) Jun 30 2004 Yep. As another poster noted he had the same problem with integers,
- Kevin Bealer (19/64) Jul 07 2004 The D equivalent might be to return int[] or char[][] y. Test if the le...
- Farmer (17/101) Jul 09 2004 Disagree. Returning an array for a single value confuses a programmer th...
- Arcane Jill (23/29) Jun 29 2004 You'll get no arguments from me there. D got it right in not having a st...
- Derek Parnell (22/51) Jun 29 2004 Because that's not what is being meant. I'd like to differentiate betwee...
- Arcane Jill (12/17) Jun 29 2004 Why?
- Sam McCall (16/29) Jun 29 2004 The difference is in C++ it's common to use a pointer to a class (and I
- Arcane Jill (8/12) Jun 29 2004 I do, but less frequently than this one as it's a slow turnover list. I ...
- Matthias Becker (14/20) Jun 29 2004 Nope, wrong.
- Sam McCall (4/12) Jun 29 2004 To request default behaviour a la optional arguments, without
- Regan Heath (8/34) Jun 29 2004 pls read my post (2 prior to this one - sorted flat and by date, it is a...
- Derek (16/38) Jun 29 2004 I don't use C++, so I'm not aware of what std::vector does or does not
- Arcane Jill (15/30) Jun 29 2004 I'd use two functions for this:
- Sam McCall (6/25) Jun 29 2004 Sure, but it sucks if there's a lot of them, and is impossible if the
-
Carlos Santander B.
(63/63)
Jun 29 2004
"Arcane Jill"
escribió en el mensaje - Regan Heath (8/31) Jun 29 2004 Pls read the reply I just made to Andy's post that started this branch i...
- Sam McCall (59/82) Jun 29 2004 I'm still getting there... I still don't see why toUpper("hello") is
- Arcane Jill (55/73) Jun 29 2004 Maybe not, but you still need something to store them in. Even if you le...
- Sam McCall (54/127) Jun 29 2004 Sure, but given that the "user" shouldn't be touching chars without
- Arcane Jill (24/52) Jun 29 2004 Yes it does. Java chars operate in UTF-16. If you want to store the char...
- Sam McCall (55/111) Jun 29 2004 Whoops. Having never had to deal with this case (and taken a series of
- Arcane Jill (38/76) Jun 30 2004 I'm led to believe there was a lot of debate about this. Some folk said ...
- Sam McCall (24/102) Jun 30 2004 Sorry, I meant "if java had originally been defined to have char being
- Arcane Jill (31/43) Jun 30 2004 You weren't mistaken. You were spot on.
-
Sam McCall
(8/24)
Jun 30 2004
- Bent Rasmussen (8/14) Jun 29 2004 That's true. In Standard ML you could do
- Sam McCall (20/34) Jun 29 2004 McCall's Law the First:
- Regan Heath (5/41) Jun 29 2004 I think the current value-type-kinda nature of arrays is good, it just
- Matthias Becker (13/29) Jun 29 2004 Why do you need to add member-functions to a string class, but you don't...
- Bent Rasmussen (32/36) Jun 29 2004 Perhaps,
-
Farmer
(13/18)
Jun 29 2004
Arcane Jill
wrote in - Sam McCall (9/20) Jun 29 2004 We don't have array literals, so we can't do this:
- Farmer (12/33) Jun 30 2004 What's messy here?
- Matthias Becker (5/39) Jun 29 2004 Could you please make some real world examples, where you need empty str...
- Regan Heath (10/58) Jun 29 2004 Thus why I dont use references either when I need the ability to say it'...
- Sean Kelly (7/9) Jun 29 2004 Why? It seems to me that this behavior would also require arrays to be
- Farmer (17/29) Jun 29 2004 The .length parameter would still work with null-arrays (as they current...
- Sean Kelly (21/33) Jun 29 2004 Consider the following:
- Farmer (18/61) Jun 30 2004 I agree with you that this feature is quite useful.
- Regan Heath (32/97) Jun 30 2004 Provably correct. :)
- Farmer (2/2) Jul 01 2004 Sorry, I've posted rubbish.
- Sean Kelly (5/11) Jun 30 2004 I read it that the assertion requires either the length to be zero or th...
- Regan Heath (6/22) Jun 30 2004 It may very well allow it (in this code, at this level), but how do you ...
- Farmer (9/28) Jul 01 2004 I blush for shame, this is too embarrassing. What a whimp I am, I can't ...
- Regan Heath (9/22) Jun 29 2004 Nope. It already works, except for 2 inconsistencies (see the original
-
Farmer
(14/16)
Jun 29 2004
Andy Friesen
wrote in - Andy Friesen (12/23) Jun 29 2004 They don't? Do you have a source to back that up? As far as I've ever
- Regan Heath (11/31) Jun 29 2004 Sure.. can you show me how. I am having trouble doing it, it must be my ...
- Farmer (22/50) Jun 30 2004 Sorry, my statement was badly expressed. I meant it more like "And proba...
- Bent Rasmussen (5/5) Jun 30 2004 I hope you're not referring to the quick hack I posted. It was meant to
- Farmer (9/17) Jul 01 2004 You suggested none's for int's but you don't use the term naive in your
- Regan Heath (13/66) Jun 30 2004 Was it me.. these links don't work for me :(
-
Farmer
(41/45)
Jul 03 2004
Regan Heath
wrote in - Farmer (13/34) Jun 30 2004 An expression like
- Arcane Jill (19/20) Jun 27 2004 Actually, I think that D has got it right here. At least mostly. I'm hap...
- Regan Heath (46/79) Jun 27 2004 This (now?) works.
- Derek (6/12) Jun 27 2004 Agreed. A non-existant array is not the same as an array with no element...
- Arcane Jill (19/23) Jun 28 2004 Indeed, I think it has always worked. It was just me misremembering the ...
- Sean Kelly (12/19) Jun 28 2004 Yes it is. But I think it's the syntax that's the problem in this case....
- Andy Friesen (10/16) Jun 28 2004 Something which just occurred to me that would resolve this issue would
- Sean Kelly (5/9) Jun 28 2004 This might be very handy. If so, I wouldn't mind seeing rbegin and rend
- Sam McCall (6/18) Jun 29 2004 Huh? They're pointers... wouldn't rbegin == end and rend == begin?
- Sean Kelly (7/15) Jun 29 2004 It does apply to associative arrays IMO. I iterate through the contents...
- Sam McCall (10/30) Jun 29 2004 We're talking about pointers for low level iteration, this doesn't apply...
- Sean Kelly (8/13) Jun 30 2004 This is easy enough to do with free functions anyway. Something like:
- Farmer (7/11) Jun 28 2004 The expression cast(elementtype*)a+n , does that.
- Regan Heath (20/52) Jun 28 2004 Interestingly..
- Norbert Nemec (9/22) Jun 29 2004 No, I disagree here. In general, that address would point to nothing.
- Arcane Jill (7/14) Jun 29 2004 Such a pointer is never used for reading OR writing. It /is/, however, u...
- Norbert Nemec (3/8) Jun 29 2004 Well - that's a workaround but not a clean solution.
- Farmer (8/16) Jun 29 2004 [snip]
-
Farmer
(9/16)
Jun 28 2004
Regan Heath
wrote in - Farmer (12/34) Jun 27 2004 I'm a bit confused, since in my sample, the array 'empty2' is created fr...
- Farmer (6/19) Jun 28 2004
Why are there (almost) no complaints about D's support for empty arrays? Just to get ex-BASIC programmers in touch with this aspect of D arrays, here's a (not so) small D sample that shows how to create a)null arrays (named: null1, null2, null3) b)empty arrays (named: array1, array2, array3) and also shows how they differ. [D arrays have sooooo obvious semantic, that D programmers should feel free to skip to the end of this post and read the conclusion.] --------------------- array sample code --------------------- void printTraits(char[] array, char[] name) { printf("\n%10.*s%-13.*s", name, ".length == 0"); if (array.length == 0) printf("%10.*s","is true"); else printf("%10.*s","is false"); printf("%10.*s%-13.*s", name, " is null"); if (array is null) printf("%10.*s","is true"); else printf("%10.*s","is false"); printf("\n%10.*s%-13.*s", name, " == null"); if (array == null) printf("%10.*s","is true"); else printf("%10.*s","is false"); printf("%10.*s%-13.*s", name, " == \"\""); if (array == "") printf("%10.*s","is true"); else printf("%10.*s","is false"); } int main(char args[][]) { char[] empty1=(new char[1])[0..0]; char[] empty2="1"[1..1]; // empty2="1"[2..2] causes ArrayBoundsError char[] empty3=""; char[] null1; char[] null2=new char[0]; char[] null3=empty1; null3.length=0; printTraits(null1, "null1"); printTraits(null2, "null2"); printTraits(null3, "null3"); printf("\n"); printTraits(empty1, "empty1"); printTraits(empty2, "empty2"); printTraits(empty3, "empty3"); printf("\n\n"); if (null1 == null) printf("%20.*s","null1 == null "); if (empty1 == null1) printf("%20.*s","empty1 == null1 "); if (empty1 != null) printf("%20.*s","but empty1 != null"); printf("\n"); return 0; } Build with DMD 0.93 (Windows), the output is: null1.length == 0 is true null1 is null is true null1 == null is true null1 == "" is true null2.length == 0 is true null2 is null is true null2 == null is true null2 == "" is true null3.length == 0 is true null3 is null is true null3 == null is true null3 == "" is true empty1.length == 0 is true empty1 is null is false empty1 == null is false empty1 == "" is true empty2.length == 0 is true empty2 is null is false empty2 == null is false empty2 == "" is true empty3.length == 0 is true empty3 is null is false empty3 == null is false empty3 == "" is true null1 == null empty1 == null1 but empty1 != null --------------------- end of array sample --------------------- Conclusion: D does have empty-arrays and null-arrays but the language tries to blur them. This is unfortunate as 1) a clear separation of empty-arrays vs. null-arrays is useful for functional rich but simple API interfaces: Imagine a function that returns the value of attributes of a XML-element char[] getAttrValue(char[] name) The attribute value could be non-existant (the attribute doesn't exist), be empty, or have a non-empty value. If empty-arrays vs. null-arrays are blurred, the interface gets more bloated: // additional parameter char[] getAttrValue(char[] name, out bit isNull) // additional function, potentially wasting a slot in the VTable bit hasAttrValue(char[] name) // additional indirection Attribute getAttribute(char[] name) 2) Initialization bugs are not detected at runtime. D has -null-references for objects -null for pointers -nan's for FP types -invalid characters for unicode characters -garantueed initialization of structs (Constructors are comming, soon !) -and strong typedefs that empower the programmer to define application specific 'not-initialized' values for integer types to make an ubiquitous source of bugs, easy to spot and fix. But if empty/null arrays are commonly treated as being the same thing, uninitialized arrays will cause subtle bugs here and there. 3) This aspect of array behaviour is not obvious! Ok, what's obvious is always a moot point. (If I knew, what's obvious, I would write posts about bit vs. bool vs. strong bool types.) But I know that the array behaviour is definitely not obvious to all D/C/C++ programmers. So, why doesn't anyone complain? Farmer.
Jun 27 2004
In article <Xns9515C8A3CA1ACitsFarmer 63.105.9.61>, Farmer says...Conclusion: D does have empty-arrays and null-arrays but the language tries to blur them.Not really. I'd rather argue that D tries to make both usable and reduce odd errors resulting from uninitialized arrays.This is unfortunate as 1) a clear separation of empty-arrays vs. null-arrays is useful for functional rich but simple API interfaces: Imagine a function that returns the value of attributes of a XML-element char[] getAttrValue(char[] name) The attribute value could be non-existant (the attribute doesn't exist), be empty, or have a non-empty value.I'd say this is an interface or documentaation problem, not a language problem.2) Initialization bugs are not detected at runtime.This makes sense in this case. I don't like the idea of having to distinguish between an initialized array with no elements and an uninitialized array, as both are equivalent IMO. Further, setting the length property will cause a reallocation for both types of arrays.to make an ubiquitous source of bugs, easy to spot and fix. But if empty/null arrays are commonly treated as being the same thing, uninitialized arrays will cause subtle bugs here and there.I believe the opposite would be true. Sean
Jun 27 2004
Sean Kelly <sean f4.ca> wrote in news:cbn29h$rpo$1 digitaldaemon.com:Not really. I'd rather argue that D tries to make both usable and reduce odd errors resulting from uninitialized arrays.I think, D tries to *hide* errors resulting from uninitialized arrays.You misunderstood me, I meant that the function interface is a good one. I could document the function like this: /* Function returns the value the attribute of the given name. param name name of the attribute return returns null if the attribute doesn't exist returns value of the attribute otherwise */ char[] getAttrValue(char[] name) But the other functions, I mentioned would be a necessary workaround if you couldn't distinguish between null and empty arrays. And these functions are a waste of both cpu cycles and developer brain.This is unfortunate as 1) a clear separation of empty-arrays vs. null-arrays is useful for functional rich but simple API interfaces: Imagine a function that returns the value of attributes of a XML-element char[] getAttrValue(char[] name) The attribute value could be non-existant (the attribute doesn't exist), be empty, or have a non-empty value.I'd say this is an interface or documentaation problem, not a language problem.Well, it's quite easy to do distinquish between an empty and a null array: An uninitialized array (null array) is a bug in either the programmer's code or in the code of a library. An initialized array (empty array) is a perfectly legal thing. Why is the idea to distinguish between a bug and correct programm behaviour such an unpleasent thing? Reallocation occures if the length is greater than the allocated size. I'm fine with that, the length 'property' is such an oddity that whatever it does, I would call it consistent. Reallocation is garanteed to not happen if the new length is less or equal the allocated size (Walter said so). Well, except when the new length happens to be 0. Talk about consistency.2) Initialization bugs are not detected at runtime.This makes sense in this case. I don't like the idea of having to distinguish between an initialized array with no elements and an uninitialized array, as both are equivalent IMO. Further, setting the length property will cause a reallocation for both types of arrays.
Jun 27 2004
On Sun, 27 Jun 2004 22:55:46 +0000 (UTC), Farmer wrote:Sean Kelly <sean f4.ca> wrote in news:cbn29h$rpo$1 digitaldaemon.com:Well....the *use* of an uninitialized array it which it is assumed to be initialized is a bug. The fact, or presence, of an uninitialized array is itself is not really a bug. Also, the use of an empty array may well be a bug in other circumstances, even though is it 'a legal thing'.Not really. I'd rather argue that D tries to make both usable and reduce odd errors resulting from uninitialized arrays.I think, D tries to *hide* errors resulting from uninitialized arrays.You misunderstood me, I meant that the function interface is a good one. I could document the function like this: /* Function returns the value the attribute of the given name. param name name of the attribute return returns null if the attribute doesn't exist returns value of the attribute otherwise */ char[] getAttrValue(char[] name) But the other functions, I mentioned would be a necessary workaround if you couldn't distinguish between null and empty arrays. And these functions are a waste of both cpu cycles and developer brain.This is unfortunate as 1) a clear separation of empty-arrays vs. null-arrays is useful for functional rich but simple API interfaces: Imagine a function that returns the value of attributes of a XML-element char[] getAttrValue(char[] name) The attribute value could be non-existant (the attribute doesn't exist), be empty, or have a non-empty value.I'd say this is an interface or documentaation problem, not a language problem.Well, it's quite easy to do distinquish between an empty and a null array: An uninitialized array (null array) is a bug in either the programmer's code or in the code of a library. An initialized array (empty array) is a perfectly legal thing.2) Initialization bugs are not detected at runtime.This makes sense in this case. I don't like the idea of having to distinguish between an initialized array with no elements and an uninitialized array, as both are equivalent IMO. Further, setting the length property will cause a reallocation for both types of arrays.Why is the idea to distinguish between a bug and correct programm behaviour such an unpleasent thing?It's not, and no one said it was. We are talking about distinguishing between an array that has not been set to anything specific *yet*, and one that has been set explictly though assignment, to contain zero elements. There is a timing issue here. For example, it might be prudent in some situations to only initialize an array if its actually going to be used. This is a run-time decision and not a compile time decision. -- Derek Melbourne, Australia 28/Jun/04 10:44:13 AM
Jun 27 2004
Derek Parnell" <derek psych.ward> wrote:There is a timing issue here. For example, it might be prudent in some situations to only initialize an array if its actually going to be used. This is a run-time decision and not a compile time decision.What I do to handle such issues is to check the array length only. See, even if the array is unallocated the length is still valid (because arrays are a pointer/length pair). If the length is zero, you move on. If not, then the pointer *should* be valid. That is, a length-check can perform double duty. For example: void foo (char[] bar) { if (bar.length) // do something ; } main () { foo (null); } - Kris
Jun 27 2004
On Sun, 27 Jun 2004 18:09:05 -0700, Kris <someidiot earthlink.dot.dot.dot.net> wrote:Derek Parnell" <derek psych.ward> wrote:I think Derek is thinking more of this other example he gave: if (a === null) { // Initialize it } else { if (a.length == 0) { // Empty situation. I DO NOT WANT TO INITIALIZE IT HERE! } else { // Use the non-empty array } } The array above is initialized if it's null. Otherwise it is handled based on whether it has items in it. We need to be able to tell the difference between empty and null, and it needs to be consistent. The inconsistencies as I see them are: empty array == null //true empry array == null array //true whereas both should be false. No change needs to be made to the way the length property works, as you say it's useful if you do not need to handle them differently. Regan. -- Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/There is a timing issue here. For example, it might be prudent in some situations to only initialize an array if its actually going to be used. This is a run-time decision and not a compile time decision.What I do to handle such issues is to check the array length only. See, even if the array is unallocated the length is still valid (because arrays are a pointer/length pair). If the length is zero, you move on. If not, then the pointer *should* be valid. That is, a length-check can perform double duty. For example: void foo (char[] bar) { if (bar.length) // do something ; } main () { foo (null); }
Jun 27 2004
Farmer wrote:Why are there (almost) no complaints about D's support for empty arrays? Conclusion: D does have empty-arrays and null-arrays but the language tries to blur them. This is unfortunate ... So, why doesn't anyone complain?I think the problem is that D arrays almost always behave like reference types, and therefore are almost always treated like reference types. They aren't. null arrays *are* empty arrays. Arrays are value types which consist of a length and a pointer to memory. Copying and slicing an array creates a brand new array whose data happens to (generally) be memory that is also pointed to by another array. So! Rules of thumb: 1) think of arrays as though they are value types which can be cheaply copied. 2) use .dup if you need to mutate copies made in this way. (the Copy-on-Write principle) -- andy
Jun 27 2004
Andy Friesen <andy ikagames.com> wrote in news:cbn3js$tgq$1 digitaldaemon.com:I think the problem is that D arrays almost always behave like reference types, and therefore are almost always treated like reference types.Yes, this is a problem. It is a necessary evil to archive that outstanding performance. But it is not really related to the topic null array vs. empty array, since empty arrays are possible with the D array layoutThey aren't. null arrays *are* empty arrays.No, null arrays are not empty arrays, as my sample proofs.Arrays are value types which consist of a length and a pointer to memory. Copying and slicing an array creates a brand new array whose data happens to (generally) be memory that is also pointed to by another array.I think there's a lapsus, slices *always* point to the same memory as the array from which they were created. Regards, Farmer.
Jun 27 2004
Farmer wrote:Andy Friesen <andy ikagames.com> wrote in news:cbn3js$tgq$1 digitaldaemon.com:Sure, in the same sense that D allows 'empty' integers. :)I think the problem is that D arrays almost always behave like reference types, and therefore are almost always treated like reference types.Yes, this is a problem. It is a necessary evil to archive that outstanding performance. But it is not really related to the topic null array vs. empty array, since empty arrays are possible with the D array layoutConceptually they are. If the length is zero, then the data pointer is meaningless. Testing the data pointer in such a case can be likened to using the result of a division by zero. Doing things like mathematically 'proving' that 3==5 or that empty!==null is easy when you go into the twilight zone. :) As an example: import std.string; char[] permute(char[] c) { // mutate that to which the array refers c[0] = 'H'; // mutate the array c.length = 4; return c; } int main() { char[] c = "hello world!"; printf("%s\n", toStringz(c)); char[] d = permute(c); printf("Post-permute\n"); printf("%s\n", toStringz(c)); printf("%s\n", toStringz(d)); return 0; } This program produces the output: hello world! Hello world! Hell The array is a value type. The data it points to is not.They aren't. null arrays *are* empty arrays.No, null arrays are not empty arrays, as my sample proofs.In my experience, this is true, but I don't know if it *must*, so I felt obligated to qualify my statement. -- andyArrays are value types which consist of a length and a pointer to memory. Copying and slicing an array creates a brand new array whose data happens to (generally) be memory that is also pointed to by another array.I think there's a lapsus, slices *always* point to the same memory as the array from which they were created.
Jun 27 2004
On Sun, 27 Jun 2004 17:02:27 -0700, Andy Friesen wrote:Farmer wrote:Huh? There are times when a zero-length array is valid and an uninitalized array is not valid. There are simply not the same thing. if (a === null) { // Initialize it } else { if (a.length == 0) { // Empty situation. I DO NOT WANT TO INITIALIZE IT HERE! } else { // Use the non-empty array } }Andy Friesen <andy ikagames.com> wrote in news:cbn3js$tgq$1 digitaldaemon.com:Sure, in the same sense that D allows 'empty' integers. :)I think the problem is that D arrays almost always behave like reference types, and therefore are almost always treated like reference types.Yes, this is a problem. It is a necessary evil to archive that outstanding performance. But it is not really related to the topic null array vs. empty array, since empty arrays are possible with the D array layoutConceptually they are. If the length is zero, then the data pointer is meaningless. Testing the data pointer in such a case can be likened to using the result of a division by zero. Doing things like mathematically 'proving' that 3==5 or that empty!==null is easy when you go into the twilight zone. :)They aren't. null arrays *are* empty arrays.No, null arrays are not empty arrays, as my sample proofs.As an example: import std.string; char[] permute(char[] c) { // mutate that to which the array refers c[0] = 'H'; // mutate the array c.length = 4; return c; } int main() { char[] c = "hello world!"; printf("%s\n", toStringz(c)); char[] d = permute(c); printf("Post-permute\n"); printf("%s\n", toStringz(c)); printf("%s\n", toStringz(d)); return 0; } This program produces the output: hello world! Hello world! Hell The array is a value type. The data it points to is not.Yes, it could be an artifact of the D compiler rather than the D language. -- Derek Melbourne, Australia 28/Jun/04 10:51:51 AMIn my experience, this is true, but I don't know if it *must*, so I felt obligated to qualify my statement.Arrays are value types which consist of a length and a pointer to memory. Copying and slicing an array creates a brand new array whose data happens to (generally) be memory that is also pointed to by another array.I think there's a lapsus, slices *always* point to the same memory as the array from which they were created.
Jun 27 2004
On Sun, 27 Jun 2004 17:02:27 -0700, Andy Friesen <andy ikagames.com> wrote:Farmer wrote:Andy Friesen <andy ikagames.com> wrote in news:cbn3js$tgq$1 digitaldaemon.com:I think the problem is that D arrays almost always behave like reference types, and therefore are almost always treated like reference types.Yes, this is a problem. It is a necessary evil to archive that outstanding performance. But it is not really related to the topic null array vs. empty array, since empty arrays are possible with the D array layoutSure, in the same sense that D allows 'empty' integers. :)D allows both empty arrays *and* null arrays. It does *not* allow both empty *and* null integers. They are different and not comparable.They aren't. null arrays *are* empty arrays.No, null arrays are not empty arrays, as my sample proofs.Conceptually they are. If the length is zero, then the data pointer is meaningless.I disagree. Conceptually they aren't the same, as both my example and 'Farmers' have proven for the case of a char array. Even with other array types there is still a conceptual difference between an array that does not exist and one containing no elements. In a large number of real world cases you would treat the 2 the same, but that does not make them the same, and is no reason to preclude the ability to treat them differently. Even in D's implementation they aren't exactly the same, consider: 0) char[] a; 1) char[] b = "regan"; 2) b = ""; 3) b = null; at 0 a's data pointer is null and length is zero at 1 b's data pointer is non-null and length is 5 at 2 b's data pointer is non-null and length is 0 I am not 100% certain what happens at 3, either: at 3 b's data pointer is null and length is 0 or at 3 b's data pointer is non-null and length is 0 in either case 'a' (the null array) is not the same as 'b' when it is an empty array, and may not be even when 'b' is a null array.Testing the data pointer in such a case can be likened to using the result of a division by zero. Doing things like mathematically 'proving' that 3==5 or that empty!==null is easy when you go into the twilight zone. :)As an example: import std.string; char[] permute(char[] c) { // mutate that to which the array refers c[0] = 'H'; // mutate the array c.length = 4; return c; } int main() { char[] c = "hello world!"; printf("%s\n", toStringz(c)); char[] d = permute(c); printf("Post-permute\n"); printf("%s\n", toStringz(c)); printf("%s\n", toStringz(d)); return 0; } This program produces the output: hello world! Hello world! Hell The array is a value type. The data it points to is not.The simple fact remains that we require both null strings (and possibly other arrays) and empty strings and that conceptually they are different, or rather they can mean different things and/or demand different behaviour. All I'm advocating is that test for null to not compare true for an empty array, and thus a null array and an empty array not to compare equal. Regan. -- Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/In my experience, this is true, but I don't know if it *must*, so I felt obligated to qualify my statement.Arrays are value types which consist of a length and a pointer to memory. Copying and slicing an array creates a brand new array whose data happens to (generally) be memory that is also pointed to by another array.I think there's a lapsus, slices *always* point to the same memory as the array from which they were created.
Jun 27 2004
Regan Heath wrote:D arrays are implement exactly so: struct Array { int length; void* data; } Array a; // value type int i; // value type 'i' will never be null, and 'a' never will either, because both types exist on the stack. 'a' can be *compared* to null because an implicit pointer conversion is performed. However, if 'a' does not contain any data, its pointer value is meaningless, so the result of such a comparison is undefined. Either way, 'a' itself is *not* null any more than 'i' ever could be. (I'm not saying that this is how it should be, I'm just saying that this is how it is)Yes, this is a problem. It is a necessary evil to archive that outstanding performance. But it is not really related to the topic null array vs. empty array, since empty arrays are possible with the D array layoutSure, in the same sense that D allows 'empty' integers. :)D allows both empty arrays *and* null arrays. It does *not* allow both empty *and* null integers. They are different and not comparable.This goes back to D performing implicit pointer conversion. Comparing arrays with null is not a good idea.They aren't. null arrays *are* empty arrays.No, null arrays are not empty arrays, as my sample proofs.Conceptually they are. If the length is zero, then the data pointer is meaningless.I disagree. Conceptually they aren't the same, as both my example and 'Farmers' have proven for the case of a char array. Even with other array types there is still a conceptual difference between an array that does not exist and one containing no elements. In a large number of real world cases you would treat the 2 the same, but that does not make them the same, and is no reason to preclude the ability to treat them differently.The simple fact remains that we require both null strings (and possibly other arrays) and empty strings and that conceptually they are different, or rather they can mean different things and/or demand different behaviour. All I'm advocating is that test for null to not compare true for an empty array, and thus a null array and an empty array not to compare equal.I'm still forming an opinion on whether this is the right thing to do or not. If comparing arrays with pointers was illegal, this issue would never arise. As for testing existence against emptiness, I suggest you do the same thing you would for an integer (or any other value type) for which nil and zero/empty/T.init must be distinguishable. -- andy
Jun 27 2004
On Sun, 27 Jun 2004 19:15:10 -0700, Andy Friesen <andy ikagames.com> wrote:Regan Heath wrote:I see what you're saying... the internal data pointer for the array can be null or non-null however, this is the difference between an un-initialized (or null) array and an empty one. I dont care how we do it, I just know we need to be able to tell the difference for 'strings'. Perhaps this applies to all arrays. Perhaps strings need to be a specialized form of array...D arrays are implement exactly so: struct Array { int length; void* data; } Array a; // value type int i; // value type 'i' will never be null, and 'a' never will either, because both types exist on the stack. 'a' can be *compared* to null because an implicit pointer conversion is performed. However, if 'a' does not contain any data, its pointer value is meaningless, so the result of such a comparison is undefined. Either way, 'a' itself is *not* null any more than 'i' ever could be. (I'm not saying that this is how it should be, I'm just saying that this is how it is)Yes, this is a problem. It is a necessary evil to archive that outstanding performance. But it is not really related to the topic null array vs. empty array, since empty arrays are possible with the D array layoutSure, in the same sense that D allows 'empty' integers. :)D allows both empty arrays *and* null arrays. It does *not* allow both empty *and* null integers. They are different and not comparable.Perhaps not, but, there is currently no other way to tell the difference between an empty string and a null string. This is very important.This goes back to D performing implicit pointer conversion. Comparing arrays with null is not a good idea.They aren't. null arrays *are* empty arrays.No, null arrays are not empty arrays, as my sample proofs.Conceptually they are. If the length is zero, then the data pointer is meaningless.I disagree. Conceptually they aren't the same, as both my example and 'Farmers' have proven for the case of a char array. Even with other array types there is still a conceptual difference between an array that does not exist and one containing no elements. In a large number of real world cases you would treat the 2 the same, but that does not make them the same, and is no reason to preclude the ability to treat them differently.True, but then you wouldn't be able to tell null strings from empty ones.The simple fact remains that we require both null strings (and possibly other arrays) and empty strings and that conceptually they are different, or rather they can mean different things and/or demand different behaviour. All I'm advocating is that test for null to not compare true for an empty array, and thus a null array and an empty array not to compare equal.I'm still forming an opinion on whether this is the right thing to do or not. If comparing arrays with pointers was illegal, this issue would never arise.As for testing existence against emptiness, I suggest you do the same thing you would for an integer (or any other value type) for which nil and zero/empty/T.init must be distinguishable.I suspect an arrays .init parameter *is* null. in which case uint[] c; if (c == c.init) is equvalent to if (c == null) I was just recently told by Walter not to use the init value of an array. I was trying to re-init the array, i.e. uint[4] c = [0,1,2,3]; c = c.init c[] = c.init; c[] = c[].init; none of those work. Walters soln... static uint[4] cinit = [0,1,2,3]; uint[4] c; c[] = cinit[]; Why can't .init do this implicitly? For my original example it would create one static array, and my array called 'c' then set c.init to the static array, so that c = c.init; would work. For an array that is not initialized c.init can stay null as c = c.init; would then be equivalent to c = null; Regan. -- Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/
Jun 27 2004
Regan Heath wrote:I see what you're saying... the internal data pointer for the array can be null or non-null however, this is the difference between an un-initialized (or null) array and an empty one. I dont care how we do it, I just know we need to be able to tell the difference for 'strings'. Perhaps this applies to all arrays. Perhaps strings need to be a specialized form of array...You say that as though it is self-evident that strings must absolutely, unequivocably be, at all costs, reference types. Why? C++ containers cannot represent null either. D will (and does) get along just fine if its array type works the same way.A 'null array' is a completely arbitrary concept that has been extrapolated from undefined behaviour. :) (check the documentation concerning arrays. Nowhere does the concept of a null array appear. The only place the keyword 'null' even occurs is a blip which says that arrays are initialized with their data pointer set to null)This goes back to D performing implicit pointer conversion. Comparing arrays with null is not a good idea.Perhaps not, but, there is currently no other way to tell the difference between an empty string and a null string. This is very important.Because there is no such thing. As far as D is concerned, all arrays exist. Some contain elements, others don't. Whether its data pointer is null or not does not set it apart from any other empty array. -- andyI'm still forming an opinion on whether this is the right thing to do or not. If comparing arrays with pointers was illegal, this issue would never arise.True, but then you wouldn't be able to tell null strings from empty ones.
Jun 28 2004
On Mon, 28 Jun 2004 12:50:08 -0700, Andy Friesen <andy ikagames.com> wrote:Regan Heath wrote:If it's not a reference type, then how can you signal non-existance (null)?I see what you're saying... the internal data pointer for the array can be null or non-null however, this is the difference between an un-initialized (or null) array and an empty one. I dont care how we do it, I just know we need to be able to tell the difference for 'strings'. Perhaps this applies to all arrays. Perhaps strings need to be a specialized form of array...You say that as though it is self-evident that strings must absolutely, unequivocably be, at all costs, reference types. Why?C++ containers cannot represent null either. D will (and does) get along just fine if its array type works the same way.I have not used C++ containers. I program in C for a living, and C++ for a hobby. Is there a C++ container for strings that cannot tell the difference between non-existant and empty?It may be undefined, but I believe it is required.A 'null array' is a completely arbitrary concept that has been extrapolated from undefined behaviour. :)This goes back to D performing implicit pointer conversion. Comparing arrays with null is not a good idea.Perhaps not, but, there is currently no other way to tell the difference between an empty string and a null string. This is very important.(check the documentation concerning arrays. Nowhere does the concept of a null array appear. The only place the keyword 'null' even occurs is a blip which says that arrays are initialized with their data pointer set to null)So it's undefined, lets define it.Yes there is. The concept exists, in C and in our examples.Because there is no such thing.I'm still forming an opinion on whether this is the right thing to do or not. If comparing arrays with pointers was illegal, this issue would never arise.True, but then you wouldn't be able to tell null strings from empty ones.As far as D is concerned, all arrays exist. Some contain elements, others don't. Whether its data pointer is null or not does not set it apart from any other empty array.Yes it does. This behaviour exists, it's just currently undefined (as you say) and inconsistent (as Farmer has pointed out). The soln IMO is either to make the current behaviour official and consistent, or to change the behaviour, make that official and provide another way to tell null apart from an empty string. Regan. -- Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/
Jun 28 2004
Regan Heath wrote:On Mon, 28 Jun 2004 12:50:08 -0700, Andy Friesen <andy ikagames.com> wrote:You don't.You say that as though it is self-evident that strings must absolutely, unequivocably be, at all costs, reference types. Why?If it's not a reference type, then how can you signal non-existance (null)?I have not used C++ containers. I program in C for a living, and C++ for a hobby. Is there a C++ container for strings that cannot tell the difference between non-existant and empty?Yeah, it's called std::string, and it's more or less the default.Why? C++ gets along without them just fine, and every C derivant I know of gets along fine without allowing primitive type returns to signify nonexistence. Functions which returns structs cannot return null either.A 'null array' is a completely arbitrary concept that has been extrapolated from undefined behaviour. :)It may be undefined, but I believe it is required.The soln IMO is either to make the current behaviour official and consistent, or to change the behaviour, make that official and provide another way to tell null apart from an empty string.Farmer's test reports pretty consistent results if you suppose that comparing arrays to null is ill-formed: empty1.length == 0 is true empty1 == "" is true empty2.length == 0 is true empty2 == "" is true empty3.length == 0 is true empty3 == "" is true Don't compare arrays to null. Don't try to differentiate between empty and nonexistent. D arrays simply do not work that way. -- andy
Jun 28 2004
On Mon, 28 Jun 2004 16:33:25 -0700, Andy Friesen wrote:Regan Heath wrote:Agreed, D doesn't seem to work that way, but isn't that the issue. Some people would like to distinguish between an uninitialized array, and an initialized but empty array. -- Derek Melbourne, Australia 29/Jun/04 10:44:05 AMOn Mon, 28 Jun 2004 12:50:08 -0700, Andy Friesen <andy ikagames.com> wrote:You don't.You say that as though it is self-evident that strings must absolutely, unequivocably be, at all costs, reference types. Why?If it's not a reference type, then how can you signal non-existance (null)?I have not used C++ containers. I program in C for a living, and C++ for a hobby. Is there a C++ container for strings that cannot tell the difference between non-existant and empty?Yeah, it's called std::string, and it's more or less the default.Why? C++ gets along without them just fine, and every C derivant I know of gets along fine without allowing primitive type returns to signify nonexistence. Functions which returns structs cannot return null either.A 'null array' is a completely arbitrary concept that has been extrapolated from undefined behaviour. :)It may be undefined, but I believe it is required.The soln IMO is either to make the current behaviour official and consistent, or to change the behaviour, make that official and provide another way to tell null apart from an empty string.Farmer's test reports pretty consistent results if you suppose that comparing arrays to null is ill-formed: empty1.length == 0 is true empty1 == "" is true empty2.length == 0 is true empty2 == "" is true empty3.length == 0 is true empty3 == "" is true Don't compare arrays to null. Don't try to differentiate between empty and nonexistent. D arrays simply do not work that way. -- andy
Jun 28 2004
Don't compare arrays to null. Don't try to differentiate between empty and nonexistent. D arrays simply do not work that way.I must say, I kind of like that. I don't have to write a read/write property where the write property has an in/out contract to guard against internal/external code setting an array member field to null -- goodbye bloat!
Jun 28 2004
On Mon, 28 Jun 2004 16:33:25 -0700, Andy Friesen <andy ikagames.com> wrote:Regan Heath wrote:Thought so..On Mon, 28 Jun 2004 12:50:08 -0700, Andy Friesen <andy ikagames.com> wrote:You don't.You say that as though it is self-evident that strings must absolutely, unequivocably be, at all costs, reference types. Why?If it's not a reference type, then how can you signal non-existance (null)?And it's crap. IMNSHO.I have not used C++ containers. I program in C for a living, and C++ for a hobby. Is there a C++ container for strings that cannot tell the difference between non-existant and empty?Yeah, it's called std::string, and it's more or less the default.Thus why just about no-one ever does this (in C). They all return a pointer to a struct.Why? C++ gets along without them just fine, and every C derivant I know of gets along fine without allowing primitive type returns to signify nonexistence. Functions which returns structs cannot return null either.A 'null array' is a completely arbitrary concept that has been extrapolated from undefined behaviour. :)It may be undefined, but I believe it is required.Fine and dandy EXCEPT we *need* to differentiate between empty and non-existant strings.The soln IMO is either to make the current behaviour official and consistent, or to change the behaviour, make that official and provide another way to tell null apart from an empty string.Farmer's test reports pretty consistent results if you suppose that comparing arrays to null is ill-formed: empty1.length == 0 is true empty1 == "" is true empty2.length == 0 is true empty2 == "" is true empty3.length == 0 is true empty3 == "" is true Don't compare arrays to null. Don't try to differentiate between empty and nonexistent.D arrays simply do not work that way.In that case we need an array specialisation for strings, so I'll have to write my own. This defeats the purpose of char[] in the first place, which was, to be a better more consistent string handling method than in possible in c/c++. Regan. -- Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/
Jun 28 2004
Regan Heath wrote:... we need an array specialisation for strings, so I'll have to write my own. This defeats the purpose of char[] in the first place, which was, to be a better more consistent string handling method than in possible in c/c++.That would work, but it might be better to adjust your thinking to match the language instead of trying to shoehorn the way you're used to thinking onto an abstraction that clearly wasn't built for it. Don't think in Java/C++/etc. Think in D. :) -- andy
Jun 28 2004
On Mon, 28 Jun 2004 22:54:23 -0700, Andy Friesen <andy ikagames.com> wrote:Regan Heath wrote:You may be right, so in an effort to change my thinking, pls consider this... struct Item { char[] label; char[] value; } class Post { Item[] items; char[] getValue(char[] label) { foreach(Item i; items) { if (item.label == label) return item.value; } //return null; not allowed return ""; } } Web page... <form post.. > <input type="text" name="foo" value=""> <input type="text" name="bar" value=""> </form> Code to do something with the post. char[] s; Post p; s = p.getValue("foo"); if (s) .. s = p.getValue("bar"); if (s) .. Right... If I cannot return null, then (using the code above) I cannot tell the difference between whether foo or bar was passed or had an empty value. So I have to add a function, something like class Post { bool isPresent(char[] label) { foreach(Item i; items) { if (item.label == label) return true; } return false; } } and in my code.. if (p.isPresent("foo")) { s = p.getValue("foo"); .. } looks more complex. In addition I am searching for the label/value twice, doing twice the work. To avoid that I can add a parameter to the getValue function i.e. class Post { char[] getValue(char[] label, out bool isNull) { foreach(Item i; items) { if (item.label == label) return item.value; } //return null; not allowed isNull = true; return ""; } } then my code looks like... char[] s; bool isn; s = p.getValue("foo",isn); if (!isn) { } more complex code again, less obvious, a 3rd option springs to mind, instead of returning a char[] from getValue I could return existance and fill a passed char[] i.e. class Post { bool getValue(char[] label, out char[] value) { foreach(Item i; items) { if (item.label == label) { value = item.value; return true; } } return false; } } so my code now looks like... char[] s; if (getValue("foo",s)) { } this is perhaps the best soln so far. But! lets consider if this were extended to get 2 or more char[] values, (this is perfectly reasonable/likely, say they are loaded from a file, why process the file twice when you can do so once and get both values). bool getValue(out char[] val1, out char[] val2) { } what do we return if val1 exists but val2 does not? a set of flags? yuck. It just seems to me, that all this is done to emulate a reference type.. so why not have a reference type? We already have one, all it would take to make it consistent is 2 minor changes. If you have a solution to the above that is both as simple, elegant and easy to code as being able to return null.. pls educate me. Regan. -- Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/... we need an array specialisation for strings, so I'll have to write my own. This defeats the purpose of char[] in the first place, which was, to be a better more consistent string handling method than in possible in c/c++.That would work, but it might be better to adjust your thinking to match the language instead of trying to shoehorn the way you're used to thinking onto an abstraction that clearly wasn't built for it. Don't think in Java/C++/etc. Think in D. :)
Jun 29 2004
--- s = p.getValue("foo"); if (s.length) --- Whats wrong with this way ? Charlie In article <opsadsu8f75a2sq9 digitalmars.com>, Regan Heath says...On Mon, 28 Jun 2004 22:54:23 -0700, Andy Friesen <andy ikagames.com> wrote:Regan Heath wrote:You may be right, so in an effort to change my thinking, pls consider this... struct Item { char[] label; char[] value; } class Post { Item[] items; char[] getValue(char[] label) { foreach(Item i; items) { if (item.label == label) return item.value; } //return null; not allowed return ""; } } Web page... <form post.. > <input type="text" name="foo" value=""> <input type="text" name="bar" value=""> </form> Code to do something with the post. char[] s; Post p; s = p.getValue("foo"); if (s) .. s = p.getValue("bar"); if (s) .. Right... If I cannot return null, then (using the code above) I cannot tell the difference between whether foo or bar was passed or had an empty value. So I have to add a function, something like class Post { bool isPresent(char[] label) { foreach(Item i; items) { if (item.label == label) return true; } return false; } } and in my code.. if (p.isPresent("foo")) { s = p.getValue("foo"); .. } looks more complex. In addition I am searching for the label/value twice, doing twice the work. To avoid that I can add a parameter to the getValue function i.e. class Post { char[] getValue(char[] label, out bool isNull) { foreach(Item i; items) { if (item.label == label) return item.value; } //return null; not allowed isNull = true; return ""; } } then my code looks like... char[] s; bool isn; s = p.getValue("foo",isn); if (!isn) { } more complex code again, less obvious, a 3rd option springs to mind, instead of returning a char[] from getValue I could return existance and fill a passed char[] i.e. class Post { bool getValue(char[] label, out char[] value) { foreach(Item i; items) { if (item.label == label) { value = item.value; return true; } } return false; } } so my code now looks like... char[] s; if (getValue("foo",s)) { } this is perhaps the best soln so far. But! lets consider if this were extended to get 2 or more char[] values, (this is perfectly reasonable/likely, say they are loaded from a file, why process the file twice when you can do so once and get both values). bool getValue(out char[] val1, out char[] val2) { } what do we return if val1 exists but val2 does not? a set of flags? yuck. It just seems to me, that all this is done to emulate a reference type.. so why not have a reference type? We already have one, all it would take to make it consistent is 2 minor changes. If you have a solution to the above that is both as simple, elegant and easy to code as being able to return null.. pls educate me. Regan. -- Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/... we need an array specialisation for strings, so I'll have to write my own. This defeats the purpose of char[] in the first place, which was, to be a better more consistent string handling method than in possible in c/c++.That would work, but it might be better to adjust your thinking to match the language instead of trying to shoehorn the way you're used to thinking onto an abstraction that clearly wasn't built for it. Don't think in Java/C++/etc. Think in D. :)
Jun 29 2004
On Wed, 30 Jun 2004 00:52:17 +0000 (UTC), Charlie <Charlie_member pathlink.com> wrote:--- s = p.getValue("foo"); if (s.length) --- Whats wrong with this way ?an empty char[] has a length of 0. the above would not see an empty value passed in a form. Regan.Charlie In article <opsadsu8f75a2sq9 digitalmars.com>, Regan Heath says...-- Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/On Mon, 28 Jun 2004 22:54:23 -0700, Andy Friesen <andy ikagames.com> wrote:Regan Heath wrote:You may be right, so in an effort to change my thinking, pls consider this... struct Item { char[] label; char[] value; } class Post { Item[] items; char[] getValue(char[] label) { foreach(Item i; items) { if (item.label == label) return item.value; } //return null; not allowed return ""; } } Web page... <form post.. > <input type="text" name="foo" value=""> <input type="text" name="bar" value=""> </form> Code to do something with the post. char[] s; Post p; s = p.getValue("foo"); if (s) .. s = p.getValue("bar"); if (s) .. Right... If I cannot return null, then (using the code above) I cannot tell the difference between whether foo or bar was passed or had an empty value. So I have to add a function, something like class Post { bool isPresent(char[] label) { foreach(Item i; items) { if (item.label == label) return true; } return false; } } and in my code.. if (p.isPresent("foo")) { s = p.getValue("foo"); .. } looks more complex. In addition I am searching for the label/value twice, doing twice the work. To avoid that I can add a parameter to the getValue function i.e. class Post { char[] getValue(char[] label, out bool isNull) { foreach(Item i; items) { if (item.label == label) return item.value; } //return null; not allowed isNull = true; return ""; } } then my code looks like... char[] s; bool isn; s = p.getValue("foo",isn); if (!isn) { } more complex code again, less obvious, a 3rd option springs to mind, instead of returning a char[] from getValue I could return existance and fill a passed char[] i.e. class Post { bool getValue(char[] label, out char[] value) { foreach(Item i; items) { if (item.label == label) { value = item.value; return true; } } return false; } } so my code now looks like... char[] s; if (getValue("foo",s)) { } this is perhaps the best soln so far. But! lets consider if this were extended to get 2 or more char[] values, (this is perfectly reasonable/likely, say they are loaded from a file, why process the file twice when you can do so once and get both values). bool getValue(out char[] val1, out char[] val2) { } what do we return if val1 exists but val2 does not? a set of flags? yuck. It just seems to me, that all this is done to emulate a reference type.. so why not have a reference type? We already have one, all it would take to make it consistent is 2 minor changes. If you have a solution to the above that is both as simple, elegant and easy to code as being able to return null.. pls educate me. Regan. -- Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/... we need an array specialisation for strings, so I'll have to write my own. This defeats the purpose of char[] in the first place, which was, to be a better more consistent string handling method than in possible in c/c++.That would work, but it might be better to adjust your thinking to match the language instead of trying to shoehorn the way you're used to thinking onto an abstraction that clearly wasn't built for it. Don't think in Java/C++/etc. Think in D. :)
Jun 29 2004
Regan Heath wrote:... I could return existance and fill a passed char[]... so my code now looks like... char[] s; if (getValue("foo",s))I like this. It's simple and obvious.if this were extended to get 2 or more char[] values... bool getValue(out char[] val1, out char[] val2) {}In this case, I would say that the best thing to do on failure is to throw an exception. Asking for a number of values all at once looks (to me, anyhow) to be implying that you expect them all to be present. If you don't, you'll have to test them all individually at some point anyway, in which case the previous form allows you to test and retrieve in one step. It may also be useful to return all the attributes as an associative array. They're easy to mutate and iterate through.It just seems to me, that all this is done to emulate a reference type.. so why not have a reference type?You got me there, but it seems to me that things could get very weird if you need to express a non-null array of 0 length.If you have a solution to the above that is both as simple, elegant and easy to code as being able to return null.. pls educate me.Exposing POST data as an associative array seems like a win to me; it's faster and can can be iterated over conveniently. Also, as a language intrinsic, it's a bit more likely to plug into other APIs easily. If you *really* need to, you could probably get away with doing something like: const char[] nadda = "nadda"; if (s is not nadda) { ... } -- andy
Jun 29 2004
On Tue, 29 Jun 2004 19:26:22 -0700, Andy Friesen <andy ikagames.com> wrote:Regan Heath wrote:I agree.... I could return existance and fill a passed char[]... so my code now looks like... char[] s; if (getValue("foo",s))I like this. It's simple and obvious.Nope. This is taken from a real life example, I have a config file with 10 different settings, all optional, I want 3 or them at this point in the code, so I process the file once and load the 3 settings which may or may not be present, and may or may not have a zero length values.if this were extended to get 2 or more char[] values... bool getValue(out char[] val1, out char[] val2) {}In this case, I would say that the best thing to do on failure is to throw an exception. Asking for a number of values all at once looks (to me, anyhow) to be implying that you expect them all to be present.If you don't, you'll have to test them all individually at some point anywayYes, at that point I need to be able to tell if the setting was present, present with zero length value, or not present at all., in which case the previous form allows you to test and retrieve in one step.Which previous form? do you mean the one that takes only one parameter, if so, that would involve parsing the file 3 times, not acceptable.It may also be useful to return all the attributes as an associative array. They're easy to mutate and iterate through.It's the same problem all over again, say I have: char[char[]] list; char[] s1,s2,s3; fn(list); s1 = list["setting1"]; s2 = list["setting2"]; s3 = list["setting3"]; s needs to be null for setting3, empty for setting2 and "foobar" for setting1. I believe this is currently the case, but!, as Farmer has shown if I then went if (s2 == s3) //this would evaluate to true and that's a problem.char[] s = "" s is a non-null array of 0 length.It just seems to me, that all this is done to emulate a reference type.. so why not have a reference type?You got me there, but it seems to me that things could get very weird if you need to express a non-null array of 0 length.I agree, it's a more D thing to do also :) I believe the same problem still applies (see above)If you have a solution to the above that is both as simple, elegant and easy to code as being able to return null.. pls educate me.Exposing POST data as an associative array seems like a win to me;it's faster and can can be iterated over conveniently. Also, as a language intrinsic, it's a bit more likely to plug into other APIs easily. If you *really* need to, you could probably get away with doing something like: const char[] nadda = "nadda"; if (s is not nadda) { ... }True, but this is yucky and what if a setting actually had a value of "nadda"? Regan. -- Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/
Jun 29 2004
Regan Heath wrote:This is taken from a real life example, I have a config file with 10 different settings, all optional, I want 3 or them at this point in the code, so I process the file once and load the 3 settings which may or may not be present, and may or may not have a zero length values.I guess it's just a matter of preference. I don't have a problem with something like this: char[][char[]] attribs = ...; if ("a" in attribs && "b" in attribs && "c" in attribs) { If nonexistence is an alias for some default, fill the array before parsing the file. Attributes that are present will override those which are not. Python offers a get() method which takes two arguments: a key, and a default value which is returned should the key not exist. I use this a lot.What about non-char types?things could get very weird if you need to express a non-null array of 0 length.char[] s = "" s is a non-null array of 0 length.That's why you use 'is' and not ==. 'is' performs a pointer comparison. The array has to point into that exact string literal for the comparison to be true. The only catch is string pooling. It'd be okay as long as the string literal "nadda" isn't declared anywhere in the source code. Come to think of it, this is better: char[] nonString = new char[1]; // don't mutate me! Just compare with 'is'! I'm officially out of ideas now. heh. -- andyIf you *really* need to, you could probably get away with doing something like: const char[] nadda = "nadda"; if (s is not nadda) { ... }True, but this is yucky and what if a setting actually had a value of "nadda"?
Jun 29 2004
On Tue, 29 Jun 2004 22:35:28 -0700, Andy Friesen <andy ikagames.com> wrote:Regan Heath wrote:It's more like: if ("a" in attribs) { } if ("b" in attribs) { } if ("c" in attribs) { } but, you seem to have completely ignored the fact that, *if* we remove the ability to return null when an array type is expected (you suggested removing the ability to assign null to an array, it's the same thing), the above will cease to work altogether as I imagine the above is simply going if (attribs["a"] != null) which is the same as char[] s; s = attribs["a"]; if (s != null) which is impossible if you cannot use null with arrays.This is taken from a real life example, I have a config file with 10 different settings, all optional, I want 3 or them at this point in the code, so I process the file once and load the 3 settings which may or may not be present, and may or may not have a zero length values.I guess it's just a matter of preference. I don't have a problem with something like this: char[][char[]] attribs = ...; if ("a" in attribs && "b" in attribs && "c" in attribs) {If nonexistence is an alias for some default, fill the array before parsing the file. Attributes that are present will override those which are not. Python offers a get() method which takes two arguments: a key, and a default value which is returned should the key not exist. I use this a lot.but if there is no default, you're left doing the nadda thing below which is simply an ugly hack (explanation below)ahh, gotcha, so basically you're creating null with another name. Why not just have null. :)What about non-char types?things could get very weird if you need to express a non-null array of 0 length.char[] s = "" s is a non-null array of 0 length.That's why you use 'is' and not ==. 'is' performs a pointer comparison. The array has to point into that exact string literal for the comparison to be true. The only catch is string pooling. It'd be okay as long as the string literal "nadda" isn't declared anywhere in the source code.If you *really* need to, you could probably get away with doing something like: const char[] nadda = "nadda"; if (s is not nadda) { ... }True, but this is yucky and what if a setting actually had a value of "nadda"?Come to think of it, this is better: char[] nonString = new char[1]; // don't mutate me! Just compare with 'is'!Another face for the same entity, null.I'm officially out of ideas now. heh.Think of it from the other point of view, assume we make the minor adjustments to arrays that I suggested, what effect does it have on the people who cannot see themselves needing a null array? hmm.. I think none. IMO it simply gives us more flexibilty of expression at no cost. Regan. -- Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/
Jun 30 2004
Regan Heath wrote:if ("a" in attribs) { ... } ... you seem to have completely ignored the fact that, *if* we remove the ability to return null when an array type is expected (you suggested removing the ability to assign null to an array, it's the same thing), the above will cease to work altogether as I imagine the above is simply going if (attribs["a"] != null)I very much doubt this. Associative arrays maintain an internal list of keys and values. In all likelihood, the 'in' operator hashes the key ("a" in this case) and searches through the associative array's internal hash table for one that matches.Right. I am an idiot. (below)If nonexistence is an alias for some default, fill the array before parsing the file. Attributes that are present will override those which are not. Python offers a get() method which takes two arguments: a key, and a default value which is returned should the key not exist. I use this a lot.but if there is no default, you're left doing the nadda thing below which is simply an ugly hack (explanation below)I was thinking about this, and the conclusion that I came to is that I am a complete idiot for not noticing what looked to be a completely arbitrary distinction with respect to comparing against null and comparing against any other pointer. After a tiny bit of testing, I came to the conclusion that I am an even bigger idiot than I could have possibly imagined. D already gets things pretty much bang on: T[] a, b; a = b; // 'a == b' and 'a is b' will both be true. (even if b is // null) a = b.dup; // 'a == b' will be true. 'a is b' will be true iff b is // null. (null.dup is null, evidently. funny that) With respect to 'a == null', my mind is quite blown. Farmer's tests reliably produce situations where zero-length strings compare false against null. My own tests show that empty arrays are equivalent to null but do not share identity. Don't test x==null, I guess. :) Explicitly testing for an empty, non-null array requires that you write 'if (x !== null && x.length == 0)', which is probably okay: I can envision hordes of new programmers going postal because of 'name != ""' and 'name.length == 0' somehow both evaluating to true at the same time. -- andyThat's why you use 'is' and not ==. 'is' performs a pointer comparison. The array has to point into that exact string literal for the comparison to be true. The only catch is string pooling. It'd be okay as long as the string literal "nadda" isn't declared anywhere in the source code.ahh, gotcha, so basically you're creating null with another name. Why not just have null. :)
Jun 30 2004
On Wed, 30 Jun 2004 19:02:22 -0700, Andy Friesen <andy ikagames.com> wrote:Regan Heath wrote:I agree totally. I am not disputing how an associative array works, what I am saying is, without the ability to compare an array to null, you cannot express 'does not exist' in terms of an associative array. What does: if ("a" in attribs) actually evaluate to, if not: if (attribs["a"] != null) ?if ("a" in attribs) { ... } ... you seem to have completely ignored the fact that, *if* we remove the ability to return null when an array type is expected (you suggested removing the ability to assign null to an array, it's the same thing), the above will cease to work altogether as I imagine the above is simply going if (attribs["a"] != null)I very much doubt this. Associative arrays maintain an internal list of keys and values. In all likelihood, the 'in' operator hashes the key ("a" in this case) and searches through the associative array's internal hash table for one that matches.My tests, given: char[] e = "" char[] n; output: e is "" (f) n is "" (f) e is null (f) n is null (t) e is n (f) e == "" (t) n == "" (t) incorrect? e == null (f) n == null (t) e == n (t) incorrect? e === "" (f) n === "" (f) e === null (f) n === null (t) e === n (f) The != and !== tests were all the opposite of the above, so I have not included them. == calls opEquals, perhaps it has a shortcut in it which says if the lengths are both 0 return true? this would explain the two cases above I have marked "incorrect?". I think these two cases are inconsistent. To reliably test for nullness I can use '===' or '!==' or 'is'.Right. I am an idiot. (below)If nonexistence is an alias for some default, fill the array before parsing the file. Attributes that are present will override those which are not. Python offers a get() method which takes two arguments: a key, and a default value which is returned should the key not exist. I use this a lot.but if there is no default, you're left doing the nadda thing below which is simply an ugly hack (explanation below)I was thinking about this, and the conclusion that I came to is that I am a complete idiot for not noticing what looked to be a completely arbitrary distinction with respect to comparing against null and comparing against any other pointer. After a tiny bit of testing, I came to the conclusion that I am an even bigger idiot than I could have possibly imagined. D already gets things pretty much bang on: T[] a, b; a = b; // 'a == b' and 'a is b' will both be true. (even if b is // null) a = b.dup; // 'a == b' will be true. 'a is b' will be true iff b is // null. (null.dup is null, evidently. funny that) With respect to 'a == null', my mind is quite blown. Farmer's tests reliably produce situations where zero-length strings compare false against null. My own tests show that empty arrays are equivalent to null but do not share identity. Don't test x==null, I guess. :) Explicitly testing for an empty, non-null array requires that you write 'if (x !== null && x.length == 0)', which is probably okay:That's why you use 'is' and not ==. 'is' performs a pointer comparison. The array has to point into that exact string literal for the comparison to be true. The only catch is string pooling. It'd be okay as long as the string literal "nadda" isn't declared anywhere in the source code.ahh, gotcha, so basically you're creating null with another name. Why not just have null. :)I can envision hordes of new programmers going postal because of 'name != ""' and 'name.length == 0' somehow both evaluating to true at the same time.Yeah.. to stop that name.length would have to have a NaN (null) value. Which 'int' or 'uint' does not have. Regan. -- Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/
Jun 30 2004
Regan Heath wrote:I am not disputing how an associative array works, what I am saying is, without the ability to compare an array to null, you cannot express 'does not exist' in terms of an associative array. What does: if ("a" in attribs) actually evaluate to, if not: if (attribs["a"] != null)This could never work anyway. Types for which null does not make sense obviously can't use null to indicate nonexistence. Types for which null does make sense can't do this either, as it makes perfect sense to store a null reference. The fundamental idea is that you're trying to represent a "nonvalue", which is storable in the result variable, but not part of the variable's range. This obviously won't work, as it requires two contradictory ideas to be simultaneously true. Adding a 'special' value like null is sometimes close enough for specific application domains, but, in the end, all you're doing is making the range of allowable values bigger.== calls opEquals, perhaps it has a shortcut in it which says if the lengths are both 0 return true? this would explain the two cases above I have marked "incorrect?". I think these two cases are inconsistent.Looking at internal/adi.d, it looks like it compares the lengths, then compares each element in succession: extern (C) int _adEq(Array a1, Array a2, TypeInfo ti) { if (a1.length != a2.length) return 0; // not equal int sz = ti.tsize(); //printf("sz = %d\n", sz); void *p1 = a1.ptr; void *p2 = a2.ptr; for (int i = 0; i < a1.length; i++) { if (!ti.equals(p1 + i * sz, p2 + i * sz)) return 0; // not equal } return 1; // equal } How on Earth ""!=null ever comes about is beyond me. -- andy
Jun 30 2004
On Wed, 30 Jun 2004 22:40:28 -0700, Andy Friesen <andy ikagames.com> wrote:Regan Heath wrote:Yeah... you're right.I am not disputing how an associative array works, what I am saying is, without the ability to compare an array to null, you cannot express 'does not exist' in terms of an associative array. What does: if ("a" in attribs) actually evaluate to, if not: if (attribs["a"] != null)This could never work anyway. Types for which null does not make sense obviously can't use null to indicate nonexistence. Types for which null does make sense can't do this either, as it makes perfect sense to store a null reference.The fundamental idea is that you're trying to represent a "nonvalue", which is storable in the result variable, but not part of the variable's range. This obviously won't work, as it requires two contradictory ideas to be simultaneously true. Adding a 'special' value like null is sometimes close enough for specific application domains, but, in the end, all you're doing is making the range of allowable values bigger.I think.. I agree. :)I went looking for that (not hard enough obviously)..== calls opEquals, perhaps it has a shortcut in it which says if the lengths are both 0 return true? this would explain the two cases above I have marked "incorrect?". I think these two cases are inconsistent.Looking at internal/adi.d, it looks like it compares the lengths, then compares each element in succession:extern (C) int _adEq(Array a1, Array a2, TypeInfo ti) { if (a1.length != a2.length) return 0; // not equal int sz = ti.tsize(); //printf("sz = %d\n", sz); void *p1 = a1.ptr; void *p2 = a2.ptr; for (int i = 0; i < a1.length; i++) { if (!ti.equals(p1 + i * sz, p2 + i * sz)) return 0; // not equal } return 1; // equal }How on Earth ""!=null ever comes about is beyond me.below _adEq is.. extern (C) int _adCmp(Array a1, Array a2, TypeInfo ti) { int len; //printf("adCmp()\n"); len = a1.length; if (a2.length < len) len = a2.length; int sz = ti.tsize(); void *p1 = a1.ptr; void *p2 = a2.ptr; for (int i = 0; i < len; i++) { int c; c = ti.compare(p1 + i * sz, p2 + i * sz); if (c) return c; } return cast(int)a1.length - cast(int)a2.length; } which would return 0 if both lengths were 0. "" and null both have a length of 0. Regan. -- Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/
Jun 30 2004
Regan Heath wrote:On Wed, 30 Jun 2004 22:40:28 -0700, Andy Friesen <andy ikagames.com> wrote:Right, but Cmp functions return 0 to indicate equality, which would be the right thing in this case. My money says the cause is in that inline-assembly-optimized _adCmpChar. (line 360) I freely admit that I blame it on the inline assembly because me and assembly have not been on speaking terms for some time now. (one too many hand-coded alpha-blits that lost to MSVC's optimizing compiler) -- andyHow on Earth ""!=null ever comes about is beyond me._adEq is.. extern (C) int _adCmp(Array a1, Array a2, TypeInfo ti) { [....] } which would return 0 if both lengths were 0. "" and null both have a length of 0.
Jul 01 2004
In article <opsadsu8f75a2sq9 digitalmars.com>, Regan Heath says...s = p.getValue("foo"); if (s) .. s = p.getValue("bar"); if (s) .. Right... If I cannot return null, then (using the code above) I cannot tell the difference between whether foo or bar was passed or had an empty value.And indeed that very situation is ALSO true with integer parameters. How can tell the difference between an integer parameter being present and zero, and no integer parameter being present at all? But of course, there are various solutions to this problem, many much simpler than you propose. For a start, you could return an int* instead of an int, or indeed a char[]* instead of a char[]. Then you could explicitly test for ===null in both cases. In C++, I'd just return a std::pair<bool, T>. I'm sure that once we have a good supply of standard templates in D we'll be able to do much the same thing. (Even without templates, you could define a struct and return it). Anything wrong with either of these approaches? Arcane Jill
Jun 30 2004
On Wed, 30 Jun 2004 07:27:33 +0000 (UTC), Arcane Jill <Arcane_member pathlink.com> wrote:In article <opsadsu8f75a2sq9 digitalmars.com>, Regan Heath says...Yep. As another poster noted he had the same problem with integers, resulting in him using a value of -1 to represent null. Yuck.s = p.getValue("foo"); if (s) .. s = p.getValue("bar"); if (s) .. Right... If I cannot return null, then (using the code above) I cannot tell the difference between whether foo or bar was passed or had an empty value.And indeed that very situation is ALSO true with integer parameters. How can tell the difference between an integer parameter being present and zero, and no integer parameter being present at all?But of course, there are various solutions to this problem, many much simpler than you propose. For a start, you could return an int* instead of an int, or indeed a char[]* instead of a char[]. Then you could explicitly test for ===null in both cases.This is the C solution. For int I cannot think of a good D solution. For char[] (or any array) we already have one, the array emulates/acts like a reference type, it's just inconsistent.In C++, I'd just return a std::pair<bool, T>. I'm sure that once we have a good supply of standard templates in D we'll be able to do much the same thing. (Even without templates, you could define a struct and return it).You're emulating a reference type, why not just have one. This may be the best soln for int and other strict value types.Anything wrong with either of these approaches?Yep. Neither is as simple, elegant or clean as a reference type, which we already have in D arrays albeit inconsistently. Regan. -- Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/
Jun 30 2004
In article <opsafg63rh5a2sq9 digitalmars.com>, Regan Heath says...On Wed, 30 Jun 2004 07:27:33 +0000 (UTC), Arcane Jill <Arcane_member pathlink.com> wrote:The D equivalent might be to return int[] or char[][] y. Test if the length is zero. If it's not, then the data is "present". Otherwise it is missing. For the HTML parsing example given in this thread, this may be even better because sometimes HTML has multiple values with the same tag. Another data point: I've also used the technique listed below (pair<bool, T>), albeit wrapped in a template class. The code is very readable. KevinIn article <opsadsu8f75a2sq9 digitalmars.com>, Regan Heath says...Yep. As another poster noted he had the same problem with integers, resulting in him using a value of -1 to represent null. Yuck.s = p.getValue("foo"); if (s) .. s = p.getValue("bar"); if (s) .. Right... If I cannot return null, then (using the code above) I cannot tell the difference between whether foo or bar was passed or had an empty value.And indeed that very situation is ALSO true with integer parameters. How can tell the difference between an integer parameter being present and zero, and no integer parameter being present at all?But of course, there are various solutions to this problem, many much simpler than you propose. For a start, you could return an int* instead of an int, or indeed a char[]* instead of a char[]. Then you could explicitly test for ===null in both cases.This is the C solution. For int I cannot think of a good D solution. For char[] (or any array) we already have one, the array emulates/acts like a reference type, it's just inconsistent.In C++, I'd just return a std::pair<bool, T>. I'm sure that once we have a good supply of standard templates in D we'll be able to do much the same thing. (Even without templates, you could define a struct and return it).You're emulating a reference type, why not just have one. This may be the best soln for int and other strict value types.Anything wrong with either of these approaches?Yep. Neither is as simple, elegant or clean as a reference type, which we already have in D arrays albeit inconsistently. Regan. -- Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/
Jul 07 2004
Kevin Bealer <Kevin_member pathlink.com> wrote in news:cci0tl$2dnl$1 digitaldaemon.com:In article <opsafg63rh5a2sq9 digitalmars.com>, Regan Heath says...Disagree. Returning an array for a single value confuses a programmer that didn't bother to fully read the function's documentation. (Don't blame the programmer, in most cases the documentation doesn't exist, anyway.)On Wed, 30 Jun 2004 07:27:33 +0000 (UTC), Arcane Jill <Arcane_member pathlink.com> wrote:The D equivalent might be to return int[] or char[][] y. Test if the length is zero. If it's not, then the data is "present". Otherwise it is missing.In article <opsadsu8f75a2sq9 digitalmars.com>, Regan Heath says...Yep. As another poster noted he had the same problem with integers, resulting in him using a value of -1 to represent null. Yuck.s = p.getValue("foo"); if (s) .. s = p.getValue("bar"); if (s) .. Right... If I cannot return null, then (using the code above) I cannot tell the difference between whether foo or bar was passed or had an empty value.And indeed that very situation is ALSO true with integer parameters. How can tell the difference between an integer parameter being present and zero, and no integer parameter being present at all?But of course, there are various solutions to this problem, many much simpler than you propose. For a start, you could return an int* instead of an int, or indeed a char[]* instead of a char[]. Then you could explicitly test for ===null in both cases.This is the C solution. For int I cannot think of a good D solution. For char[] (or any array) we already have one, the array emulates/acts like a reference type, it's just inconsistent.For the HTML parsing example given in this thread, this may be even better because sometimes HTML has multiple values with the same tag. Another data point: I've also used the technique listed below (pair<bool, T>), albeit wrapped in a template class. The code is very readable.Since the code is very readable, why do you argue that the D-way would be something different, then? [No need to answer this, I already know one good answer.] What do you mean by 'albeit wrapped in a template class'? Do you wrap 'pair<bool, T>' into your own templated class to provide an isNull() method? I like the pair<bool, T> solution best. It expresses the meaning of the returned value precisely and can be generically applied to all types. Still, in some cases using reference typec (e.g.null-arrays) is a simpler and faster solution. Farmer.KevinIn C++, I'd just return a std::pair<bool, T>. I'm sure that once we have a good supply of standard templates in D we'll be able to do much the same thing. (Even without templates, you could define a struct and return it).You're emulating a reference type, why not just have one. This may be the best soln for int and other strict value types.Anything wrong with either of these approaches?Yep. Neither is as simple, elegant or clean as a reference type, which we already have in D arrays albeit inconsistently. Regan. -- Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/
Jul 09 2004
In article <opsab6o5rl5a2sq9 digitalmars.com>, Regan Heath says...You'll get no arguments from me there. D got it right in not having a string class. I didn't think that at first, but I've come round to the D way of thinking. The problem with a string class is that you can't add new member functions to it. (Oh, you may be able to subclass String, if it's not final. Oh wait - it /is/ final in Java). With char[] arrays, you CAN add new functions. Besides which, what else can a char[] array possibly repreresent, other than a string? (Given that a char[] array MUST contain UTF-8, I mean). It's not the same as a byte[] array, which could mean anything.Yeah, it's called std::string, and it's more or less the default.And it's crap. IMNSHO.Why? Do we also need a way to differentiate between empty and non-existent ints? In D, there is no such thing as a non-existent int; there is no such thing as a non-existent struct; and there is no such thing as a non-existent string. Why not just start from the assumption that we DON'T need to differentiate between empty and non-existant strings, and take it from there? Maybe the real solution would be to make it a compile error to assign an array with null, or to compare it with null. This would then force people to say what they mean, and all such problems would go away. (Anyway, you all KNOW my opinion that should be a compile-time error anyway, because char[] is not boolean. But that's another story). JillDon't compare arrays to null. Don't try to differentiate between empty and nonexistent.Fine and dandy EXCEPT we *need* to differentiate between empty and non-existant strings.
Jun 29 2004
On Tue, 29 Jun 2004 07:18:20 +0000 (UTC), Arcane Jill wrote:In article <opsab6o5rl5a2sq9 digitalmars.com>, Regan Heath says...Because that's not what is being meant. I'd like to differentiate between INITIALIZED and UNINITIALIZED vectors. This non-existant thing is a red-herring. 'empty' means initialized and length of zero. 'non-existant' means not initialized yet. Its a workaround for the current (longer) way of handling this situation. Its no big deal but it would be 'nice to have'. Like a strict bool type would be nice to have. -- Derek Melbourne, Australia 29/Jun/04 6:24:19 PMYou'll get no arguments from me there. D got it right in not having a string class. I didn't think that at first, but I've come round to the D way of thinking. The problem with a string class is that you can't add new member functions to it. (Oh, you may be able to subclass String, if it's not final. Oh wait - it /is/ final in Java). With char[] arrays, you CAN add new functions. Besides which, what else can a char[] array possibly repreresent, other than a string? (Given that a char[] array MUST contain UTF-8, I mean). It's not the same as a byte[] array, which could mean anything.Yeah, it's called std::string, and it's more or less the default.And it's crap. IMNSHO.Why? Do we also need a way to differentiate between empty and non-existent ints? In D, there is no such thing as a non-existent int; there is no such thing as a non-existent struct; and there is no such thing as a non-existent string. Why not just start from the assumption that we DON'T need to differentiate between empty and non-existant strings, and take it from there?Don't compare arrays to null. Don't try to differentiate between empty and nonexistent.Fine and dandy EXCEPT we *need* to differentiate between empty and non-existant strings.
Jun 29 2004
In article <cbr9e5$vai$1 digitaldaemon.com>, Derek Parnell says...Because that's not what is being meant. I'd like to differentiate between INITIALIZED and UNINITIALIZED vectors.Why? D's dynamic arrays are the same thing as C++ std::vectors (as I'm sure you realize). In C++, there is no such thing as an uninitialized vector. Why on Earth would you want them in D?This non-existant thing is a red-herring. 'empty' means initialized and length of zero. 'non-existant' means not initialized yet.Yeah - but nobody has yet answered WHY? Why would ANYONE want to allow uninitialized array handles (as opposed to array content) to exist in D. It makes no sense. Please, can someone who is arguing in favor of allowing a distinction between initialized and unintialized dynamic array handles, explain exactly why you want such a distinction to exist? Arcane Jill
Jun 29 2004
Arcane Jill wrote:In article <cbr9e5$vai$1 digitaldaemon.com>, Derek Parnell says...The difference is in C++ it's common to use a pointer to a class (and I presume, a vector). In D, an array is a struct, not a class, so to get reference semantics you have to use a struct pointer. In C++ this would be no big deal, but this doesn't seem like the D way. Reference semantics allow me to change the length of an array and have it reflected in the caller, and to store nulls.Because that's not what is being meant. I'd like to differentiate between INITIALIZED and UNINITIALIZED vectors.Why? D's dynamic arrays are the same thing as C++ std::vectors (as I'm sure you realize).In C++, there is no such thing as an uninitialized vector. Why on Earth would you want them in D?For the same reason you use null in other situations with reference types. I want accessing an uninitialised member array to give an error. I want to be able to use a null argument to a function to trigger special or default behaviour (optional arguments in any position). Sam PS: AJ, I'm not sure if you read the forums at dsource, I posted a couple of deimos bugs: http://dsource.org/forums/viewtopic.php?t=224
Jun 29 2004
In article <cbrgtm$19gj$1 digitaldaemon.com>, Sam McCall says...PS: AJ, I'm not sure if you read the forums at dsource,I do, but less frequently than this one as it's a slow turnover list. I get notified when new posts are added to existing threads, but not when new threads are added.I posted a couple of deimos bugs: http://dsource.org/forums/viewtopic.php?t=224Okay, I'm on it. I'll let you know when they're fixed. Maybe we could start a "Bugs" thread on Deimos. That way I'll always get notified when anyone adds to it. Jill
Jun 29 2004
Nope, wrong. If you use reference-types that are allowed to be NULL (in C++ references aren't, e.g. in nice there are references, that aren't, too, ...) you want to show that there possibly is no object. At least in languages that allow you to use other kinds of references (e.g. C++ or nixe as mentiond above). In languages that don't have references that can't be null, you just can't express yourself in the code. In C++ I never had the wish to pass a container/collection as a pointer. I allways pass them as C++-reference. So I'm sure there allways is a collection and I don't have to check for this. If there are no values to pass in, I just pass an empty collection. Could you please make some example where it makes sense not to pass a collection instead of passing an empty collection? -- Matthias BeckerIn C++, there is no such thing as an uninitialized vector. Why on Earth would you want them in D?For the same reason you use null in other situations with reference types. I want accessing an uninitialised member array to give an error. I want to be able to use a null argument to a function to trigger special or default behaviour (optional arguments in any position).
Jun 29 2004
Matthias Becker wrote:In C++ I never had the wish to pass a container/collection as a pointer. I allways pass them as C++-reference. So I'm sure there allways is a collection and I don't have to check for this. If there are no values to pass in, I just pass an empty collection. Could you please make some example where it makes sense not to pass a collection instead of passing an empty collection?To request default behaviour a la optional arguments, without restrictions on the number or position of the arguments. Sam
Jun 29 2004
On Tue, 29 Jun 2004 15:58:29 +0000 (UTC), Matthias Becker <Matthias_member pathlink.com> wrote:pls read my post (2 prior to this one - sorted flat and by date, it is a response to Andy's post) it contains an example. I would like some feedback on how to achieve what I want to do... Regan.Nope, wrong. If you use reference-types that are allowed to be NULL (in C++ references aren't, e.g. in nice there are references, that aren't, too, ...) you want to show that there possibly is no object. At least in languages that allow you to use other kinds of references (e.g. C++ or nixe as mentiond above). In languages that don't have references that can't be null, you just can't express yourself in the code. In C++ I never had the wish to pass a container/collection as a pointer. I allways pass them as C++-reference. So I'm sure there allways is a collection and I don't have to check for this. If there are no values to pass in, I just pass an empty collection. Could you please make some example where it makes sense not to pass a collection instead of passing an empty collection?In C++, there is no such thing as an uninitialized vector. Why on Earth would you want them in D?For the same reason you use null in other situations with reference types. I want accessing an uninitialised member array to give an error. I want to be able to use a null argument to a function to trigger special or default behaviour (optional arguments in any position).-- Matthias Becker-- Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/
Jun 29 2004
On Tue, 29 Jun 2004 09:50:35 +0000 (UTC), Arcane Jill wrote:In article <cbr9e5$vai$1 digitaldaemon.com>, Derek Parnell says...I don't use C++, so I'm not aware of what std::vector does or does not provide. Ok, off the top of my head... I'm writing a library that will be used by other coders. It has a function that accepts a dynamic array. A zero-length array is a valid parameter. The caller however can pass an uninitialized parameter to tell my function that the user wishes to use the default values instead of supplying a value. In short, an uninitialized variable contains information - namely the fact that it *is* uninitialized. And that information could be utilized by a coder - if they had the chance.Because that's not what is being meant. I'd like to differentiate between INITIALIZED and UNINITIALIZED vectors.Why? D's dynamic arrays are the same thing as C++ std::vectors (as I'm sure you realize). In C++, there is no such thing as an uninitialized vector. Why on Earth would you want them in D?Ok, but it does to me. Sorry I can't seem to be able to explain why.This non-existant thing is a red-herring. 'empty' means initialized and length of zero. 'non-existant' means not initialized yet.Yeah - but nobody has yet answered WHY? Why would ANYONE want to allow uninitialized array handles (as opposed to array content) to exist in D. It makes no sense.Please, can someone who is arguing in favor of allowing a distinction between initialized and unintialized dynamic array handles, explain exactly why you want such a distinction to exist?Apparently not; sorry. -- Derek Melbourne, Australia
Jun 29 2004
In article <12vwf4nkzjzxa.17ai9mojp3dpz$.dlg 40tude.net>, Derek says...Ok, off the top of my head... I'm writing a library that will be used by other coders. It has a function that accepts a dynamic array. A zero-length array is a valid parameter. The caller however can pass an uninitialized parameter to tell my function that the user wishes to use the default values instead of supplying a value.I'd use two functions for this: ..but only if an empty array was NOT the default. In many cases, I could probably get away with an empty array BEING the default, in which case, I could simply do:In short, an uninitialized variable contains information - namely the fact that it *is* uninitialized.It's a nice argument, but it could be applied equally well to ANY types. If I were supremely in favor of the notion that uninitializedness carries information (which I'm not), I might argue as follows:I'm writing a library that will be used by other coders. It has a function that accepts a bit. Zero is a valid parameter. The caller however can pass an uninitialized parameter to tell my function that the user wishes to use the default values instead of supplying a value.If I believed that, I'd be arguing for a distinction between an uninitialized bit, and a bit containing zero. I happen not to believe that, however.Why would ANYONE want to allow uninitialized array handles (as opposed to array content) to exist in D. It makes no sense.Ok, but it does to me. Sorry I can't seem to be able to explain why.Yeah, human language is a bummer. Someone ought to invent telepathy. Jill
Jun 29 2004
Arcane Jill wrote:In article <12vwf4nkzjzxa.17ai9mojp3dpz$.dlg 40tude.net>, Derek says...Sure, but it sucks if there's a lot of them, and is impossible if the function is variadic. The ability to pass null to a function is very useful, I've switched from structs to classes more than once for this reason. SamOk, off the top of my head... I'm writing a library that will be used by other coders. It has a function that accepts a dynamic array. A zero-length array is a valid parameter. The caller however can pass an uninitialized parameter to tell my function that the user wishes to use the default values instead of supplying a value.I'd use two functions for this: ..but only if an empty array was NOT the default. In many cases, I could probably get away with an empty array BEING the default, in which case, I could simply do:
Jun 29 2004
"Arcane Jill" <Arcane_member pathlink.com> escribió en el mensaje news:cbre1b$15j0$1 digitaldaemon.com | | ... | | Yeah - but nobody has yet answered WHY? Why would ANYONE want to allow | uninitialized array handles (as opposed to array content) to exist in D. It | makes no sense. | | Please, can someone who is arguing in favor of allowing a distinction between | initialized and unintialized dynamic array handles, explain exactly why you want | such a distinction to exist? | | | Arcane Jill Regan already said why: "Regan Heath" <regan netwin.co.nz> escribió en el mensaje news:opr99w0st25a2sq9 digitalmars.com | | ... | | We *need* to have *both* null and empty arrays. The reason is pretty | simple: | - null means does not exist | - emtpy means exists, but has no value (or empty value) | | This is important in situations like the original poster mentioned and in | my experience for example... When reading POST input from a web page, you | get a string like so: | | Setting1=Regan+Heath&Setting2=&& | | when requesting items you might have a function like: | | char[] getFormValue(char[] label); | | the code to get the values for the above form might go: | | char[] s; | | s = getFormValue("Setting1"); // s is "Regan Heath" | s = getFormValue("Setting2"); // s is "" | s = getFormValue("Setting3"); // s is null | | It is important the above code can tell that Setting3 was not passed in | the form, so it can decide not to overwrite whatever current value that | setting has, whereas it can tell Setting2 was passed and will overwrite | the current value with a new blank one. | | ... | Personally, I would use an associative array to represent such a thing (instead of using a function), but it's an implementation difference, and the language should let Regan do the way he wants. I've ran into such cases before ("" !== null), I know that. I just can't remember any of them right now :D Two more things: I don't think this should only be for strings, but for any array. And I'm 100% sure this has been raised before. ----------------------- Carlos Santander Bernal
Jun 29 2004
On Tue, 29 Jun 2004 09:50:35 +0000 (UTC), Arcane Jill <Arcane_member pathlink.com> wrote:In article <cbr9e5$vai$1 digitaldaemon.com>, Derek Parnell says...Pls read the reply I just made to Andy's post that started this branch in this thread i.e. just go up a little bit in a threaded reader, or look for the post I made just prior to this one if viewing flat and sorting by date. Regan. -- Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/Because that's not what is being meant. I'd like to differentiate between INITIALIZED and UNINITIALIZED vectors.Why? D's dynamic arrays are the same thing as C++ std::vectors (as I'm sure you realize). In C++, there is no such thing as an uninitialized vector. Why on Earth would you want them in D?This non-existant thing is a red-herring. 'empty' means initialized and length of zero. 'non-existant' means not initialized yet.Yeah - but nobody has yet answered WHY? Why would ANYONE want to allow uninitialized array handles (as opposed to array content) to exist in D. It makes no sense. Please, can someone who is arguing in favor of allowing a distinction between initialized and unintialized dynamic array handles, explain exactly why you want such a distinction to exist?
Jun 29 2004
Arcane Jill wrote:In article <opsab6o5rl5a2sq9 digitalmars.com>, Regan Heath says...I'm still getting there... I still don't see why toUpper("hello") is better than "hello".toUpper(), under the assumption that the OO way has any merit. (If it doesn't, why do we have it?)You'll get no arguments from me there. D got it right in not having a string class. I didn't think that at first, but I've come round to the D way of thinking.Yeah, it's called std::string, and it's more or less the default.And it's crap. IMNSHO.The problem with a string class is that you can't add new member functions to it. (Oh, you may be able to subclass String, if it's not final. Oh wait - it /is/ final in Java). With char[] arrays, you CAN add new functions.I'm confused: is there a way of adding functions to array types that can't be used with classes?Besides which, what else can a char[] array possibly repreresent, other than a string? (Given that a char[] array MUST contain UTF-8, I mean). It's not the same as a byte[] array, which could mean anything.In theory you're right. The problem is when people assume "a char array is a list of characters", which is perfectly logical, given the name. In theory, you should only store a list of characters in a dchar[]. But it's not going to happen, see std.string.maketrans (char[] is a list) and translate (char[] is opaque). [RANT] IMO, D (language, not libraries) isn't _really_ trying to be fully-unicode at all. What is the purpose of a char/wchar variable? How often do you actually need to be directly manipulating UTF8/16 fragments? (Hint: in a unicode-based language with good libraries, almost never). *IF* D is going to be fully-unicode, that does have performance impacts. A single character must _always_ go in a dchar variable. So what is the advantage in having strings being char[] arrays? ("knowing the encoding" doesn't count, the user shouldn't have to care). IMO, strings NEED to: * Have only one type, or one base type. I want to write a function that accepts a string. I don't want to write three functions, or use a template (that has to be manually instantiated). * Expose character data as _characters_, not fragments. This means characters accessed must be dchars, indexing must be character, not fragment-based. * Be efficient in the common case. At the moment, this probably means using UTF-8 internally. This could be changed in the future, or there could be multiple versions with the same base type, because all character data would be exposed at the character level. * Be fully reference types. At the moment, if someone passes in a string, I can modify its data, which is shared, and its length, which is not. This makes sense if you understand the implementation, but why should foo~="bar" have the truly odd effects it does? Always passing strings inout is ugly and confusing in other cases. Based on this, the solution to me looks like a String interface that exposes character data, and UTF8String as the default implementation, which stores its data in a ubyte[], literal strings would create these. There could then be a UTF32String implementation which would be more efficient for various other languages. The "char" type should be 32 bits wide. Anything else is confusing. (Hey, they did it with "int"...). [/RANT] Now flame on, I'm sure that's not going to be too popular ;-)Frankly, yes, I use -1 as a "magic value" all the time, and do all sorts of ugly things when negative numbers are perfectly valid. This is neccesary for pragmatic reasons of efficiency, I'd love chips to treat 0x8000... as NaN like the NaN we have in IEEE floating point. (This'd also balance the range of integers). I'm not saying we can/should change the behaviour of ints, just that I don't think this argument has merit. I think arrays should become fully reference types, for the same reason as strings above. Yes, this would probably mean double indirection, arrays would be a pointer to the (length,data pointer) struct that they currently are. SamWhy? Do we also need a way to differentiate between empty and non-existent ints?Don't compare arrays to null. Don't try to differentiate between empty and nonexistent.Fine and dandy EXCEPT we *need* to differentiate between empty and non-existant strings.
Jun 29 2004
In article <cbrfn4$1805$1 digitaldaemon.com>, Sam McCall says...[RANT] IMO, D (language, not libraries) isn't _really_ trying to be fully-unicode at all. What is the purpose of a char/wchar variable? How often do you actually need to be directly manipulating UTF8/16 fragments? (Hint: in a unicode-based language with good libraries, almost never).Maybe not, but you still need something to store them in. Even if you let a library do all your UTF-8 work for you (which you should), then you still a type designed to contain such sequences. In D, a char array is that type. In other words, the type char exists in order that the type char[] might exist. I don't have a problem with that.*IF* D is going to be fully-unicode, that does have performance impacts. A single character must _always_ go in a dchar variable. So what is the advantage in having strings being char[] arrays?Space.("knowing the encoding" doesn't count, the user shouldn't have to care).In a strongly typed language, that would be true, but D is not a strongly typed language. Walter is on record as stating that all char types including dchar can be freely used as integers. If that's going to be true, you MUST care about the encoding.IMO, strings NEED to: * Have only one type, or one base type.And, to take that reasoning further, it should have other interesting properties too, like it should be IMPOSSIBLE IN ANY CIRCUMSTANCE to end up with a char containing a value outside the range U+000000 to U+10FFFF inclusive. However, I don't see this happening in D. The reason being that even a dchar is not a character in the Unicode sense. It is a UTF-32 encoding of a character. (The minor technical difference being that dchar values above 0x10FFFF exist, but are invalid, whereas Unicode characters beyond U+10FFFF do not even exist).* Expose character data as _characters_, not fragments. This means characters accessed must be dchars, indexing must be character, not fragment-based.That depends on your point of view. Unicode may be viewed on many levels. I'm sure I could hold a reasonable argument in which I insisted that string data should be exposed as _glyphs_, not characters (characters are, after all, merely glyph fragments). Glyphs are what you see. If a string contains an e-acute glyph, should your application really /care/ which characters compose that glyph? Somewhere along the line, you have to face the bottom level. That level is the level of character encoding. Language support is given to the encoding level. For anything above that, you use libraries. If such libraries don't exist yet, we can write them.The "char" type should be 32 bits wide. Anything else is confusing.21 bits wide, and limited to the range 0-0x10FFFF. Anything else is confusing. But this is D, and D is practical.Now flame on, I'm sure that's not going to be too popular ;-)Actually, I loved it, and I'm not flaming (and I hope nobody does). You've made some excellent observations. But it's way too late to shape D that way now. In the future, the may well be languages which handle characters as true, pure, Unicode characters, but the world isn't fully Unicode-aware yet. To give an example of what I mean: Suppose you publish a web site containing a few musical symbols and a few exotic math symbols. (All valid Unicode). The sad fact is, such a website won't display properly on most people's browsers. To get them to display properly, it is currently the responsibility of VIEWERS (rather than publishers) of web sites, to "obtain", somehow, the relevant fonts to make it work. Usually, obtaining such fonts costs money, so who's going to bother? It'd be like buying a book and opening it to find half the characters looking like black blobs until you pay more money to a font-designer. And so, web site designers tend NOT to use such characters on their web sites, prefering gif images which everyone can view. It's a vicious circle. In short, the world is not Unicode yet, and it's frustrating. Bits of it are still trying to catch up with other bits. Sometimes you just want scream at the planet to get its act together right now. But we have to be realistic. And realistically, things /are/ changing - but slowly. What D is doing is moving in the right direction. The shift to full Unicode support in all things is a long way off yet, and to get there, we must move in small steps. Defining a char as a UTF-8 fragment may be a small step, but it is a very important and valuable one. At least we don't say "a char is a character in some unspecified encoding", like some other languages do. Nice post, by the way. I enjoyed reading it. Jill
Jun 29 2004
Arcane Jill wrote:In article <cbrfn4$1805$1 digitaldaemon.com>, Sam McCall says...Sure, but given that the "user" shouldn't be touching chars without realising that they're more complicated than in C, byte[] would do? Still, I'm not fussed about this.[RANT] IMO, D (language, not libraries) isn't _really_ trying to be fully-unicode at all. What is the purpose of a char/wchar variable? How often do you actually need to be directly manipulating UTF8/16 fragments? (Hint: in a unicode-based language with good libraries, almost never).Maybe not, but you still need something to store them in. Even if you let a library do all your UTF-8 work for you (which you should), then you still a type designed to contain such sequences. In D, a char array is that type. In other words, the type char exists in order that the type char[] might exist. I don't have a problem with that.Sorry, I didn't mean char[] as opposed to dchar[], I meant char[] as opposed to something more opaque. The reasoning for not having a string class, IIRC, is "strings are lists of characters". Well, chars aren't characters.*IF* D is going to be fully-unicode, that does have performance impacts. A single character must _always_ go in a dchar variable. So what is the advantage in having strings being char[] arrays?Space.If you don't use them as integers, then you don't have to care. I'm not saying it shouldn't be well-defined, but Java doesn't require the user to understand the intricacies of unicode encodings to manipulate strings. (Yes, java has efficiency problems with strings and presumably some problems with wide unicode characters due to a 16 bit char type, but I think that still makes sense).("knowing the encoding" doesn't count, the user shouldn't have to care).In a strongly typed language, that would be true, but D is not a strongly typed language. Walter is on record as stating that all char types including dchar can be freely used as integers. If that's going to be true, you MUST care about the encoding.Okay, I didn't realise dchars were 21 bits wide... if there's a way of doing this that's efficient, that'd be cool, dchar (or "char") could be 21 bits. If it's going to be hopelessly slow, you have to trust the programmer to some extent, what about "any library operation involving an out-of-range dchar is undefined"?IMO, strings NEED to: * Have only one type, or one base type.And, to take that reasoning further, it should have other interesting properties too, like it should be IMPOSSIBLE IN ANY CIRCUMSTANCE to end up with a char containing a value outside the range U+000000 to U+10FFFF inclusive. However, I don't see this happening in D. The reason being that even a dchar is not a character in the Unicode sense. It is a UTF-32 encoding of a character. (The minor technical difference being that dchar values above 0x10FFFF exist, but are invalid, whereas Unicode characters beyond U+10FFFF do not even exist).That depends on your point of view. Unicode may be viewed on many levels. I'm sure I could hold a reasonable argument in which I insisted that string data should be exposed as _glyphs_, not characters (characters are, after all, merely glyph fragments). Glyphs are what you see. If a string contains an e-acute glyph, should your application really /care/ which characters compose that glyph?Probably not, although if reading an encoded string and then writing it again doesn't produce the same byte-output, I'm sure I could find a contrived example... copy-pasting text invalidating a digital signature? Either would be much better than what we've got now, and I think character is more likely (though still spectacularly unlikely), because it has an obvious, efficient representation (32 bit unsigned number). Am I right in assuming a glyph can be fairly complicated?Somewhere along the line, you have to face the bottom level. That level is the level of character encoding. Language support is given to the encoding level. For anything above that, you use libraries. If such libraries don't exist yet, we can write them.Yeah. It's just a bit disappointing after hearing "Strings are character arrays and everything about them makes sense" to realise that you either have to grok UTF-N or treat these "characters" as opaque... the advantages over a class are gone, and a class has reference semantics and member functions.It clearly is, because I assumed a unicode character was 32 bits wide, on the basis that that's what D had taught me :-\The "char" type should be 32 bits wide. Anything else is confusing.21 bits wide, and limited to the range 0-0x10FFFF. Anything else is confusing.But this is D, and D is practical.If it's going to be horribly inefficient to make it 21 bits, have the spec say "it's at least 21 bits" and alias it to uint.Actually, I loved it, and I'm not flaming (and I hope nobody does). You've made some excellent observations. But it's way too late to shape D that way now. In the future, the may well be languages which handle characters as true, pure, Unicode characters, but the world isn't fully Unicode-aware yet.Yeah, it's the partly-there that's frustrating... my selfish side would be happy with just ASCII ;-). It just seems sometimes that if it's not easy and consistent to make things unicode-friendly, it won't happen. Especially in places where ASCII works fine, that's certainly easy and consistent! The current way seems to suggest that officially it's all unicode and happy, but (don't tell anyone) feel free to use ascii and assume chars are characters if you want. The standard library even does this, in std.string no less.To give an example of what I mean: Suppose you publish a web site containing a few musical symbols and a few exotic math symbols. (All valid Unicode). The sad fact is, such a website won't display properly on most people's browsers. To get them to display properly, it is currently the responsibility of VIEWERS (rather than publishers) of web sites, to "obtain", somehow, the relevant fonts to make it work. Usually, obtaining such fonts costs money, so who's going to bother? It'd be like buying a book and opening it to find half the characters looking like black blobs until you pay more money to a font-designer. And so, web site designers tend NOT to use such characters on their web sites, prefering gif images which everyone can view. It's a vicious circle.Yeah, fonts are a problem. My ideal world would have a (huge!) complete system default font (or one each for serif, sans, and mono) supplied with the OS, that would be the fallback for nonexistant characters.And realistically, things /are/ changing - but slowly. What D is doing is moving in the right direction. The shift to full Unicode support in all things is a long way off yet, and to get there, we must move in small steps.Yes. What gets me is that in a 5 years we'll (hopefully) be far enough down the unicode road that D's approach will seem backward, and I'll have to wait for someone to reinvent a similar language, with a more thorough unicode integration. Ah well, maybe we'll get a strong boolean next time <g>Defining a char as a UTF-8 fragment may be a small step, but it is a very important and valuable one. At least we don't say "a char is a character in some unspecified encoding", like some other languages do.Yeah, definitely. I just wish it was easier to use and harder to ignore. Sam
Jun 29 2004
In article <cbs0bj$1vhf$1 digitaldaemon.com>, Sam McCall says...I'm not saying it shouldn't be well-defined, but Java doesn't require the user to understand the intricacies of unicode encodings to manipulate strings.Yes it does. Java chars operate in UTF-16. If you want to store the character U+012345 in a Java string, you need to worry about UTF-16.Probably not, although if reading an encoded string and then writing it again doesn't produce the same byte-output, I'm sure I could find a contrived example... copy-pasting text invalidating a digital signature?That's what normalization is for. We'll have that soon in a forthcoming version of etc.unicode.Am I right in assuming a glyph can be fairly complicated?Very much so. Especially if you're a font designer, since Unicode allows you to munge any two glyphs together into a bigger glyph (a ligature). In practice, fonts only provide a small subset of all possible ligatures (as you can imagine!).Yeah. It's just a bit disappointing after hearing "Strings are character arrays and everything about them makes sense" to realise that you either have to grok UTF-N or treat these "characters" as opaque... the advantages over a class are gone, and a class has reference semantics and member functions.Not really. So long as you remember that characters <= 0x7F are OK in a char, and that characters <= 0xFFFF are fine in a wchar, you're sorted.Yeah, it's the partly-there that's frustrating... my selfish side would be happy with just ASCII ;-). It just seems sometimes that if it's not easy and consistent to make things unicode-friendly, it won't happen.Right, but it's a question of where that support comes from. To demand it all of the language itself is asking /a lot/ from poor old Walter. If we can add it, piece by piece, in libraries, I'd say we're not doing too badly.Especially in places where ASCII works fine, that's certainly easy and consistent! The current way seems to suggest that officially it's all unicode and happy, but (don't tell anyone) feel free to use asciiIt /is/ okay to use ASCII. All valid ASCII also happens to be valid UTF-8. UTF-8 was designed that way.and assume chars are characters if you want. The standard library even does this, in std.string no less.So long as they make no assumptions about characters > 0x7F, that's perfectly reasonable.Yeah, fonts are a problem. My ideal world would have a (huge!) complete system default font (or one each for serif, sans, and mono) supplied with the OS, that would be the fallback for nonexistant characters.I absolutely agree. There are free fonts which do this, but they don't display well at small point-size because of something called "hinting", which apparently you can't do without paying someone royalties because of some stupid IP nonsense.Yes. What gets me is that in a 5 years we'll (hopefully) be far enough down the unicode road that D's approach will seem backward, and I'll have to wait for someone to reinvent a similar language, with a more thorough unicode integration.Yup. That's the way it goes. So what else shall we imagine for D++? Jill
Jun 29 2004
Arcane Jill wrote:In article <cbs0bj$1vhf$1 digitaldaemon.com>, Sam McCall says...Whoops. Having never had to deal with this case (and taken a series of CS courses where we've iterated over chars countless times and they never mentioned this once :-\) I hadn't thought about this. Okay, suppose java had a 21- or 32-bit char type.I'm not saying it shouldn't be well-defined, but Java doesn't require the user to understand the intricacies of unicode encodings to manipulate strings.Yes it does. Java chars operate in UTF-16. If you want to store the character U+012345 in a Java string, you need to worry about UTF-16.Of course... so no, the program shouldn't care, but...Probably not, although if reading an encoded string and then writing it again doesn't produce the same byte-output, I'm sure I could find a contrived example... copy-pasting text invalidating a digital signature?That's what normalization is for. We'll have that soon in a forthcoming version of etc.unicode.Glyphs aren't really a practical option as the logical element type of strings if they can't be easily represented as a fixed-width number, I'd imagine.Am I right in assuming a glyph can be fairly complicated?Very much so. Especially if you're a font designer, since Unicode allows you to munge any two glyphs together into a bigger glyph (a ligature). In practice, fonts only provide a small subset of all possible ligatures (as you can imagine!).But you can't do obvious "list-of-characters" things like index by character or even slice at any offset.Yeah. It's just a bit disappointing after hearing "Strings are character arrays and everything about them makes sense" to realise that you either have to grok UTF-N or treat these "characters" as opaque... the advantages over a class are gone, and a class has reference semantics and member functions.Not really. So long as you remember that characters <= 0x7F are OK in a char, and that characters <= 0xFFFF are fine in a wchar, you're sorted.A decent unicode string class could be almost entirely library based, and would only require a little magic language support (for string literals). I might have a play around with one, on the assumption that if people find it useful, the horribly inefficient/incorrect bits could be fixed by people who know what they're doing ;)Yeah, it's the partly-there that's frustrating... my selfish side would be happy with just ASCII ;-). It just seems sometimes that if it's not easy and consistent to make things unicode-friendly, it won't happen.Right, but it's a question of where that support comes from. To demand it all of the language itself is asking /a lot/ from poor old Walter. If we can add it, piece by piece, in libraries, I'd say we're not doing too badly.It /is/ okay to use ASCII. All valid ASCII also happens to be valid UTF-8. UTF-8 was designed that way.So this means a char[] has two purposes depending on the app? On the one hand, ASCII/Unicode being a per-app decision is fair enough. On the other hand, that's not what it looked like to me in the docs, and I still think unicode should be the "default". Also, if people are going to use char[] as ASCII, they may write libraries that assume char[] is ASCII or worse, "a character in some unknown encoding".If it were documented as only working for ASCII, sure, otherwise you might assume it was a UTF-8 encoded character list. And I'm still not sure it'd be reasonable unless a wchar/dchar version was provided, how good is a language's unicode support if string manipulation functions only work on ascii? Anyway: /************************************ * Construct translation table for translate(). */ char[] maketrans(char[] from, char[] to) in { assert(from.length == to.length); } body { char[] t = new char[256]; int i; for (i = 0; i < 256; i++) t[i] = cast(char)i; for (i = 0; i < from.length; i++) t[from[i]] = to[i]; return t; }and assume chars are characters if you want. The standard library even does this, in std.string no less.So long as they make no assumptions about characters > 0x7F, that's perfectly reasonable.Ew, does that apply to creating fonts too? I thought most free fonts weren't manually hinted because it'd take forever, especially for unicode... I know freetype doesn't interpret hints by default, but there's a #define somewhere: "set this to 1 if you have permission from Apple Legal, or live somewhere sane". On my distro of choice, this was set by default :-DYeah, fonts are a problem. My ideal world would have a (huge!) complete system default font (or one each for serif, sans, and mono) supplied with the OS, that would be the fallback for nonexistant characters.I absolutely agree. There are free fonts which do this, but they don't display well at small point-size because of something called "hinting", which apparently you can't do without paying someone royalties because of some stupid IP nonsense.Fix C's broken precedence rules? SamYes. What gets me is that in a 5 years we'll (hopefully) be far enough down the unicode road that D's approach will seem backward, and I'll have to wait for someone to reinvent a similar language, with a more thorough unicode integration.Yup. That's the way it goes. So what else shall we imagine for D++?
Jun 29 2004
In article <cbsufo$a8u$1 digitaldaemon.com>, Sam McCall says...Okay, suppose java had a 21- or 32-bit char type.I'm led to believe there was a lot of debate about this. Some folk said that Java's char could NOT be anything other that 16 bits wide because it was defined that way and changing it would break things. Other folk looked under the hood of the JVM and decided that actually it probably wouldn't break anything after all. I don't know the ins and outs of it, but I gather the first lot won. The way it's going to go is UTF-16 support, with functions like isLetter() taking an int rather than a char.Glyphs aren't really a practical option as the logical element type of strings if they can't be easily represented as a fixed-width number, I'd imagine.Well, they can, with a bit of sneaky manipulation. The trick is to map only those ones you actually USE to the unused codepoints between 0x110000 and 0xFFFFFFFF. So long as such a mapping stays within the application (like, don't try to export it), you can indeed have one dchar per glyph. But it would be a temporary one - not one you could write to a file, for example. In general, you're right.But you can't do obvious "list-of-characters" things like index by character or even slice at any offset.True.I'm not sure I follow that. If you say char[] a = "hello world"; then you will get a string containing eleven chars, and it will be both valid ASCII and valid UTF-8. It's not like you have to choose.It /is/ okay to use ASCII. All valid ASCII also happens to be valid UTF-8. UTF-8 was designed that way.So this means a char[] has two purposes depending on the app?On the one hand, ASCII/Unicode being a per-app decision is fair enough.That isn't what I said. It's possible we may be misunderstanding each other somehow.Also, if people are going to use char[] as ASCII, they may write libraries that assume char[] is ASCIIWell, that would be a bug, of course. It's perfectly ok to choose only to store ASCII characters in chars, but NOT perfectly okay to assume that chars will only contain ASCII characters. Anyone writing a library containing such a bug should simply be press-ganged into fixing it.or worse, "a character in some unknown encoding".Again, that would be a bug, and at odds with D's definition of what a char is.If it were documented as only working for ASCII, sure, otherwise you might assume it was a UTF-8 encoded character list. And I'm still not sure it'd be reasonable unless a wchar/dchar version was provided, how good is a language's unicode support if string manipulation functions only work on ascii?I'm not completely clear what functions you're talking about, as I haven't read the source code for std.string. Am I correct in assuming that the quote below is an extract?Anyway: /************************************ * Construct translation table for translate(). */ char[] maketrans(char[] from, char[] to) in { assert(from.length == to.length); } body { char[] t = new char[256]; int i; for (i = 0; i < 256; i++) t[i] = cast(char)i; for (i = 0; i < from.length; i++) t[from[i]] = to[i]; return t; }This is a bug. ASCII stops at 0x7F. Characters above 0x7F are not ASCII. If this function is intended as an ASCII-only function then (a) it should be documented as such, and (b) it should leave all bytes >0x7F unmodified. Char values between 0x80 and 0xFF are resevered for the role they play in UTF-8. You CANNOT mess with them (unless you're a UTF-8 engine). You're right. I'd prefer to see a dchar version of this routine. Of course, you wouldn't want a lookup table with 0x1100000 entries in it, but an associative array should do the job. Assuming this is from std.string, I guess one of us should report this as a bug. Arcane Jill
Jun 30 2004
Arcane Jill wrote:In article <cbsufo$a8u$1 digitaldaemon.com>, Sam McCall says...Sorry, I meant "if java had originally been defined to have char being 21 bits instead of 16, and storing a unicode codepoint instead of a UTF-16 fragment". All java's string manipulation stuff is char-based, and I was convinced there was a one-to-one correspondence between chars and characters (or possibly some too-big char values possible). Clearly I was mistaken, but if they had made chars 21 bits and kept the rest the same, it looks to me like it'd be just about perfect. (Well, I'm sure the APIs could be improved in minor ways, etc, but relatively speaking).Okay, suppose java had a 21- or 32-bit char type.I'm led to believe there was a lot of debate about this. Some folk said that Java's char could NOT be anything other that 16 bits wide because it was defined that way and changing it would break things. Other folk looked under the hood of the JVM and decided that actually it probably wouldn't break anything after all. I don't know the ins and outs of it, but I gather the first lot won. The way it's going to go is UTF-16 support, with functions like isLetter() taking an int rather than a char.Ooh, clever :) But I don't see this working in a situation where you have dynamic libraries, for example.Glyphs aren't really a practical option as the logical element type of strings if they can't be easily represented as a fixed-width number, I'd imagine.Well, they can, with a bit of sneaky manipulation. The trick is to map only those ones you actually USE to the unused codepoints between 0x110000 and 0xFFFFFFFF. So long as such a mapping stays within the application (like, don't try to export it), you can indeed have one dchar per glyph. But it would be a temporary one - not one you could write to a file, for example.Sorry, what I originally meant:I'm not sure I follow that. If you say char[] a = "hello world"; then you will get a string containing eleven chars, and it will be both valid ASCII and valid UTF-8. It's not like you have to choose.It /is/ okay to use ASCII. All valid ASCII also happens to be valid UTF-8. UTF-8 was designed that way.So this means a char[] has two purposes depending on the app?On the one hand, ASCII/Unicode being a per-app decision is fair enough.That isn't what I said. It's possible we may be misunderstanding each other somehow.Especially in places where ASCII works fine, that's certainly easy and consistent! The current way seems to suggest that officially it's all unicode and happy, but (don't tell anyone) feel free to use asciiWas that although unicode is the officially designated content of these types, char[] looks and feels (and the standard library uses it) like it's ASCII, and people won't bother to use unicode, because it's requires calling conversion functions and so on. Especially since if you assume the language will take care of unicode for you like java (almost) does, then you'll end up with code that only works properly for ASCII data. That's probably all a lot of people will test it with. We should get unicode by default.std.string.maketrans and std.string.translate.If it were documented as only working for ASCII, sure, otherwise you might assume it was a UTF-8 encoded character list. And I'm still not sure it'd be reasonable unless a wchar/dchar version was provided, how good is a language's unicode support if string manipulation functions only work on ascii?I'm not completely clear what functions you're talking about, as I haven't read the source code for std.string. Am I correct in assuming that the quote below is an extract?It's got a single-line explanation that doesn't mention encoding. I'll report it. SamAnyway: /************************************ * Construct translation table for translate(). */ char[] maketrans(char[] from, char[] to) in { assert(from.length == to.length); } body { char[] t = new char[256]; int i; for (i = 0; i < 256; i++) t[i] = cast(char)i; for (i = 0; i < from.length; i++) t[from[i]] = to[i]; return t; }This is a bug. ASCII stops at 0x7F. Characters above 0x7F are not ASCII. If this function is intended as an ASCII-only function then (a) it should be documented as such, and (b) it should leave all bytes >0x7F unmodified. Char values between 0x80 and 0xFF are resevered for the role they play in UTF-8. You CANNOT mess with them (unless you're a UTF-8 engine).
Jun 30 2004
In article <cbts89$1poh$1 digitaldaemon.com>, Sam McCall says...Sorry, I meant "if java had originally been defined to have char being 21 bits instead of 16, and storing a unicode codepoint instead of a UTF-16 fragment". All java's string manipulation stuff is char-based, and I was convinced there was a one-to-one correspondence between chars and characters (or possibly some too-big char values possible). Clearly I was mistaken,You weren't mistaken. You were spot on. When Java was invented, Unicode stood at version 2.0. Possibly even earlier. At that time, Unicode was touted as a 16-bit standard, and its maximum codepoint was U+FFFF. At that time, there was no such thing as UTF-16. A Unicode char was 16 bits wide, and that was that. The only relevant 16-bit encodings were UCS-16LE (which meant, emit the 16-bit codepoint low order byte first), and UCS-16BE (which meant, emit the codepoint high order byte first). Java simply took that on board and went with it. But as time went by, the Unicode folk realized that sixty five thousand characters wasn't actually ENOUGH for all the world's scripts (including historical ones that nobody ever uses any more), so they managed to find a way to squeeze even more characters into that 16-bit model. They called it UTF-16, and it extends the range from U+FFFF to U+10FFFF. There has been some discussion on the Unicode public formum as to whether even THIS limit will ever be extended. The Unicode Consortium currently are stating flat out that there will never, ever, be Unicode characters with codepoints above U+10FFFF. So, you can choose to believe them, or you can regard this statement with as much credibility as the statements like "64K should be enough memory for anyone" which were touted in the ZX81 days. Java got caught out by the changing of the times. D's chars should probably be wider than 21-bits, just in case.... (Not that I'm choosing to disbelieve the Unicode Consortium of course!) 32 bits seems safe enough, for the forseeable future.but if they had made chars 21 bits and kept the rest the same, it looks to me like it'd be just about perfect.Yes. I'll bet the Java folk thought that at the time.Was that although unicode is the officially designated content of these types, char[] looks and feels (and the standard library uses it) like it's ASCII, and people won't bother to use unicode, because it's requires calling conversion functions and so on.Well, of course UTF-8 was /designed/ to be compatible with ASCII, to ease transition. That's not such a bad thing. Bugs will happen, of course, just as they happen with any other encoding, but they can be found and fixed (and fixing them will be easier, the more library support there is). It's just one of those things which is going to get better with time. Arcane Jill
Jun 30 2004
Arcane Jill wrote:In article <cbts89$1poh$1 digitaldaemon.com>, Sam McCall says... You weren't mistaken. You were spot on.<snip> Wow, thanks for that explanation, I really appreciate it :-)Okay, we'll stick with 32 bits. If they reach that in my lifetime, someone is going to die... Anyway, by the time I work out how to efficiently character-index UTF-8 in mutable stri]ngs, I'm sure I'll think unicode is thorougly overrated :-D Sambut if they had made chars 21 bits and kept the rest the same, it looks to me like it'd be just about perfect.Yes. I'll bet the Java folk thought that at the time.Well, of course UTF-8 was /designed/ to be compatible with ASCII, to ease transition. That's not such a bad thing. Bugs will happen, of course, just as they happen with any other encoding, but they can be found and fixed (and fixing them will be easier, the more library support there is). It's just one of those things which is going to get better with time.
Jun 30 2004
Frankly, yes, I use -1 as a "magic value" all the time, and do all sorts of ugly things when negative numbers are perfectly valid. This isThat's true. In Standard ML you could do val index : 'a -> int option Then if 'a exists return SOME(x), if not, return NONE. If a function has a an option type as a domain it has to deal with both cases. In D, you'd either use a magic value like -1 or encapsulate values in a class; then null is NONE and not null is SOME.I think arrays should become fully reference types, for the same reason as strings above. Yes, this would probably mean double indirection, arrays would be a pointer to the (length,data pointer) struct that they currently are.But you can go ahead and create a class for lists, no problem at all. Neither Phobos nor DTL has fully hatched yet, so we'll see what happens.
Jun 29 2004
Bent Rasmussen wrote:McCall's Law the First: Every feature of a "traditional" language is a special case of a feature of every functional language. McCall's Law the Second: Every feature of every functional language is a special case of the only feature of Lisp.Frankly, yes, I use -1 as a "magic value" all the time, and do all sorts of ugly things when negative numbers are perfectly valid. This isThat's true. In Standard ML you could do val index : 'a -> int option Then if 'a exists return SOME(x), if not, return NONE. If a function has a an option type as a domain it has to deal with both cases.In D, you'd either use a magic value like -1 or encapsulate values in a class; then null is NONE and not null is SOME.But this isn't ML. I will get some weird looks, and nobody will touch my libraries ;-) Besides, that's exactly equivalent (AFAICS) to a reference type, assuming no pointer arithmetic and casting shenanigans. If this _is_ useful, is dereferencing one more pointer to access arrays really going to kill us? Or is there some case where the value-type-kinda nature of arrays is useful?But you can go ahead and create a class for lists, no problem at all. Neither Phobos nor DTL has fully hatched yet, so we'll see what happens.I'm beginning to think this is the only answer. But lists are such a fundamental type, using a non-standard list type would be a pain. I can't see room for another list type, so I guess I'll end up using DTL's list everywhere, and hope everyone does the same. But it does seem a waste of such powerful arrays in the language. Sam
Jun 29 2004
On Wed, 30 Jun 2004 03:20:54 +1200, Sam McCall <tunah.d tunah.net> wrote:Bent Rasmussen wrote:I think the current value-type-kinda nature of arrays is good, it just needs the 2 tweaks I mentioned to make it consistent.McCall's Law the First: Every feature of a "traditional" language is a special case of a feature of every functional language. McCall's Law the Second: Every feature of every functional language is a special case of the only feature of Lisp.Frankly, yes, I use -1 as a "magic value" all the time, and do all sorts of ugly things when negative numbers are perfectly valid. This isThat's true. In Standard ML you could do val index : 'a -> int option Then if 'a exists return SOME(x), if not, return NONE. If a function has a an option type as a domain it has to deal with both cases.In D, you'd either use a magic value like -1 or encapsulate values in a class; then null is NONE and not null is SOME.But this isn't ML. I will get some weird looks, and nobody will touch my libraries ;-) Besides, that's exactly equivalent (AFAICS) to a reference type, assuming no pointer arithmetic and casting shenanigans. If this _is_ useful, is dereferencing one more pointer to access arrays really going to kill us? Or is there some case where the value-type-kinda nature of arrays is useful?-- Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/But you can go ahead and create a class for lists, no problem at all. Neither Phobos nor DTL has fully hatched yet, so we'll see what happens.I'm beginning to think this is the only answer. But lists are such a fundamental type, using a non-standard list type would be a pain. I can't see room for another list type, so I guess I'll end up using DTL's list everywhere, and hope everyone does the same. But it does seem a waste of such powerful arrays in the language. Sam
Jun 29 2004
Why do you need to add member-functions to a string class, but you don't on char-arrays? Why are global functions OK for char-arrays, but aren't for a string class? This is some kind of strange. Just because another notation? Does taht realy matters? There are languages where you can wirte: object.function() and function(objekt) and it measn the same. I don't get your point.You'll get no arguments from me there. D got it right in not having a string class. I didn't think that at first, but I've come round to the D way of thinking. The problem with a string class is that you can't add new member functions to it. (Oh, you may be able to subclass String, if it's not final. Oh wait - it /is/ final in Java). With char[] arrays, you CAN add new functions.Yeah, it's called std::string, and it's more or less the default.And it's crap. IMNSHO.Something like that would be cool, just like option in SML. I think I have to write something like this. -- Matthias BeckerWhy? Do we also need a way to differentiate between empty and non-existent ints? In D, there is no such thing as a non-existent int; there is no such thing as a non-existent struct; and there is no such thing as a non-existent string.Don't compare arrays to null. Don't try to differentiate between empty and nonexistent.Fine and dandy EXCEPT we *need* to differentiate between empty and non-existant strings.
Jun 29 2004
thing as aIn D, there is no such thing as a non-existent int; there is no suchPerhaps, class Option(VALUE) { VALUE Item; } template SOME(VALUE) { Option!(VALUE) SOME(VALUE x) { Option!(VALUE) e = new Option!(VALUE)(); e.Item = x; return e; } } alias Option!(uint) INDEX; class Array(VALUE) { ... INDEX Index(VALUE x) { foreach (uint i, VALUE z; Items) { if (x == z) { return SOME!(VALUE)(i); } } return null; } } Somewhat non-ideal though.non-existent struct; and there is no such thing as a non-existent string.Something like that would be cool, just like option in SML. I think I have to write something like this.
Jun 29 2004
Arcane Jill <Arcane_member pathlink.com> wrote in news:cbr53s$op8$1 digitaldaemon.com: [snip]Why? Do we also need a way to differentiate between empty and non-existent ints?Yes, we do. A slightly *naive* but definitely opinionated soul already suggested exactly this. Unfortunately, this is not implementable without unacceptable performance loss. So we cannot have this. [snip]Maybe the real solution would be to make it a compile error to assign an array with null, or to compare it with null. This would then force people to say what they mean, and all such problems would go away.I agree, that would help to avoid some confusion. Unfortunately, people would be forced to either say 'I mean empty' or to shut up completely and use sth. completely different. Farmer.
Jun 29 2004
Farmer wrote:Arcane Jill <Arcane_member pathlink.com> wrote in news:cbr53s$op8$1 digitaldaemon.com:We don't have array literals, so we can't do this: foo( [] ); At the moment we can do this: foo( null ); If we outlawed using nulls as arrays, we'd be left with foo( new int[0] ) which is maybe a bit messy? SamMaybe the real solution would be to make it a compile error to assign an array with null, or to compare it with null. This would then force people to say what they mean, and all such problems would go away.I agree, that would help to avoid some confusion. Unfortunately, people would be forced to either say 'I mean empty' or to shut up completely and use sth. completely different.
Jun 29 2004
Sam McCall <tunah.d tunah.net> wrote in news:cbsupg$anb$1 digitaldaemon.com:Farmer wrote:What's messy here? A bit more typing, that's it. One disadvantage of foo( null ); is, that there is no type information. If you had foo(int[]) foo(float[]) you would need a cast, because it gets ambiguous. Farmer.Arcane Jill <Arcane_member pathlink.com> wrote in news:cbr53s$op8$1 digitaldaemon.com:We don't have array literals, so we can't do this: foo( [] ); At the moment we can do this: foo( null ); If we outlawed using nulls as arrays, we'd be left with foo( new int[0] ) which is maybe a bit messy? SamMaybe the real solution would be to make it a compile error to assign an array with null, or to compare it with null. This would then force people to say what they mean, and all such problems would go away.I agree, that would help to avoid some confusion. Unfortunately, people would be forced to either say 'I mean empty' or to shut up completely and use sth. completely different.
Jun 30 2004
Because copying a struct costs much more than just copying a pointer to it. In C++ you have references for things like this, which can't be NULL.Thus why just about no-one ever does this (in C). They all return a pointer to a struct.Why? C++ gets along without them just fine, and every C derivant I know of gets along fine without allowing primitive type returns to signify nonexistence. Functions which returns structs cannot return null either.A 'null array' is a completely arbitrary concept that has been extrapolated from undefined behaviour. :)It may be undefined, but I believe it is required.Could you please make some real world examples, where you need empty strings and null-strings? -- Matthias BeckerFine and dandy EXCEPT we *need* to differentiate between empty and non-existant strings.The soln IMO is either to make the current behaviour official and consistent, or to change the behaviour, make that official and provide another way to tell null apart from an empty string.Farmer's test reports pretty consistent results if you suppose that comparing arrays to null is ill-formed: empty1.length == 0 is true empty1 == "" is true empty2.length == 0 is true empty2 == "" is true empty3.length == 0 is true empty3 == "" is true Don't compare arrays to null. Don't try to differentiate between empty and nonexistent.D arrays simply do not work that way.In that case we need an array specialisation for strings, so I'll have to write my own. This defeats the purpose of char[] in the first place, which was, to be a better more consistent string handling method than in possible in c/c++.
Jun 29 2004
On Tue, 29 Jun 2004 15:39:15 +0000 (UTC), Matthias Becker <Matthias_member pathlink.com> wrote:Thus why I dont use references either when I need the ability to say it's NULL.Because copying a struct costs much more than just copying a pointer to it. In C++ you have references for things like this, which can't be NULL.Thus why just about no-one ever does this (in C). They all return a pointer to a struct.Why? C++ gets along without them just fine, and every C derivant I know of gets along fine without allowing primitive type returns to signify nonexistence. Functions which returns structs cannot return null either.A 'null array' is a completely arbitrary concept that has been extrapolated from undefined behaviour. :)It may be undefined, but I believe it is required.Sure thing, pls see my reply to andy's post.. there has to be an easy way to direct you to a post but I dont know how.. I posted it 3 or 4 posts ago if you sort flat and by date. Regan. -- Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/Could you please make some real world examples, where you need empty strings and null-strings?Fine and dandy EXCEPT we *need* to differentiate between empty and non-existant strings.The soln IMO is either to make the current behaviour official and consistent, or to change the behaviour, make that official and provide another way to tell null apart from an empty string.Farmer's test reports pretty consistent results if you suppose that comparing arrays to null is ill-formed: empty1.length == 0 is true empty1 == "" is true empty2.length == 0 is true empty2 == "" is true empty3.length == 0 is true empty3 == "" is true Don't compare arrays to null. Don't try to differentiate between empty and nonexistent.D arrays simply do not work that way.In that case we need an array specialisation for strings, so I'll have to write my own. This defeats the purpose of char[] in the first place, which was, to be a better more consistent string handling method than in possible in c/c++.
Jun 29 2004
In article <opsab6o5rl5a2sq9 digitalmars.com>, Regan Heath says...Fine and dandy EXCEPT we *need* to differentiate between empty and non-existant strings.Why? It seems to me that this behavior would also require arrays to be initialized with new rather than resizing from zero using the .length parameter. And this would result in a ton of extra coding--either in clauses that errored on null arrays or initialization code to handle both cases. No thanks. If this happened I'd stil using built-in arrays and write a class for the purpose. Sean
Jun 29 2004
Sean Kelly <sean f4.ca> wrote in news:cbs4ju$26aj$1 digitaldaemon.com:In article <opsab6o5rl5a2sq9 digitalmars.com>, Regan Heath says...The .length parameter would still work with null-arrays (as they currently do). But why would you want to initialize an array to null/empty and then resize it, instead of 'newing' it with the correct size in first place? My CPU gets hot enough, no need for extra heat-up cycles :-) Extra coding is not required if you don't need null-arrays: if some user passes a null-array, the user gets a nice access violation/array bounds exception and will quickly learn to not pass null-arrays to such functions. A quick check in the DbC section of your function would do the job, too. (But I suppose, the user might not adapt that fast that way :-) If your function should deal with both null-arrays and empty-arrays, no extra code is required, since the .length property can be accessed for both null- arrays and emtpy-arrays.Fine and dandy EXCEPT we *need* to differentiate between empty and non-existant strings.Why? It seems to me that this behavior would also require arrays to be initialized with new rather than resizing from zero using the .length parameter. And this would result in a ton of extra coding--either in clauses that errored on null arrays or initialization code to handle both cases. [...][...] No thanks. If this happened I'd stil using built-in arrays and write a class for the purpose.I came to the same conclusion, wrapping a build-in array in a class or struct to adapt its behaviour to the specific needs is one (if not the) way to go. Farmer.
Jun 29 2004
In article <Xns9517F3F654C29itsFarmer 63.105.9.61>, Farmer says...The .length parameter would still work with null-arrays (as they currently do). But why would you want to initialize an array to null/empty and then resize it, instead of 'newing' it with the correct size in first place?Consider the following: char[] str = new char[100]; str.length = 0; // A str.length = 5; // B str = new char[10]; // C In A, AFAIK it's legal for the compiler to retain the memory and merely change the length parameter for the string. B then just changes the length parameter again, and no reallocation is performed. C forces a reallocation even if the array already has the (hidden) capacity in place. Lacking allocators, this is a feature I consider rather nice in D.Extra coding is not required if you don't need null-arrays: if some user passes a null-array, the user gets a nice access violation/array bounds exception and will quickly learn to not pass null-arrays to such functions. A quick check in the DbC section of your function would do the job, too. (But I suppose, the user might not adapt that fast that way :-)I originally thought D worked the way you describe and added DBC clauses to all my functions to check for null array parameters. After some testing I realized I'd been mistaken and happily removed most of these clauses. The result IMO was tighter, cleaner code that was easier to understand. I suppose it's really a matter of opinion. I like that arrays work the same as the other primitive types.If your function should deal with both null-arrays and empty-arrays, no extra code is required, since the .length property can be accessed for both null- arrays and emtpy-arrays.Could it? I suppose so, but the concept seems a tad odd. I kind of expect none of the parameters (besides sizeof, perhaps) to work for dynamic types that have not been initialized. Though perhaps that's the C way of thinking. Sean
Jun 29 2004
Sean Kelly <sean f4.ca> wrote in news:cbsqnf$547$1 digitaldaemon.com:In article <Xns9517F3F654C29itsFarmer 63.105.9.61>, Farmer says...I agree with you that this feature is quite useful. The problem with (A) is, that DMD doesn't do that; the function 'arraysetlength' explicitly checks whether the new length is null, and if so destroys the data pointer. Furthermore it seems that it is not allowed to call the .length property for null-arrays. How do I know? Well the function in the phobos file internal\gc.d byte[] _d_arraysetlength(uint newlength, uint sizeelem, Array *p) contains this assertion assert(!p.length || p.data); Ironically, this assertion permits, that the data pointer is null, but the length is greater than 0.The .length parameter would still work with null-arrays (as they currently do). But why would you want to initialize an array to null/empty and then resize it, instead of 'newing' it with the correct size in first place?Consider the following: char[] str = new char[100]; str.length = 0; // A str.length = 5; // B str = new char[10]; // C In A, AFAIK it's legal for the compiler to retain the memory and merely change the length parameter for the string. B then just changes the length parameter again, and no reallocation is performed. C forces a reallocation even if the array already has the (hidden) capacity in place. Lacking allocators, this is a feature I consider rather nice in D.I always love it when this happens. Code that isn't written, is bug-free, maintainable, and super-fast ;-)Extra coding is not required if you don't need null-arrays: if some user passes a null-array, the user gets a nice access violation/array bounds exception and will quickly learn to not pass null-arrays to such functions. A quick check in the DbC section of your function would do the job, too. (But I suppose, the user might not adapt that fast that way :-)I originally thought D worked the way you describe and added DBC clauses to all my functions to check for null array parameters. After some testing I realized I'd been mistaken and happily removed most of these clauses. The result IMO was tighter, cleaner code that was easier to understand. I suppose it's really a matter of opinion. I like that arrays work the same as the other primitive types.Yes, I think it is bit odd, too. For reading the length property it makes sense, but for resizing it is more questionable. But I am definetely thinking the C way here. Farmer.If your function should deal with both null-arrays and empty-arrays, no extra code is required, since the .length property can be accessed for both null- arrays and emtpy-arrays.Could it? I suppose so, but the concept seems a tad odd. I kind of expect none of the parameters (besides sizeof, perhaps) to work for dynamic types that have not been initialized. Though perhaps that's the C way of thinking.
Jun 30 2004
On Wed, 30 Jun 2004 22:57:02 +0000 (UTC), Farmer <itsFarmer. freenet.de> wrote:Sean Kelly <sean f4.ca> wrote in news:cbsqnf$547$1 digitaldaemon.com:Provably correct. :) --[test.d]-- struct array { int length; void *data; } void main() { char[] p = new char[100]; array *s = cast(array *)&p; printf("%d\n",s.length); printf("%08x\n",s.data); p.length = 0; printf("%d\n",s.length); printf("%08x\n",s.data); } prints 100 007d2f80 0 00000000In article <Xns9517F3F654C29itsFarmer 63.105.9.61>, Farmer says...I agree with you that this feature is quite useful. The problem with (A) is, that DMD doesn't do that; the function 'arraysetlength' explicitly checks whether the new length is null, and if so destroys the data pointer.The .length parameter would still work with null-arrays (as they currently do). But why would you want to initialize an array to null/empty and then resize it, instead of 'newing' it with the correct size in first place?Consider the following: char[] str = new char[100]; str.length = 0; // A str.length = 5; // B str = new char[10]; // C In A, AFAIK it's legal for the compiler to retain the memory and merely change the length parameter for the string. B then just changes the length parameter again, and no reallocation is performed. C forces a reallocation even if the array already has the (hidden) capacity in place. Lacking allocators, this is a feature I consider rather nice in D.Furthermore it seems that it is not allowed to call the .length property for null-arrays.I can go: p.length = 0; p.length = 0; p.length = 0; p.length = 0; no problem? is that what you mean't?How do I know? Well the function in the phobos file internal\gc.d byte[] _d_arraysetlength(uint newlength, uint sizeelem, Array *p) contains this assertion assert(!p.length || p.data);perhaps this function is not called if (p.length == 0 && newlength == 0) one level higher?Ironically, this assertion permits, that the data pointer is null, but the length is greater than 0.which is technically impossible. Regan-- Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/I always love it when this happens. Code that isn't written, is bug-free, maintainable, and super-fast ;-)Extra coding is not required if you don't need null-arrays: if some user passes a null-array, the user gets a nice access violation/array bounds exception and will quickly learn to not pass null-arrays to such functions. A quick check in the DbC section of your function would do the job, too. (But I suppose, the user might not adapt that fast that way :-)I originally thought D worked the way you describe and added DBC clauses to all my functions to check for null array parameters. After some testing I realized I'd been mistaken and happily removed most of these clauses. The result IMO was tighter, cleaner code that was easier to understand. I suppose it's really a matter of opinion. I like that arrays work the same as the other primitive types.Yes, I think it is bit odd, too. For reading the length property it makes sense, but for resizing it is more questionable. But I am definetely thinking the C way here. Farmer.If your function should deal with both null-arrays and empty-arrays, no extra code is required, since the .length property can be accessed for both null- arrays and emtpy-arrays.Could it? I suppose so, but the concept seems a tad odd. I kind of expect none of the parameters (besides sizeof, perhaps) to work for dynamic types that have not been initialized. Though perhaps that's the C way of thinking.
Jun 30 2004
In article <Xns95199C928F73itsFarmer 63.105.9.61>, Farmer says...How do I know? Well the function in the phobos file internal\gc.d byte[] _d_arraysetlength(uint newlength, uint sizeelem, Array *p) contains this assertion assert(!p.length || p.data); Ironically, this assertion permits, that the data pointer is null, but the length is greater than 0.I read it that the assertion requires either the length to be zero or the length to be nonzero and the data to be non-null. This seems to correspond to my assumption that D allows for zero length arrays to retain allocated memory. Sean
Jun 30 2004
On Thu, 1 Jul 2004 04:37:37 +0000 (UTC), Sean Kelly <sean f4.ca> wrote:In article <Xns95199C928F73itsFarmer 63.105.9.61>, Farmer says...It may very well allow it (in this code, at this level), but how do you do it? Regan -- Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/How do I know? Well the function in the phobos file internal\gc.d byte[] _d_arraysetlength(uint newlength, uint sizeelem, Array *p) contains this assertion assert(!p.length || p.data); Ironically, this assertion permits, that the data pointer is null, but the length is greater than 0.I read it that the assertion requires either the length to be zero or the length to be nonzero and the data to be non-null. This seems to correspond to my assumption that D allows for zero length arrays to retain allocated memory.
Jun 30 2004
Sean Kelly <sean f4.ca> wrote in news:cc04eh$2l5e$1 digitaldaemon.com:In article <Xns95199C928F73itsFarmer 63.105.9.61>, Farmer says...Rubbish.How do I know? Well the function in the phobos file internal\gc.d byte[] _d_arraysetlength(uint newlength, uint sizeelem, Array *p) contains this assertion assert(!p.length || p.data); Ironically, this assertion permits, that the data pointer is null, but the length is greater than 0.I read it that the assertion requires either the length to be zero or the length to be nonzero and the data to be non-null. This seems to correspond to my assumption that D allows for zero length arrays to retain allocated memory. SeanI blush for shame, this is too embarrassing. What a whimp I am, I can't do simple boolean algebra. What must years of Java(TM) programming hav done to me? On the upside, it means that I was wrong. No assertion discourages null or empty-arrays. Yes, memory for zero length arrays is retained, if the array is sliced.
Jul 01 2004
On Tue, 29 Jun 2004 16:15:58 +0000 (UTC), Sean Kelly <sean f4.ca> wrote:In article <opsab6o5rl5a2sq9 digitalmars.com>, Regan Heath says...Nope. It already works, except for 2 inconsistencies (see the original post)Fine and dandy EXCEPT we *need* to differentiate between empty and non-existant strings.Why? It seems to me that this behavior would also require arrays to be initialized with new rather than resizing from zero using the .length parameter.And this would result in a ton of extra coding--either in clauses that errored on null arrays or initialization code to handle both cases. No thanks.Not true. You can/could still simply check the length vs 0 if you want to treat null and empty the same.If this happened I'd stil using built-in arrays and write a class for the purpose.? 'stil' == 'stop' ? Regan -- Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/
Jun 29 2004
Andy Friesen <andy ikagames.com> wrote in news:cbpsi6$1u7d$1 digitaldaemon.com: [snip]C++ containers cannot represent null either. D will (and does) get along just fine if its array type works the same way.[snip] And probably that is one reason why programmers don't use std::vector. <rant> If I wanted to use sth. like std::vector, I'd simply use them in D. But if I want to get to the *bare metal*, I want the *bare metal*. No less. I don't want sth. that is similar to std::vector (just better tuned for performance), tightly integrated (or coupled, in book) with the language, with some odd syntax and superfluous but still incomplete properties like array.sort. Even if that means that I have to code a bubble sort, all the time myself ;-) <end of rant> Farmer.
Jun 29 2004
Farmer wrote:Andy Friesen <andy ikagames.com> wrote in news:cbpsi6$1u7d$1 digitaldaemon.com: [snip]They don't? Do you have a source to back that up? As far as I've ever noticed, bigwig C++ people have always made it clear that std::vector is preferable over an array and that std::string is preferable to a char*. The concern for distinguishing empty vs null has quite honestly never even occurred to me until it was mentioned here. Think about expressing the distinction a different way and move on. I do apologize if I sound naive, (I'll assume that comment was directed at me :) ) but I honestly can't comprehend a situation in which the distinction is going to have any measurable cost on clarity, let alone performance. -- andyC++ containers cannot represent null either. D will (and does) get along just fine if its array type works the same way.[snip] And probably that is one reason why programmers don't use std::vector.
Jun 29 2004
On Tue, 29 Jun 2004 18:16:25 -0700, Andy Friesen <andy ikagames.com> wrote:Farmer wrote:Sure.. can you show me how. I am having trouble doing it, it must be my C fixated brain. Pls use the example in the post I made to you earlier today..Andy Friesen <andy ikagames.com> wrote in news:cbpsi6$1u7d$1 digitaldaemon.com: [snip]They don't? Do you have a source to back that up? As far as I've ever noticed, bigwig C++ people have always made it clear that std::vector is preferable over an array and that std::string is preferable to a char*. The concern for distinguishing empty vs null has quite honestly never even occurred to me until it was mentioned here. Think about expressing the distinction a different way and move on.C++ containers cannot represent null either. D will (and does) get along just fine if its array type works the same way.[snip] And probably that is one reason why programmers don't use std::vector.I do apologize if I sound naive, (I'll assume that comment was directed at me :) )LOL.. I thought it was me..but I honestly can't comprehend a situation in which the distinction is going to have any measurable cost on clarity, let alone performance.I think my example in my previous post does show a cost on either or both. Basically I think a reference type allows me to *express* more than a value type does. Regan. -- Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/
Jun 29 2004
Andy Friesen <andy ikagames.com> wrote in news:cbt41t$i1n$1 digitaldaemon.com:Farmer wrote:Sorry, my statement was badly expressed. I meant it more like "And probably that is another reason why programmers often refrain from using std:vector." Of course, programmers use std::vector, otherwise I'd said that I am not a programmer ;-)Andy Friesen <andy ikagames.com> wrote in news:cbpsi6$1u7d$1 digitaldaemon.com: [snip]They don't? Do you have a source to back that up? As far as I've ever noticed, bigwig C++ people have always made it clear that std::vector is preferable over an array and that std::string is preferable to a char*.C++ containers cannot represent null either. D will (and does) get along just fine if its array type works the same way.[snip] And probably that is one reason why programmers don't use std::vector.The concern for distinguishing empty vs null has quite honestly never even occurred to me until it was mentioned here. Think about expressing the distinction a different way and move on.I expect that this concern will rarely come up, and that's exactly why I brought it up. I would move on, but I see no compelling reason to express it in a different way.I do apologize if I sound naive, (I'll assume that comment was directed at me :) ) but I honestly can't comprehend a situation in which the distinction is going to have any measurable cost on clarity, let alone performance. -- andyI was naive in believing that it is obvious what posts I referred to. I was thinking e.g. of post http://www.digitalmars.com/drn-bin/wwwnews/23126 Btw, the author of this post, happens to use the term naive, so he shouldn't take offense. But in fact, this post doesn't really advocate 'NaN' for ints, rather http://www.digitalmars.com/drn-bin/wwwnews/23100 does so. Sorry, andy and sorry Regan. You didn't suggest 'NaN' for ints. So no f(l)ame(s) for you... Farmer.
Jun 30 2004
I hope you're not referring to the quick hack I posted. It was meant to express the *conceptual* problem of returning a null value for a value type -- *not* a practical one. It was mentioned in the context of the ML option type. ps. Both links are broken.
Jun 30 2004
"Bent Rasmussen" <exo bent-rasmussen.info> wrote in news:cbvk1g$1r9b$1 digitaldaemon.com:I hope you're not referring to the quick hack I posted. It was meant to express the *conceptual* problem of returning a null value for a value type -- *not* a practical one. It was mentioned in the context of the ML option type. ps. Both links are broken.You suggested none's for int's but you don't use the term naive in your posts. So no f(l)ame(s) for you, either. Try these: http://www.digitalmars.com/drn-bin/wwwnews?D/29213 http://www.digitalmars.com/drn-bin/wwwnews?D/23120 Farmer.
Jul 01 2004
On Wed, 30 Jun 2004 22:57:04 +0000 (UTC), Farmer <itsFarmer. freenet.de> wrote:Andy Friesen <andy ikagames.com> wrote in news:cbt41t$i1n$1 digitaldaemon.com:Was it me.. these links don't work for me :(Farmer wrote:Sorry, my statement was badly expressed. I meant it more like "And probably that is another reason why programmers often refrain from using std:vector." Of course, programmers use std::vector, otherwise I'd said that I am not a programmer ;-)Andy Friesen <andy ikagames.com> wrote in news:cbpsi6$1u7d$1 digitaldaemon.com: [snip]They don't? Do you have a source to back that up? As far as I've ever noticed, bigwig C++ people have always made it clear that std::vector is preferable over an array and that std::string is preferable to a char*.C++ containers cannot represent null either. D will (and does) get along just fine if its array type works the same way.[snip] And probably that is one reason why programmers don't use std::vector.The concern for distinguishing empty vs null has quite honestly never even occurred to me until it was mentioned here. Think about expressing the distinction a different way and move on.I expect that this concern will rarely come up, and that's exactly why I brought it up. I would move on, but I see no compelling reason to express it in a different way.I do apologize if I sound naive, (I'll assume that comment was directed at me :) ) but I honestly can't comprehend a situation in which the distinction is going to have any measurable cost on clarity, let alone performance. -- andyI was naive in believing that it is obvious what posts I referred to. I was thinking e.g. of post http://www.digitalmars.com/drn-bin/wwwnews/23126 Btw, the author of this post, happens to use the term naive, so he shouldn't take offense.But in fact, this post doesn't really advocate 'NaN' for ints, rather http://www.digitalmars.com/drn-bin/wwwnews/23100 does so.linky no worky :(Sorry, andy and sorry Regan. You didn't suggest 'NaN' for ints. So no f(l)ame(s) for you...Aww.. AFAIKS we either need a NaN value for all value types, OR, we use reference types instead. Arrays in D act just like reference types (except for the inconsitencies you have shown) even tho they aren't technically, what I want to know is, what effect will changes to those inconsistencies actually have to people who do not need to be able to tell a null array from an empty one? Regan. -- Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/
Jun 30 2004
Regan Heath <regan netwin.co.nz> wrote in news:opsafmvd1m5a2sq9 digitalmars.com: [snip]Arrays in D act just like reference types (except for the inconsitencies you have shown) even tho they aren't technically, what I want to know is, what effect will changes to those inconsistencies actually have to people who do not need to be able to tell a null array from an empty one?The impact for code that doesn't need to distinguish between null arrays and empty arrays depends on 1) the semantic of null arrays regarding the .length property and the opCat operator. 2) whether null arrays are disallowed by a function interface contract. 3) whether a function should treat null arrays and empty arrays in the sameway. Regarding item 1) I assume these semantics: - Reading and writing of the length property is allowed. Write access to the length property always returns an array of the given size. So nullarray.length=0 turns the null array 'nullarray' into an empty array - The opCat operators allows null arrays for both arguments. So nullarray.opcat(nullarray2) creates an empty array. Regarding item 2): Non-local arrays should be initialized to an empty array. Local arrays should be initialized to an empty array instead of a null array. Note that local arrays that are not explicitly initialized are not permitted, anyway (see section 'Local Variables' in function.htm of the D spec). (But as DMD doesn't enforce this, yet, such illegal D code might be quite common.) As with all reference types that are passed to a function, putting an assertion to check for the disallowed null-case is a good idea. If the D language permits that different objects that are physically never changed, are allocated only once, then there is almost no performance penalty for using empty arrays instead of null. Regarding item 3): Code need the same changed as described for item 2. Additionally any array parameters must be checked against null and eventually converted to empty arrays. E.g. if (array is null) array=new char[0]; Of course, a templated function could do that. For many "low-level" functions null arrays can be treated as empty arrays, without any additional checks, since the length property can still be accessed. But 'high-level' functions typically have to deal with null arrays explicitly, because they would depend on functions that disallow null-arrays. Farmer.
Jul 03 2004
Andy Friesen <andy ikagames.com> wrote in news:cbpsi6$1u7d$1 digitaldaemon.com:Regan Heath wrote:An expression like if (getValue("foo",s) == true) doesn't tell much to the maintainer. An enumeration is needed to fully express the intend. [snip]... I could return existance and fill a passed char[]... so my code now looks like... char[] s; if (getValue("foo",s))I like this. It's simple and obvious.Exposing POST data as an associative array seems like a win to me; it's faster and can can be iterated over conveniently. Also, as a language intrinsic, it's a bit more likely to plug into other APIs easily. If you *really* need to, you could probably get away with doing something like: const char[] nadda = "nadda"; if (s is not nadda) { ... } -- andyI see one issue with associative arrays here. It would break up the encapsulation of the class. The internal data would be revealed. If your internal data structure is different you must convert the internal data to the associate array. At best, a call of .dup would be needed as safety-practice. Farmer.
Jun 30 2004
In article <Xns9515C8A3CA1ACitsFarmer 63.105.9.61>, Farmer says...Why are there (almost) no complaints about D's support for empty arrays?Actually, I think that D has got it right here. At least mostly. I'm happy with the fact that null counts as an empty array. But I do have SOME gripes. These are: (1) given that a is an array of length n, the expression a[n..n] gives an array bounds exception, and I don't believe it should. I would prefer that it simply evaluated to an empty string. I've lost count of the number of times I've had to put a special test for this case in various bits of code. It's a fairly normal thing to do, to have a pointer (or index in this case) to the first element BEYOND the last one in which you're interested, and to slice against it. Currently you get the assert if n == a.length. I don't believe it should assert unless n >= a.length (2) I think it is wrong that the test (a == null) will return true if and only if BOTH the length AND the address are zero. I think, if we're going to have a model in which the statement a = null; will create an empty array, then (a == null) should return true if a /is/ an empty array. That is, only the length should be tested, not the address. (If you want to test both parts, well there's always a === null). Arcane Jill
Jun 27 2004
On Sun, 27 Jun 2004 18:58:50 +0000 (UTC), Arcane Jill <Arcane_member pathlink.com> wrote:In article <Xns9515C8A3CA1ACitsFarmer 63.105.9.61>, Farmer says...This (now?) works. void main() { char[] a; a ~= "1"; a ~= "2"; a ~= "3"; printf("%.*s\n",a[3..3]); printf("%.*s\n",a[2..3]); printf("%.*s\n",a[1..3]); printf("%.*s\n",a[0..3]); }Why are there (almost) no complaints about D's support for empty arrays?Actually, I think that D has got it right here. At least mostly. I'm happy with the fact that null counts as an empty array. But I do have SOME gripes. These are: (1) given that a is an array of length n, the expression a[n..n] gives an array bounds exception, and I don't believe it should. I would prefer that it simply evaluated to an empty string. I've lost count of the number of times I've had to put a special test for this case in various bits of code. It's a fairly normal thing to do, to have a pointer (or index in this case) to the first element BEYOND the last one in which you're interested, and to slice against it. Currently you get the assert if n == a.length. I don't believe it should assert unless n >= a.length(2) I think it is wrong that the test (a == null) will return true if and only if BOTH the length AND the address are zero.I think this is correct.I think, if we're going to have a model in which the statement a = null; will create an empty array,I think this is wrong. a = null should set the data to null and length to 0. It should *not* create an empty array.then (a == null) should return true if a /is/ an empty array. That is, only the length should be tested, not the address. (If you want to test both parts, well there's always a === null).We *need* to have *both* null and empty arrays. The reason is pretty simple: - null means does not exist - emtpy means exists, but has no value (or empty value) This is important in situations like the original poster mentioned and in my experience for example... When reading POST input from a web page, you get a string like so: Setting1=Regan+Heath&Setting2=&& when requesting items you might have a function like: char[] getFormValue(char[] label); the code to get the values for the above form might go: char[] s; s = getFormValue("Setting1"); // s is "Regan Heath" s = getFormValue("Setting2"); // s is "" s = getFormValue("Setting3"); // s is null It is important the above code can tell that Setting3 was not passed in the form, so it can decide not to overwrite whatever current value that setting has, whereas it can tell Setting2 was passed and will overwrite the current value with a new blank one. I think the problem with arrays is that a null array should not compare equal to an empty array. In other words the original post test(s) null1 == "" null1 == empty1 should be false. Regan. -- Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/
Jun 27 2004
On Mon, 28 Jun 2004 10:06:18 +1200, Regan Heath wrote: [snip]We *need* to have *both* null and empty arrays. The reason is pretty simple: - null means does not exist - emtpy means exists, but has no value (or empty value)Agreed. A non-existant array is not the same as an array with no elements. -- Derek Melbourne, Australia
Jun 27 2004
In article <opr99w0st25a2sq9 digitalmars.com>, Regan Heath says...(1) given that a is an array of length n, the expression a[n..n] gives an array bounds exception,This (now?) works.Indeed, I think it has always worked. It was just me misremembering the problem. I'll start again. What I MEANT was... Given that a is an array of length n, the expression &a[n] gives an array bounds exception. And I don't believe it should. Taking the address of the first byte beyond the end of an array can be a very useful thing to do. In particular, if a is an empty array, then &a[0] asserts, which means that code like this: intended to fill an array from a FILE*-type stream, will fall over if a is empty. And there's no reason why it should - fread is quite happy to be passed a length of zero. Same goes for functions like memset() and so on. The fact of not being able to take &a[a.length] creates an akwardness that we have to code around. The above example would have to be encased in an if test in order not to assert - and you might think: So what? This is no big deal. But having to make that explicit test time and time again can start to get annoying. It should not, in my opinion, be an error to evaluate &a[a.length]; Arcane Jill
Jun 28 2004
In article <cbpkes$1ip0$1 digitaldaemon.com>, Arcane Jill says...Indeed, I think it has always worked. It was just me misremembering the problem. I'll start again. What I MEANT was... Given that a is an array of length n, the expression &a[n] gives an array bounds exception. And I don't believe it should. Taking the address of the first byte beyond the end of an array can be a very useful thing to do.Yes it is. But I think it's the syntax that's the problem in this case. IIRC using the subscript operator (ie. [n]) dereferences the element. So what you're doing when you call &a[n] is calculating the address of the element at position n. Since no such element exists, the call fails. In C the correct thing to do would be to use (a+n) instead. Just to make sure I was right, I dug this quote out of the C++ standard (5.2.1): "The expression E1[E2] is identical (by definition) to *((E1)+(E2))."The fact of not being able to take &a[a.length] creates an akwardness that we have to code around.A possibility would be to have the compiler treat &a[n] as a special case... since the address-of operator is present, it could treat this expression as equivalent to: "a+n" rather than "&*(a+n)" Sean
Jun 28 2004
Arcane Jill wrote:The fact of not being able to take &a[a.length] creates an akwardness that we have to code around. The above example would have to be encased in an if test in order not to assert - and you might think: So what? This is no big deal. But having to make that explicit test time and time again can start to get annoying. It should not, in my opinion, be an error to evaluate &a[a.length];Something which just occurred to me that would resolve this issue would be to add two properties to array types: begin and end. These properties would be pointer types which point to the beginning and end of the array's contents. (exactly like C++ iterators) T[] buffer = ...; // buffer.length makes more sense than end-begin in this case. // Bear with me: it's an example :) fread(buffer.begin, T.sizeof, buffer.end - buffer.begin, fileHandle); -- andy
Jun 28 2004
In article <cbprfd$1sq9$1 digitaldaemon.com>, Andy Friesen says...Something which just occurred to me that would resolve this issue would be to add two properties to array types: begin and end. These properties would be pointer types which point to the beginning and end of the array's contents. (exactly like C++ iterators)This might be very handy. If so, I wouldn't mind seeing rbegin and rend parameters as well though. Plus, it raises the question of what they return for associative arrays. Sean
Jun 28 2004
Sean Kelly wrote:In article <cbprfd$1sq9$1 digitaldaemon.com>, Andy Friesen says...Huh? They're pointers... wouldn't rbegin == end and rend == begin? I think I missed the point...Something which just occurred to me that would resolve this issue would be to add two properties to array types: begin and end. These properties would be pointer types which point to the beginning and end of the array's contents. (exactly like C++ iterators)This might be very handy. If so, I wouldn't mind seeing rbegin and rend parameters as well though.Plus, it raises the question of what they return for associative arrays.The concept doesn't apply to associative arrays afaics, so they wouldn't exist. Sam
Jun 29 2004
In article <cbrhd9$1a0o$1 digitaldaemon.com>, Sam McCall says...Actually, rbegin == end-1 and rend == begin-1.This might be very handy. If so, I wouldn't mind seeing rbegin and rend parameters as well though.Huh? They're pointers... wouldn't rbegin == end and rend == begin? I think I missed the point...It does apply to associative arrays IMO. I iterate through the contents of such containers quite regularly in C++. I've done something similar with an iterator wrapper for associative arrays in D, but it would be nice to have this built-in if we move towards the iterator methodology. SeanPlus, it raises the question of what they return for associative arrays.The concept doesn't apply to associative arrays afaics, so they wouldn't exist.
Jun 29 2004
Sean Kelly wrote:In article <cbrhd9$1a0o$1 digitaldaemon.com>, Sam McCall says...Oops. Yeah, this would be useful.Actually, rbegin == end-1 and rend == begin-1.This might be very handy. If so, I wouldn't mind seeing rbegin and rend parameters as well though.Huh? They're pointers... wouldn't rbegin == end and rend == begin? I think I missed the point...We're talking about pointers for low level iteration, this doesn't apply to associative arrays, who's data structure's opaque. I don't think we're moving towards iterators, just talking about pointers. The fact that iterators pretend to be pointers in their syntax is neither here nor threre ;) If you really want "official" iterators, there's always (or will always be) the DTL... SamIt does apply to associative arrays IMO. I iterate through the contents of such containers quite regularly in C++. I've done something similar with an iterator wrapper for associative arrays in D, but it would be nice to have this built-in if we move towards the iterator methodology.Plus, it raises the question of what they return for associative arrays.The concept doesn't apply to associative arrays afaics, so they wouldn't exist.
Jun 29 2004
In article <cbt5vu$kdb$1 digitaldaemon.com>, Sam McCall says...We're talking about pointers for low level iteration, this doesn't apply to associative arrays, who's data structure's opaque. I don't think we're moving towards iterators, just talking about pointers. The fact that iterators pretend to be pointers in their syntax is neither here nor threre ;)This is easy enough to do with free functions anyway. Something like: alias char[][char[]] StrMap; StrMap map; Iterator!(Pair!(char[],char[])) i = begin!(StrMap)( map ); I'm sure the syntax could bwe improved but you get the idea. I've already experimented with such iterators for associative arrays and they work just fine. Sean
Jun 30 2004
Arcane Jill <Arcane_member pathlink.com> wrote in news:cbpkes$1ip0$1 digitaldaemon.com:Given that a is an array of length n, the expression &a[n] gives an array bounds exception. And I don't believe it should. Taking the address of the first byte beyond the end of an array can be a very useful thing to do.The expression cast(elementtype*)a+n , does that. E.g. to get rid of annoying bounds-checking you could write. // given ubyte[] a; fread(cast(ubyte*)a+0, ubyte.size, a.length, fp); Farmer.
Jun 28 2004
On Mon, 28 Jun 2004 17:27:56 +0000 (UTC), Arcane Jill <Arcane_member pathlink.com> wrote:In article <opr99w0st25a2sq9 digitalmars.com>, Regan Heath says...Interestingly.. void main() { char[] p,s; s.length = 10; printf("%08x\n",&s[0]); printf("%08x\n",&p[0]); } D:\D\src\build\temp>dmd arr.d d:\D\dmd\bin\..\..\dm\bin\link.exe arr,,,user32+kernel32/noi; D:\D\src\build\temp>arr 007d0fd0 Error: ArrayBoundsError arr.d(6) So it seems Sean is indeed correct about what p[0] is doing (de-referencing the element) Regan -- Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/(1) given that a is an array of length n, the expression a[n..n] gives an array bounds exception,This (now?) works.Indeed, I think it has always worked. It was just me misremembering the problem. I'll start again. What I MEANT was... Given that a is an array of length n, the expression &a[n] gives an array bounds exception. And I don't believe it should. Taking the address of the first byte beyond the end of an array can be a very useful thing to do. In particular, if a is an empty array, then &a[0] asserts, which means that code like this: intended to fill an array from a FILE*-type stream, will fall over if a is empty. And there's no reason why it should - fread is quite happy to be passed a length of zero. Same goes for functions like memset() and so on. The fact of not being able to take &a[a.length] creates an akwardness that we have to code around. The above example would have to be encased in an if test in order not to assert - and you might think: So what? This is no big deal. But having to make that explicit test time and time again can start to get annoying. It should not, in my opinion, be an error to evaluate &a[a.length];
Jun 28 2004
Arcane Jill wrote:In article <opr99w0st25a2sq9 digitalmars.com>, Regan Heath says...No, I disagree here. In general, that address would point to nothing. Reading there is pointless, writing is dangerous. If you want to append to a string by doing a low-level write to memory, then increment length first and write then. The way you could phrase it: In some cases it would be convenient if it were not an error to take that address, if it is then not used afterward. But still, I don't see that coding around that "limitation" is that much of an effort. It gives you a few if-clauses around expressions, so what?(1) given that a is an array of length n, the expression a[n..n] gives an array bounds exception,This (now?) works.Indeed, I think it has always worked. It was just me misremembering the problem. I'll start again. What I MEANT was... Given that a is an array of length n, the expression &a[n] gives an array bounds exception. And I don't believe it should. Taking the address of the first byte beyond the end of an array can be a very useful thing to do.
Jun 29 2004
In article <cbr57k$p0m$1 digitaldaemon.com>, Norbert Nemec says...Such a pointer is never used for reading OR writing. It /is/, however, used in pointer comparison expressions, and in such context, is perfectly meaningful, and safe. But anyway, Farmer tells me I can write cast(elementtype*)a+n, so I'm happy.Given that a is an array of length n, the expression &a[n] gives an array bounds exception. And I don't believe it should. Taking the address of the first byte beyond the end of an array can be a very useful thing to do.No, I disagree here. In general, that address would point to nothing. Reading there is pointless, writing is dangerous.If you want to append to a string by doing a low-level write to memory,I never said I wanted to do any such thing. Arcane Jill
Jun 29 2004
Arcane Jill wrote:Such a pointer is never used for reading OR writing. It /is/, however, used in pointer comparison expressions, and in such context, is perfectly meaningful, and safe.True, you have a point there - I really don't know what to think about it.But anyway, Farmer tells me I can write cast(elementtype*)a+n, so I'm happy.Well - that's a workaround but not a clean solution.
Jun 29 2004
Norbert Nemec <Norbert.Nemec gmx.de> wrote in news:cbrogr$1jp7$1 digitaldaemon.com:Arcane Jill wrote:[snip]In Jill's example, a *C* function expects a pointer to anything, not a D- array. So, I think, it makes perfect sense to convert the D array to the pointer type first, and than do pointer arithmetic as in C. (If you need the behaviour of a pointer, use one.) Farmer.But anyway, Farmer tells me I can write cast(elementtype*)a+n, so I'm happy.Well - that's a workaround but not a clean solution.
Jun 29 2004
Regan Heath <regan netwin.co.nz> wrote in news:opr99w0st25a2sq9 digitalmars.com: [snip]I think the problem with arrays is that a null array should not compare equal to an empty array. In other words the original post test(s) null1 == "" null1 == empty1 should be false.Exactly, otherwise the equals() method would not be transitive. (Of course, we could also make (empty1 == null) evaluate to true by completely banning empty-arrays from the D sphere.) Regards, Farmer.
Jun 28 2004
Arcane Jill <Arcane_member pathlink.com> wrote in news:cbn5da$vu1$1 digitaldaemon.com:In article <Xns9515C8A3CA1ACitsFarmer 63.105.9.61>, Farmer says...I'm a bit confused, since in my sample, the array 'empty2' is created from a slice that points behind the array and it didn't cause an array bounds exception. Or did you need empty-slices, that point at arbitrary memory locations?Why are there (almost) no complaints about D's support for empty arrays?Actually, I think that D has got it right here. At least mostly. I'm happy with the fact that null counts as an empty array. But I do have SOME gripes. These are: (1) given that a is an array of length n, the expression a[n..n] gives an array bounds exception, and I don't believe it should. I would prefer that it simply evaluated to an empty string. I've lost count of the number of times I've had to put a special test for this case in various bits of code. It's a fairly normal thing to do, to have a pointer (or index in this case) to the first element BEYOND the last one in which you're interested, and to slice against it. Currently you get the assert if n == a.length. I don't believe it should assert unless n >= a.length(2) I think it is wrong that the test (a == null) will return true if and only if BOTH the length AND the address are zero. I think, if we're going to have a model in which the statement a = null; will create an empty array, then (a == null) should return true if a /is/ an empty array. That is, only the length should be tested, not the address. (If you want to test both parts, well there's always a === null).I guess the rule here is simple: For value types (as the array handle is one) ==/equals() is exactly the same as ===/is. But why should we're going to model arrays in way that make arrays less powerful and requires *additional* code to make the model work correct? Regards, Farmer.
Jun 27 2004
Farmer <itsFarmer. freenet.de> wrote in news:Xns951699362221itsFarmer 63.105.9.61:Arcane Jill <Arcane_member pathlink.com> wrote in news:cbn5da$vu1$1 digitaldaemon.com:In article <Xns9515C8A3CA1ACitsFarmer 63.105.9.61>, Farmer says...My mistake, forget about this sentence, it is utter rubbish: For primitive value types, ===/is behaves like ==/equals() , rather than the other way round. Furthermore, array handles aren't primitive types.(2) I think it is wrong that the test (a == null) will return true if and only if BOTH the length AND the address are zero. I think, if we're going to have a model in which the statement a = null; will create an empty array, then (a == null) should return true if a /is/ an empty array. That is, only the length should be tested, not the address. (If you want to test both parts, well there's always a === null).I guess the rule here is simple: For value types (as the array handle is one) ==/equals() is exactly the same as ===/is.
Jun 28 2004