digitalmars.D - Walter - Should we use arrays as Null?
- AJG (36/36) Jul 28 2005 Hi Walter,
- Derek Parnell (34/76) Jul 28 2005 I don't understand your concern. Both these are allowed and both work as
- Regan Heath (4/8) Jul 29 2005 I hope not. Someone once mentioned it as a goal Walter had. I've not hea...
- Regan Heath (40/45) Jul 29 2005 Why? I mean, I know what you mean: What's the point in having a
- AJG (6/17) Jul 29 2005 Something _is_ wrong, IMHO.
- Ben Hinkle (11/14) Jul 29 2005 I don't remember what Walter said but I hope he thinks about the options...
- AJG (55/122) Jul 29 2005 Sure, they both "work" to a certain extent. But if you try to access a p...
- Derek Parnell (59/167) Jul 29 2005 Yes ... but so what? Is that behavior hurting anyone?
- AJG (71/146) Jul 29 2005 Well, if we look at it that way, then everything becomes a lot easier, d...
- Niko Korhonen (9/14) Jul 29 2005 Do you want all operations on a null array, such as:
- AJG (6/17) Jul 29 2005 For starters, yes. Why should objects be different than arrays when they...
- Ben Hinkle (7/28) Jul 29 2005 I think you'll have a hard time getting lots of support for that. I much...
- AJG (10/26) Jul 29 2005 I agree that there won't be much support for this. I don't suppose it wi...
- Ben Hinkle (6/38) Jul 29 2005 no
- AJG (14/14) Jul 29 2005 Hi Ben,
- Ben Hinkle (24/40) Jul 29 2005 no - an array is two pieces of information: (1) a pointer to the data an...
- Shammah Chancellor (14/57) Jul 29 2005 I've been following this, but have as of yet been unable to express my p...
- AJG (23/35) Jul 29 2005 If this is so, it is unfortunate. I'm asking Walter to clarify this, tha...
- AJG (20/43) Jul 29 2005 Hi,
- Ben Hinkle (24/82) Jul 29 2005 Yes - they "have reference semantics" in the sense that they act on the ...
- AJG (34/74) Jul 29 2005 Just to make sure I understand:
- Ben Hinkle (26/81) Jul 29 2005 yes - aside from the fact that you should dup the "123" before trying to...
- AJG (27/86) Jul 30 2005 So then .length is related to slicing? How does the semantics of .length...
- Ben Hinkle (15/74) Jul 30 2005 I recommend you pursue some of your ideas where length is manipulated by
- AJG (14/66) Jul 30 2005 Would an example do? I may not be an expert regarding slicing, but I cou...
- Shammah Chancellor (34/38) Jul 30 2005 Because sometimes it needs to reallocate memory. Why don't you look at ...
- Ben Hinkle (14/26) Jul 31 2005 Let me step through some choices that I was hoping you would do. Let's s...
- AJG (19/31) Jul 31 2005 I don't think this change in the way arrays operate internally would be
- Derek Parnell (20/36) Jul 30 2005 You are wrong here because 'B.someProperty' operates on B not A.
- AJG (12/33) Jul 30 2005 Um... I said "except .length" for a reason. That's my very point. That ....
- Shammah Chancellor (34/56) Jul 30 2005 No, All others do _NOT_ operate on A. They happen to operate on the sam...
- AJG (37/57) Jul 30 2005 You are simply splitting hairs here. You are arguing language semantics....
- Shammah Chancellor (80/137) Jul 31 2005 I am not splitting hairs. I gave you a very valid reason why a and b ar...
- Carlos Santander (19/29) Jul 30 2005 First of all, I don't agree with AJG: I think D arrays are very well the...
- Derek Parnell (30/50) Jul 29 2005 This is where I think we separate. I don't think that D arrays are
- AJG (31/69) Jul 29 2005 Well, my "in theory" is actually pretty down-to-earth. I mean reference
- Mike Parker (6/20) Jul 30 2005 Wasn't it you who posted elsewhere in this thread that change is good? ;...
- Derek Parnell (7/26) Jul 30 2005 I think I have the solution. Rename them. Don't call them arrays. Call t...
- Niko Korhonen (22/27) Jul 31 2005 Indeed. I think the array semantics where you can't access a property of...
- Derek Parnell (16/33) Aug 01 2005 Agreed. The way I look at it is that a D array variable *contains* a
- J Thomas (7/7) Jul 30 2005 so wait, you basically want an array to be a pointer to data containing
- AJG (22/30) Jul 30 2005 No. I would like it to be that way, but I know there wouldn't be support...
- Derek Parnell (9/18) Jul 30 2005 There might have been be an argument that .reverse and .sort should foll...
- Ben Hinkle (6/22) Aug 01 2005 Besides those reasons writing "B.reverse" to me indicates you want to af...
- Shammah Chancellor (12/37) Aug 01 2005 Utterly confusing! reserve(b) and B.reverse have nothing in their name ...
- Ben Hinkle (4/53) Aug 01 2005 You've lost me. Are you proposing a change to any existing behavior or
- Shammah Chancellor (18/54) Aug 01 2005 I wasn't proposing a change at all. I was disagreing with Derek. I thi...
- AJG (10/21) Aug 01 2005 IMHO, and for consistency, it should never do COW. If a user wants to do...
- Shammah Chancellor (16/36) Aug 01 2005 While I agree with you that it could be annoying, the problem is that ar...
- Ben Hinkle (12/77) Aug 01 2005 I didn't read Derek's post as proposing reverse use COW. He was pointing...
- Shammah Chancellor (18/102) Aug 01 2005 You're right, he didn't. I was contesting that tolower(b) and b.tolower...
- Ben Hinkle (13/134) Aug 01 2005 That is what I'm implying - and that's what many std.string functions do...
- Shammah Chancellor (20/159) Aug 01 2005 Exactly. Quite often when I want to replace one thing, I want to replac...
- Ben Hinkle (24/56) Aug 01 2005 I don't know if you followed the recent COW/const/inplace performance
- Shammah Chancellor (19/78) Aug 01 2005 I think this would be a bad choice. It might be wise with respect to
- Ben Hinkle (2/16) Aug 02 2005 There's a link at the bottom of the phobos page for the wiki. I don't kn...
- Derek Parnell (16/30) Aug 01 2005 Hi Shammah,
- Shammah Chancellor (26/52) Aug 01 2005 No,no I understood that. I'm just being argumentative. I don't agree w...
Hi Walter, This is something that's confused me quite a bit and I think you are the only one that can settle it for good. The question is whether we should be using null as a special array value. Maybe it can be broken down to pieces: 1) Why can objects be null but arrays can't (given that _both_ are by-ref)? IMHO this is inconsistent. The former makes sense, the latter is weird. Another way of looking at it is: why dumb-down arrays but not objects? 2) Is it a technical limitation (for now)? 3) Is support for "proper" null arrays planned? I, for one, would _like_ to see support for both null arrays and continued support for null objects. As Regan has argued (and now I'm a believer), the null special value is very useful, and we should keep this distinction (vs. empty). Perhaps you can clarify whether this is going to happen properly or not. In my view, proper array nulls do _not_ exist. What we have right now is very confusing because sometimes we can use the null value and sometimes we can't. It is also fickle because the null value is tied to the pointer. Regan thinks that you are planning on merging emptiness and existence into one (a bad thing). Some of the problems (not technically "bugs"): - array.length = 0 sets the pointer to null. - static int[0] is not null, but new int[0] is. - .dup of an empty string (static or not) also sets the pointer to null. - static arrays can't have null pointers. 4) What exactly does [if (array)] mean (or theoretically should mean)? - if (array.ptr) - if (array.length) - if (array == null) - if (array is null) - or some combination thereof? ============== In short, I think it would be dangerous to use this feature if you are planning on subtly phasing it out. Could you please shed some light on the situation? Thanks! --AJG.
Jul 28 2005
On Fri, 29 Jul 2005 05:30:52 +0000 (UTC), AJG wrote:Hi Walter,I know I'm not the big W. but here's my take this anyhow ;-)This is something that's confused me quite a bit and I think you are the only one that can settle it for good. The question is whether we should be using null as a special array value. Maybe it can be broken down to pieces: 1) Why can objects be null but arrays can't (given that _both_ are by-ref)?I don't understand your concern. Both these are allowed and both work as expected; that is to say in both cases, after the assignment, the variable does not reference anything. What more than this are you expecting?IMHO this is inconsistent. The former makes sense, the latter is weird. Another way of looking at it is: why dumb-down arrays but not objects?Huh? I still can't see what you are worried about. Object obj = null; // Makes sense. int[]array = null; // Also makes sense (to me anyhow).2) Is it a technical limitation (for now)?Is *what* a limitation?3) Is support for "proper" null arrays planned? I, for one, would _like_ to see support for both null arrays and continued support for null objects.Who says that this support is going away?As Regan has argued (and now I'm a believer), the null special value is very useful, and we should keep this distinction (vs. empty).Absolutely! Both are good and distinct concepts.Perhaps you can clarify whether this is going to happen properly or not. In my view, proper array nulls do _not_ exist.But they do. If array.ptr is null, then array is a null array.What we have right now is very confusing because sometimes we can use the null value and sometimes we can't.Huh? When can't you use it?It is also fickle because the null value is tied to the pointer.Huh? Of course it is. What else could it be?Regan thinks that you are planning on merging emptiness and existence into one (a bad thing).I don't think that Walter is planning on this.Some of the problems (not technically "bugs"): - array.length = 0 sets the pointer to null.This is a bug and Walter has said so. He will fix this.- static int[0] is not null, but new int[0] is.static arrays cannot be null (not reference anything) by their very nature. A static array must always reference some RAM somewhere. What do you think that 'new int[0]' should return?- .dup of an empty string (static or not) also sets the pointer to null.This is because of the bug.- static arrays can't have null pointers.Of course not. The 'static' attribute means that they occupy RAM that is allocated at compile time.4) What exactly does [if (array)] mean (or theoretically should mean)? - if (array.ptr) - if (array.length) - if (array == null) - if (array is null) - or some combination thereof?It actually means ... if (array.ptr !is null || array.length != 0) which is a bit redundant because we can never have the situation where the ptr is null and the length is > 0.============== In short, I think it would be dangerous to use this feature if you are planning on subtly phasing it out. Could you please shed some light on the situation?Once Walter fixes the bug in which setting the length to zero also clears the ptr, I think we will have what you want. Hope I've helped. -- Derek Melbourne, Australia 29/07/2005 4:17:38 PM
Jul 28 2005
On Fri, 29 Jul 2005 16:39:54 +1000, Derek Parnell <derek psych.ward> wrote:I hope not. Someone once mentioned it as a goal Walter had. I've not heard from Big W himself. ReganRegan thinks that you are planning on merging emptiness and existence into one (a bad thing).I don't think that Walter is planning on this.
Jul 29 2005
On Fri, 29 Jul 2005 16:39:54 +1000, Derek Parnell <derek psych.ward> wrote:Why? I mean, I know what you mean: What's the point in having a non-existant static array. A static array always exists, therefore cannot be null. But doesn't that then make: int[0] a; illegal? I thought about this for a sec and decided that no, to make it illegal would likely annoy the heck out of a template programmer some time in the future. But, it can be null, can't it? I mean the data pointer, not the array 'reference'. I'm not sure an 'array reference' even exists for static arrays? My impression is that a static array is simply implemented as a pointer, the length property which is static is 'macro replaced' at compile time. In which case, the data pointer could be null, right? Statements like a.length would be fine, it's marco replaced after all. Statements like a[0] = 'a'; would crash, or give array bounds errors, just like any other array would. Maybe I'm missing some secret of their implementation.- static int[0] is not null, but new int[0] is.static arrays cannot be null (not reference anything) by their very nature. A static array must always reference some RAM somewhere.What do you think that 'new int[0]' should return?Well, at first glance 'null'. You're asking for (0 * int.sizeof) memory which is 0 bytes. But, have you tried it in C/C++? The MSDN documentation states: "If size is 0, malloc allocates a zero-length item in the heap and returns a valid pointer to that item" There is nothing in the docs for "new" but a quick experiment showed the same behaviour as malloc for the statement "new int[0]". Ilya wondered immediately how it was possible to have a "zero-length item in the heap" so he tried DMC and Cygwin-GCC and found: "both returned at least 8 bytes." If you step back and just look it at a conceptual level you'd expect the statements: int[0] a; int[] a = new int[0]; to result in the same thing, surely? i mean 'a' is an instance of an 'int[0]' in both cases (whatever that is decided to be). Currently they don't and there appears to be 3 choices: - leave it as it, nothing is wrong. - make "int[0] a" null. - make "new int[0]" non-null. Regan
Jul 29 2005
Hi,If you step back and just look it at a conceptual level you'd expect the statements: int[0] a; int[] a = new int[0]; to result in the same thing, surely? i mean 'a' is an instance of an 'int[0]' in both cases (whatever that is decided to be).Exactly.Currently they don't and there appears to be 3 choices: - leave it as it, nothing is wrong.Something _is_ wrong, IMHO.- make "int[0] a" null.Two wrongs don't a right make. ;)- make "new int[0]" non-null.Bingo!Regan--AJG.
Jul 29 2005
I don't remember what Walter said but I hope he thinks about the options. There are three factors involved (that I can see): 1) setting length to/from 0 2) slicing to a 0 length array and appending to a 0 length array 3) the +1 that gets added to every array allocation which makes powers-of-2 allocations the most inefficient (takes 2x the memory of what you asked for) Currently item 3 is added because one can slice off the end of an array and then ask to grow that. Should 2 behave like 1 or should 1 behave like 2? I could imagine a solution where appending to a zero-length array reallocs like setting the length from 0 reallocs. In any case there isn't a pain-free solution.Some of the problems (not technically "bugs"): - array.length = 0 sets the pointer to null.This is a bug and Walter has said so. He will fix this.
Jul 29 2005
Hi Derek,I know I'm not the big W. but here's my take this anyhow ;-)Sure, they both "work" to a certain extent. But if you try to access a property on a null object -that's illegal. On a null array, it's not. That's not a null array. That's a pseudo-null reference that never goes away.This is something that's confused me quite a bit and I think you are the only one that can settle it for good. The question is whether we should be using null as a special array value. Maybe it can be broken down to pieces: 1) Why can objects be null but arrays can't (given that _both_ are by-ref)?I don't understand your concern. Both these are allowed and both work as expected; that is to say in both cases, after the assignment, the variable does not reference anything. What more than this are you expecting?Arrays are dumbed-down so that you can do things like: foreach (char c; null) { // do something } NULL, in my opinion, is _not_ the same as empty. BUT, the above operation makes it so. Instead, it should throw an exception at the very least, or even better, it could be detected at compile-time.IMHO this is inconsistent. The former makes sense, the latter is weird. Another way of looking at it is: why dumb-down arrays but not objects?Huh? I still can't see what you are worried about.The distinction I made before between the nullness of objects (which is complete), and that of arrays (which is incomplete). I asked this because perhaps it was due to the way D properties worked (more like functions), or somesuch, meaning, it was not _intended_ to be that way.2) Is it a technical limitation (for now)?Is *what* a limitation?If Walter decides emptiness and existence should be one. This already happens in the language. Maybe it's due to bugs, maybe not. That's why I asked Walter for his "vision," if you will, regarding arrays and nulls. If array null disappears, it's likely object null will also disappear. Both of these worry me. But if indeed they are just bugs, then why doesn't Walter say so?3) Is support for "proper" null arrays planned? I, for one, would _like_ to see support for both null arrays and continued support for null objects.Who says that this support is going away?In theory, yes. In D, not entirely. Please see below:As Regan has argued (and now I'm a believer), the null special value is very useful, and we should keep this distinction (vs. empty).Absolutely! Both are good and distinct concepts.Why should be have to resort to array.ptr for nullness? Why can't the _array itself_ be null? An object _can_ be null by itself, no need to check an "object.ptr". In fact, on a null object .ptr would throw. You have to acknowledge this is a significant difference.Perhaps you can clarify whether this is going to happen properly or not. In my view, proper array nulls do _not_ exist.But they do. If array.ptr is null, then array is a null array.I can't use it when the "bugs" get in my way. And they just so happen to get in the way a lot. I'm working with databases right not, and essentially there's no way to have a string represent a DBNULL, because when I dup an empty string, it _too_ becomes NULL.What we have right now is very confusing because sometimes we can use the null value and sometimes we can't.Huh? When can't you use it?It could be, say, a simple boolean. Or, it could be, say, like objects. Objects don't rely on object.ptr, why should arrays?It is also fickle because the null value is tied to the pointer.Huh? Of course it is. What else could it be?I certainly hope not, but how can we be sure? This is why I asked.Regan thinks that you are planning on merging emptiness and existence into one (a bad thing).I don't think that Walter is planning on this.Just out of curiosity, is there a post that I could read regarding this? I'd really like to see what he said.Some of the problems (not technically "bugs"): - array.length = 0 sets the pointer to null.This is a bug and Walter has said so. He will fix this.Their very nature says nothing of nullness. It just means allocate in a different area of memory.- static int[0] is not null, but new int[0] is.static arrays cannot be null (not reference anything) by their very nature.A static array must always reference some RAM somewhere.Why? Why can't it reference null? Conceptually, I don't see a problem. But maybe this is one of the "technical limitations" I was talking about.What do you think that 'new int[0]' should return?It should return a NON-null empty array. In current terminology: in[] arr = new int[0]; if (arr) // this should be TRUE. if (arr.length) // this should be FALSE.Well this subtle bugs renders DB-null impossible because as it happens .dups are fairly common. This is what I said about it being "fickle." char[] s = ""; // here you have it. char[] p = s.dup; // now you don't. very fickle.- .dup of an empty string (static or not) also sets the pointer to null.This is because of the bug.This is a technical limitation. Once again, conceptually, it should be able to point to static null just as well. Perhaps allocate 0-bytes? In other words, this is a problem with the implementation. The language itself shouldn't be limited because of this.- static arrays can't have null pointers.Of course not. The 'static' attribute means that they occupy RAM that is allocated at compile time.4) What exactly does [if (array)] mean (or theoretically should mean)? - if (array.ptr) - if (array.length) - if (array == null) - if (array is null) - or some combination thereof?It actually means ... if (array.ptr !is null || array.length != 0)which is a bit redundant because we can never have the situation where the ptr is null and the length is > 0.IIRC this was deduced from a single dissasembly, wasn't it? Is it _always_ the same thing? (static/dynamic/associative)?And the very important duping "bug" (for DBs). And the inconsistency with static arrays. And I'm sure I could find some more problems. But first I need to know whether they are problems in the first place. Only Walter knows... Cheers, --AJG.============== In short, I think it would be dangerous to use this feature if you are planning on subtly phasing it out. Could you please shed some light on the situation?Once Walter fixes the bug in which setting the length to zero also clears the ptr, I think we will have what you want. Hope I've helped.
Jul 29 2005
On Fri, 29 Jul 2005 14:19:06 +0000 (UTC), AJG wrote:Hi Derek,Yes ... but so what? Is that behavior hurting anyone? Object properties are user-defined and all take 'this' as an automatic argument, thus they require an instance. Array properties are built-in to D. The 'array' is the instance. Don't confuse its elements as being instances of the array. Thus 'int[] array;' creates an instance of the array even though it has no data yet. And because the instance exists, you can use its properties. There is no inconsistency here. I think you have merged object nullness and array nullness into the same meaning. But they are not the same thing. A null object is a placeholder into which you can later store a reference to an object instance. A null array is an instance of an array that has no data.I know I'm not the big W. but here's my take this anyhow ;-)Sure, they both "work" to a certain extent. But if you try to access a property on a null object -that's illegal. On a null array, it's not. That's not a null array. That's a pseudo-null reference that never goes away.This is something that's confused me quite a bit and I think you are the only one that can settle it for good. The question is whether we should be using null as a special array value. Maybe it can be broken down to pieces: 1) Why can objects be null but arrays can't (given that _both_ are by-ref)?I don't understand your concern. Both these are allowed and both work as expected; that is to say in both cases, after the assignment, the variable does not reference anything. What more than this are you expecting?You use the phrase 'dumbed-down' where as I see this as a smart thing. And just because the coder does some weirdo foreach() statement, doesn't mean the language is wrong. And by the way, you code example does produce a compiler error - " foreach: void* is not an aggregate type". To get it to compile you have to use 'foreach(char c; cast(char[])null) {};' which shows to me that somebody with a big stick needs to chat with the coder.Arrays are dumbed-down so that you can do things like: foreach (char c; null) { // do something } NULL, in my opinion, is _not_ the same as empty. BUT, the above operation makes it so. Instead, it should throw an exception at the very least, or even better, it could be detected at compile-time.IMHO this is inconsistent. The former makes sense, the latter is weird. Another way of looking at it is: why dumb-down arrays but not objects?Huh? I still can't see what you are worried about.But arrays are not classes or objects. So what if they are both reference types. They are still not the same beasties.The distinction I made before between the nullness of objects (which is complete), and that of arrays (which is incomplete). I asked this because perhaps it was due to the way D properties worked (more like functions), or somesuch, meaning, it was not _intended_ to be that way.2) Is it a technical limitation (for now)?Is *what* a limitation?That could be said about lots of things ;-)If Walter decides emptiness and existence should be one. This already happens in the language. Maybe it's due to bugs, maybe not. That's why I asked Walter for his "vision," if you will, regarding arrays and nulls. If array null disappears, it's likely object null will also disappear. Both of these worry me. But if indeed they are just bugs, then why doesn't Walter say so?3) Is support for "proper" null arrays planned? I, for one, would _like_ to see support for both null arrays and continued support for null objects.Who says that this support is going away?I distinctly remember reading a note from Walter saying that he was surprised that setting the length to zero also nulled the pointer. He has code in Phobos that assumes that this is not the right behavior.In theory, yes. In D, not entirely. Please see below:As Regan has argued (and now I'm a believer), the null special value is very useful, and we should keep this distinction (vs. empty).Absolutely! Both are good and distinct concepts.Because its like saying, why can't an object instance be null. The array IS an instance. A null array means something different to a null object.Why should be have to resort to array.ptr for nullness? Why can't the _array itself_ be null?Perhaps you can clarify whether this is going to happen properly or not. In my view, proper array nulls do _not_ exist.But they do. If array.ptr is null, then array is a null array.An object _can_ be null by itself, no need to check an "object.ptr". In fact, on a null object .ptr would throw. You have to acknowledge this is a significant difference.Yes it is a difference. So what? Learn it and move on. This is D, not C/C++.Yep. Been there, done that. I just wish he'd fix this bug. Its very easy to fix.I can't use it when the "bugs" get in my way. And they just so happen to get in the way a lot. I'm working with databases right not, and essentially there's no way to have a string represent a DBNULL, because when I dup an empty string, it _too_ becomes NULL.What we have right now is very confusing because sometimes we can use the null value and sometimes we can't.Huh? When can't you use it?Because they are arrays and not class instances.It could be, say, a simple boolean. Or, it could be, say, like objects. Objects don't rely on object.ptr, why should arrays?It is also fickle because the null value is tied to the pointer.Huh? Of course it is. What else could it be?Yes, but I don't know how to search for it.I certainly hope not, but how can we be sure? This is why I asked.Regan thinks that you are planning on merging emptiness and existence into one (a bad thing).I don't think that Walter is planning on this.Just out of curiosity, is there a post that I could read regarding this? I'd really like to see what he said.Some of the problems (not technically "bugs"): - array.length = 0 sets the pointer to null.This is a bug and Walter has said so. He will fix this.By 'static' are you meaning non-dynamic arrays or single-instance arrays. For example, which of these lines are static to you? void func() { int[] a; int[1] b; static int[1] c; } To me, I only call array 'c' a static array. The array 'a' is a dynamic(-length) array and array 'b' is a fixed-length array. But array 'a' and 'b' are not single-instance arrays. After checking with the usage in D itself, it seems that D uses static ambiguously when it comes to arrays.Their very nature says nothing of nullness. It just means allocate in a different area of memory.- static int[0] is not null, but new int[0] is.static arrays cannot be null (not reference anything) by their very nature.Because static arrays are allocated RAM at compile time and they reference themselves. Because they exist they can't be null. Given ... static int[1] x; You will find that ... x.ptr == &x And because &x will always return a non-null, then x.ptr is always non-null. -- Derek Parnell Melbourne, Australia 30/07/2005 2:09:05 AMA static array must always reference some RAM somewhere.Why? Why can't it reference null? Conceptually, I don't see a problem. But maybe this is one of the "technical limitations" I was talking about.
Jul 29 2005
Hi Derek,Well, if we look at it that way, then everything becomes a lot easier, doesn't it? Whether it hurts anyone or not is not the way to build a language. There are things that are correct, and those that are not. Arrays, as it stands, _break_ reference semantics. I don't know whether this hurts anyone or not, but it is certainly inconsistent, and it is my view simply incorrect.Sure, they both "work" to a certain extent. But if you try to access a property on a null object -that's illegal. On a null array, it's not. That's not a null array. That's a pseudo-null reference that never goes away.Yes ... but so what? Is that behavior hurting anyone?Object properties are user-defined and all take 'this' as an automatic argument, thus they require an instance. Array properties are built-in to D. The 'array' is the instance.Technically speaking, this is half-right: Semantically speaking, I think this is wrong. int[] arr // This is the reference. = new int[0] // _This_ is the array.Don't confuse its elements as being instances of the array.I never did. Elements are fine the way they are. But by your logic, perhaps we should be able to do this: Or wouldn't you say this is wrong? Does it "hurt" anyone? Nah. In fact, it will help by preventing those annoying ArrayOutOfBounds thingamajjigs.Thus 'int[] array;' creates an instance of the array even though it has no data yet. And because the instance exists, you can use its properties.This just doesn't make sense. int[] array creates a _reference_. That's the very definition of a reference. That's why arrays are reference-types. It's essentially a nicer version of a pointer.There is no inconsistency here. I think you have merged object nullness and array nullness into the same meaning. But they are not the same thing. A null object is a placeholder into which you can later store a reference to an object instance. A null array is an instance of an array that has no data.If this is so, then arrays can't be called reference types. That's not what references do. Frankly, I wouldn't know what the heck to call arrays if these are the semantics we're supposed to follow.You use the phrase 'dumbed-down' where as I see this as a smart thing. And just because the coder does some weirdo foreach() statement, doesn't mean the language is wrong.It's not whether it's a smart-thing or a dumb thing. I see it as being a dumbing down, but that's not the point. The point is that it muddies the distinction between _empty_ and _non-existant_. Conceptually, you can't iterate thru a non-existant array. It doesn't exist. It should be a bug. Conceptually, you _can_ iterate thru an empty array. It exists and has no elements, thus no iteration would happen, but the construct is valid. With this "smart" feature, the two are fused into one. Empty and Non-existant can _both_ be iterated thru. They both produce the same result: 0 iterations. This I think is incorrect. 0 iterations == 0 elements // correct 0 iterations == null // incorrectAnd by the way, you code example does produce a compiler error - " foreach: void* is not an aggregate type". To get it to compile you have to use 'foreach(char c; cast(char[])null) {};' which shows to me that somebody with a big stick needs to chat with the coder.Sorry, my bad. I don't have access to DMD. But you know what I meant: char[] nullArray = null; foreach (char c; nullArray) { /* do something */ }Certainly they are not the same. But they both have the same "nature" as you put it, -references. As it is, arrays are breaking reference behaviour too, as my example above showed. Or do you know agree that arrays are reference types either?But arrays are not classes or objects. So what if they are both reference types. They are still not the same beasties.The distinction I made before between the nullness of objects (which is complete), and that of arrays (which is incomplete). I asked this because perhaps it was due to the way D properties worked (more like functions), or somesuch, meaning, it was not _intended_ to be that way.2) Is it a technical limitation (for now)?Is *what* a limitation?I distinctly remember reading a note from Walter saying that he was surprised that setting the length to zero also nulled the pointer. He has code in Phobos that assumes that this is not the right behavior.This is all very circumstantial, but oh well...Once again, this view is incorrect. An array can be both a reference and an instance. int[] arr // This is the reference. = new int[0] // _This_ is the array.Because its like saying, why can't an object instance be null. The array IS an instance. A null array means something different to a null object.Why should be have to resort to array.ptr for nullness? Why can't the _array itself_ be null?Perhaps you can clarify whether this is going to happen properly or not. In my view, proper array nulls do _not_ exist.But they do. If array.ptr is null, then array is a null array.Why learn and move on from something that is clearly wrong? I'd rather fix it, thank you very much ;)An object _can_ be null by itself, no need to check an "object.ptr". In fact, on a null object .ptr would throw. You have to acknowledge this is a significant difference.Yes it is a difference. So what? Learn it and move on. This is D, not C/C++.Yep. Been there, done that. I just wish he'd fix this bug. Its very easy to fix.It's very frustrating. I would sue this bug if I could. :pAssuming the _wrong_ semantics stay in place, why couldn't we do something like: array.isNull or array.exists as a simple boolean check, instead of the more complicated array.ptr that is riddled with "bugs?" That way we separate the implementation details (how the array.ptr is handled internally), from the semantics (whether the array exists or not).Because they are arrays and not class instances.It could be, say, a simple boolean. Or, it could be, say, like objects. Objects don't rely on object.ptr, why should arrays?It is also fickle because the null value is tied to the pointer.Huh? Of course it is. What else could it be?Well, perhaps he can clarify his position now. ---- Re:Static <snip>Yes, but I don't know how to search for it.I certainly hope not, but how can we be sure? This is why I asked.Regan thinks that you are planning on merging emptiness and existence into one (a bad thing).I don't think that Walter is planning on this.Just out of curiosity, is there a post that I could read regarding this? I'd really like to see what he said.Some of the problems (not technically "bugs"): - array.length = 0 sets the pointer to null.This is a bug and Walter has said so. He will fix this.By 'static' are you meaning non-dynamic arrays or single-instance arrays.<snip> It doesn't matter. I don't understand why you bring technical implementation details to the discussion, when I am talking solely about the concept. Conceptually, whether an array is static or not has no effect on whether the array can exist or not. Static changes allocation semantics, _not_ existance semantics. That's my point. Now, if you tell me: We can't have that because of a technical limitation, then I would understand. However, the point in your memory argument can be fixed the way Regan and others have mentioned. Cheers, --AJG.
Jul 29 2005
AJG wrote:1) Why can objects be null but arrays can't (given that _both_ are by-ref)?Do you want all operations on a null array, such as: to segfault (to throw a NullPointerException in managed environments parlance), like they do on a null object reference? -- Niko Korhonen SW Developer
Jul 29 2005
Hi, In article <dcclvb$v0i$1 digitaldaemon.com>, Niko Korhonen says...AJG wrote:For starters, yes. Why should objects be different than arrays when they are both reference types? This is inconsistent IMHO. Thanks, --AJG.1) Why can objects be null but arrays can't (given that _both_ are by-ref)?Do you want all operations on a null array, such as: to segfault (to throw a NullPointerException in managed environments parlance), like they do on a null object reference?
Jul 29 2005
"AJG" <AJG_member pathlink.com> wrote in message news:dcde0p$1kop$1 digitaldaemon.com...Hi, In article <dcclvb$v0i$1 digitaldaemon.com>, Niko Korhonen says...I think you'll have a hard time getting lots of support for that. I much prefer the current behavior and I bet there is lots of existing D code that assumes one can test the length of an array at any time. Since an array is not an object I see no problem with the "inconistency" - an array is an array.AJG wrote:For starters, yes. Why should objects be different than arrays when they are both reference types? This is inconsistent IMHO.1) Why can objects be null but arrays can't (given that _both_ are by-ref)?Do you want all operations on a null array, such as: to segfault (to throw a NullPointerException in managed environments parlance), like they do on a null object reference?
Jul 29 2005
Hi Ben,I agree that there won't be much support for this. I don't suppose it will change. But ideally that's what the behaviour should be. Say you had no D code written at the moment, would you support the change? On the other hand, would you support access to object properties that don't require an instance from a null reference? It's the same thing, isn't it? Yet aren't those illegal at the moment? (don't have DMD at hand). Cheers, --AJG. "What is popular is not always right; what is right is not always popular."I think you'll have a hard time getting lots of support for that. I much prefer the current behavior and I bet there is lots of existing D code that assumes one can test the length of an array at any time. Since an array is not an object I see no problem with the "inconistency" - an array is an array.Do you want all operations on a null array, such as: to segfault (to throw a NullPointerException in managed environments parlance), like they do on a null object reference?For starters, yes. Why should objects be different than arrays when they are both reference types? This is inconsistent IMHO.
Jul 29 2005
"AJG" <AJG_member pathlink.com> wrote in message news:dcdqig$1us6$1 digitaldaemon.com...Hi Ben,noI agree that there won't be much support for this. I don't suppose it will change. But ideally that's what the behaviour should be. Say you had no D code written at the moment, would you support the change?I think you'll have a hard time getting lots of support for that. I much prefer the current behavior and I bet there is lots of existing D code that assumes one can test the length of an array at any time. Since an array is not an object I see no problem with the "inconistency" - an array is an array.Do you want all operations on a null array, such as: to segfault (to throw a NullPointerException in managed environments parlance), like they do on a null object reference?For starters, yes. Why should objects be different than arrays when they are both reference types? This is inconsistent IMHO.On the other hand, would you support access to object properties that don't require an instance from a null reference?no (assuming you aren't referring to static class properties)It's the same thing, isn't it?noYet aren't those illegal at the moment? (don't have DMD at hand).yesCheers, --AJG. "What is popular is not always right; what is right is not always popular."
Jul 29 2005
Hi Ben, Ok, I don't think I said exactly what I meant before. Let's look at this piece by piece: 1) Arrays are ("in theory") reference types. 2) Objects are reference types. 3) Arrays are not objects. 4) So, even though Arrays and Objects are different, they share (or should) reference semantics. I believe most of us can agree up to here. My overall point is that D is not keeping its promise regarding Arrays obeying reference semantics. Whether this is good or not is debatable, but at least it should be noted. Do you agree that D's arrays break reference semantics? Thanks, --AJG.
Jul 29 2005
"AJG" <AJG_member pathlink.com> wrote in message news:dcdtq5$21q6$1 digitaldaemon.com...Hi Ben, Ok, I don't think I said exactly what I meant before. Let's look at this piece by piece: 1) Arrays are ("in theory") reference types.no - an array is two pieces of information: (1) a pointer to the data and (2) a length. The pointer can be considered a reference but the length information is definitely not manipulated by reference. For example int[] a,b; a.length = 10; b = a; b.length = 100; assert( a.length == 10 ); If arrays had "pure" reference semantics in the same way objects do then one would expect a.length == 100. In casual conversations one often says arrays have reference semantics but the unspoken assumption is that one is talking about the data pointer. This can confuse people who aren't used to D array semantics.2) Objects are reference types. 3) Arrays are not objects.these I agree with.4) So, even though Arrays and Objects are different, they share (or should) reference semantics.naturally I disagree given 1).I believe most of us can agree up to here. My overall point is that D is not keeping its promise regarding Arrays obeying reference semantics. Whether this is good or not is debatable, but at least it should be noted. Do you agree that D's arrays break reference semantics?The length information is not manipulated with reference semantics. I think this is a good design choice that shouldn't be changed. I agree it is different than object behavior but that's well worth the benefits of the current system. If there are statements in the D doc that say "arrays have reference sematnics" I think they should be changed to be more accurate and say something like "the array data has reference semantics". It's common to ignore the length field when you are casually talking about arrays.
Jul 29 2005
I've been following this, but have as of yet been unable to express my problem with this whole issue. My feelings line up with yours ben. Arrays are not pointers, nor are they reference types. In C, pointers happen to be able to be dereferenced with the array index operator, but that's a side effect of implementation. If something is a true array, I think there is a reasonable expectation that the array always points to some data. array != null; //should always be true; Especially in the case of D where the array is really a structure with a ptr and a length. However, if for some reason you need the reference symantics, those are not denied to you. You're free to do this: int* array = new int[100]; My 2 cents -Sha In article <dcdur3$232d$1 digitaldaemon.com>, Ben Hinkle says..."AJG" <AJG_member pathlink.com> wrote in message news:dcdtq5$21q6$1 digitaldaemon.com...Hi Ben, Ok, I don't think I said exactly what I meant before. Let's look at this piece by piece: 1) Arrays are ("in theory") reference types.no - an array is two pieces of information: (1) a pointer to the data and (2) a length. The pointer can be considered a reference but the length information is definitely not manipulated by reference. For example int[] a,b; a.length = 10; b = a; b.length = 100; assert( a.length == 10 ); If arrays had "pure" reference semantics in the same way objects do then one would expect a.length == 100. In casual conversations one often says arrays have reference semantics but the unspoken assumption is that one is talking about the data pointer. This can confuse people who aren't used to D array semantics.2) Objects are reference types. 3) Arrays are not objects.these I agree with.4) So, even though Arrays and Objects are different, they share (or should) reference semantics.naturally I disagree given 1).I believe most of us can agree up to here. My overall point is that D is not keeping its promise regarding Arrays obeying reference semantics. Whether this is good or not is debatable, but at least it should be noted. Do you agree that D's arrays break reference semantics?The length information is not manipulated with reference semantics. I think this is a good design choice that shouldn't be changed. I agree it is different than object behavior but that's well worth the benefits of the current system. If there are statements in the D doc that say "arrays have reference sematnics" I think they should be changed to be more accurate and say something like "the array data has reference semantics". It's common to ignore the length field when you are casually talking about arrays.
Jul 29 2005
Hi,I've been following this, but have as of yet been unable to express my problem with this whole issue. My feelings line up with yours ben. Arrays [...] are not reference types.If this is so, it is unfortunate. I'm asking Walter to clarify this, that is all.In C, pointers happen to be able to be dereferenced with the array index operator, but that's a side effect of implementation. If something is a true array, I think there is a reasonable expectation that the array always points to some data.This is not a reasonable expectation. We are talking about two things here: a) Existance. b) Emptiness. Even in C, you can express both. I'm asking whether Walter thinks we should do that in D or not. Some examples (of the 3 possible cases): char[] string = "hi"; // non-null non-empty array. char[] empty = ""; // non-null empty array. char[] cnull = null; // null array.array != null; //should always be true; Especially in the case of D where the array is really a structure with a ptr and a length.This is not the case in D. array != null is sometimes false, because it's comparing the pointer. This is the very thing that allows an array to be non-existant (a true, NULL array). Thus, that was my original question to Walter, whether we should rely on this behaviour or if he's planning on phasing it out.However, if for some reason you need the reference symantics, those are not denied to you. You're free to do this: int* array = new int[100];Yes, there's nothing like regressing a couple of decades ;) I think one of D's design goals was to make pointer use unnecessary. Using a pointer you lose safety, lose info (.length, etc.) and lose functionality. This is not a valid solution, IMHO. Cheers, --AJG.
Jul 29 2005
Hi, Well, this is certainly an interesting development. So, to recap, arrays in D are not reference types. I was always under the impression that they were. This is very saddening to me. Is this correct? Walter, could you clarify this?What about .dup, .sort, .reverse, .sizeof? Do those have reference semantics or not?1) Arrays are ("in theory") reference types.no - an array is two pieces of information: (1) a pointer to the data and (2) a length. The pointer can be considered a reference but the length information is definitely not manipulated by reference. For exampleIf arrays had "pure" reference semantics in the same way objects do then one would expect a.length == 100. In casual conversations one often says arrays have reference semantics but the unspoken assumption is that one is talking about the data pointer. This can confuse people who aren't used to D array semantics.Yes, arrays semantics are definitely weird. I was hoping they were references and that .length was simply buggy, but perhaps it's by design. In addition, IMO this "unspoken assumption" is not mentioned anywhere in the docs.Why is it a good design choice? Forget about legacy for a second. Wouldn't it be much simpler, more consistent and less confusing to make arrays pure reference types? It would eliminate a lot of the various special cases that we have to deal with given the current convoluted semantics. It would also align their behaviour to that of objects, much like a struct's behaviour is aligned to that of a primitive.My overall point is that D is not keeping its promise regarding Arrays obeying reference semantics. Whether this is good or not is debatable, but at least it should be noted. Do you agree that D's arrays break reference semantics?The length information is not manipulated with reference semantics. I think this is a good design choice that shouldn't be changed.I agree it is different than object behavior but that's well worth the benefits of the current system.Like what? Which benefits?If there are statements in the D doc that say "arrays have reference sematnics" I think they should be changed to be more accurate and say something like "the array data has reference semantics". It's common to ignore the length field when you are casually talking about arrays.Or perhaps the arrays themselves could be changed to reference types? ;) Cheers, --AJG
Jul 29 2005
"AJG" <AJG_member pathlink.com> wrote in message news:dce24h$272t$1 digitaldaemon.com...Hi, Well, this is certainly an interesting development. So, to recap, arrays in D are not reference types. I was always under the impression that they were. This is very saddening to me. Is this correct? Walter, could you clarify this?Yes - they "have reference semantics" in the sense that they act on the data (though in the case of .dup and .sizeof the reference/value semantics is irrelevant).What about .dup, .sort, .reverse, .sizeof? Do those have reference semantics or not?1) Arrays are ("in theory") reference types.no - an array is two pieces of information: (1) a pointer to the data and (2) a length. The pointer can be considered a reference but the length information is definitely not manipulated by reference. For exampleThe first sentance of http://www.digitalmars.com/d/arrays.html section Dynamic Arrays says "Dynamic arrays consist of a length and a pointer to the array data." I agree, though, that the doc needs to emphasize this more. I added some feedback to the Wiki about arrays asking for examples illustrating how array assignment works.If arrays had "pure" reference semantics in the same way objects do then one would expect a.length == 100. In casual conversations one often says arrays have reference semantics but the unspoken assumption is that one is talking about the data pointer. This can confuse people who aren't used to D array semantics.Yes, arrays semantics are definitely weird. I was hoping they were references and that .length was simply buggy, but perhaps it's by design. In addition, IMO this "unspoken assumption" is not mentioned anywhere in the docs.It would be very annoying to have to check for null before asking if an array length is zero. Plus the whole design of slicing would need to be redone and probably would lose much of the efficiency it has today. I view an array as much closer to a struct than an object: an array is just like a struct with a pointer field and a length field. That's the simplest description of what an array is. Comparing them to objects is the wrong analogy.Why is it a good design choice? Forget about legacy for a second. Wouldn't it be much simpler, more consistent and less confusing to make arrays pure reference types? It would eliminate a lot of the various special cases that we have to deal with given the current convoluted semantics. It would also align their behaviour to that of objects, much like a struct's behaviour is aligned to that of a primitive.My overall point is that D is not keeping its promise regarding Arrays obeying reference semantics. Whether this is good or not is debatable, but at least it should be noted. Do you agree that D's arrays break reference semantics?The length information is not manipulated with reference semantics. I think this is a good design choice that shouldn't be changed.see above - checking all the time for null would be very annoying. Almost all the time with arrays one cares if the length is zero and making people check for null before asking that question is error-prone. See Java for examples of making people check for null before asking for the length.I agree it is different than object behavior but that's well worth the benefits of the current system.Like what? Which benefits?Sure - one can change anything in D if the tradeoffs are worth it. I happen to believe D's dynamic array semantics are an excellent balance of tradeoffs.If there are statements in the D doc that say "arrays have reference sematnics" I think they should be changed to be more accurate and say something like "the array data has reference semantics". It's common to ignore the length field when you are casually talking about arrays.Or perhaps the arrays themselves could be changed to reference types? ;)
Jul 29 2005
Hi,Just to make sure I understand: char[] A = "123"; char[] B = A; B.reverse; // B will be 321 // A will be 321 also. // correct? BUT: char[] A = "123"; char[] B = A; B.length = 2; // B will be 12 // A will be remain 123. // correct? If this is true, then it seems rather arbitrary to me that .length should break reference semantics. Why not keep it in line to how the rest work? (Specially since it's not related to the benefits you talked about before).What about .dup, .sort, .reverse, .sizeof? Do those have reference semantics or not?Yes - they "have reference semantics" in the sense that they act on the data (though in the case of .dup and .sizeof the reference/value semantics is irrelevant).The first sentance of http://www.digitalmars.com/d/arrays.html section Dynamic Arrays says "Dynamic arrays consist of a length and a pointer to the array data." I agree, though, that the doc needs to emphasize this more. I added some feedback to the Wiki about arrays asking for examples illustrating how array assignment works.Ok. This would be an improvement.Why is it a good design choice?It would be very annoying to have to check for null before asking if an array length is zero. Plus the whole design of slicing would need to be redone and probably would lose much of the efficiency it has today.Ok. This is a valid point. However, that's not to say the problem is insurmoutable. Solutions do exist. In fact, I have thought of a couple of possible solutions, but I'm afraid it'll scare everybody so for now this would be "something to think about." I just want to say that change, even if it breaks things, can be very good. It shouldn't be automatically ruled out in fear.I view an array as much closer to a struct than an object: an array is just like a struct with a pointer field and a length field. That's the simplest description of what an array is. Comparing them to objects is the wrong analogy.Except that _all_ properties other than .length operate via reference semantics. Structs wouldn't do that. Objects would.It wouldn't be error prone. Perhaps you mean exceptions would be thrown, and that's fine, but there wouldn't be unnoticed errors. But in general I agree with you, slicing would lose its "magic" having to check for nulls.see above - checking all the time for null would be very annoying. Almost all the time with arrays one cares if the length is zero and making people check for null before asking that question is error-prone.I agree it is different than object behavior but that's well worth the benefits of the current system.Like what? Which benefits?See Java for examples of making people check for null before asking for the length.You can also learn from their mistakes and avoid them.I think the semantics could use a little rethinking and specially a bit of clarification. Cheers, --AJG.Sure - one can change anything in D if the tradeoffs are worth it. I happen to believe D's dynamic array semantics are an excellent balance of tradeoffs.If there are statements in the D doc that say "arrays have reference sematnics" I think they should be changed to be more accurate and say something like "the array data has reference semantics". It's common to ignore the length field when you are casually talking about arrays.Or perhaps the arrays themselves could be changed to reference types? ;)
Jul 29 2005
In article <dce4up$2cbc$1 digitaldaemon.com>, AJG says...Hi,yes - aside from the fact that you should dup the "123" before trying to modify it since "123" is put in read-only memory. Reverse acts in-place because it is a method of the array type - like sorting is in-place.Just to make sure I understand: char[] A = "123"; char[] B = A; B.reverse; // B will be 321 // A will be 321 also. // correct?What about .dup, .sort, .reverse, .sizeof? Do those have reference semantics or not?Yes - they "have reference semantics" in the sense that they act on the data (though in the case of .dup and .sizeof the reference/value semantics is irrelevant).BUT: char[] A = "123"; char[] B = A; B.length = 2; // B will be 12 // A will be remain 123. // correct?yesIf this is true, then it seems rather arbitrary to me that .length should break reference semantics. Why not keep it in line to how the rest work? (Specially since it's not related to the benefits you talked about before).It is not arbitrary. There are advantages to the current design. I don't see why you say it is not related since it would be silly to have length do something different if there weren't benefits to making length special.Where is the reaction in fear? I only see people trying to explain the current design and its advantages. I said I doubt a solution exists that would have the benefits of the current design while having reference semantics (if even reference semantics for length would be desirable). If you want to present some ideas that would be great - do whatever you want and enjoy (remember we're all doing this for fun).Why is it a good design choice?It would be very annoying to have to check for null before asking if an array length is zero. Plus the whole design of slicing would need to be redone and probably would lose much of the efficiency it has today.Ok. This is a valid point. However, that's not to say the problem is insurmoutable. Solutions do exist. In fact, I have thought of a couple of possible solutions, but I'm afraid it'll scare everybody so for now this would be "something to think about." I just want to say that change, even if it breaks things, can be very good. It shouldn't be automatically ruled out in fear.uhh - the struct has a pointer to the data. The pointer part has reference semantics and the length part doesn't. A struct can easily have methods that derefence the pointer and modify shared state. I do it all the time with the MinTL containers and pretty much any struct that stores a pointer.I view an array as much closer to a struct than an object: an array is just like a struct with a pointer field and a length field. That's the simplest description of what an array is. Comparing them to objects is the wrong analogy.Except that _all_ properties other than .length operate via reference semantics. Structs wouldn't do that. Objects would.By error-prone I mean the programmer will introduce bugs into the code by forgetting to check for null every time they want to know if an array has any content (meaning non-zero length).It wouldn't be error prone. Perhaps you mean exceptions would be thrown, and that's fine, but there wouldn't be unnoticed errors. But in general I agree with you, slicing would lose its "magic" having to check for nulls.see above - checking all the time for null would be very annoying. Almost all the time with arrays one cares if the length is zero and making people check for null before asking that question is error-prone.I agree it is different than object behavior but that's well worth the benefits of the current system.Like what? Which benefits?That's what D has now - it is avoiding the mistakes of Java by not requiring all those annoying null checks. Plus slicing is fast by not requiring memory allocations. Note in Java the length of an array is read-only so the whole question about length having value/reference semantics doesn't apply.See Java for examples of making people check for null before asking for the length.You can also learn from their mistakes and avoid them.
Jul 29 2005
Hi Ben,So then .length is related to slicing? How does the semantics of .length affect slicing? Or perhaps you meant other benefits?If this is true, then it seems rather arbitrary to me that .length should break reference semantics. Why not keep it in line to how the rest work? (Specially since it's not related to the benefits you talked about before).It is not arbitrary. There are advantages to the current design. I don't see why you say it is not related since it would be silly to have length do something different if there weren't benefits to making length special.The general impression I get is that as soon as something creates the possibility of breaking existing code, then there is backlash. This would be fine for the embedded C language that runs medical heart devices. But for a language that isn't even out the door, it's disheartening (haha, no pun intended ;). Just my 2 cents.Where is the reaction in fear? I only see people trying to explain the current design and its advantages. I said I doubt a solution exists that would have the benefits of the current design while having reference semantics (if even reference semantics for length would be desirable). If you want to present some ideas that would be great - do whatever you want and enjoy (remember we're all doing this for fun).Why is it a good design choice?It would be very annoying to have to check for null before asking if an array length is zero. Plus the whole design of slicing would need to be redone and probably would lose much of the efficiency it has today.Ok. This is a valid point. However, that's not to say the problem is insurmoutable. Solutions do exist. In fact, I have thought of a couple of possible solutions, but I'm afraid it'll scare everybody so for now this would be "something to think about." I just want to say that change, even if it breaks things, can be very good. It shouldn't be automatically ruled out in fear.SomeObject A = new SomeObject; SomeObject B = A; B.SomeProperty; // Operates on A. SomeStruct A; SomeStruct B = A; B.SomeProperty; // Operates on B. int[] A = new int[5]; int[] B = A; B.SomeProperty; // Operates on A; // _Except_ if it's .length. This behaviour seems much more in line with Objects than with Structs, to me. That's why I don't see how .length should break the current semantics.uhh - the struct has a pointer to the data. The pointer part has reference semantics and the length part doesn't. A struct can easily have methods that derefence the pointer and modify shared state. I do it all the time with the MinTL containers and pretty much any struct that stores a pointer.I view an array as much closer to a struct than an object: an array is just like a struct with a pointer field and a length field. That's the simplest description of what an array is. Comparing them to objects is the wrong analogy.Except that _all_ properties other than .length operate via reference semantics. Structs wouldn't do that. Objects would.Ok.By error-prone I mean the programmer will introduce bugs into the code by forgetting to check for null every time they want to know if an array has any content (meaning non-zero length).It wouldn't be error prone. Perhaps you mean exceptions would be thrown, and that's fine, but there wouldn't be unnoticed errors. But in general I agree with you, slicing would lose its "magic" having to check for nulls.see above - checking all the time for null would be very annoying. Almost all the time with arrays one cares if the length is zero and making people check for null before asking that question is error-prone.I agree it is different than object behavior but that's well worth the benefits of the current system.Like what? Which benefits?I'm not suggesting making .length read-only. I'm suggesting making it operate on the same data it has a pointer to. Just like .sort or .reverse would. The way I see it, if you explicitly want to make a copy of the data, that's why there is dup. Why should .length secretely call .dup sometimes, and sometimes not? Cheers, --AJG.That's what D has now - it is avoiding the mistakes of Java by not requiring all those annoying null checks. Plus slicing is fast by not requiring memory allocations. Note in Java the length of an array is read-only so the whole question about length having value/reference semantics doesn't apply.See Java for examples of making people check for null before asking for the length.You can also learn from their mistakes and avoid them.
Jul 30 2005
In article <dcgc3q$13i9$1 digitaldaemon.com>, AJG says...Hi Ben,I recommend you pursue some of your ideas where length is manipulated by reference and follow the dependencies to see how different dynamic arrays (and, yes, slicing) would be. In particular I recommend you learn more about slicing. I'm sorry if that sounds harsh but I've gotten the opinion now that you haven't really gotten experience with D arrays as they exist now.So then .length is related to slicing? How does the semantics of .length affect slicing? Or perhaps you meant other benefits?If this is true, then it seems rather arbitrary to me that .length should break reference semantics. Why not keep it in line to how the rest work? (Specially since it's not related to the benefits you talked about before).It is not arbitrary. There are advantages to the current design. I don't see why you say it is not related since it would be silly to have length do something different if there weren't benefits to making length special.For my case when I said essentially "much code will break" it wasn't meant as a backlash - just as a fact you would have to address. A proposed change that breaks lots of code is harder to push through than one that doesn't as a simple practical matter more than any emotional attachment to old code.The general impression I get is that as soon as something creates the possibility of breaking existing code, then there is backlash. This would be fine for the embedded C language that runs medical heart devices. But for a language that isn't even out the door, it's disheartening (haha, no pun intended ;). Just my 2 cents.Where is the reaction in fear? I only see people trying to explain the current design and its advantages. I said I doubt a solution exists that would have the benefits of the current design while having reference semantics (if even reference semantics for length would be desirable). If you want to present some ideas that would be great - do whatever you want and enjoy (remember we're all doing this for fun).Why is it a good design choice?It would be very annoying to have to check for null before asking if an array length is zero. Plus the whole design of slicing would need to be redone and probably would lose much of the efficiency it has today.Ok. This is a valid point. However, that's not to say the problem is insurmoutable. Solutions do exist. In fact, I have thought of a couple of possible solutions, but I'm afraid it'll scare everybody so for now this would be "something to think about." I just want to say that change, even if it breaks things, can be very good. It shouldn't be automatically ruled out in fear.Please think about structs that contain pointers. [snip]SomeObject A = new SomeObject; SomeObject B = A; B.SomeProperty; // Operates on A. SomeStruct A; SomeStruct B = A; B.SomeProperty; // Operates on B. int[] A = new int[5]; int[] B = A; B.SomeProperty; // Operates on A; // _Except_ if it's .length. This behaviour seems much more in line with Objects than with Structs, to me. That's why I don't see how .length should break the current semantics.uhh - the struct has a pointer to the data. The pointer part has reference semantics and the length part doesn't. A struct can easily have methods that derefence the pointer and modify shared state. I do it all the time with the MinTL containers and pretty much any struct that stores a pointer.I view an array as much closer to a struct than an object: an array is just like a struct with a pointer field and a length field. That's the simplest description of what an array is. Comparing them to objects is the wrong analogy.Except that _all_ properties other than .length operate via reference semantics. Structs wouldn't do that. Objects would.Why should .length secretely call .dup sometimes, and sometimes not?Here I agree that the documentation should be more explicit in describing when setting the length reallocated and when it doesn't. If it is compiler-dependent the doc should say so.
Jul 30 2005
Hi Ben,Would an example do? I may not be an expert regarding slicing, but I could see a discrete problem if you point it out.So then .length is related to slicing? How does the semantics of .length affect slicing? Or perhaps you meant other benefits?I recommend you pursue some of your ideas where length is manipulated by reference and follow the dependencies to see how different dynamic arrays (and, yes, slicing) would be. In particular I recommend you learn more about slicing. I'm sorry if that sounds harsh but I've gotten the opinion now that you haven't really gotten experience with D arrays as they exist now.This kind of thinking only works ceteris paribus. But if a solution that breaks less code is not as good, then the language loses. I think at this point the language can afford such changes before it becomes like C, where a header file was needed to introduce mere booleans.The general impression I get is that as soon as something creates the possibility of breaking existing code, then there is backlash. This would be fine for the embedded C language that runs medical heart devices. But for a language that isn't even out the door, it's disheartening (haha, no pun intended ;). Just my 2 cents.For my case when I said essentially "much code will break" it wasn't meant as a backlash - just as a fact you would have to address. A proposed change that breaks lots of code is harder to push through than one that doesn't as a simple practical matter more than any emotional attachment to old code.Even if we see arrays as structs (which I don't, but for the sake of the argument), it doesn't explain why .length should break the other properties' semantics. If there's an obvious reason I'm blind to, could you point it out? I'm a little dense sometimes.Please think about structs that contain pointers.SomeObject A = new SomeObject; SomeObject B = A; B.SomeProperty; // Operates on A. SomeStruct A; SomeStruct B = A; B.SomeProperty; // Operates on B. int[] A = new int[5]; int[] B = A; B.SomeProperty; // Operates on A; // _Except_ if it's .length. This behaviour seems much more in line with Objects than with Structs, to me. That's why I don't see how .length should break the current semantics.uhh - the struct has a pointer to the data. The pointer part has reference semantics and the length part doesn't. A struct can easily have methods that derefence the pointer and modify shared state. I do it all the time with the MinTL containers and pretty much any struct that stores a pointer.I view an array as much closer to a struct than an object: an array is just like a struct with a pointer field and a length field. That's the simplest description of what an array is. Comparing them to objects is the wrong analogy.Except that _all_ properties other than .length operate via reference semantics. Structs wouldn't do that. Objects would.[snip]Ok. Cheers, --AJG.Why should .length secretely call .dup sometimes, and sometimes not?Here I agree that the documentation should be more explicit in describing when setting the length reallocated and when it doesn't. If it is compiler-dependent the doc should say so.
Jul 30 2005
In article <dcgkt5$1b4i$1 digitaldaemon.com>, AJG says...Even if we see arrays as structs (which I don't, but for the sake of the argument), it doesn't explain why .length should break the other properties' semantics. If there's an obvious reason I'm blind to, could you point it out? I'm a little dense sometimes.Because sometimes it needs to reallocate memory. Why don't you look at `man realloc`: The realloc() function tries to change the size of the allocation pointed to by ptr to size, and return ptr. If there is not enough room to enlarge the memory allocation pointed to by ptr, realloc() creates a new allocation, copies as much of the old data pointed to by ptr as will fit to the new allocation, frees the old allocation, and returns a pointer to the allocated memory. realloc() returns a NULL pointer if there is an error, and the allocation pointed to by ptr is still valid. The difference is that D cannot let it free the original, because if it did then other refereces to the data would break. So it dups the data if a realloc is going to allocate memory in a different area. I'm not sure fo the exact implementation details in D, but that's my basic understanding. So for recap: If length increases, and there is not enough space available to grow the array it, it allocates another block of memory and copies the data. It leaves the original pointer in tack then and lets the garbage collector decide if anybody else has references to it still. This may seem confusing, but it's about array slicing being fast. If you don't want there do be this mixed semantics, and always dup your data. (P.S. You mention C++ reference symatecs when you're talking about these arrays. But this isn't even legal in C++: int foo[10]; foo = null; You really can't compare the two languages in this aspect. I think D arrays are a big step forward when compared to C arrays, which literally couldn't find their ass with both hands.) -Sha
Jul 30 2005
In article <dcgkt5$1b4i$1 digitaldaemon.com>, AJG says...Hi Ben,Let me step through some choices that I was hoping you would do. Let's start by thinking about what an array with reference-based length would look like. It would either be a pointer to today's dynamic array (a ptr and a length) or it would be a pointer to one memory block with the length stored either at the front or end of the array data. How would slicing work for those two implementations? For the first slicing would have to allocate memory to store the new ptr and new length. For the second slicing would have to be a different type since it is impossible to store the length for the slice in the middle of the original source array. So that's why I suggested you think through your initial suggestion and work out the impact on slicing and arrays in general. But to be honest I would still prefer the current behavior where the length information is always available without having to check for null first - even if you could somehow make the rest of D remain the same as today.Would an example do? I may not be an expert regarding slicing, but I could see a discrete problem if you point it out.So then .length is related to slicing? How does the semantics of .length affect slicing? Or perhaps you meant other benefits?I recommend you pursue some of your ideas where length is manipulated by reference and follow the dependencies to see how different dynamic arrays (and, yes, slicing) would be. In particular I recommend you learn more about slicing. I'm sorry if that sounds harsh but I've gotten the opinion now that you haven't really gotten experience with D arrays as they exist now.
Jul 31 2005
Hi Ben,Let me step through some choices that I was hoping you would do. Let's start by thinking about what an array with reference-based length would look like. It would either be a pointer to today's dynamic array (a ptr and a length) or it would be a pointer to one memory block with the length stored either at the front or end of the array data. How would slicing work for those two implementations? For the first slicing would have to allocate memory to store the new ptr and new length. For the second slicing would have to be a different type since it is impossible to store the length for the slice in the middle of the original source array. So that's why I suggested you think through your initial suggestion and work out the impact on slicing and arrays in general.I don't think this change in the way arrays operate internally would be necessary. What about simply using the current data pointer as it is to implement reference semantics? A null pointer means the reference is null; and vice-versa. The problem I keep hearing comes when trying to re-size (specifically, enlarge), an array, by reference. So then what it all comes down to re: .length is the inability of realloc() to guarantee that the pointer it returns is the same on it receives. Is this correct?But to be honest I would still prefer the current behavior where the length information is always available without having to check for null first - even >if you could somehow make the rest of D remain the same as today.I understand this concern, and it is a valid one. However, at this point D is trying to have the cake and eating it too: It wants to have null arrays, but not have to go thru null checks. The result is a bit confusing, IMHO. Moreover, it is buggy. Worse of all, it is not well documented. This combination of factors leads me to think something should be done. Frankly, from the docs I can't make out what the semantics of arrays are supposed to be. That was why I asked the original question: should we or shouldn't we treat arrays as null? I guess maybe not even Walter knows ;) ? Cheers, --AJG.
Jul 31 2005
On Sat, 30 Jul 2005 17:07:06 +0000 (UTC), AJG wrote: [snip]SomeObject A = new SomeObject; SomeObject B = A; B.SomeProperty; // Operates on A. SomeStruct A; SomeStruct B = A; B.SomeProperty; // Operates on B. int[] A = new int[5]; int[] B = A; B.SomeProperty; // Operates on A; // _Except_ if it's .length. This behaviour seems much more in line with Objects than with Structs, to me. That's why I don't see how .length should break the current semantics.You are wrong here because 'B.someProperty' operates on B not A. A simple proof is this ... int[] A = new int[5]; int[] B = A; A.length = 4; writefln("%d", B.length); // displays 5. In your example, it *appears* to operate on A (the 8-byte array structure) because B and A have the same values. That is A.ptr == B.ptr and A.length == B.length. We just have to admit that arrays in D are not the classical array definition and are really a different type of thing altogether. Then get to learn the rules of D 'arrays'. If you want arrays to behave like objects, then maybe you can write an array class. -- Derek Parnell Melbourne, Australia 31/07/2005 8:26:46 AM
Jul 30 2005
Hi Derek,Um... I said "except .length" for a reason. That's my very point. That .length is the exception. All others operate on A.int[] A = new int[5]; int[] B = A; B.SomeProperty; // Operates on A; // _Except_ if it's .length. This behaviour seems much more in line with Objects than with Structs, to me. That's why I don't see how .length should break the current semantics.You are wrong here because 'B.someProperty' operates on B not A. A simple proof is this ... int[] A = new int[5]; int[] B = A; A.length = 4; writefln("%d", B.length); // displays 5. In your example, it *appears* to operate on A (the 8-byte array structure) because B and A have the same values. That is A.ptr == B.ptr and A.length == B.length.We just have to admit that arrays in D are not the classical array definition and are really a different type of thing altogether. Then get to learn the rules of D 'arrays'. If you want arrays to behave like objects, then maybe you can write an array class.First of all, this would throw efficiency out the window. Second, let me quote you a little of the D manifesto: [Taken from "The D Programming Language" written by Walter Bright] [Arrays Section] "Arrays are enhanced from being little more than an alternative syntax for a pointer into first class objects." That's, ahem, "First Class Objects," for those that missed it. Cheers, --AJG.
Jul 30 2005
In article <dch28c$1nrj$1 digitaldaemon.com>, AJG says...Hi Derek,No, All others do _NOT_ operate on A. They happen to operate on the same data that A points to. A is a struct which an int and a ptr, obviously changing B's ptr, or B's length do not affect A. You're thinking about D arrays all wrong. That's what Derek was getting at. A and B are two separate objects which happen to be able to have references to the same data. For effiencies sake both the length and the ptr are assigned by value. Think of it this way in C, if you have this structure: struct Array { int length; void* ptr; } a, b; a.ptr = new char[100]; b = a; What does this do? This is the semantics of D arrays. A and B are distinct structures, and if you allocate more memory for b then it's not going to change A. As you can see this is not the same as reference semantics at all, otherwise A's ptr would change as well. If you want reference semantics you are free to use an array handle. But the way D arrays are handled is not mystical or inconsistent. They're perfectly consistent with themselves, and if you understand how they operate (which is not hard) then you won't make mistakes. As for your other issue, where array nullness and length == 0 being converged, I do not think this is an issue. length == 0 is the definition of a null set (arrays in CS seem to be more in line with sets, dunno why they're named as they are). But if you want to be consitent with terminology, techincally a null array is a an array with all elements set to null. Can you show me an example where it matters if length == 0 and arr.ptr == null does not denote the same thing? -ShaUm... I said "except .length" for a reason. That's my very point. That .length is the exception. All others operate on A.int[] A = new int[5]; int[] B = A; B.SomeProperty; // Operates on A; // _Except_ if it's .length. This behaviour seems much more in line with Objects than with Structs, to me. That's why I don't see how .length should break the current semantics.You are wrong here because 'B.someProperty' operates on B not A. A simple proof is this ... int[] A = new int[5]; int[] B = A; A.length = 4; writefln("%d", B.length); // displays 5. In your example, it *appears* to operate on A (the 8-byte array structure) because B and A have the same values. That is A.ptr == B.ptr and A.length == B.length.
Jul 30 2005
Hi,You are simply splitting hairs here. You are arguing language semantics. The fact of the matter is that for all practical purposes, EXCEPT for .length, arrays in D are by reference. This means that for all practical purposes, EXCEPT for .length, B operates on A. It doesn't matter if it's because of the pointer (an implementation, system-dependent, gory detail) or because of any other reason. If assiging an array _immediately_ copied the data, then what you said is true. But it doesn't, because (a) that would be inefficient, and (b) that would remove _all_ reference semantics. Therefore, as it is, reference semantics are broken when it comes to .length. <snip>Um... I said "except .length" for a reason. That's my very point. That .length is the exception. All others operate on A.No, All others do _NOT_ operate on A. They happen to operate on the same data that A points to.RE: Arrays as structs.This is were _you_ are wrong. Arrays are not structs. Arrays do not share the semantics of structs. Arrays share _implementation details_ with structs, and that's _it_. Didn't you see the quote from the D language doc? It clearly says "First-Class Objects." Not structs. Not primitives. Not pointers. If you, however, equate that with structs, that's fine. But I certainly do not.They're perfectly consistent with themselves,This means absolutely nothing. A bug can be perfectly consistent with itself and it is still a bug. To be meaningful, they would have to be consistent with the rest of the language. Or perhaps, consistent with another part of the language, like, say, Objects.and if you understand how they operate (which is not hard) then you won't make mistakes.It's not about making mistakes. Sure, I can just as well avoid a function in a library that is buggy, and I'll avoid a mistake. That's not the point. If something is broken, then it need to be fixed. If Walter could perhaps clarify the semantics of arrays, then we would get somewhere.As for your other issue, where array nullness and length == 0 being converged do not think this is an issue. length == 0 is the definition of a null setSo? What I would like to express is _No Set_.(arrays in CS seem to be more in line with sets, dunno why they're named as they are). But if you want to be consitent with terminology, techincally a null array is a an array with all elements set to null. Can you show me an example it matters if length == 0 and arr.ptr == null does not denote the same thing?When you are returning fields from a database, for instance. If you've ever dealt with a DB, you would know fields can be NULL, meaning no value. This is different than "", which means explicitly the empty string. It is very difficult to do this because of certain bugs which meld .length == 0 and .ptr == null. They are not the same thing. Not semantically. Not technically, at the moment, except for the "bugs." That's why I'm asking Walter whether he _plans_ on merging the two into one. If that's his vision, which would be unfortunate, then those things aren't "bugs" at all, but rather the intended design. Cheers, --AJG.
Jul 30 2005
In article <dchgkl$23v5$1 digitaldaemon.com>, AJG says...Hi,I am not splitting hairs. I gave you a very valid reason why a and b are not references, not even theoretically. They happen to have a reference member that in some cases, will point to the same data. YOU are in full control over when that happens. If that's not what you intended, then you should be using references to the ARRAY. Rather than using multiple arrays with have references to the same data. I might ask you this: What MAGIC would you like to happen with arrays? What you want is not possible without some kind of magic. Try this example on for size, from classic C: int* a = malloc(100 * sizeof(int)); int* b = a; b = realloc( b, 1000 * sizeof(int) ); Guess what, a is most likely now a bad reference. Is this what you would like D to do? Probably not, you probably want 'a' to point to the new array of length 1000. Do you want the compiler to magically handle this for you? Would you like length to be read only? Forcing us to call b = new int[], and then manually code up the data copy to resize the array? Starting to sound like C.... What a pain arrays were. And a still didn't change automatically to where b is pointing now.You are simply splitting hairs here. You are arguing language semantics. The fact of the matter is that for all practical purposes, EXCEPT for .length, arrays in D are by reference. This means that for all practical purposes, EXCEPT for .length, B operates on A. It doesn't matter if it's because of the pointer (an implementation, system-dependent, gory detail) or because of any other reason.Um... I said "except .length" for a reason. That's my very point. That .length is the exception. All others operate on A.No, All others do _NOT_ operate on A. They happen to operate on the same data that A points to.If assiging an array _immediately_ copied the data, then what you said is true. But it doesn't, because (a) that would be inefficient, and (b) that would remove _all_ reference semantics. Therefore, as it is, reference semantics are broken when it comes to .length.There are no reference semantics when it comes to arrays. Maybe what you want is D to automagically do a Copy-on-Write. Any time an array that is set to a reference of another array the flag could get turned on, and when you use it as an lvalue and that is on, it could dup the array. But that's silly since b = new int[100]; is perfectly legal in D, and would result in a double memory access if you ever tried to assign to the array. Wonder what kind of magic would have to be done to fix this case. IMHO, Better to let the programmer specify when he wants a and b to point a the same data.<snip>You can't use a language to it's fully potential if you don't know implementation details. There will always be ambiguities of when references are by value, by ref, or whatever else. As the saying goes: the language is in the details. Here's a good example for you, from a VB.NET project i just inherited: If arr.Length - arr.Replace(",", "").Length <> 17 Then 'error out What's the big deal? It's only one line of code, must be just as good as counting the number of commas in the array....RE: Arrays as structs.This is were _you_ are wrong. Arrays are not structs. Arrays do not share the semantics of structs. Arrays share _implementation details_ with structs, and that's _it_. Didn't you see the quote from the D language doc? It clearly says "First-Class Objects." Not structs. Not primitives. Not pointers. If you, however, equate that with structs, that's fine. But I certainly do not.Not Set?They're perfectly consistent with themselves,This means absolutely nothing. A bug can be perfectly consistent with itself and it is still a bug. To be meaningful, they would have to be consistent with the rest of the language. Or perhaps, consistent with another part of the language, like, say, Objects.and if you understand how they operate (which is not hard) then you won't make mistakes.It's not about making mistakes. Sure, I can just as well avoid a function in a library that is buggy, and I'll avoid a mistake. That's not the point. If something is broken, then it need to be fixed. If Walter could perhaps clarify the semantics of arrays, then we would get somewhere.As for your other issue, where array nullness and length == 0 being converged do not think this is an issue. length == 0 is the definition of a null setSo? What I would like to express is _No Set_.I see your point, but any kind of attempt to do that would be abusing the array. There are laws against array abuse in most countries these days. </sarcasm> Most every single database api in existence deals with that by having special objects. so you have this: static char[0] DBNull; in your database module; then char[] foo; foo = dbCommand.executeScalar( ); if( foo is DBNull ) // I'm not sure if the .ptr prop is needed here. Last I heard if you just use the array name it defaults to the ptr . oh noes, the field was null! else . oh good ..(arrays in CS seem to be more in line with sets, dunno why they're named as they are). But if you want to be consitent with terminology, techincally a null array is a an array with all elements set to null. Can you show me an example it matters if length == 0 and arr.ptr == null does not denote the same thing?When you are returning fields from a database, for instance. If you've ever dealt with a DB, you would know fields can be NULL, meaning no value. This is different than "", which means explicitly the empty string. It is very difficult to do this because of certain bugs which meld .length == 0 and .ptr == null.They are not the same thing. Not semantically. Not technically, at the moment, except for the "bugs." That's why I'm asking Walter whether he _plans_ on merging the two into one.They should never be the same thing. But there's a gotcha, if .ptr is null, then length should always be 0. Other way around is not necessarily true. Just because length == 0 the ptr isn't necesisarily null. This should be the case when the array was at one point allocated, and then length was reduced. It should be that way for efficiency. That however is not useful for your example of DBNulls. It would be silly to allocate some space and then just not use it and say that's when somebody entered something, and it was nothing.If that's his vision, which would be unfortunate, then those things aren't "bugs" at all, but rather the intended design.What 'things'? Are you talking about the .ptr value being the same for two arrays?
Jul 31 2005
AJG escribió:I'm not suggesting making .length read-only. I'm suggesting making it operate on the same data it has a pointer to. Just like .sort or .reverse would. The way I see it, if you explicitly want to make a copy of the data, that's why there is dup. Why should .length secretely call .dup sometimes, and sometimes not? Cheers, --AJG.First of all, I don't agree with AJG: I think D arrays are very well the way they're now. There's something, though, and correct me if I'm wrong, but I think array.length doesn't go hand in hand with COW. char [] a; a.length = 3; foo(a); void foo(char [] b) { b[0] = 'f'; // 1 b.length = 5; // 2 } COW says to do 1, you have to dup first, because you don't own the array, but when you do 2, b is automatically dupped. So, my point is that to be consistent, maybe resizing should also require dupping. Am I right? Does it make sense? -- Carlos Santander Bernal
Jul 30 2005
On Fri, 29 Jul 2005 18:50:45 +0000 (UTC), AJG wrote:Hi Ben, Ok, I don't think I said exactly what I meant before. Let's look at this piece by piece: 1) Arrays are ("in theory") reference types.This is where I think we separate. I don't think that D arrays are reference types in the same manner as objects. I think they are value types in that they always have two fields; a pointer and a length. D arrays are more like a predefined struct. Your phrase "in theory", depends on whose theory you are talking about.2) Objects are reference types.Okay.3) Arrays are not objects.True.4) So, even though Arrays and Objects are different, they share (or should) reference semantics.I assume at this point that you are talking about arrays as defined in some computer science book rather than how they are implemented in D.I believe most of us can agree up to here.Apparently not ;-)My overall point is that D is not keeping its promise regarding Arrays obeying reference semantics."Promise"? Where is that written down?Whether this is good or not is debatable, but at least it should be noted. Do you agree that D's arrays break reference semantics?I suppose so. But it doesn't worry me because it is a pragmatic implementation that makes coding clearer (IMO) and improves performance. I'm not sorry that D doesn't have text-book arrays, in that case. In your previous example ...Semantically speaking, I think this is wrong.I've adjusted my thinking when using D. To me, after the assignment 'b = a', I see that 'a' and 'b' are distinct arrays that happen to share the same data. This may be seen as twisting words or playing with semantics, but it works for me. And by the way, the 'b.length = 2' statement does not cause 'b' to become another instance. It still shares the same data as 'a'. You only get a new instance when the length increases. If D has not implemented text-book arrays, what are we losing? I can't see that we have lost anything, in fact we have gained. -- Derek Parnell Melbourne, Australia 30/07/2005 11:19:20 AM
Jul 29 2005
Hi Derek,Well, my "in theory" is actually pretty down-to-earth. I mean reference references. This is not an ivory tower concept. It means essentially a nicer, fancier version of a pointer. When using the languages I mentioned, if you assign a reference, it will not become its own instance spontaineously in certain cases.Ok, I don't think I said exactly what I meant before. Let's look at this piece by piece: 1) Arrays are ("in theory") reference types.This is where I think we separate. I don't think that D arrays are reference types in the same manner as objects. I think they are value types in that they always have two fields; a pointer and a length. D arrays are more like a predefined struct. Your phrase "in theory", depends on whose theory you are talking about.Guilty as charged re: being a computer scientist ;). However, once again, this is not a high-brow idea. Reference semantics are very basic and are implemented Javascript). D breaks reference semantics when it comes to arrays. This leads me to believe arrays are _not_ reference types, which is not the impression I got from their description. Walter has remained conspicously silent about the matter, and has not answered the question. Are arrays reference types or not? If yes, then they are broken.4) So, even though Arrays and Objects are different, they share (or should) reference semantics.I assume at this point that you are talking about arrays as defined in some computer science book rather than how they are implemented in D.Indeed. The final word can only come from the Big W., I'm afraid.I believe most of us can agree up to here.Apparently not ;-)It was a figure of speech :p. The promise "would" be written down if D agrees to implement array reference semantics and then doesn't. This is what I'm not sure about.My overall point is that D is not keeping its promise regarding Arrays obeying reference semantics."Promise"? Where is that written down?Once more, these "text-book" arrays are fairly common across modern languages, and D's semantics are certainly a twisted variation. Also, I don't follow how that improves performance. If anything, it _decreases_ performance by spawning deep copies of array instances in certain special cases.Whether this is good or not is debatable, but at least it should be noted. Do you agree that D's arrays break reference semantics?I suppose so. But it doesn't worry me because it is a pragmatic implementation that makes coding clearer (IMO) and improves performance. I'm not sorry that D doesn't have text-book arrays, in that case.In your previous example ...Well, then that's not a reference. Sharing just the same data is some weird variation of array that I hadn't encountered. This is not a reference.Semantically speaking, I think this is wrong.I've adjusted my thinking when using D. To me, after the assignment 'b = a', I see that 'a' and 'b' are distinct arrays that happen to share the same data. This may be seen as twisting words or playing with semantics, but it works for me.And by the way, the 'b.length = 2' statement does not cause 'b' to become another instance. It still shares the same data as 'a'. You only get a new instance when the length increases.Great, yet another exception. Thanks for pointing it out.If D has not implemented text-book arrays, what are we losing? I can't see that we have lost anything, in fact we have gained.Well, so what if we lost object reference semantics? Would that also be another "gain?" Less is more! Rations will be increased -33%. It's doubleplusgood! ;) Cheers, --AJG.
Jul 29 2005
AJG wrote:Well, then that's not a reference. Sharing just the same data is some weird variation of array that I hadn't encountered. This is not a reference.Great, yet another exception. Thanks for pointing it out.Well, so what if we lost object reference semantics? Would that also be another "gain?" Less is more! Rations will be increased -33%. It's doubleplusgood!Wasn't it you who posted elsewhere in this thread that change is good? ;) D has changed the way we think about arrays. From my perspective, it's a good change and your desire to revert to the 'array as a reference' paradigm is not. Maybe it would help if you think of the D array as a wrapper/facade to the actual reference?
Jul 30 2005
On Sat, 30 Jul 2005 02:30:17 +0000 (UTC), AJG wrote:Hi Derek,I think I have the solution. Rename them. Don't call them arrays. Call them something else. Then your problem goes away ;-) -- Derek Parnell Melbourne, Australia 30/07/2005 10:49:59 PMWell, my "in theory" is actually pretty down-to-earth. I mean reference references. This is not an ivory tower concept. It means essentially a nicer, fancier version of a pointer. When using the languages I mentioned, if you assign a reference, it will not become its own instance spontaineously in certain cases.Ok, I don't think I said exactly what I meant before. Let's look at this piece by piece: 1) Arrays are ("in theory") reference types.This is where I think we separate. I don't think that D arrays are reference types in the same manner as objects. I think they are value types in that they always have two fields; a pointer and a length. D arrays are more like a predefined struct. Your phrase "in theory", depends on whose theory you are talking about.
Jul 30 2005
Ben Hinkle wrote:I think you'll have a hard time getting lots of support for that. I much prefer the current behavior and I bet there is lots of existing D code that assumes one can test the length of an array at any time. Since an array is not an object I see no problem with the "inconistency" - an array is an array.Indeed. I think the array semantics where you can't access a property of the array without the Fear of the NullPointerException is the most annoying thing in the world, or at least in the field of programming. I will happily agree to this difference in semantics because the benefits far outweigh the slight inconsistency. Besides, in a way there is no inconsistency. An array reference is a value type consisting of two 4-byte integers (in 32-bit environments). This is different from an object reference. The first integer is the length of the array and the second is a pointer to the first item of the array. Whenever an array reference is created a pointer to the data exists. The .length property is just a shortcut to access the length field of the array. The .sort property is a function called on the array reference. These always work even if the array reference points to an empty array. Trying to access the elements of an empty array will segfault in the usual way. Object references stored in an array have the usual semantics. IMO nothing forces a language to treat arrays as templated instances of a class Array with regular object semantics. D's way is just better. -- Niko Korhonen SW Developer
Jul 31 2005
On Mon, 01 Aug 2005 09:56:57 +0300, Niko Korhonen wrote:Ben Hinkle wrote:Agreed. The way I look at it is that a D array variable *contains* a reference to the array elements but is, in itself, not the reference. When it comes to implementation, dynamic-length arrays always have an 8-byte structure allocated to themselves, and may have more RAM allocated if there are any elements in the array. The address of the array variable is not the address of the first element; the length property is fetched at runtime from the array variable. However, fixed-length arrays always have a minimum of 8 bytes allocated regardless of the number of elements declared, and the address of the array variable is also the address of its first element; the length property is 'hard-coded' by the compiler in any expressions that use it. -- Derek Melbourne, Australia 1/08/2005 5:01:43 PMI think you'll have a hard time getting lots of support for that. I much prefer the current behavior and I bet there is lots of existing D code that assumes one can test the length of an array at any time. Since an array is not an object I see no problem with the "inconistency" - an array is an array.Indeed. I think the array semantics where you can't access a property of the array without the Fear of the NullPointerException is the most annoying thing in the world, or at least in the field of programming. I will happily agree to this difference in semantics because the benefits far outweigh the slight inconsistency. Besides, in a way there is no inconsistency. An array reference is a value type consisting of two 4-byte integers (in 32-bit environments). This is different from an object reference.
Aug 01 2005
so wait, you basically want an array to be a pointer to data containing a length and a pointer? i have been following this thread somewhat but I can hardly find the benifit here. it seems to me you want to take something very straightforward and close to the metal and turn it into a referenced object, for some bizzare reason regarding reference semantics. why dont you just put your arrays in objects if you are having problems?
Jul 30 2005
Hi,so wait, you basically want an array to be a pointer to data containing a length and a pointer? i have been following this thread somewhat but I can hardly find the benifit here.No. I would like it to be that way, but I know there wouldn't be support for this. What I'd like is for all array properties to follow reference semantics.it seems to me you want to take something very straightforward and close to the metal and turn it into a referenced object, for some bizzare reason regarding reference semantics.What is bizarre is the current array semantics, be it due to "close to the metal" requirements, or whatever. If you don't think arrays at the moment follow at least _partial_ reference semantics, then why does: Reverse _also_ the contents of A? Those are reference semantics. According to Derek, the array reference itself is implemented on the stack in 8-byte chunks. That's fine. I'm not talking about making the array itself a pointer. Now, my point is that .length breaks reference semantics in special cases, because: A.length did not change. If it were consistent with .reverse and .sort, then A's length too would have changed. Cheers, --AJG. why dont you just put your arrays in objects if you arehaving problems?
Jul 30 2005
On Sat, 30 Jul 2005 22:12:31 +0000 (UTC), AJG wrote: [snip]What is bizarre is the current array semantics, be it due to "close to the metal" requirements, or whatever. If you don't think arrays at the moment follow at least _partial_ reference semantics, then why does: Reverse _also_ the contents of A?There might have been be an argument that .reverse and .sort should follow Walter's Copy-on-Write rules of engagement, but the current behavior is documented and relied upon in current code. -- Derek Parnell Melbourne, Australia 31/07/2005 8:53:41 AM
Jul 30 2005
"Derek Parnell" <derek psych.ward> wrote in message news:a118xxgyuee7.t1828b9vk5du$.dlg 40tude.net...On Sat, 30 Jul 2005 22:12:31 +0000 (UTC), AJG wrote: [snip]Besides those reasons writing "B.reverse" to me indicates you want to affect B hence no COW while "reverse(B)" says you want a reversed B hence COW. That's one reason why I don't really like the current syntax hack of being able to write B.tolower() to mean tolower(B).What is bizarre is the current array semantics, be it due to "close to the metal" requirements, or whatever. If you don't think arrays at the moment follow at least _partial_ reference semantics, then why does: Reverse _also_ the contents of A?There might have been be an argument that .reverse and .sort should follow Walter's Copy-on-Write rules of engagement, but the current behavior is documented and relied upon in current code.
Aug 01 2005
In article <dclba9$2pif$1 digitaldaemon.com>, Ben Hinkle says..."Derek Parnell" <derek psych.ward> wrote in message news:a118xxgyuee7.t1828b9vk5du$.dlg 40tude.net...Utterly confusing! reserve(b) and B.reverse have nothing in their name to imply that either one copies the data. By default COW should not happen. Believe me, look at .NET where everything is COW. New memory allocations all over the place. IMHO .dup is there for a reason, and nothing is preventing you from doing: foo.dup.reverse If somebody else comes along, they will knows you are copying the array. It's only 4 more characters of typing. Plus no confusion as to what does cow and what doesn't. I can copy the thing first with .dup if I want. This isn't C where it's 5 lines of code every time you need to copy an array! -ShaOn Sat, 30 Jul 2005 22:12:31 +0000 (UTC), AJG wrote: [snip]Besides those reasons writing "B.reverse" to me indicates you want to affect B hence no COW while "reverse(B)" says you want a reversed B hence COW. That's one reason why I don't really like the current syntax hack of being able to write B.tolower() to mean tolower(B).What is bizarre is the current array semantics, be it due to "close to the metal" requirements, or whatever. If you don't think arrays at the moment follow at least _partial_ reference semantics, then why does: Reverse _also_ the contents of A?There might have been be an argument that .reverse and .sort should follow Walter's Copy-on-Write rules of engagement, but the current behavior is documented and relied upon in current code.
Aug 01 2005
"Shammah Chancellor" <Shammah_member pathlink.com> wrote in message news:dcleqr$2ti5$1 digitaldaemon.com...In article <dclba9$2pif$1 digitaldaemon.com>, Ben Hinkle says...You've lost me. Are you proposing a change to any existing behavior or coding practice (ie COW)?"Derek Parnell" <derek psych.ward> wrote in message news:a118xxgyuee7.t1828b9vk5du$.dlg 40tude.net...Utterly confusing! reserve(b) and B.reverse have nothing in their name to imply that either one copies the data. By default COW should not happen. Believe me, look at .NET where everything is COW. New memory allocations all over the place. IMHO .dup is there for a reason, and nothing is preventing you from doing: foo.dup.reverse If somebody else comes along, they will knows you are copying the array. It's only 4 more characters of typing. Plus no confusion as to what does cow and what doesn't. I can copy the thing first with .dup if I want. This isn't C where it's 5 lines of code every time you need to copy an array! -ShaOn Sat, 30 Jul 2005 22:12:31 +0000 (UTC), AJG wrote: [snip]Besides those reasons writing "B.reverse" to me indicates you want to affect B hence no COW while "reverse(B)" says you want a reversed B hence COW. That's one reason why I don't really like the current syntax hack of being able to write B.tolower() to mean tolower(B).What is bizarre is the current array semantics, be it due to "close to the metal" requirements, or whatever. If you don't think arrays at the moment follow at least _partial_ reference semantics, then why does: Reverse _also_ the contents of A?There might have been be an argument that .reverse and .sort should follow Walter's Copy-on-Write rules of engagement, but the current behavior is documented and relied upon in current code.
Aug 01 2005
In article <dclfvs$2usj$1 digitaldaemon.com>, Ben Hinkle says..."Shammah Chancellor" <Shammah_member pathlink.com> wrote in message news:dcleqr$2ti5$1 digitaldaemon.com...I wasn't proposing a change at all. I was disagreing with Derek. I think COW is a bad thing for API functions to be doing mysteriously. It leads to crap like this: foo = foo.Replace("Hello",""); dateFoo = dateFoo.AddDays(1); If I want a duplicate something, in D, it's as easy as saying: (Not that replace is a valid property for char[]s, but you get my gist) This leads to effective memory use, and no confusion about: reverse(b), or b.reverse Which one does c-o-w? The name certainly doesn't say, maybe by somebodies reasoning it might make sense that one does cow and one doesn't. But certainly not mine, from the information given. Also, you might say for consistency, always use cow. But cow is not always what you want. Since there's no way to manually un-cowify it, It would make logical sense to NEVER do cow, and let the programmer call dup first. -ShaIn article <dclba9$2pif$1 digitaldaemon.com>, Ben Hinkle says...You've lost me. Are you proposing a change to any existing behavior or coding practice (ie COW)?"Derek Parnell" <derek psych.ward> wrote in message news:a118xxgyuee7.t1828b9vk5du$.dlg 40tude.net... [snip] Besides those reasons writing "B.reverse" to me indicates you want to affect B hence no COW while "reverse(B)" says you want a reversed B hence COW. That's one reason why I don't really like the current syntax hack of being able to write B.tolower() to mean tolower(B).Utterly confusing! reserve(b) and B.reverse have nothing in their name to imply that either one copies the data. By default COW should not happen. Believe me, look at .NET where everything is COW. New memory allocations all over the place. IMHO .dup is there for a reason, and nothing is preventing you from doing: foo.dup.reverse If somebody else comes along, they will knows you are copying the array. It's only 4 more characters of typing. Plus no confusion as to what does cow and what doesn't. I can copy the thing first with .dup if I want. This isn't C where it's 5 lines of code every time you need to copy an array! -Sha
Aug 01 2005
Hi,If I want a duplicate something, in D, it's as easy as saying: (Not that replace is a valid property for char[]s, but you get my gist)Exactly.This leads to effective memory use, and no confusion about: reverse(b), or b.reverse Which one does c-o-w? The name certainly doesn't say, maybe by somebodies reasoning it might make sense that one does cow and one doesn't. But certainly not mine, from the information given.IMHO, and for consistency, it should never do COW. If a user wants to do COW, let the user do it. That's exactly what I mean by reference semantics, so it seems we are in agreement here.Also, you might say for consistency, always use cow. But cow is not always you want. Since there's no way to manually un-cowify it, It would make logical sense to NEVER do cow, and let the programmer call dup first.Interestingly enough (and one of my points), .length does COW about half of the time, and there's no way to un-cowify it. That's a great word, btw, un-cowify. It had me chuckling. Cheers, --AJG.
Aug 01 2005
In article <dclls6$37o$1 digitaldaemon.com>, AJG says...Hi,While I agree with you that it could be annoying, the problem is that arrays are really stack variables which have a reference member. (As you well know by now.) So, in order to un-cowify .length we would have to make all arrays true references which contain references. Also, that still doesn't fix array slices. We would ALWAYS need to dup when an array slice is made. :( However, there's an easy way to handle the first problem already: char[] a = "Hello"; char[]* b = &a; // (I hope anyways, & shouldn't return a.ptr... I haven't checked this.) (*b).length = 10; writef("%i", a.length); Although array slices won't be fixed without a special array slice type. So that it would know the start of the array and resize that.If I want a duplicate something, in D, it's as easy as saying: (Not that replace is a valid property for char[]s, but you get my gist)Exactly.This leads to effective memory use, and no confusion about: reverse(b), or b.reverse Which one does c-o-w? The name certainly doesn't say, maybe by somebodies reasoning it might make sense that one does cow and one doesn't. But certainly not mine, from the information given.IMHO, and for consistency, it should never do COW. If a user wants to do COW, let the user do it. That's exactly what I mean by reference semantics, so it seems we are in agreement here.Also, you might say for consistency, always use cow. But cow is not always you want. Since there's no way to manually un-cowify it, It would make logical sense to NEVER do cow, and let the programmer call dup first.Interestingly enough (and one of my points), .length does COW about half of the time, and there's no way to un-cowify it.That's a great word, btw, un-cowify. It had me chuckling.Thanks :) -Sha
Aug 01 2005
"Shammah Chancellor" <Shammah_member pathlink.com> wrote in message news:dclk4p$1o0$1 digitaldaemon.com...In article <dclfvs$2usj$1 digitaldaemon.com>, Ben Hinkle says...I didn't read Derek's post as proposing reverse use COW. He was pointing out that it doesn't. It's too bad you see COW as mysterious."Shammah Chancellor" <Shammah_member pathlink.com> wrote in message news:dcleqr$2ti5$1 digitaldaemon.com...I wasn't proposing a change at all. I was disagreing with Derek. I think COW is a bad thing for API functions to be doing mysteriously. It leads to crap like this: foo = foo.Replace("Hello",""); dateFoo = dateFoo.AddDays(1);In article <dclba9$2pif$1 digitaldaemon.com>, Ben Hinkle says...You've lost me. Are you proposing a change to any existing behavior or coding practice (ie COW)?"Derek Parnell" <derek psych.ward> wrote in message news:a118xxgyuee7.t1828b9vk5du$.dlg 40tude.net... [snip] Besides those reasons writing "B.reverse" to me indicates you want to affect B hence no COW while "reverse(B)" says you want a reversed B hence COW. That's one reason why I don't really like the current syntax hack of being able to write B.tolower() to mean tolower(B).Utterly confusing! reserve(b) and B.reverse have nothing in their name to imply that either one copies the data. By default COW should not happen. Believe me, look at .NET where everything is COW. New memory allocations all over the place. IMHO .dup is there for a reason, and nothing is preventing you from doing: foo.dup.reverse If somebody else comes along, they will knows you are copying the array. It's only 4 more characters of typing. Plus no confusion as to what does cow and what doesn't. I can copy the thing first with .dup if I want. This isn't C where it's 5 lines of code every time you need to copy an array! -ShaIf I want a duplicate something, in D, it's as easy as saying: (Not that replace is a valid property for char[]s, but you get my gist) This leads to effective memory use, and no confusion about: reverse(b), or b.reverse Which one does c-o-w? The name certainly doesn't say, maybe by somebodies reasoning it might make sense that one does cow and one doesn't. But certainly not mine, from the information given.The statement about effective memory use only is true when the operation is guaranteed to change the string. If foo in the example didn't contain any Hellos then the dup would be wasteful. Plus I'm surprised you don't see any difference between reverse(b) and b.reverse since it's common in OOP to interpret b.foo as acting on b while foo(b) is just some function of b.Also, you might say for consistency, always use cow. But cow is not always what you want. Since there's no way to manually un-cowify it, It would make logical sense to NEVER do cow, and let the programmer call dup first.That would be a big change in D style since many times you do not know if a dup will be needed or not (eg most of the functions in std.string might just return the original string).
Aug 01 2005
In article <dclmn7$42s$1 digitaldaemon.com>, Ben Hinkle says..."Shammah Chancellor" <Shammah_member pathlink.com> wrote in message news:dclk4p$1o0$1 digitaldaemon.com...You're right, he didn't. I was contesting that tolower(b) and b.tolower should do different things.In article <dclfvs$2usj$1 digitaldaemon.com>, Ben Hinkle says...I didn't read Derek's post as proposing reverse use COW. He was pointing out that it doesn't."Shammah Chancellor" <Shammah_member pathlink.com> wrote in message news:dcleqr$2ti5$1 digitaldaemon.com...I wasn't proposing a change at all. I was disagreing with Derek. I think COW is a bad thing for API functions to be doing mysteriously. It leads to crap like this: foo = foo.Replace("Hello",""); dateFoo = dateFoo.AddDays(1);In article <dclba9$2pif$1 digitaldaemon.com>, Ben Hinkle says...You've lost me. Are you proposing a change to any existing behavior or coding practice (ie COW)?"Derek Parnell" <derek psych.ward> wrote in message news:a118xxgyuee7.t1828b9vk5du$.dlg 40tude.net... [snip] Besides those reasons writing "B.reverse" to me indicates you want to affect B hence no COW while "reverse(B)" says you want a reversed B hence COW. That's one reason why I don't really like the current syntax hack of being able to write B.tolower() to mean tolower(B).Utterly confusing! reserve(b) and B.reverse have nothing in their name to imply that either one copies the data. By default COW should not happen. Believe me, look at .NET where everything is COW. New memory allocations all over the place. IMHO .dup is there for a reason, and nothing is preventing you from doing: foo.dup.reverse If somebody else comes along, they will knows you are copying the array. It's only 4 more characters of typing. Plus no confusion as to what does cow and what doesn't. I can copy the thing first with .dup if I want. This isn't C where it's 5 lines of code every time you need to copy an array! -ShaIt's too bad you see COW as mysterious.I don't find anything mysterious about it. It's just not useful most every time I've had any dealing with COW functions. If I want COW, I can dupe the object first.I hope you're not implying that replace should only return a new instance if something was actually changed. That is obsurd. I would then need to check to see if it's given me back a reference to a new array before I could use it?If I want a duplicate something, in D, it's as easy as saying: (Not that replace is a valid property for char[]s, but you get my gist) This leads to effective memory use, and no confusion about: reverse(b), or b.reverse Which one does c-o-w? The name certainly doesn't say, maybe by somebodies reasoning it might make sense that one does cow and one doesn't. But certainly not mine, from the information given.The statement about effective memory use only is true when the operation is guaranteed to change the string. If foo in the example didn't contain any Hellos then the dup would be wasteful.Plus I'm surprised you don't see any difference between reverse(b) and b.reverse since it's common in OOP to interpret b.foo as acting on b while foo(b) is just some function of b.Why don't you tell microsoft that. Many of the examples I listed were from VB.NET, and do COW from member functions. Also, Just because it is common doesn't make it logical, consistent, or obvious to a somebody not familiar with these __unwritten__ agreements.If I'm understanding what you just said, let me say this: As I said above, I think it's silly to have non-deterministic behavior from those functions. When I say deterministic, I mean that I should be able to expect it to always return a duplicate string, or not. -ShaAlso, you might say for consistency, always use cow. But cow is not always what you want. Since there's no way to manually un-cowify it, It would make logical sense to NEVER do cow, and let the programmer call dup first.That would be a big change in D style since many times you do not know if a dup will be needed or not (eg most of the functions in std.string might just return the original string).
Aug 01 2005
"Shammah Chancellor" <Shammah_member pathlink.com> wrote in message news:dcm4an$grn$1 digitaldaemon.com...In article <dclmn7$42s$1 digitaldaemon.com>, Ben Hinkle says...That is what I'm implying - and that's what many std.string functions do."Shammah Chancellor" <Shammah_member pathlink.com> wrote in message news:dclk4p$1o0$1 digitaldaemon.com...You're right, he didn't. I was contesting that tolower(b) and b.tolower should do different things.In article <dclfvs$2usj$1 digitaldaemon.com>, Ben Hinkle says...I didn't read Derek's post as proposing reverse use COW. He was pointing out that it doesn't."Shammah Chancellor" <Shammah_member pathlink.com> wrote in message news:dcleqr$2ti5$1 digitaldaemon.com...I wasn't proposing a change at all. I was disagreing with Derek. I think COW is a bad thing for API functions to be doing mysteriously. It leads to crap like this: foo = foo.Replace("Hello",""); dateFoo = dateFoo.AddDays(1);In article <dclba9$2pif$1 digitaldaemon.com>, Ben Hinkle says...You've lost me. Are you proposing a change to any existing behavior or coding practice (ie COW)?"Derek Parnell" <derek psych.ward> wrote in message news:a118xxgyuee7.t1828b9vk5du$.dlg 40tude.net... [snip] Besides those reasons writing "B.reverse" to me indicates you want to affect B hence no COW while "reverse(B)" says you want a reversed B hence COW. That's one reason why I don't really like the current syntax hack of being able to write B.tolower() to mean tolower(B).Utterly confusing! reserve(b) and B.reverse have nothing in their name to imply that either one copies the data. By default COW should not happen. Believe me, look at .NET where everything is COW. New memory allocations all over the place. IMHO .dup is there for a reason, and nothing is preventing you from doing: foo.dup.reverse If somebody else comes along, they will knows you are copying the array. It's only 4 more characters of typing. Plus no confusion as to what does cow and what doesn't. I can copy the thing first with .dup if I want. This isn't C where it's 5 lines of code every time you need to copy an array! -ShaI hope you're not implying that replace should only return a new instance if something was actually changed.If I want a duplicate something, in D, it's as easy as saying: (Not that replace is a valid property for char[]s, but you get my gist) This leads to effective memory use, and no confusion about: reverse(b), or b.reverse Which one does c-o-w? The name certainly doesn't say, maybe by somebodies reasoning it might make sense that one does cow and one doesn't. But certainly not mine, from the information given.The statement about effective memory use only is true when the operation is guaranteed to change the string. If foo in the example didn't contain any Hellos then the dup would be wasteful.That is obsurd. I would then need to check to see if it's given me back a reference to a new array before I could use it?why? The only time you would care is if you start modifying the array in-place.Strings in VB.NET are immutable so I'm not surprised that methods return new strings - that's the definition of immutable. Mutable objects would interpret b.reverse as acting on b.Plus I'm surprised you don't see any difference between reverse(b) and b.reverse since it's common in OOP to interpret b.foo as acting on b while foo(b) is just some function of b.Why don't you tell microsoft that. Many of the examples I listed were from VB.NET, and do COW from member functions.Also, Just because it is common doesn't make it logical, consistent, or obvious to a somebody not familiar with these __unwritten__ agreements.Unwritten in what sense? COW is documented in several places in D (though I would like even more documenation about it since it appears people don't know about it).ok - everyone is entitled to their opinions. To me it's simpler to obey COW. Changing an array in-place is rare enough that special care is ok with me.If I'm understanding what you just said, let me say this: As I said above, I think it's silly to have non-deterministic behavior from those functions. When I say deterministic, I mean that I should be able to expect it to always return a duplicate string, or not.Also, you might say for consistency, always use cow. But cow is not always what you want. Since there's no way to manually un-cowify it, It would make logical sense to NEVER do cow, and let the programmer call dup first.That would be a big change in D style since many times you do not know if a dup will be needed or not (eg most of the functions in std.string might just return the original string).
Aug 01 2005
In article <dcmaak$l5m$1 digitaldaemon.com>, Ben Hinkle says..."Shammah Chancellor" <Shammah_member pathlink.com> wrote in message news:dcm4an$grn$1 digitaldaemon.com...BahIn article <dclmn7$42s$1 digitaldaemon.com>, Ben Hinkle says...That is what I'm implying - and that's what many std.string functions do."Shammah Chancellor" <Shammah_member pathlink.com> wrote in message news:dclk4p$1o0$1 digitaldaemon.com...You're right, he didn't. I was contesting that tolower(b) and b.tolower should do different things.In article <dclfvs$2usj$1 digitaldaemon.com>, Ben Hinkle says...I didn't read Derek's post as proposing reverse use COW. He was pointing out that it doesn't."Shammah Chancellor" <Shammah_member pathlink.com> wrote in message news:dcleqr$2ti5$1 digitaldaemon.com...I wasn't proposing a change at all. I was disagreing with Derek. I think COW is a bad thing for API functions to be doing mysteriously. It leads to crap like this: foo = foo.Replace("Hello",""); dateFoo = dateFoo.AddDays(1);In article <dclba9$2pif$1 digitaldaemon.com>, Ben Hinkle says...You've lost me. Are you proposing a change to any existing behavior or coding practice (ie COW)?"Derek Parnell" <derek psych.ward> wrote in message news:a118xxgyuee7.t1828b9vk5du$.dlg 40tude.net... [snip] Besides those reasons writing "B.reverse" to me indicates you want to affect B hence no COW while "reverse(B)" says you want a reversed B hence COW. That's one reason why I don't really like the current syntax hack of being able to write B.tolower() to mean tolower(B).Utterly confusing! reserve(b) and B.reverse have nothing in their name to imply that either one copies the data. By default COW should not happen. Believe me, look at .NET where everything is COW. New memory allocations all over the place. IMHO .dup is there for a reason, and nothing is preventing you from doing: foo.dup.reverse If somebody else comes along, they will knows you are copying the array. It's only 4 more characters of typing. Plus no confusion as to what does cow and what doesn't. I can copy the thing first with .dup if I want. This isn't C where it's 5 lines of code every time you need to copy an array! -ShaI hope you're not implying that replace should only return a new instance if something was actually changed.If I want a duplicate something, in D, it's as easy as saying: (Not that replace is a valid property for char[]s, but you get my gist) This leads to effective memory use, and no confusion about: reverse(b), or b.reverse Which one does c-o-w? The name certainly doesn't say, maybe by somebodies reasoning it might make sense that one does cow and one doesn't. But certainly not mine, from the information given.The statement about effective memory use only is true when the operation is guaranteed to change the string. If foo in the example didn't contain any Hellos then the dup would be wasteful.Exactly. Quite often when I want to replace one thing, I want to replace ALOT of things. (Or take any other example.) If each replace allocates a new string, that's inefficient. Maybe I only want to copy it once, and then modify it in place. When .dup is only 4 extra characters per instance of this, it does not justify having two copies of every array function, one for cow and one for in place.That is obsurd. I would then need to check to see if it's given me back a reference to a new array before I could use it?why? The only time you would care is if you start modifying the array in-place.True. However, for mutable objects, would you like to duplicate every function for COW and non-COW? I find it less confusing to explicitly dup. (It also clutters my namespace less!)Strings in VB.NET are immutable so I'm not surprised that methods return new strings - that's the definition of immutable. Mutable objects would interpret b.reverse as acting on b.Plus I'm surprised you don't see any difference between reverse(b) and b.reverse since it's common in OOP to interpret b.foo as acting on b while foo(b) is just some function of b.Why don't you tell microsoft that. Many of the examples I listed were from VB.NET, and do COW from member functions.There's barely any documentation for the API as it is. And a footnote about tolower( string ) on a man page is not enough for me.Also, Just because it is common doesn't make it logical, consistent, or obvious to a somebody not familiar with these __unwritten__ agreements.Unwritten in what sense? COW is documented in several places in D (though I would like even more documenation about it since it appears people don't know about it).Rare in what? Rare in what you're writing? I think you'll find that many projects have use for it alot. -Shaok - everyone is entitled to their opinions. To me it's simpler to obey COW. Changing an array in-place is rare enough that special care is ok with me.If I'm understanding what you just said, let me say this: As I said above, I think it's silly to have non-deterministic behavior from those functions. When I say deterministic, I mean that I should be able to expect it to always return a duplicate string, or not.Also, you might say for consistency, always use cow. But cow is not always what you want. Since there's no way to manually un-cowify it, It would make logical sense to NEVER do cow, and let the programmer call dup first.That would be a big change in D style since many times you do not know if a dup will be needed or not (eg most of the functions in std.string might just return the original string).
Aug 01 2005
I don't know if you followed the recent COW/const/inplace performance discussion but my own $0.02 is that one should use COW as a general rule and after profiling the performance target a (presumably) small set of routines that need more careful memory management and possibly inplace manipulations. In a "worst case" one can use one of the many other memory management techniques listed in the D docs. In any case you might want to look over those recent COW threads for more (and more and more) discussion on the topic. On a side note, remember that operations like "replace" might increase the length of the string (if the replacement is longer that the pattern) in which case modifying it inplace becomes tricky. A general rule like COW can take the place of lots of individual rules for each function. But you can code your app however you like or write a phobos lib that does everything inplace - there's nothing technically preventing that and it's perfectly ok if that's what you want to do.Exactly. Quite often when I want to replace one thing, I want to replace ALOT of things. (Or take any other example.) If each replace allocates a new string, that's inefficient. Maybe I only want to copy it once, and then modify it in place. When .dup is only 4 extra characters per instance of this, it does not justify having two copies of every array function, one for cow and one for in place.That is obsurd. I would then need to check to see if it's given me back a reference to a new array before I could use it?why? The only time you would care is if you start modifying the array in-place.I'm not sure what man page you are referring to since D doesn't have man pages (or footnotes from what I can tell). Maybe you are speaking figuratively in which case I recommend that if you have concrete suggestions for improving the doc that you add comments to the doc wiki. On a slight OT I wonder if/how the doc wiki is being used. Are comments removed as they are fixed in the doc? what's the process for using the wiki? I add my comments where I think they should go but I notice there's stuff ranging all over the map and to be honest I have no clue if Walter ever looks at it, how often, and what happens when he does look at it.There's barely any documentation for the API as it is. And a footnote about tolower( string ) on a man page is not enough for me.Also, Just because it is common doesn't make it logical, consistent, or obvious to a somebody not familiar with these __unwritten__ agreements.Unwritten in what sense? COW is documented in several places in D (though I would like even more documenation about it since it appears people don't know about it).
Aug 01 2005
In article <dcmpne$10pp$1 digitaldaemon.com>, Ben Hinkle says...I think this would be a bad choice. It might be wise with respect to performance, but having different methods randomly be cow or not cow depending on how much more time they take is a bit confusing to say the least.I don't know if you followed the recent COW/const/inplace performance discussion but my own $0.02 is that one should use COW as a general rule and after profiling the performance target a (presumably) small set of routines that need more careful memory management and possibly inplace manipulations. In a "worst case" one can use one of the many other memory management techniques listed in the D docs. In any case you might want to look over those recent COW threads for more (and more and more) discussion on the topic.Exactly. Quite often when I want to replace one thing, I want to replace ALOT of things. (Or take any other example.) If each replace allocates a new string, that's inefficient. Maybe I only want to copy it once, and then modify it in place. When .dup is only 4 extra characters per instance of this, it does not justify having two copies of every array function, one for cow and one for in place.That is obsurd. I would then need to check to see if it's given me back a reference to a new array before I could use it?why? The only time you would care is if you start modifying the array in-place.On a side note, remember that operations like "replace" might increase the length of the string (if the replacement is longer that the pattern) in which case modifying it inplace becomes tricky.Inplace may not be possible, but it could still follow the normal rule of modifying the ptr of your array to point to the new value. That way a dup only happens when it is required, and the calling function does not care. This would be ideal IMHO.A general rule like COW can take the place of lots of individual rules for each function. But you can code your app however you like or write a phobos lib that does everything inplace - there's nothing technically preventing that and it's perfectly ok if that's what you want to do.That's true, but it would be nice not to be including my own runtime in every little application I write. I suppose I could force installation of a shared library. Ugh.Which I have been doing when I see them. However, most of the doc that you can post on the Wiki. (I haven't looked alot at it. ) Seems to be for the language specification. Can the phobos docs be modified?I'm not sure what man page you are referring to since D doesn't have man pages (or footnotes from what I can tell). Maybe you are speaking figuratively in which case I recommend that if you have concrete suggestions for improving the doc that you add comments to the doc wiki.There's barely any documentation for the API as it is. And a footnote about tolower( string ) on a man page is not enough for me.Also, Just because it is common doesn't make it logical, consistent, or obvious to a somebody not familiar with these __unwritten__ agreements.Unwritten in what sense? COW is documented in several places in D (though I would like even more documenation about it since it appears people don't know about it).On a slight OT I wonder if/how the doc wiki is being used. Are comments removed as they are fixed in the doc? what's the process for using the wiki? I add my comments where I think they should go but I notice there's stuff ranging all over the map and to be honest I have no clue if Walter ever looks at it, how often, and what happens when he does look at it.
Aug 01 2005
There's a link at the bottom of the phobos page for the wiki. I don't know if the modules with stand-along pages have the link, though.Which I have been doing when I see them. However, most of the doc that you can post on the Wiki. (I haven't looked alot at it. ) Seems to be for the language specification. Can the phobos docs be modified?There's barely any documentation for the API as it is. And a footnote about tolower( string ) on a man page is not enough for me.I'm not sure what man page you are referring to since D doesn't have man pages (or footnotes from what I can tell). Maybe you are speaking figuratively in which case I recommend that if you have concrete suggestions for improving the doc that you add comments to the doc wiki.
Aug 02 2005
On Mon, 1 Aug 2005 16:54:49 +0000 (UTC), Shammah Chancellor wrote:"Derek Parnell" <derek psych.ward> wrote in message news:a118xxgyuee7.t1828b9vk5du$.dlg 40tude.net... [snip] Besides those reasons writing "B.reverse" to me indicates you want to affect B hence no COW while "reverse(B)" says you want a reversed B hence COW. That's one reason why I don't really like the current syntax hack of being able to write B.tolower() to mean tolower(B).I was disagreing with Derek. I think COW is a bad thing for API functions to be doing mysteriously. It leads to crap like this: foo = foo.Replace("Hello",""); dateFoo = dateFoo.AddDays(1);Hi Shammah, I wasn't actually saying that .reverse must use CoW. I was saying that it didn't and that fact seems go counter to Walter's general principle (as I understand it) about when to use Cow or not. I thought that one should use CoW if the code is actually changing the data *and* the data might be accessible to the calling routine. Thus as the .reverse will change the data for lengths > 1, and the data is probably accessible to the code using .reverse, one could have expected it to CoW. Of course, I might be misunderstanding that 'general principle' ;-) As the current behaviour is documented, we can cope with this seeming exception. -- Derek Parnell Melbourne, Australia 2/08/2005 7:21:43 AM
Aug 01 2005
In article <1as80g46qpg5w$.1dfr6mqon4u1t$.dlg 40tude.net>, Derek Parnell says...On Mon, 1 Aug 2005 16:54:49 +0000 (UTC), Shammah Chancellor wrote:No,no I understood that. I'm just being argumentative. I don't agree with you that tolower(b) and b.tolower should do different things. I don't agree that tolower(b) should even exist in the face of b.tolower. It clutters up my namespace. (Aside that user properties in D can't be added to a special char[] namespace =/ ) It just happened my example from VB was using class methods. For example in NET in order to round a date up from seconds to 5 minutes, you need to allocate like 3 or 4 datetimes. Of course you don't SEE this, but .AddDays, .AddSeconds etc. They all allocate a new datetime. For example, in .NET in order to get tomorrow's date: Dim tomorrow as String = DateTime.Now.Date.AddDays(1).ToLongDateString() That required allocations of 3 dateTimes and a String. I could completely be abusing the .NET Framework, but I searched far and wide and couldn't find an alternative that worked on the original. This kind of crud is why I'm very opposed to COW. In class methods or global functions. If .NET had D style dupes and I really wanted to operate on a new object: Dim tomorrow as String = DateTime.Now.Duplicate.Date.AddDays(1).ToLongDateString() One less allocation since AddDays didn't need/get it's own copy of the memory. You might still cite tolower(b) instead of b.tolower as not being as rediculous as what .NET wants. But I ask you thi: If somebody doesn't know your COW conventions, would they know the difference in what happens? In any case arr.dup.tolower would fit the same purpose just fine, and it's more explicit. -Sha"Derek Parnell" <derek psych.ward> wrote in message news:a118xxgyuee7.t1828b9vk5du$.dlg 40tude.net... [snip] Besides those reasons writing "B.reverse" to me indicates you want to affect B hence no COW while "reverse(B)" says you want a reversed B hence COW. That's one reason why I don't really like the current syntax hack of being able to write B.tolower() to mean tolower(B).I was disagreing with Derek. I think COW is a bad thing for API functions to be doing mysteriously. It leads to crap like this: foo = foo.Replace("Hello",""); dateFoo = dateFoo.AddDays(1);Hi Shammah, I wasn't actually saying that .reverse must use CoW. I was saying that it didn't and that fact seems go counter to Walter's general principle (as I understand it) about when to use Cow or not. I thought that one should use CoW if the code is actually changing the data *and* the data might be accessible to the calling routine. Thus as the .reverse will change the data for lengths > 1, and the data is probably accessible to the code using .reverse, one could have expected it to CoW. Of course, I might be misunderstanding that 'general principle' ;-) As the current behaviour is documented, we can cope with this seeming exception.
Aug 01 2005