digitalmars.D - Passing dynamic arrays
- Jens Mueller (24/24) Nov 08 2010 Hi,
- Denis Koroskin (5/29) Nov 08 2010 Yes, you understood it correctly. The changes to array structure (i.e.
- bearophile (6/10) Nov 08 2010 They are passed by "fat reference" :-)
- Jonathan M Davis (26/31) Nov 08 2010 I don't know. I find it to be pretty straightforward, though I can under...
- Daniel Gibson (11/17) Nov 08 2010 If you pass a dynamic array to a function and chance it's size within th...
- Steven Schveighoffer (17/32) Nov 08 2010 Not exactly. If you happen to change its size *and* change the original...
- =?UTF-8?B?QWxpIMOHZWhyZWxp?= (8/16) Nov 08 2010 Let's also note that appending to the array qualifies as "change its
- Steven Schveighoffer (14/27) Nov 08 2010 No, it doesn't. If you are appending to data that was passed in, you are...
- =?UTF-8?B?QWxpIMOHZWhyZWxp?= (5/7) Nov 08 2010 to it.
- Steven Schveighoffer (6/13) Nov 09 2010 Before the array append changes were introduced (somewhere around 2.040 ...
- Pillsy (5/13) Nov 09 2010 Ah! This is a lot of what was confusing me about arrays; I still thought...
- Steven Schveighoffer (18/34) Nov 09 2010 As I said before, this rarely affects code. The common cases I've seen:
- Pillsy (19/40) Nov 09 2010
- spir (44/58) Nov 09 2010 m with not appending in place (and arrays not having the possibility of ...
- Steven Schveighoffer (37/103) Nov 10 2010 Care to name names? I want to understand this dislike of D arrays,
- Jonathan M Davis (27/40) Nov 08 2010 Yes you can. It _never_ alters the array which was passed in. Sure, it _...
- Jonathan M Davis (23/39) Nov 08 2010 Implementation-defined behavior would be more accurate. Undefined would ...
- Daniel Gibson (18/59) Nov 08 2010 Ok, undefined may have been the wrong word.. non-deterministic may be be...
- Jonathan M Davis (7/41) Nov 08 2010 That would be a devastating change. It would cause bugs all over the pla...
- bearophile (5/10) Nov 08 2010 That's a solution that I have proposed lot of time ago, I did receive no...
- spir (14/21) Nov 08 2010 h are=20
- Jonathan M Davis (17/32) Nov 08 2010 The reference semantics for Objects and dynamic arrays are identical. Th...
- Vladimir Panteleev (16/20) Nov 08 2010 Compare to this C-ish program:
- Daniel Gibson (15/38) Nov 08 2010 This might technically be the same thing, but D's nice syntax hides this...
- Walter Bright (3/5) Nov 08 2010 It makes things like vectors (i.e. float[3]) natural to manipulate. It a...
- Daniel Gibson (2/9) Nov 08 2010 Why can't that be done when the static arrays are passed by reference?
- Daniel Gibson (12/22) Nov 08 2010 Ah I guess you mean something like "alias float[3] vec3", so one may exp...
- Daniel Gibson (3/28) Nov 08 2010 Hrm forgot the link:
- Walter Bright (3/32) Nov 08 2010 You don't write an add function by using references to ints. The vector
- Daniel Gibson (2/36) Nov 08 2010 Ok, thanks for explaining :-)
- spir (14/17) Nov 08 2010 nce,
- Kagamin (2/7) Nov 08 2010 It may take some effort to explain it. Does it help, if you think about ...
- Jens Mueller (6/15) Nov 08 2010 Yes. This behavior is well explained when one thinks about passing a
- Jesse Phillips (16/38) Nov 08 2010 But they are past by reference. You can modify the data all you want, bu...
- spir (30/31) Nov 08 2010 cannot reassign the reference itself.
- Jesse Phillips (29/32) Nov 08 2010 void main() {
- Daniel Gibson (17/19) Nov 08 2010 Unlike in C, a D array is more than a reference.
- Jesse Phillips (9/35) Nov 08 2010 If you are using the definition of reference which would include C point...
- Steven Schveighoffer (16/25) Nov 09 2010 It depends on your definition of reference type. I agree with Daniel. ...
- bearophile (22/26) Nov 09 2010 Let's play some more :-)
- Jesse Phillips (5/15) Nov 09 2010 Yet this still isn't complete because if you resize the array the data m...
- spir (9/13) Nov 09 2010 e =20
- Bruno Medeiros (40/61) Nov 26 2010 What do you mean "depends on your definition of reference type" ? I
- spir (35/106) Nov 26 2010 =20
- Bruno Medeiros (13/18) Nov 26 2010 ARRGHH, actually this is not entirely true, I forgot about something:
- Steven Schveighoffer (31/57) Nov 29 2010 A class is a reference type. Every operation on a class instance operat...
- Bruno Medeiros (13/68) Nov 29 2010 Hum, I see your point, yeah, I guess I do agree somewhat.
- spir (18/54) Nov 30 2010 s =20
- spir (35/58) Nov 09 2010 ans both the internal pointer and length must be the same. Just because ...
- Jens Mueller (9/51) Nov 08 2010 I like that explanation. Jonathan is saying the same, I think. I'll
- Jesse Phillips (10/17) Nov 08 2010 Don't know too much about C++ references, but as mentioned somewhere you...
- Jens Mueller (10/26) Nov 09 2010 I do see why a = new A() is useful. But it makes only sense if I passed
- Jonathan M Davis (15/46) Nov 09 2010 Why wouldn't you be able to change it? You can change any parameter as l...
- Jens Mueller (4/45) Nov 09 2010 I see your point. You argue that the behavior is consistent. My point is
- Jesse Phillips (7/12) Nov 09 2010 Well, in the case of classes, I don't think it would be very common.
- Jens Mueller (11/26) Nov 10 2010 With dynamic arrays I totally agree. Slicing is very useful. I just want
- so (5/13) Nov 08 2010 Yes in C++, you can't redirect a reference after initialization.
- Jonathan M Davis (7/20) Nov 08 2010 D references are more like Java references. They're really pointers, but...
- Rainer Deyke (5/6) Nov 12 2010 That's true for class references. D also supports pass-by-reference
- Jonathan M Davis (7/12) Nov 12 2010 True. But generally when talking about references, class references are ...
- Jonathan M Davis (87/95) Nov 08 2010 As Jesse says, they _are_ passed by reference. The struct itself _is_ th...
- spir (90/92) Nov 09 2010 =20
- =?UTF-8?B?UGVsbGUgTcOlbnNzb24=?= (3/48) Nov 09 2010 ...wait!
- so (8/8) Nov 08 2010 D arrays very powerful but you first need to understand what is going on...
- Andrei Alexandrescu (4/6) Nov 08 2010 Or a mildly outdated but accurate preview of the relevant chapter:
- Jonathan M Davis (10/16) Nov 08 2010 It's perfectly defined, just not knowable at compile time. You can even ...
- so (4/4) Nov 08 2010 I didn't mean that one, check page 112 on
- so (3/3) Nov 08 2010 Oh yeh you are right, i said reallocation. Should have said assignment.
- Bruno Medeiros (25/41) Nov 26 2010 Making the array reallocate _every_ time that it's resized (to a greater...
- =?ISO-8859-9?Q?Pelle_M=E5nsson?= (4/10) Nov 26 2010 What about when you don't know the length before, or working with
- Bruno Medeiros (30/37) Nov 26 2010 I must recognize I made a huge blunder with that, my reasoning was
- Andrei Alexandrescu (9/59) Nov 26 2010 It would be difficult to challenge the assumption that appends in a loop...
- Bruno Medeiros (8/21) Nov 26 2010 You could still do exponential capacity growth by manipulating the
- spir (11/15) Nov 26 2010 =20
- Bruno Medeiros (5/16) Nov 26 2010 But D does exactly that, there is a capacity field (internal to the GC),...
- Bruno Medeiros (12/30) Nov 26 2010 Well, there was actually no assumption yet, I wanted first of all to
- spir (14/42) Nov 26 2010 =20
- Kagamin (2/5) Nov 26 2010 Challenge: make D slower than C#.
- Bruno Medeiros (4/9) Nov 26 2010 Huh?
Hi, I do not understand what's going on behind the scene with this code. Or better said I have some idea but maybe I do not see the whole point. void foo(int[] array) { array.length += 1000; // may copy the array array[0] = 1; } auto a = new int[1]; foo(a); assert(a[0] == 1); // fails if a needs to copied inside foo I do understand that array.length += 1000 may copy the array. Page 98 of The D Programming Language and http://www.digitalmars.com/d/2.0/arrays.html#resize shed some light on this matter. Passing a to foo is achieved by copying let's say a begin and an end pointer. Now due to array.length += 1000 new memory might be needed and that's why the begin and end pointer change and array[0] works now on different data. That's why the assert fails. Right? I find this behavior rather strange. Arrays are neither passed by value (copying the whole array) nor by reference. I see reasons for doing it like this, e.g. doing array = array[1..$] inside should not affect the outside. But I wonder whether these semantics are well enough documented? I think I should use ref int[] in the example above, shouldn't I? Jens
Nov 08 2010
On Mon, 08 Nov 2010 20:30:03 +0300, Jens Mueller <jens.k.mueller gmx.de> wrote:Hi, I do not understand what's going on behind the scene with this code. Or better said I have some idea but maybe I do not see the whole point. void foo(int[] array) { array.length += 1000; // may copy the array array[0] = 1; } auto a = new int[1]; foo(a); assert(a[0] == 1); // fails if a needs to copied inside foo I do understand that array.length += 1000 may copy the array. Page 98 of The D Programming Language and http://www.digitalmars.com/d/2.0/arrays.html#resize shed some light on this matter. Passing a to foo is achieved by copying let's say a begin and an end pointer. Now due to array.length += 1000 new memory might be needed and that's why the begin and end pointer change and array[0] works now on different data. That's why the assert fails. Right? I find this behavior rather strange. Arrays are neither passed by value (copying the whole array) nor by reference. I see reasons for doing it like this, e.g. doing array = array[1..$] inside should not affect the outside. But I wonder whether these semantics are well enough documented? I think I should use ref int[] in the example above, shouldn't I? JensYes, you understood it correctly. The changes to array structure (i.e. size and pointer to contents) aren't visible to outer scope, but changes to the contents are. int[] is merely a tuple of a T* ptr and size_t length.
Nov 08 2010
Jens Mueller:I find this behavior rather strange.I don't know if it's strange, but surely it is a little bug-prone corner of D. I have had two or three bugs in my code because of that.Arrays are neither passed by value (copying the whole array) nor by reference.They are passed by "fat reference" :-)I think I should use ref int[] in the example above, shouldn't I?Right. Bye, bearophile
Nov 08 2010
On Monday, November 08, 2010 09:40:47 bearophile wrote:Jens Mueller:I don't know. I find it to be pretty straightforward, though I can understand why people would find it to be confusing at first. As long as you don't alter a dynamic array's size in any way, then it and any references to it or any part of it will continue to point to the same data. If you dup an array, then it's guaranteed to point to different data (albeit a copy of that data). If you alter the size of an array but don't explicitly dup it, then it _might_ point to the same data and it might not (hence the potential confusion). So, if you want to guarantee that an array continues to point to the same data, then just don't alter its size. If you want to guarantee that its copied and points to different data, then dup or idup it. If you don't care whether it continues to point to the same data or not, then feel free to resize it through setting its length or appending to it. Granted, it's easier to understand what's going on when you understand that a dynamic array is essentially struct array(T) { T* ptr; size_t length; } but you don't really need to. Truth be told, I don't think that I've _ever_ had a bug due to how arrays reallocate. I think that as long as you understand that resizing could mean reallocation, it's quite easy to avoid bugs with it. That doesn't mean that they'll never happen, but I don't find this particular corner of the language to be particularly bug-prone. - Jonathan M DavisI find this behavior rather strange.I don't know if it's strange, but surely it is a little bug-prone corner of D. I have had two or three bugs in my code because of that.
Nov 08 2010
bearophile schrieb:Jens Mueller:If you pass a dynamic array to a function and chance it's size within the function, you have undefined behaviour - you never know if it will affect the original array (from the calling function) or not. So IMHO a compiler warning would be appropriate in that case. (It would be even better to have more consistent array handling throughout the different kinds of arrays, as I wrote in another branch of this thread, but if that is no option, for example because it contradicts TDPL, a compiler warning is a good compromise) Cheers, - DanielI find this behavior rather strange.I don't know if it's strange, but surely it is a little bug-prone corner of D. I have had two or three bugs in my code because of that.
Nov 08 2010
On Mon, 08 Nov 2010 13:35:38 -0500, Daniel Gibson <metalcaedes gmail.com> wrote:bearophile schrieb:Not exactly. If you happen to change its size *and* change the original data afterwards, then it's somewhat undefined (I'd call it confusing, since the behavior is perfectly defined, just hard to describe). Such cases are very rare. You are usually changing data on the array in place, or appending to the array, but not usually both.Jens Mueller:If you pass a dynamic array to a function and chance it's size within the function, you have undefined behaviour - you never know if it will affect the original array (from the calling function) or not.I find this behavior rather strange.I don't know if it's strange, but surely it is a little bug-prone corner of D. I have had two or three bugs in my code because of that.So IMHO a compiler warning would be appropriate in that case. (It would be even better to have more consistent array handling throughout the different kinds of arrays, as I wrote in another branch of this thread, but if that is no option, for example because it contradicts TDPL, a compiler warning is a good compromise)First, D doesn't have compiler warnings. Either something is an error, or it is not. You can use the -w switch to turn on extra checks that become errors, but that's it. Second, if you made that a compiler warning, then 90% of D functions would exhibit a warning. This may appear to be a surprising issue, the one time it happens to you, but when it does, you learn how arrays work and move on. In practice, it's not a huge killer such as to warrant making it a compiler error or warning. -Steve
Nov 08 2010
Steven Schveighoffer wrote:On Mon, 08 Nov 2010 13:35:38 -0500, Daniel GibsonLet's also note that appending to the array qualifies as "change its size *and* change the original data afterwards." We cannot be sure whether appending affects the passed-in array.If you pass a dynamic array to a function and chance it's size within the function, you have undefined behaviour - you never know if it will affect the original array (from the calling function) or not.Not exactly. If you happen to change its size *and* change the original data afterwards, then it's somewhat undefined(I'd call it confusing, since the behavior is perfectly defined, just hard to describe).I like the term "discretionary sharing semantics" where any slice can leave the sharing contract at their discretion regardless of whether they modified the shared elements so far. Ali
Nov 08 2010
On Mon, 08 Nov 2010 14:22:36 -0500, Ali Çehreli <acehreli yahoo.com> wrote:Steven Schveighoffer wrote: > On Mon, 08 Nov 2010 13:35:38 -0500, Daniel Gibson >> If you pass a dynamic array to a function and chance it's size within >> the function, you have undefined behaviour - you never know if it will >> affect the original array (from the calling function) or not. > > Not exactly. If you happen to change its size *and* change the original > data afterwards, then it's somewhat undefined Let's also note that appending to the array qualifies as "change its size *and* change the original data afterwards." We cannot be sure whether appending affects the passed-in array.No, it doesn't. If you are appending to data that was passed in, you are not changing the *original data* passed in. You are only appending to it. for example: char[] s = "foo".dup; s ~= "bar"; does not change the first 3 characters at all. So any aliases to s would not be affected. However, any data aliased to the original s may or may not be aliased to the new s. Once you start changing that original data (either via s or via an alias to the original s), this is where the confusing behavior occurs. In my experience, this does not cause a problem in the vast majority of cases. -Steve
Nov 08 2010
Steven Schveighoffer wrote:No, it doesn't. If you are appending to data that was passed in, you are not changing the *original data* passed in. You are only appendingto it. I must be remembering an old behavior. I think appending could affect the original if it had enough capacity. Ali
Nov 08 2010
On Mon, 08 Nov 2010 18:29:27 -0500, Ali Çehreli <acehreli yahoo.com> wrote:Steven Schveighoffer wrote: > No, it doesn't. If you are appending to data that was passed in, you are > not changing the *original data* passed in. You are only appending to it. I must be remembering an old behavior. I think appending could affect the original if it had enough capacity.Before the array append changes were introduced (somewhere around 2.040 I think?), appending to a slice that started at the beginning of the memory block could affect the other data in the array. But that was a memory corruption issue, somewhat different than what we are talking about. -Steve
Nov 09 2010
Steven Schveighoffer Wrote:On Mon, 08 Nov 2010 18:29:27 -0500, Ali Çehreli <acehreli yahoo.com> wrote:[...]I must be remembering an old behavior. I think appending could affect the original if it had enough capacity.Before the array append changes were introduced (somewhere around 2.040 I think?), appending to a slice that started at the beginning of the memory block could affect the other data in the array. But that was a memory corruption issue, somewhat different than what we are talking about.Ah! This is a lot of what was confusing me about arrays; I still thought they had this behavior. The fact that they don't makes me a good deal more comfortable with them, though I still don't like the non-deterministic way that they may copy their elements or they may share structure after you append stuff to them. Cheers, Pillsy
Nov 09 2010
On Tue, 09 Nov 2010 08:14:40 -0500, Pillsy <pillsbury gmail.com> wrote:Steven Schveighoffer Wrote:As I said before, this rarely affects code. The common cases I've seen: 1. You append to an array and return it. 2. You modify data in the array. 3. You use a passed in array as a buffer, which means you overwrite the array, and then start appending when it runs out of space. I don't ever remember seeing: You append to an array, then go back and modify the first few bytes of the array. Let's assume this is a very common thing and absolutely needs to be addressed. What would you like the behavior to be? How can anything different than the current behavior be reasonable? IMO, the benefits of just being able to append to an array any time you want without having to set up some special type far outweighs this little quirk that almost nobody encounters. You can append to *any* array, no matter where the data is located, or whether the data is a slice, and it just works. I can't see how anyone would prefer another solution! -SteveOn Mon, 08 Nov 2010 18:29:27 -0500, Ali Çehreli <acehreli yahoo.com> wrote:[...]I must be remembering an old behavior. I think appending could affect the original if it had enough capacity.Before the array append changes were introduced (somewhere around 2.040 I think?), appending to a slice that started at the beginning of the memory block could affect the other data in the array. But that was a memory corruption issue, somewhat different than what we are talking about.Ah! This is a lot of what was confusing me about arrays; I still thought they had this behavior. The fact that they don't makes me a good deal more comfortable with them, though I still don't like the non-deterministic way that they may copy their elements or they may share structure after you append stuff to them.
Nov 09 2010
Steven Schveighoffer Wrote:On Tue, 09 Nov 2010 08:14:40 -0500, Pillsy <pillsbury gmail.com> wrote:[...]Ah! This is a lot of what was confusing me about arrays; I still thought they had this behavior. The fact that they don't makes me a good deal more comfortable with them, though I still don't like the non-deterministic way that they may copy their elements or they may share structure after you append stuff to them.As I said before, this rarely affects code. The common cases I've seen:1. You append to an array and return it. 2. You modify data in the array. 3. You use a passed in array as a buffer, which means you overwrite the array, and then start appending when it runs out of space.I don't ever remember seeing:You append to an array, then go back and modify the first few bytes of the array.I've certainly encountered situations in at least one other language where standard library functions will return mutable arrays which may or may not share structure with their inputs. This has been such a frequent source of pain when using that language that I tend to react very negatively to the possibility in any context.Let's assume this is a very common thing and absolutely needs to be addressed. What would you like the behavior to be?Using a different, library type for a buffer you can append to. I think of "a buffer or abstract list you can cheaply append to" as a different sort of type from a fixed size buffer anyway, since it so often is a different type. Arrays/slices are a very basic type in D, and I'm generally thinking that giving your basic types simpler, easier to understand semantics is worth paying a modest cost. [...]IMO, the benefits of just being able to append to an array any time you want without having to set up some special type far outweighs this little quirk that almost nobody encounters. You can append to *any* array, no matter where the data is located, or whether the data is a slice, and it just works. I can't see how anyone would prefer another solution!There's a difference between appending and appending in place. The problem with not appending in place (and arrays not having the possibility of a reserve that's larger than the actual amount, of course) is one of efficiency. Having auto s = "foo"; s ~= "bar"; result in a new array being allocated that is of length 6 and contains "foobar", and assigning that array to `s`, is obviously useful and desirable behavior. If the expansion can happen in place, that's a perfectly reasonable performance optimization to have in the case of strings or other immutable arrays. Indeed, one of the reasons that functional programming and GC go together like peanut butter and jelly is that together they let you get all sorts of wins in terms of efficiency from shared structure. However, I've found working with languages that mix a lot of imperative and functional constructs (Lisp is one, but not the only one) that if you're going to do this, it's really very important that there not be any doubt about when mutable state is shared and when it isn't. D is trying to be that same kind of multi-paradigm language. This means that, for mutable arrays, having int[] x = [1, 2, 3]; x ~= [4, 5, 6]; maybe reallocate and maybe not seems like it's only really there to protect people from doing inefficient things by accident when they append onto the back of an array repeatedly (or to make that admittedly common case more convenient). This really doesn't strike me as worth the trouble. Like I said elsewhere, the uncertainty gives me the screaming willies. Cheers, Pillsy
Nov 09 2010
On Tue, 09 Nov 2010 15:13:55 -0500 Pillsy <pillsbury gmail.com> wrote:There's a difference between appending and appending in place. The proble=m with not appending in place (and arrays not having the possibility of a r= eserve that's larger than the actual amount, of course) is one of efficienc= y. Having=20 auto s =3D "foo"; s ~=3D "bar"; =20 result in a new array being allocated that is of length 6 and contains "f=oobar", and assigning that array to `s`, is obviously useful and desirable = behavior. If the expansion can happen in place, that's a perfectly reasonab= le performance optimization to have in the case of strings or other immutab= le arrays. Indeed, one of the reasons that functional programming and GC go= together like peanut butter and jelly is that together they let you get al= l sorts of wins in terms of efficiency from shared structure.=20=20 However, I've found working with languages that mix a lot of imperative a=nd functional constructs (Lisp is one, but not the only one) that if you're= going to do this, it's really very important that there not be any doubt a= bout when mutable state is shared and when it isn't. D is trying to be that= same kind of multi-paradigm language. +++This means that, for mutable arrays, having =20 int[] x =3D [1, 2, 3]; x ~=3D [4, 5, 6]; =20 maybe reallocate and maybe not seems like it's only really there to prote=ct people from doing inefficient things by accident when they append onto t= he back of an array repeatedly (or to make that admittedly common case more= convenient). This really doesn't strike me as worth the trouble. Like I sa= id elsewhere, the uncertainty gives me the screaming willies.=20 There is some trouble in there; but it's hard to point it clearly. After int[] ints1; ... ints2 =3D ints1; ... depending of what happens to each array, especially when passed to funcs th= at could manipulate them, the relation initially established between variab= les may or may not be maintained. Also, it may be broken only in some cases= , depending on what operations are performed during a given run. Possibly, in practice, things are much easier to do right than seems at fir= st sight. But my impression is I will be bitten more than once, and badly. = (*) Denis (*) Reminds of python's famous gotcha which instead _establishes_ an unexpe= cted relation: def f(i, l=3D[]): l.append(i) return l -- -- -- -- -- -- -- vit esse estrany =E2=98=A3 spir.wikidot.com
Nov 09 2010
On Tue, 09 Nov 2010 15:13:55 -0500, Pillsy <pillsbury gmail.com> wrote:Steven Schveighoffer Wrote:Care to name names? I want to understand this dislike of D arrays, because out of all the languages I've ever used, D arrays are by far the easiest and most intuitive to use. I don't expect to be convinced, but at least we can have some debate on this, and maybe we can avoid mistakes made by other languages.On Tue, 09 Nov 2010 08:14:40 -0500, Pillsy <pillsbury gmail.com> wrote:[...]Ah! This is a lot of what was confusing me about arrays; I stillthoughtthey had this behavior. The fact that they don't makes me a good deal more comfortable with them, though I still don't like the non-deterministic way that they may copy their elements or they may share structure after you append stuff to them.As I said before, this rarely affects code. The common cases I've seen:1. You append to an array and return it. 2. You modify data in the array. 3. You use a passed in array as a buffer, which means you overwrite the array, and then start appending when it runs out of space.I don't ever remember seeing:You append to an array, then go back and modify the first few bytes of the array.I've certainly encountered situations in at least one other language where standard library functions will return mutable arrays which may or may not share structure with their inputs. This has been such a frequent source of pain when using that language that I tend to react very negatively to the possibility in any context.There was a time when the T[new] idea was expected to be part of the language. Both Andrei and Walter were behind it, and seldom does something not make it into the language when that happens. It turns out, that after all the academic and theoretical discussions were finished, and it came time to implement, it was a clunky and confusing feature. Andrei said that for TDPL he had a whole table dedicated to what type to use in which cases (T[] or T[new]) and he didn't even know how to fill out the table. The beauty of D's arrays are that the slice and the array are both the same type, so you only need to define one function to handle both, and appending "just works". I feel like this is simply a case of 'not well enough understood.' BTW, you can allocate a fixed buffer by doing: T[BUFSIZE] buffer; This cannot be appended to. It is still difficult to allocate one of these on the heap, which is a language shortcoming, but it can be fixed.Let's assume this is a very common thing and absolutely needs to be addressed. What would you like the behavior to be?Using a different, library type for a buffer you can append to. I think of "a buffer or abstract list you can cheaply append to" as a different sort of type from a fixed size buffer anyway, since it so often is a different type. Arrays/slices are a very basic type in D, and I'm generally thinking that giving your basic types simpler, easier to understand semantics is worth paying a modest cost.[...]To leave no doubt about whether this reallocates or not try: bool willReallocate = x.length + 3 > x.capacity; But I still don't understand this concept. If you find out it's not going to reallocate, what are you going to do? I mean, you have three cases here: 1. You *don't* want it to reallocate -- well, you can't enforce this, but you can use ref to ensure the original is always affected 2. You *want* it to reallocate -- use dup or ~ 3. You don't care -- just use the array directly I don't see how these three options aren't enough.IMO, the benefits of just being able to append to an array any time you want without having to set up some special type far outweighs this little quirk that almost nobody encounters. You can append to *any* array, no matter where the data is located, or whether the data is a slice, and it just works. I can't see how anyone would prefer another solution!There's a difference between appending and appending in place. The problem with not appending in place (and arrays not having the possibility of a reserve that's larger than the actual amount, of course) is one of efficiency. Having auto s = "foo"; s ~= "bar"; result in a new array being allocated that is of length 6 and contains "foobar", and assigning that array to `s`, is obviously useful and desirable behavior. If the expansion can happen in place, that's a perfectly reasonable performance optimization to have in the case of strings or other immutable arrays. Indeed, one of the reasons that functional programming and GC go together like peanut butter and jelly is that together they let you get all sorts of wins in terms of efficiency from shared structure. However, I've found working with languages that mix a lot of imperative and functional constructs (Lisp is one, but not the only one) that if you're going to do this, it's really very important that there not be any doubt about when mutable state is shared and when it isn't. D is trying to be that same kind of multi-paradigm language. This means that, for mutable arrays, having int[] x = [1, 2, 3]; x ~= [4, 5, 6];maybe reallocate and maybe not seems like it's only really there to protect people from doing inefficient things by accident when they append onto the back of an array repeatedly (or to make that admittedly common case more convenient). This really doesn't strike me as worth the trouble. Like I said elsewhere, the uncertainty gives me the screaming willies.I hear you, but at the same time, we are talking about common and uncommon cases here. D (at least in my mind) tries to be a practical language -- make the common things easy as long as they are safe. And the cases where D's arrays may surprise you are pretty uncommon IMO. -Steve
Nov 10 2010
On Monday, November 08, 2010 11:22:36 Ali =C3=87ehreli wrote:Steven Schveighoffer wrote: > On Mon, 08 Nov 2010 13:35:38 -0500, Daniel Gibson >=20 >> If you pass a dynamic array to a function and chance it's size within >> the function, you have undefined behaviour - you never know if it will >> affect the original array (from the calling function) or not. >=20 > Not exactly. If you happen to change its size *and* change the origin=al> data afterwards, then it's somewhat undefined =20 Let's also note that appending to the array qualifies as "change its size *and* change the original data afterwards." We cannot be sure whether appending affects the passed-in array.Yes you can. It _never_ alters the array which was passed in. Sure, it _cou= ld_=20 alter the memory just off the end of the passed in array if no arrays refer= to=20 that memory, but that doesn't cause any problems. It doesn't alter the orig= inal=20 array at all. It just means that when you resize that array, it may have to= =20 reallocate whereas before it might have been able to resize in place. And i= f the=20 array in the called function reallocates instead of resizing in place, then= the=20 original array would either have been forced to reallocate anyway or it may= be=20 able to resize in place depending on how much you try and resize it and whe= ther=20 any other arrays refer to the memory passed its end. In _no_ case does appending or concatenating to an array alter any other ar= rays,=20 even if they point to the same memory. They may or may not end up pointing = to=20 the same memory afterwards (depending on whether a reallocation takes place= ),=20 but you never alter any other arrays. =2D Jonathan M Davis
Nov 08 2010
On Monday, November 08, 2010 10:35:38 Daniel Gibson wrote:bearophile schrieb:Implementation-defined behavior would be more accurate. Undefined would mean that it's something dangerous which is not defined by the language, and you shouldn't be doing. It's perfectly defined by the language in this case. It's just that how much extra memory is allocated for a dynamic array is implementation-defined, so in cases where a reallocation would be necessary because the array ran out of memory, it's implementation-dependent as to whether or not a reallocation will be necessary or not. In every other case, it's completely language-defined and deterministic. The algorithm for additional capacity is the only part that isn't.Jens Mueller:If you pass a dynamic array to a function and chance it's size within the function, you have undefined behaviour - you never know if it will affect the original array (from the calling function) or not.I find this behavior rather strange.I don't know if it's strange, but surely it is a little bug-prone corner of D. I have had two or three bugs in my code because of that.So IMHO a compiler warning would be appropriate in that case. (It would be even better to have more consistent array handling throughout the different kinds of arrays, as I wrote in another branch of this thread, but if that is no option, for example because it contradicts TDPL, a compiler warning is a good compromise)Honestly, if you want an array that you're passing in to be altered by the function that you're passing it to, it really should be passed by ref anyway. And if you want to guarantee that it isn't going to be altered, make it const, dup it, or have its individual elements be const or immutable. Problem solved. I really don't see this as an issue. I can understand why there might be some confusion - particularly since the online docs aren't very clear on the matter, but it really isn't complicated, and I'd hate to have to dup an array just to be able to append to it, which is what the warning that you're suggesting would require. A compiler warning is as good as an error really, since you're going to have to fix it anyway, so I'd definitely be against making this either an error or a warning. I see no problem with being allowed to resize arrays that are passed to functions. - Jonathan M Davis
Nov 08 2010
Jonathan M Davis schrieb:On Monday, November 08, 2010 10:35:38 Daniel Gibson wrote:Ok, undefined may have been the wrong word.. non-deterministic may be better. Anyway, you can't know if there's space left behind the array that can be obtained by realloc or if increasing the length will cause the array to be copied to a new block of memory, so the array in the calling function points to other memory than the array in the called function.bearophile schrieb:Implementation-defined behavior would be more accurate. Undefined would mean that it's something dangerous which is not defined by the language, and you shouldn't be doing. It's perfectly defined by the language in this case. It's just that how much extra memory is allocated for a dynamic array is implementation-defined, so in cases where a reallocation would be necessary because the array ran out of memory, it's implementation-dependent as to whether or not a reallocation will be necessary or not. In every other case, it's completely language-defined and deterministic. The algorithm for additional capacity is the only part that isn't.Jens Mueller:If you pass a dynamic array to a function and chance it's size within the function, you have undefined behaviour - you never know if it will affect the original array (from the calling function) or not.I find this behavior rather strange.I don't know if it's strange, but surely it is a little bug-prone corner of D. I have had two or three bugs in my code because of that.The documentation[1] says: "For dynamic array and object parameters, which are passed by reference, in/out/ref apply only to the reference and not the contents." So, by reading the documentation one would assume that dynamic arrays are passed by reference - *real* reference, implying that any change to the array within the called function will be visible outside of the function. The truth however is that dynamic arrays are not passed by reference and any changes to the length will be lost (even if the arrays data won't be copied).So IMHO a compiler warning would be appropriate in that case. (It would be even better to have more consistent array handling throughout the different kinds of arrays, as I wrote in another branch of this thread, but if that is no option, for example because it contradicts TDPL, a compiler warning is a good compromise)Honestly, if you want an array that you're passing in to be altered by the function that you're passing it to, it really should be passed by ref anyway.And if you want to guarantee that it isn't going to be altered, make it const, dup it, or have its individual elements be const or immutable. Problem solved. I really don't see this as an issue. I can understand why there might be some confusion - particularly since the online docs aren't very clear on the matter, but it really isn't complicated, and I'd hate to have to dup an array just to be able to append to it, which is what the warning that you're suggesting would require. A compiler warning is as good as an error really, since you're going to have to fix it anyway, so I'd definitely be against making this either an error or a warning. I see no problem with being allowed to resize arrays that are passed to functions.So maybe yet another solution would be to *really* pass dynamic arrays by reference (like the doc pretends it's already done)?- Jonathan M DavisCheers, - Daniel [1] http://www.digitalmars.com/d/2.0/function.html
Nov 08 2010
On Monday, November 08, 2010 11:30:33 Daniel Gibson wrote:Then the documentation should be updated to be more clear.The documentation[1] says: "For dynamic array and object parameters, which are passed by reference, in/out/ref apply only to the reference and not the contents." So, by reading the documentation one would assume that dynamic arrays are passed by reference - *real* reference, implying that any change to the array within the called function will be visible outside of the function. The truth however is that dynamic arrays are not passed by reference and any changes to the length will be lost (even if the arrays data won't be copied).So IMHO a compiler warning would be appropriate in that case. (It would be even better to have more consistent array handling throughout the different kinds of arrays, as I wrote in another branch of this thread, but if that is no option, for example because it contradicts TDPL, a compiler warning is a good compromise)Honestly, if you want an array that you're passing in to be altered by the function that you're passing it to, it really should be passed by ref anyway.That would be a devastating change. It would cause bugs all over the place - especially when dealing with std.algorithm. Remember that an array is a range. Really, when an array is passed to a function, you're passing a slice of that array which happen to slice the whole array. - Jonathan M DavisAnd if you want to guarantee that it isn't going to be altered, make it const, dup it, or have its individual elements be const or immutable. Problem solved. I really don't see this as an issue. I can understand why there might be some confusion - particularly since the online docs aren't very clear on the matter, but it really isn't complicated, and I'd hate to have to dup an array just to be able to append to it, which is what the warning that you're suggesting would require. A compiler warning is as good as an error really, since you're going to have to fix it anyway, so I'd definitely be against making this either an error or a warning. I see no problem with being allowed to resize arrays that are passed to functions.So maybe yet another solution would be to *really* pass dynamic arrays by reference (like the doc pretends it's already done)?
Nov 08 2010
Jonathan M Davis:Daniel Gibson:That's a solution that I have proposed lot of time ago, I did receive no answers :-) A disadvantage of passing on default dynamic arrays by reference is a decrease in performance. Bye, bearophileSo maybe yet another solution would be to *really* pass dynamic arrays by reference (like the doc pretends it's already done)?That would be a devastating change.
Nov 08 2010
On Mon, 08 Nov 2010 20:30:33 +0100 Daniel Gibson <metalcaedes gmail.com> wrote:The documentation[1] says: "For dynamic array and object parameters, whic=h are=20passed by reference, in/out/ref apply only to the reference and not the c=ontents."So, by reading the documentation one would assume that dynamic arrays are=passed=20by reference - *real* reference, implying that any change to the array wi=thin=20the called function will be visible outside of the function. The truth however is that dynamic arrays are not passed by reference and =any=20changes to the length will be lost (even if the arrays data won't be copi=ed). Exactly. Pass (and assign) by reference mean different semantics from what = D does with dyn arrays: that changes are shared. Denis -- -- -- -- -- -- -- vit esse estrany =E2=98=A3 spir.wikidot.com
Nov 08 2010
On Monday 08 November 2010 12:07:39 spir wrote:On Mon, 08 Nov 2010 20:30:33 +0100 Daniel Gibson <metalcaedes gmail.com> wrote:The reference semantics for Objects and dynamic arrays are identical. They are passed by reference, but their reference is passed by value. So, any changes to the contents will affect the object or array which was passed in. However, changes to the reference (such as assigning a new Object or array or doing any operation on an array which would result in reallocation) do not change the object or array which was referred to by that reference. "For dynamic array and object parameters, which are passed by reference, in/out/ref apply only to the reference and not the contents." is perfectly correct. The problem is whether you read the "which..." as applying to both dynamic arrays and object parameters or just object parameters. It's perfectly correct as is but apparently ambiguous enough to cause confusion. The docs should be updated to be clearer, but the quoted documentation, at least, is correct. And there's nothing funny about how arrays work in comparison to objects except that arrays happen to have operations which can cause reallocation (in addition to new) whereas objects don't. - Jonathan M DavisThe documentation[1] says: "For dynamic array and object parameters, which are passed by reference, in/out/ref apply only to the reference and not the contents." So, by reading the documentation one would assume that dynamic arrays are passed by reference - *real* reference, implying that any change to the array within the called function will be visible outside of the function. The truth however is that dynamic arrays are not passed by reference and any changes to the length will be lost (even if the arrays data won't be copied).Exactly. Pass (and assign) by reference mean different semantics from what D does with dyn arrays: that changes are shared.
Nov 08 2010
On Mon, 08 Nov 2010 19:30:03 +0200, Jens Mueller <jens.k.mueller gmx.de> wrote:I find this behavior rather strange. Arrays are neither passed by value (copying the whole array) nor by reference. I see reasons for doing it like this, e.g. doing array = array[1..$] inside should not affect the outside.Compare to this C-ish program: void foo(int *array, int length) { array = (int*)realloc(array, (length += 1000) * sizeof(int)); // may copy the array array[0] = 1; } int* a = new int[1]; foo(a, 1); assert(a[0] == 1); // fails if a needs to copied inside foo C's realloc *may* copy the memory area being reallocated. Just like when using realloc, it's something you must be aware of when using D arrays. -- Best regards, Vladimir mailto:vladimir thecybershadow.net
Nov 08 2010
Vladimir Panteleev schrieb:On Mon, 08 Nov 2010 19:30:03 +0200, Jens Mueller <jens.k.mueller gmx.de> wrote:This might technically be the same thing, but D's nice syntax hides this. IMHO passing arrays to functions are really inconsistent in D2 anyway: static arrays are passed by value but dynamic arrays are passed by reference, but then again, as this thread shows, not really.. And what about associative arrays? (I don't know, haven't tried yet, afaik it isn't documented). Certainly D's behaviour regarding the dynamic arrays has technical reasons and makes sense when you know how they're implemented (a fat pointer that is passed by value), but it *feels* inconsistent. IMHO either all kinds of arrays should *either* be passed by reference (real reference, not coincidental reference like dynamic arrays are now) *or* by "logical" value, i.e. dynamic arrays would be dup'ed and associative arrays would be cloned. BTW: What were the reasons to pass static arrays by value in D2 (while in D1 they're passed by reference)? Cheers, - DanielI find this behavior rather strange. Arrays are neither passed by value (copying the whole array) nor by reference. I see reasons for doing it like this, e.g. doing array = array[1..$] inside should not affect the outside.Compare to this C-ish program: void foo(int *array, int length) { array = (int*)realloc(array, (length += 1000) * sizeof(int)); // may copy the array array[0] = 1; } int* a = new int[1]; foo(a, 1); assert(a[0] == 1); // fails if a needs to copied inside foo C's realloc *may* copy the memory area being reallocated. Just like when using realloc, it's something you must be aware of when using D arrays.
Nov 08 2010
Daniel Gibson wrote:BTW: What were the reasons to pass static arrays by value in D2 (while in D1 they're passed by reference)?It makes things like vectors (i.e. float[3]) natural to manipulate. It also segues nicely into hopeful future support for the CPU's vector instructions.
Nov 08 2010
Walter Bright schrieb:Daniel Gibson wrote:Why can't that be done when the static arrays are passed by reference?BTW: What were the reasons to pass static arrays by value in D2 (while in D1 they're passed by reference)?It makes things like vectors (i.e. float[3]) natural to manipulate. It also segues nicely into hopeful future support for the CPU's vector instructions.
Nov 08 2010
Daniel Gibson schrieb:Walter Bright schrieb:Ah I guess you mean something like "alias float[3] vec3", so one may expect vec3 to behave like a value type (like when you define it in a struct) so it's passed by value. That does make sense, even though I'm not sure what's more important: consistency between different kinds of arrays or expectations towards typed defined from static arrays. ;-) I still don't get the part with the CPU's vector instructions though. I don't have any assembly knowledge an no experience with directly using CPU's vector instructions, but the example of a C function wrapping SSE instructions from the wikipedia article[1], which multiplies two arrays of floats, loads the arrays by reference and even stores the result in one of them.Daniel Gibson wrote:Why can't that be done when the static arrays are passed by reference?BTW: What were the reasons to pass static arrays by value in D2 (while in D1 they're passed by reference)?It makes things like vectors (i.e. float[3]) natural to manipulate. It also segues nicely into hopeful future support for the CPU's vector instructions.
Nov 08 2010
Daniel Gibson schrieb:Daniel Gibson schrieb:Hrm forgot the link: [1] http://en.wikipedia.org/wiki/Vector_processorWalter Bright schrieb:Ah I guess you mean something like "alias float[3] vec3", so one may expect vec3 to behave like a value type (like when you define it in a struct) so it's passed by value. That does make sense, even though I'm not sure what's more important: consistency between different kinds of arrays or expectations towards typed defined from static arrays. ;-) I still don't get the part with the CPU's vector instructions though. I don't have any assembly knowledge an no experience with directly using CPU's vector instructions, but the example of a C function wrapping SSE instructions from the wikipedia article[1], which multiplies two arrays of floats, loads the arrays by reference and even stores the result in one of them.Daniel Gibson wrote:Why can't that be done when the static arrays are passed by reference?BTW: What were the reasons to pass static arrays by value in D2 (while in D1 they're passed by reference)?It makes things like vectors (i.e. float[3]) natural to manipulate. It also segues nicely into hopeful future support for the CPU's vector instructions.
Nov 08 2010
Daniel Gibson wrote:Daniel Gibson schrieb:You don't write an add function by using references to ints. The vector instructions treat them like values, so a value type should correspond to it.Daniel Gibson schrieb:Hrm forgot the link: [1] http://en.wikipedia.org/wiki/Vector_processorWalter Bright schrieb:Ah I guess you mean something like "alias float[3] vec3", so one may expect vec3 to behave like a value type (like when you define it in a struct) so it's passed by value. That does make sense, even though I'm not sure what's more important: consistency between different kinds of arrays or expectations towards typed defined from static arrays. ;-) I still don't get the part with the CPU's vector instructions though. I don't have any assembly knowledge an no experience with directly using CPU's vector instructions, but the example of a C function wrapping SSE instructions from the wikipedia article[1], which multiplies two arrays of floats, loads the arrays by reference and even stores the result in one of them.Daniel Gibson wrote:Why can't that be done when the static arrays are passed by reference?BTW: What were the reasons to pass static arrays by value in D2 (while in D1 they're passed by reference)?It makes things like vectors (i.e. float[3]) natural to manipulate. It also segues nicely into hopeful future support for the CPU's vector instructions.
Nov 08 2010
Walter Bright schrieb:Daniel Gibson wrote:Ok, thanks for explaining :-)Daniel Gibson schrieb:You don't write an add function by using references to ints. The vector instructions treat them like values, so a value type should correspond to it.Daniel Gibson schrieb:Hrm forgot the link: [1] http://en.wikipedia.org/wiki/Vector_processorWalter Bright schrieb:Ah I guess you mean something like "alias float[3] vec3", so one may expect vec3 to behave like a value type (like when you define it in a struct) so it's passed by value. That does make sense, even though I'm not sure what's more important: consistency between different kinds of arrays or expectations towards typed defined from static arrays. ;-) I still don't get the part with the CPU's vector instructions though. I don't have any assembly knowledge an no experience with directly using CPU's vector instructions, but the example of a C function wrapping SSE instructions from the wikipedia article[1], which multiplies two arrays of floats, loads the arrays by reference and even stores the result in one of them.Daniel Gibson wrote:Why can't that be done when the static arrays are passed by reference?BTW: What were the reasons to pass static arrays by value in D2 (while in D1 they're passed by reference)?It makes things like vectors (i.e. float[3]) natural to manipulate. It also segues nicely into hopeful future support for the CPU's vector instructions.
Nov 08 2010
On Mon, 08 Nov 2010 19:04:40 +0100 Daniel Gibson <metalcaedes gmail.com> wrote:IMHO passing arrays to functions are really inconsistent in D2 anyway: static arrays are passed by value but dynamic arrays are passed by refere=nce,but then again, as this thread shows, not really..It may be better to have 2 kinds of "sequences", one having true value sema= ntics (assignment & parameter passing also protect the content), the other = true reference semantics (say, an array-list). Static arrays may just be co= nsidered as an additional hint to the compiler for possible optimization. Or, have arrays implement true value semantics, but pass them as ref when n= eeded. But then, we may sometimes need assignment not to copy... What do you think? Denis -- -- -- -- -- -- -- vit esse estrany =E2=98=A3 spir.wikidot.com
Nov 08 2010
Jens Mueller Wrote:I find this behavior rather strange. Arrays are neither passed by value (copying the whole array) nor by reference. I see reasons for doing it like this, e.g. doing array = array[1..$] inside should not affect the outside. But I wonder whether these semantics are well enough documented?It may take some effort to explain it. Does it help, if you think about T[] as not an array but a slice?
Nov 08 2010
Kagamin wrote:Jens Mueller Wrote:Yes. This behavior is well explained when one thinks about passing a slice of data instead of an dynamic array. With a dynamic array I think about the std::vector. But passing a reference to a std::vector is not the same as passing a slice. JensI find this behavior rather strange. Arrays are neither passed by value (copying the whole array) nor by reference. I see reasons for doing it like this, e.g. doing array = array[1..$] inside should not affect the outside. But I wonder whether these semantics are well enough documented?It may take some effort to explain it. Does it help, if you think about T[] as not an array but a slice?
Nov 08 2010
Jens Mueller Wrote:Hi, I do not understand what's going on behind the scene with this code. Or better said I have some idea but maybe I do not see the whole point. void foo(int[] array) { array.length += 1000; // may copy the array array[0] = 1; } auto a = new int[1]; foo(a); assert(a[0] == 1); // fails if a needs to copied inside foo...I find this behavior rather strange. Arrays are neither passed by value (copying the whole array) nor by reference. I see reasons for doing it like this, e.g. doing array = array[1..$] inside should not affect the outside. But I wonder whether these semantics are well enough documented? I think I should use ref int[] in the example above, shouldn't I? JensBut they are past by reference. You can modify the data all you want, but cannot reassign the reference itself. Resizing the array may cause a reassignment of that reference. It is not different from the following code except resizing does not guarantee a reference change. import std.stdio; void assignValue(A a) { a = new A(); a.value = 6; } class A { int value; } void main() { A a = new A(); assignValue(a); writeln(a.value); }
Nov 08 2010
On Mon, 08 Nov 2010 15:32:56 -0500 Jesse Phillips <jessekphillips+D gmail.com> wrote:But they are past by reference. You can modify the data all you want, but=cannot reassign the reference itself. No, they are _not_ passed by reference. Stop saying that, this is precisely= what causes confusion. This is reference semantics: class C { int* ints; } void main () { auto c =3D new C(); auto d =3D c; auto ints =3D [1,2,3]; c.ints =3D &(ints[0]); assert(c.ints =3D=3D d.ints); assert(*(c.ints) =3D=3D *(d.ints)); } Whatever change one performs on object fields never breaks the relation wit= h other vars pointing to the same object; because the object itself is refe= renced. D arrays do not work that way: the kind of struct is *copied*, not referenc= ed; arrays themselves are *values*. So that changes to the content that req= uires reallocation breaks the relation. (Additional confusion is brought by the fact that, if a is an array, after = "b=3Da" (b is a) yields true, for any reason, but this is also wrong. I gue= ss 'is' is overloaded for arrays to compare the adresses of the contents in= stead of the ones of the array-structs, but this is misguided.) Denis -- -- -- -- -- -- -- vit esse estrany =E2=98=A3 spir.wikidot.com
Nov 08 2010
spir Wrote:No, they are _not_ passed by reference. Stop saying that, this is precisely what causes confusion. This is reference semantics:void main() { auto a = new int[1]; auto b = a; a[0] = 5; assert(a == b); assert(a[0] == b[0]); assert(a is b); writefln("%x == %x", &a[0], &b[0]); moreExample(a, cast(uint) &a[0]); } void moreExample(int[] a, uint aAddr) { assert(cast(uint)&a[0] == aAddr); writefln("%x == %x", &a[0], aAddr); }D arrays do not work that way: the kind of struct is *copied*, not referenced; arrays themselves are *values*. So that changes to the content that requires reallocation breaks the relation.The struct you are referring to _is_ the reference. There is no such thing as "copy by reference," everything is copied by value. However, some values are just a reference to another location.(Additional confusion is brought by the fact that, if a is an array, after "b=a" (b is a) yields true, for any reason, but this is also wrong. I guess 'is' is overloaded for arrays to compare the adresses of the contents instead of the ones of the array-structs, but this is misguided.)The array-struct is the reference, so it is what gets compared. That means both the internal pointer and length must be the same. Just because the reference is more than an address does not make it any less a reference. void main() { auto a = new int[2]; a[0] = 5; a[1] = 10; auto b = a.dup; auto c = a[0..1]; assert(a == b); assert(a[0] == b[0]); assert(a !is b); assert(a !is c); } The distinction is that an Array can have its reference changed by resizing (which is not an option for other reference types).
Nov 08 2010
On Tue, Nov 9, 2010 at 1:24 AM, Jesse Phillips <jessekphillips+D gmail.com> wrote:The array-struct is the reference, so it is what gets compared. That means both the internal pointer and length must be the same. Just because the reference is more than an address does not make it any less a reference.Unlike in C, a D array is more than a reference. An array is the data (or a reference to its data) + metadata (its length) - the metadata belongs to the array (and not to the reference). This means that, when you say "the array is passed by reference" one expects that also the arrays length is passed by reference - because the length belongs to the array.The distinction is that an Array can have its reference changed by resizing (which is not an option for other reference types).If that is not an option for any other reference type, why should it be an option for arrays? That is just inconsistent and doesn't make any sense. No, Arrays should not be considered reference types when passed to a function. As someone else said before: Logically you don't pass the array but a slice that contains the whole array. Cheers, - Daniel
Nov 08 2010
Daniel Gibson Wrote:On Tue, Nov 9, 2010 at 1:24 AM, Jesse Phillips <jessekphillips+D gmail.com> wrote:If you are using the definition of reference which would include C pointers (i.e. not C++ references), then there is nothing in the definition of reference that precludes it from having meta data. "In distributed computing, the reference may contain more than an address or identifier; it may also include an embedded specification of the network protocols used to locate and access the referenced object, the way information is encoded or serialized." http://en.wikipedia.org/wiki/Reference_(computer_science)The array-struct is the reference, so it is what gets compared. That means both the internal pointer and length must be the same. Just because the reference is more than an address does not make it any less a reference.Unlike in C, a D array is more than a reference. An array is the data (or a reference to its data) + metadata (its length) - the metadata belongs to the array (and not to the reference).This means that, when you say "the array is passed by reference" one expects that also the arrays length is passed by reference - because the length belongs to the array.This is a good point. Though if the length did stay with the array, it is still reasonable that a reallocation of the array would cause the reference to no longer point to the same array.Well, if you can come up with a good definition for what "increasing the size of a class" would do, then maybe it should be added. It really doesn't matter. Arrays are their own type, the have their own semantics. It does help to think of them in terms of slices (which is what TDPL refers to them as), yet that does not remove the fact that they are in dead a reference type. Many times familiar terms are used so that a behavior is quickly understood. For example it is common to say that arrays in C is just a pointer into a memory location. But in reality that is not true. http://www.lysator.liu.se/c/c-faq/c-2.htmlThe distinction is that an Array can have its reference changed by resizing (which is not an option for other reference types).If that is not an option for any other reference type, why should it be an option for arrays? That is just inconsistent and doesn't make any sense.No, Arrays should not be considered reference types when passed to a function. As someone else said before: Logically you don't pass the array but a slice that contains the whole array. Cheers, - Daniel
Nov 08 2010
On Mon, 08 Nov 2010 21:29:42 -0500, Jesse Phillips <jessekphillips+D gmail.com> wrote:Well, if you can come up with a good definition for what "increasing the size of a class" would do, then maybe it should be added. It really doesn't matter. Arrays are their own type, the have their own semantics. It does help to think of them in terms of slices (which is what TDPL refers to them as), yet that does not remove the fact that they are in dead a reference type. Many times familiar terms are used so that a behavior is quickly understood. For example it is common to say that arrays in C is just a pointer into a memory location. But in reality that is not true.It depends on your definition of reference type. I agree with Daniel. If you want to get into academic definitions, yes, 'technically' an array is a reference, but it's confusing to someone who's not interested in exploring the theoretical parts of computer science. I think of an array as a hybrid between a reference and a value type. The data is passed by reference, the length is passed by value. This mean changing the length only affects the local copy, but changing the data affects all arrays that point to that data. I think it also helps to think of arrays as slices (that's what they are anyways). There is no more distinction between slices or arrays, they are one and the same. An array does not own its data, and that is a major point of confusion. An array always just references data, it never owns it. -Steve
Nov 09 2010
Steven Schveighoffer:I think of an array as a hybrid between a reference and a value type. The data is passed by reference, the length is passed by value. This mean changing the length only affects the local copy, but changing the data affects all arrays that point to that data.Let's play some more :-) The data is passed by pointer value, the length is passed by value: void foo1(ref int[2] arr) { int[2] a = [10, 20]; arr = a; } void foo2(int[] arr) { int[2] a = [10, 20]; arr = a; } void main() { int[2] arr; arr = [1, 2]; foo1(arr); assert(arr == [10, 20]); arr = [1, 2]; foo2(arr); assert(arr == [1, 2]); } Bye, bearophile
Nov 09 2010
Steven Schveighoffer Wrote:It depends on your definition of reference type. I agree with Daniel. If you want to get into academic definitions, yes, 'technically' an array is a reference, but it's confusing to someone who's not interested in exploring the theoretical parts of computer science.I think viewing a reference as a pointer is wrong. But I did already agree that since it is claimed to be passed by reference it is odd that the length is not part of it.I think of an array as a hybrid between a reference and a value type. The data is passed by reference, the length is passed by value. This mean changing the length only affects the local copy, but changing the data affects all arrays that point to that data.Yet this still isn't complete because if you resize the array the data may not point to that same data anymore. I do not see this as an issue, but if you understand this, the fact that the length is passed by value isn't important, because you won't be relying on whether it points to the same data or not. D arrays are done differently then other types/languages. I don't have an issue with changing the wording to clarify, but if we are going to do that suggestions should be made. But saying it isn't a reference does nothing.There is no more distinction between slices or arrays, they are one and the same.Yeah, I'm glad about this. Arrays are much nice for this.
Nov 09 2010
On Tue, 09 Nov 2010 07:42:13 -0500 "Steven Schveighoffer" <schveiguy yahoo.com> wrote:I think of an array as a hybrid between a reference and a value type. Th=e =20data is passed by reference, the length is passed by value. This mean =20 changing the length only affects the local copy, but changing the data =20 affects all arrays that point to that data.The pointer is passed by value as well. Thinking at an array as a fat point= er, a (pointer,length) tuple, well, the whole tuple is a plain value. Denis -- -- -- -- -- -- -- vit esse estrany =E2=98=A3 spir.wikidot.com
Nov 09 2010
On 09/11/2010 12:42, Steven Schveighoffer wrote:On Mon, 08 Nov 2010 21:29:42 -0500, Jesse Phillips <jessekphillips+D gmail.com> wrote:What do you mean "depends on your definition of reference type" ? I think that what a reference type is, is generally understood fairly well by the majority of developers, even if understood implicitly. I think our confusion here has more to do to what we consider an "array", rather than what we consider to "reference type to be. See below.Well, if you can come up with a good definition for what "increasing the size of a class" would do, then maybe it should be added. It really doesn't matter. Arrays are their own type, the have their own semantics. It does help to think of them in terms of slices (which is what TDPL refers to them as), yet that does not remove the fact that they are in dead a reference type. Many times familiar terms are used so that a behavior is quickly understood. For example it is common to say that arrays in C is just a pointer into a memory location. But in reality that is not true.It depends on your definition of reference type. I agree with Daniel. If you want to get into academic definitions, yes, 'technically' an array is a reference, but it's confusing to someone who's not interested in exploring the theoretical parts of computer science.I think of an array as a hybrid between a reference and a value type. The data is passed by reference, the length is passed by value. This mean changing the length only affects the local copy, but changing the data affects all arrays that point to that data.Well, saying _dynamic arrays_ are a hybrid, like you mentioned, is perhaps the best way to describe them, with less misunderstanding. However, if I had to say whether dynamic arrays are value types or reference types, I would agree with Jesse and call them reference types, and I would not feel this is inaccurate. Let's try a test everyone, look at this code: void test() { int[] a = [1, 2, 3]; int[] b = a; int[] c = a; } and tell us, what you would you reply if asked "how many arrays are created during test's scope"? I would say "1", and not feel it is inaccurate. However, if asked how "many dynamic arrays are created during test's scope?", I would likely say "3". This is because what I consider an array to *be*, is its (contiguous & homogeneous) elements. With that definition, then D's static arrays are value types, because when you assign a static array value to a static array variable, the underlying _array_ (ie, the contents), get copied. Conversely, D's dynamic arrays are reference types, because when you assign a dynamic array value to a dynamic array variable, the underlying "array" (ie, the contents) are not copied, instead you get two references to exactly the same "array" (ie, same data). This viewpoint gets a bit murkier because an "array" can be contained inside another "array", but that doesn't fundamentally change the it (the viewpoint that is). But ultimately neither view is right or wrong, it just depends on how we define our terms, how we conceptualize things. However, it's probably best to stick to what the spec/TDPL says about it, if it says anything specific. (I don't think it does though :/ ) Or to avoid ambiguous designations. ... Oh man, I need to clear my head. -- Bruno Medeiros - Software Engineer
Nov 26 2010
On Fri, 26 Nov 2010 19:50:27 +0000 Bruno Medeiros <brunodomedeiros+spam com.gmail> wrote:On 09/11/2010 12:42, Steven Schveighoffer wrote:=20On Mon, 08 Nov 2010 21:29:42 -0500, Jesse Phillips <jessekphillips+D gmail.com> wrote:=20 What do you mean "depends on your definition of reference type" ? I=20 think that what a reference type is, is generally understood fairly well=Well, if you can come up with a good definition for what "increasing the size of a class" would do, then maybe it should be added. It really doesn't matter. Arrays are their own type, the have their own semantics. It does help to think of them in terms of slices (which is what TDPL refers to them as), yet that does not remove the fact that they are in dead a reference type. Many times familiar terms are used so that a behavior is quickly understood. For example it is common to say that arrays in C is just a pointer into a memory location. But in reality that is not true.It depends on your definition of reference type. I agree with Daniel. If you want to get into academic definitions, yes, 'technically' an array is a reference, but it's confusing to someone who's not interested in exploring the theoretical parts of computer science.by the majority of developers, even if understood implicitly. I think=20 our confusion here has more to do to what we consider an "array", rather==20than what we consider to "reference type to be. See below. =20 =20=20I think of an array as a hybrid between a reference and a value type. The data is passed by reference, the length is passed by value. This mean changing the length only affects the local copy, but changing the data affects all arrays that point to that data.=20 Well, saying _dynamic arrays_ are a hybrid, like you mentioned, is=20 perhaps the best way to describe them, with less misunderstanding. =20 However, if I had to say whether dynamic arrays are value types or=20 reference types, I would agree with Jesse and call them reference types,=and I would not feel this is inaccurate. Let's try a test everyone, look==20at this code: =20 void test() { int[] a =3D [1, 2, 3]; int[] b =3D a; int[] c =3D a; } =20 and tell us, what you would you reply if asked "how many arrays are=20 created during test's scope"? I would say "1", and not feel it is=20 inaccurate. However, if asked how "many dynamic arrays are created=20 during test's scope?", I would likely say "3". =20 This is because what I consider an array to *be*, is its (contiguous &=20 homogeneous) elements. With that definition, then D's static arrays are=20 value types, because when you assign a static array value to a static=20 array variable, the underlying _array_ (ie, the contents), get copied. Conversely, D's dynamic arrays are reference types, because when you=20 assign a dynamic array value to a dynamic array variable, the underlying==20"array" (ie, the contents) are not copied, instead you get two=20 references to exactly the same "array" (ie, same data). This viewpoint=20 gets a bit murkier because an "array" can be contained inside another=20 "array", but that doesn't fundamentally change the it (the viewpoint=20 that is).I think your explanation helps & clarify why there are misunderstandings (I= mean between people in the D community). Someone used to playing with _sta= tic_ arrays in a low-level language is used to identify an array with a mem= ory area. But at a slightly higher level, an array is seen by other people = as an element in a program like an int, a struct, a set or whatever; this a= pplies to any kind of array, not only static. The thin interface implemented by D [the (pointer,length) tuple] creates a = minimal abstraction that causes these 2 points of view to diverge in practi= ce. The structure, the array element is not referenced, but the memory area= is, well, "pointed".But ultimately neither view is right or wrong, it just depends on how we==20define our terms, how we conceptualize things. However, it's probably=20 best to stick to what the spec/TDPL says about it, if it says anything=20 specific. (I don't think it does though :/ ) Or to avoid ambiguous=20 designations.In many languages, there are data structures very similar to D's dynamic ar= rays, but where the element is referenced as well. So that when newcomers r= ead that D arrays are referenced, they can only think "yes, of course", and= fall into the trap. Note that the "pointing" here has no semantic value, unlike what is usually= meant by the notion of reference type; it is instead plain internal mechan= ics, a necessity to implement a value of variable length. Either copying the area on assignment (real value type), or on the contrary= referencing the element (real ref type), would both result in array types = in which these internal mechanics are opaque to the user. Note that I don't= advocate for this; I have finally understood the efficiency of D's choice.= But it's hard to get.... Oh man, I need to clear my head.... Denis -- -- -- -- -- -- -- vit esse estrany =E2=98=A3 spir.wikidot.com
Nov 26 2010
On 26/11/2010 19:50, Bruno Medeiros wrote:Well, saying _dynamic arrays_ are a hybrid, like you mentioned, is perhaps the best way to describe them, with less misunderstanding. However, if I had to say whether dynamic arrays are value types or reference types, I would agree with Jesse and call them reference types, and I would not feel this is inaccurate.ARRGHH, actually this is not entirely true, I forgot about something: null vs. empty arrays. Indeed on assignment (and parameter passing) dynamic arrays work just like reference types. The reallocation operations also don't go against that either. However, in D's dynamic arrays, null is the same as an empty array! Which in practice means there is no proper null "value" for dynamic arrays, like there is for other reference types, and I think that is kinda of an essential characteristic for a reference type. So actually I would not call D's dynamic arrays "reference types" and feel this is really accurate. -- Bruno Medeiros - Software Engineer
Nov 26 2010
On Fri, 26 Nov 2010 14:50:27 -0500, Bruno Medeiros <brunodomedeiros+spam com.gmail> wrote:On 09/11/2010 12:42, Steven Schveighoffer wrote:A class is a reference type. Every operation on a class instance operates on all aliases to that class, only assignment to another class instance of the same type decouples it. Consider these two statements: "a class is a reference type" class C { int length; } foo(C c) { c.length = 5; // because a class is a reference type, this affects the original, right? } "an array is a reference type" foo(int[] arr) { arr.length = 5; // because an array is a reference type, this affects the original, right? No?!!! } This is the major confusion that I think people see. I would say people assume a "reference type" means that the reference's members are all shared between all aliases. For an array, it is one level removed, and that confuses the hell out of people. But the power gained by doing D's way is worth way waaaay more than confusing some noobs. Java where arrays actually *are* class instances, and therefore full reference types. -SteveOn Mon, 08 Nov 2010 21:29:42 -0500, Jesse Phillips <jessekphillips+D gmail.com> wrote:What do you mean "depends on your definition of reference type" ? I think that what a reference type is, is generally understood fairly well by the majority of developers, even if understood implicitly. I think our confusion here has more to do to what we consider an "array", rather than what we consider to "reference type to be. See below.Well, if you can come up with a good definition for what "increasing the size of a class" would do, then maybe it should be added. It really doesn't matter. Arrays are their own type, the have their own semantics. It does help to think of them in terms of slices (which is what TDPL refers to them as), yet that does not remove the fact that they are in dead a reference type. Many times familiar terms are used so that a behavior is quickly understood. For example it is common to say that arrays in C is just a pointer into a memory location. But in reality that is not true.It depends on your definition of reference type. I agree with Daniel. If you want to get into academic definitions, yes, 'technically' an array is a reference, but it's confusing to someone who's not interested in exploring the theoretical parts of computer science.
Nov 29 2010
On 29/11/2010 14:13, Steven Schveighoffer wrote:On Fri, 26 Nov 2010 14:50:27 -0500, Bruno Medeiros <brunodomedeiros+spam com.gmail> wrote:Hum, I see your point, yeah, I guess I do agree somewhat. I mean, it goes back to the issue of what one thinks the "array" is. There is no misunderstanding in the code above if one considers the array to be just it's elements... However, as you point out, it would not be uncommon that a newbie would consider the "members" of the array to be part of it (and thus shared across all aliases). It's not even unreasonable to think that, in fact. But then with that mental model it would be quite inaccurate to say "an array is a reference type", yeah. (even disregarding the issues with empty vs null arrays, a different beast altogether) -- Bruno Medeiros - Software EngineerOn 09/11/2010 12:42, Steven Schveighoffer wrote:A class is a reference type. Every operation on a class instance operates on all aliases to that class, only assignment to another class instance of the same type decouples it. Consider these two statements: "a class is a reference type" class C { int length; } foo(C c) { c.length = 5; // because a class is a reference type, this affects the original, right? } "an array is a reference type" foo(int[] arr) { arr.length = 5; // because an array is a reference type, this affects the original, right? No?!!! } This is the major confusion that I think people see. I would say people assume a "reference type" means that the reference's members are all shared between all aliases. For an array, it is one level removed, and that confuses the hell out of people. But the power gained by doing D's way is worth way waaaay more than confusing some noobs.On Mon, 08 Nov 2010 21:29:42 -0500, Jesse Phillips <jessekphillips+D gmail.com> wrote:What do you mean "depends on your definition of reference type" ? I think that what a reference type is, is generally understood fairly well by the majority of developers, even if understood implicitly. I think our confusion here has more to do to what we consider an "array", rather than what we consider to "reference type to be. See below.Well, if you can come up with a good definition for what "increasing the size of a class" would do, then maybe it should be added. It really doesn't matter. Arrays are their own type, the have their own semantics. It does help to think of them in terms of slices (which is what TDPL refers to them as), yet that does not remove the fact that they are in dead a reference type. Many times familiar terms are used so that a behavior is quickly understood. For example it is common to say that arrays in C is just a pointer into a memory location. But in reality that is not true.It depends on your definition of reference type. I agree with Daniel. If you want to get into academic definitions, yes, 'technically' an array is a reference, but it's confusing to someone who's not interested in exploring the theoretical parts of computer science.
Nov 29 2010
On Mon, 29 Nov 2010 09:13:17 -0500 "Steven Schveighoffer" <schveiguy yahoo.com> wrote:A class is a reference type. Every operation on a class instance operate=s =20on all aliases to that class, only assignment to another class instance o=f =20the same type decouples it. =20 Consider these two statements: =20 "a class is a reference type" =20 class C { int length; } =20 foo(C c) { c.length =3D 5; // because a class is a reference type, this affects t=he =20original, right? } =20 "an array is a reference type" =20 foo(int[] arr) { arr.length =3D 5; // because an array is a reference type, this affec=ts =20the original, right? No?!!! } =20 This is the major confusion that I think people see. I would say people ==20assume a "reference type" means that the reference's members are all =20 shared between all aliases. For an array, it is one level removed, and ==20that confuses the hell out of people. But the power gained by doing D's ==20way is worth way waaaay more than confusing some noobs. =20=20Java where arrays actually *are* class instances, and therefore full =20 reference types.Very good explanation of the mental issue for newcomers. Note that this doe= also from most, if not all, dynamic languages (including one that are not = "officially" OO, like Lua). Denis -- -- -- -- -- -- -- vit esse estrany =E2=98=A3 spir.wikidot.com
Nov 30 2010
On Tue, 9 Nov 2010 02:14:17 +0100 Daniel Gibson <metalcaedes gmail.com> wrote:On Tue, Nov 9, 2010 at 1:24 AM, Jesse Phillips <jessekphillips+D gmail.com> wrote:ans both the internal pointer and length must be the same. Just because the= reference is more than an address does not make it any less a reference.The array-struct is the reference, so it is what gets compared. That me=Exactly. When a D object (class instance) holds value or reference fields, changing = any of them affects other variables denoting the same object, right? This i= s not true for D dyn arrays. D object implement reference semantics, while = D arrays implement value semantics. Or rather, they do it superficially (sh= allow copy). To have true value semantics, one would need a kind of this(th= is) copy constructor that also copies the target memory area adressed by th= e array's internal pointer.=20 Unlike in C, a D array is more than a reference. An array is the data (or a reference to its data) + metadata (its length) - the metadata belongs to the array (and not to the reference). This means that, when you say "the array is passed by reference" one expects that also the arrays length is passed by reference - because the length belongs to the array.zing (which is not an option for other reference types).The distinction is that an Array can have its reference changed by resi=tion.=20 If that is not an option for any other reference type, why should it be an option for arrays? That is just inconsistent and doesn't make any sense. =20 No, Arrays should not be considered reference types when passed to a func=As someone else said before: Logically you don't pass the array but a slice that contains the whole array.Exactly, again. The comparison with slices makes sense. If I don't mess up = everything after all those discussions, "b =3D a;" has the same semantics a= s "b =3D a[0..$];". A new array struct is built with equal pointer & length= as the original one. Meaning both arrays _initially_ address the same memo= ry area. If D arrays were referenced instead, then no copy would happen at all, inst= ead a reference to the same struct would be created; so that later changes = would be shared -- however they internally happen, including reallocation. = Like for D objects. For people used to manually implement variable-size data structures (eg in = plain C), saying that a D dyn array is a kind of (pointer,length) tuple (a = fat pointer) defining an internal memory area (static array) is also useful= . They can imagine the internal mechanics and from there infer actual behav= iour. But saying that D dyn arrays are referenced can only bring confusion. And s= tating a parallel with D objects even more: they do _not_ behave the same w= ay. Denis -- -- -- -- -- -- -- vit esse estrany =E2=98=A3 spir.wikidot.com
Nov 09 2010
Jesse Phillips wrote:Jens Mueller Wrote:I like that explanation. Jonathan is saying the same, I think. I'll guess my misunderstanding is mainly caused by figuring out that a reassign is happening and that a reassign to a reference changes the reference. In C++ you cannot change a reference (I hope I'm right here.). When using a std::vector one does not need to think about this. What's the general use of a = new A() in the above code? Where is it useful? JensHi, I do not understand what's going on behind the scene with this code. Or better said I have some idea but maybe I do not see the whole point. void foo(int[] array) { array.length += 1000; // may copy the array array[0] = 1; } auto a = new int[1]; foo(a); assert(a[0] == 1); // fails if a needs to copied inside foo...I find this behavior rather strange. Arrays are neither passed by value (copying the whole array) nor by reference. I see reasons for doing it like this, e.g. doing array = array[1..$] inside should not affect the outside. But I wonder whether these semantics are well enough documented? I think I should use ref int[] in the example above, shouldn't I? JensBut they are past by reference. You can modify the data all you want, but cannot reassign the reference itself. Resizing the array may cause a reassignment of that reference. It is not different from the following code except resizing does not guarantee a reference change. import std.stdio; void assignValue(A a) { a = new A(); a.value = 6; } class A { int value; } void main() { A a = new A(); assignValue(a); writeln(a.value); }
Nov 08 2010
Jens Mueller Wrote:I like that explanation. Jonathan is saying the same, I think.Yes, same thing.In C++ you cannot change a reference (I hope I'm right here.). When using a std::vector one does not need to think about this.Don't know too much about C++ references, but as mentioned somewhere you can use the Array container which won't have this issueWhat's the general use of a = new A() in the above code? Where is it useful? JensI don't really have any good use-case examples. Maybe an initialization function? Developed your own number object (big int) and were thinking in terms of it being a refrence you thought a = a + BigInt(7); would result in a being resigned in the calling function. Or maybe just a function that swaps two class references: void swap(T)(T a, T b) { // Correct void swap(T)(ref T a, ref T b) { auto tmp = a; a = b; b = tmp; } Actually that turned out to be a pretty good one.
Nov 08 2010
I do see why a = new A() is useful. But it makes only sense if I passed a/b as ref a/b. Basically I wonder why I do not get a warning when changing the reference in that situation. So my question is more why am I allowed to change the reference even though I didn't pass it as ref a. I was looking for a use of that. Assuming there is good use then there is no reason to forbid it. But if there is no good use I'd like to be warned when compiling the above swap. Because it's an error. For dynamic arrays slicing is a good example to allow it. But why should it be allowed for objects? JensWhat's the general use of a = new A() in the above code? Where is it useful? JensI don't really have any good use-case examples. Maybe an initialization function? Developed your own number object (big int) and were thinking in terms of it being a refrence you thought a = a + BigInt(7); would result in a being resigned in the calling function. Or maybe just a function that swaps two class references: void swap(T)(T a, T b) { // Correct void swap(T)(ref T a, ref T b) { auto tmp = a; a = b; b = tmp; } Actually that turned out to be a pretty good one.
Nov 09 2010
On Tuesday 09 November 2010 02:43:57 Jens Mueller wrote:Why wouldn't you be able to change it? You can change any parameter as long as it's not const or immutable (or in, which implies const). The fact that it's a referenc is irrelevant. I can assign whatever I want to references and pointers in a function whether they were passed in or not. Sure, if you want to alter the original pointer or reference, that's not going to work unless it was passed as ref, but that's the same as any other parameter. Why would you expect altering a reference to alter the original? Sure, altering what it _refers to_ should alter what the original refers to because they refer to the same thing, but altering the reference itself shouldn't alter the original because they're two different references. There is no reason why parameters should have to stay the same as what they were passed in as - regardless of whether they're value types or reference types. If you want that behavior, use const, in, or immutable. - Jonathan M DavisI do see why a = new A() is useful. But it makes only sense if I passed a/b as ref a/b. Basically I wonder why I do not get a warning when changing the reference in that situation. So my question is more why am I allowed to change the reference even though I didn't pass it as ref a. I was looking for a use of that. Assuming there is good use then there is no reason to forbid it. But if there is no good use I'd like to be warned when compiling the above swap. Because it's an error. For dynamic arrays slicing is a good example to allow it. But why should it be allowed for objects?What's the general use of a = new A() in the above code? Where is it useful? JensI don't really have any good use-case examples. Maybe an initialization function? Developed your own number object (big int) and were thinking in terms of it being a refrence you thought a = a + BigInt(7); would result in a being resigned in the calling function. Or maybe just a function that swaps two class references: void swap(T)(T a, T b) { // Correct void swap(T)(ref T a, ref T b) { auto tmp = a; a = b; b = tmp; } Actually that turned out to be a pretty good one.
Nov 09 2010
I see your point. You argue that the behavior is consistent. My point is that this consistency can lead to bugs. I may forget the ref. But I'll keep in mind to never forget the ref if it is needed. JensWhy wouldn't you be able to change it? You can change any parameter as long as it's not const or immutable (or in, which implies const). The fact that it's a referenc is irrelevant. I can assign whatever I want to references and pointers in a function whether they were passed in or not. Sure, if you want to alter the original pointer or reference, that's not going to work unless it was passed as ref, but that's the same as any other parameter. Why would you expect altering a reference to alter the original? Sure, altering what it _refers to_ should alter what the original refers to because they refer to the same thing, but altering the reference itself shouldn't alter the original because they're two different references. There is no reason why parameters should have to stay the same as what they were passed in as - regardless of whether they're value types or reference types. If you want that behavior, use const, in, or immutable.I don't really have any good use-case examples. Maybe an initialization function? Developed your own number object (big int) and were thinking in terms of it being a refrence you thought a = a + BigInt(7); would result in a being resigned in the calling function. Or maybe just a function that swaps two class references: void swap(T)(T a, T b) { // Correct void swap(T)(ref T a, ref T b) { auto tmp = a; a = b; b = tmp; } Actually that turned out to be a pretty good one.I do see why a = new A() is useful. But it makes only sense if I passed a/b as ref a/b. Basically I wonder why I do not get a warning when changing the reference in that situation. So my question is more why am I allowed to change the reference even though I didn't pass it as ref a. I was looking for a use of that. Assuming there is good use then there is no reason to forbid it. But if there is no good use I'd like to be warned when compiling the above swap. Because it's an error. For dynamic arrays slicing is a good example to allow it. But why should it be allowed for objects?
Nov 09 2010
Jens Mueller Wrote:I see your point. You argue that the behavior is consistent. My point is that this consistency can lead to bugs. I may forget the ref. But I'll keep in mind to never forget the ref if it is needed. JensWell, in the case of classes, I don't think it would be very common. For arrays it can be nice since you can assign back a slice of the array that you want to work with. void main(string args) { args = args[1..$]; } Otherwise I suggest you start labeling all parameters with in. Then you are prevented from modifying the reference, and can decide if it should be a ref parameter. Who knows, maybe you'll find a reason to leave it off.
Nov 09 2010
With dynamic arrays I totally agree. Slicing is very useful. I just want a safe rule to work with it. Maybe that's not needed anymore because by now I spend enough time on this that probably I'll never forget. With in I also disallow changing the object itself, right? I want to only forbid the changing of the reference of an argument inside the function. With in/const I disallow every change. I may want to change data of an array. Jens PS The more we talk about it, the more I come to the conclusion that this is just something you need to know. I can live with that.I see your point. You argue that the behavior is consistent. My point is that this consistency can lead to bugs. I may forget the ref. But I'll keep in mind to never forget the ref if it is needed. JensWell, in the case of classes, I don't think it would be very common. For arrays it can be nice since you can assign back a slice of the array that you want to work with. void main(string args) { args = args[1..$]; } Otherwise I suggest you start labeling all parameters with in. Then you are prevented from modifying the reference, and can decide if it should be a ref parameter. Who knows, maybe you'll find a reason to leave it off.
Nov 10 2010
I like that explanation. Jonathan is saying the same, I think. I'll guess my misunderstanding is mainly caused by figuring out that a reassign is happening and that a reassign to a reference changes the reference. In C++ you cannot change a reference (I hope I'm right here.). When using a std::vector one does not need to think about this. What's the general use of a = new A() in the above code? Where is it useful? JensYes in C++, you can't redirect a reference after initialization. And also can't have a std::vector of references which is mainly by this reason. -- Using Opera's revolutionary email client: http://www.opera.com/mail/
Nov 08 2010
On Monday, November 08, 2010 16:11:46 so wrote:D references are more like Java references. They're really pointers, but you don't have to dereference them, so you can reassign them to something else. You can't do that in C++, because references are treated more like name aliases of variables than pointers. So, if you think in terms of C++ pointers vs D references, then C++ and D function the same in this respect. - Jonathan M DavisI like that explanation. Jonathan is saying the same, I think. I'll guess my misunderstanding is mainly caused by figuring out that a reassign is happening and that a reassign to a reference changes the reference. In C++ you cannot change a reference (I hope I'm right here.). When using a std::vector one does not need to think about this. What's the general use of a = new A() in the above code? Where is it useful? JensYes in C++, you can't redirect a reference after initialization. And also can't have a std::vector of references which is mainly by this reason.
Nov 08 2010
On 11/8/2010 17:43, Jonathan M Davis wrote:D references are more like Java references.That's true for class references. D also supports pass-by-reference through the 'ref' keyword, which works like C++ references. -- Rainer Deyke - rainerd eldwood.com
Nov 12 2010
On Friday 12 November 2010 17:55:31 Rainer Deyke wrote:On 11/8/2010 17:43, Jonathan M Davis wrote:True. But generally when talking about references, class references are what is being referred to. Personally, I would say that in the case of a ref parameter, the argument is being passed _by_ reference but not that it _is_ a reference. I suppose that the whole issue could get pretty confusing though if we're not clear on what we're referring to. - Jonathan M DavisD references are more like Java references.That's true for class references. D also supports pass-by-reference through the 'ref' keyword, which works like C++ references.
Nov 12 2010
On Monday, November 08, 2010 13:54:28 spir wrote:On Mon, 08 Nov 2010 15:32:56 -0500 Jesse Phillips <jessekphillips+D gmail.com> wrote:As Jesse says, they _are_ passed by reference. The struct itself _is_ the reference. What makes arrays odd is that anything that resizes it returns a _new_ reference or alters the current one. So, using ~= or length can result in a reference with a new ptr value and a new length value, and the reference then refers to a different block of memory. In the case of an object reference, you could implement ~= to return a new reference in exactly the same manner and get behavior similar to an array, but no one does that normally. _Everything_ which is not passed as ref or out is passed by value in D. And technically, ref and out probably wrap the value being passed in a pointer and that pointer is passed by value. Really, this is easier to understand when dealing with pointers than references, since references hide the dereferencing process. Take this program for instance: import std.stdio; void func1(int* b) { *b = 12; } void func2(int* c) { c = new int; *c = 7; } void func3(int** d) { *d = new int; **d = 3; } void main() { int* i = new int; int* j = i; *i = 5; writefln("%s %s", *i, *j); func1(i); writefln("%s %s", *i, *j); func2(i); writefln("%s %s", *i, *j); func3(&i); writefln("%s %s", *i, *j); } It prints out 5 5 12 12 12 12 3 12 i and j both point to the same memory, so when that memory is altered, the value that you get when you dereference them is altered. When func1() is called, the value of i is passed to func1() - that is the address that i points to - so b points to the same memory that i and j do, so altering the value in that memory, alters the memory for all three pointers. When calling func2(), c is a copy of i and holds the same address. However, setting c to a new address means that it no longer points to the same memory and any changes made to the memory that it points to does not alter the memory that i and j point to. So, they still print out the same value when dereferenced. func3() takes the address of i. That means that d the holds the address of i and can alter i itself rather than just the memory that it points to. c points to the memory that holds i itself. So, when assigning the dereferenced c to a new address, i itself receives a new address. Then when altering the memory that's pointed to by the address that c holds, you alter the memory that i points to and change that value. However, while j held the same address as i originally and altering the memory at that address therefore altered what both i and j pointed to, once i was given a new address by c, they i and j held different addresses and altering the memory that one pointed to did not alter the one that the other pointed to. I expect that you've gone over this sort of thing before, but you need to realize that references are _exactly_ the same. The only difference is the lack of dereferencing step when accessing what it points to. So, whether you're passing object reference or an array reference, you're passing that reference by value, and any changes to the copy won't change the original. Changes to what the reference refers to _will_ change what the original reference points to as well since they point to the same thing, but changes to the copied reference won't change the original. So, if make the copied reference point to something else, then it won't change what the original reference pointed to anymore. Having an array that you use ~= on is like doing objRef = new Foo(); on a reference that was passed into the function. It now points to something else, so it's not going to affect the original. What gets somewhat weird about it is that ~= doesn't necessarily return a new reference, and if it doesn't then the length is different, so it still hasn't affected the original reference (if it was going to, it would have resulted in a reallocation). The value of the copied reference has changed. I grant you that if you're expecting ~= to return the same reference every time and thereby somehow alter the original reference, then you're going to be surprised. But it's completely consistent with how pointers and references work. You can alter the memory that they point to, but since the pointer or reference itself was passed by value, altering it will not alter the original. - Jonathan M DavisBut they are past by reference. You can modify the data all you want, but cannot reassign the reference itself.No, they are _not_ passed by reference. Stop saying that, this is precisely what causes confusion. This is reference semantics:
Nov 08 2010
On Mon, 8 Nov 2010 17:08:32 -0800 Jonathan M Davis <jmdavisProg gmx.com> wrote:As Jesse says, they _are_ passed by reference. The struct itself _is_ the==20reference.=20(Well, that is a sensible redefinition of "reference"; but it is simply _no= t_ what the word means in any other context.) It is true that the inner, hidden, memory area (static array) containing th= e elements is indeed referenced, actually "pointed", from the dynamic array= struct: struct ValueArray(Element) { Element* elements; uint length; } (Well, actually, this may not be a struct, but it's easier to imagine it so= .) But: the dyn array itself, meaning the struct, is not referenced: "a2 =3D a= 1" copies it, as well as parameter passing. And the fact that the internal = memory is referenced is an implementation detail that should *not* affect s= emantics. The inner pointer is there because we need some kind of indirecti= on to implement variable-size thingies, and the means for this is pointers. This is precisely where & why people get bitten: implementation leaks out i= nto semantics. Actually, one could conceptually replace the (pointer,length) pair by a sin= gle field of type MemoryArea -- which would be a plain value. Then, there w= ould be no more (visible) pointer in the dyn array, right? (Actually, it wo= uld just be hidden deeper inside the MemoryArea field... but that is again = implementation detail!) We should not mess up pointers used for implementation mechanics, like in t= he case of dyn arrays, or more generally variable size data structure, with= pointers used as true references carrying semantics, like in the case of t= he difference between struct and class. And precisely, replacing array struct by a class, or explicitely referencin= g the struct, would make a *reference* dyn array type. See below an example= of a primitive sort of such an array type (you can only put new elements i= n it ;-), implemented as class. After "a2 =3D a1", every change to one of the vars affects the other var; w= hether the change requires reallocation is irrelevant; this detail belongs = to implementation, not to semantics. Now, replace class with struct and you have a type for *value* dyn arrays. = Which works exactly like D ones. The assertion will fail; and output should be interesting ;-) Hope it's clear, because I cannot do better. I do not mean that D arrays are bad in any way. They work perfectly and are= very efficient. Enforcing a true interface between implementation and sema= ntics would certainly have a relevant cost in terms of space & time. But pl= ease, stop stating D arrays are referenced if you want newcomers to have a = chance & understand the actual behaviour, to use them without beeing consta= ntly bitten, and to stop & complain. Denis class RefArray(Element) { Element* elements; uint length; private uint capacity; this () { this.elements =3D cast(Element*) malloc(Element.sizeof); this.capacity =3D 1; this.length =3D 0; } void reAlloc() { writeln("realloc"); this.capacity *=3D 2; size_t memSize =3D this.capacity * Element.sizeof; realloc(this.elements, memSize); } void put(Element element) { if (this.length >=3D this.capacity) this.reAlloc(); this.elements[this.length] =3D element; ++ this.length; } void opBinary(string op) (Element element) if (op =3D=3D "+") { this.put(element); } } void main () { auto a1 =3D new RefArray!int(); auto a2 =3D a1; foreach (int i ; 1..8) {a2.put(i);} assert(a1.length =3D=3D a2.length); foreach (int i ; 0 .. a2.length) writef("%s=3D%s ", a1.elements[i], a2.elements[i]); writeln(); a1 + 8; a1 + 9; foreach (int i ; 0 .. a2.length) writef("%s=3D%s ", a1.elements[i], a2.elements[i]); writeln(); } -- -- -- -- -- -- -- vit esse estrany =E2=98=A3 spir.wikidot.com
Nov 09 2010
On 11/09/2010 09:36 AM, spir wrote:On Mon, 8 Nov 2010 17:08:32 -0800 Jonathan M Davis<jmdavisProg gmx.com> wrote:...wait! Did you just overload binary operator + to mean append?As Jesse says, they _are_ passed by reference. The struct itself _is_ the reference.(Well, that is a sensible redefinition of "reference"; but it is simply _not_ what the word means in any other context.) It is true that the inner, hidden, memory area (static array) containing the elements is indeed referenced, actually "pointed", from the dynamic array struct: struct ValueArray(Element) { Element* elements; uint length; } (Well, actually, this may not be a struct, but it's easier to imagine it so.) But: the dyn array itself, meaning the struct, is not referenced: "a2 = a1" copies it, as well as parameter passing. And the fact that the internal memory is referenced is an implementation detail that should *not* affect semantics. The inner pointer is there because we need some kind of indirection to implement variable-size thingies, and the means for this is pointers. This is precisely where& why people get bitten: implementation leaks out into semantics. Actually, one could conceptually replace the (pointer,length) pair by a single field of type MemoryArea -- which would be a plain value. Then, there would be no more (visible) pointer in the dyn array, right? (Actually, it would just be hidden deeper inside the MemoryArea field... but that is again implementation detail!) We should not mess up pointers used for implementation mechanics, like in the case of dyn arrays, or more generally variable size data structure, with pointers used as true references carrying semantics, like in the case of the difference between struct and class. And precisely, replacing array struct by a class, or explicitely referencing the struct, would make a *reference* dyn array type. See below an example of a primitive sort of such an array type (you can only put new elements in it ;-), implemented as class. After "a2 = a1", every change to one of the vars affects the other var; whether the change requires reallocation is irrelevant; this detail belongs to implementation, not to semantics. Now, replace class with struct and you have a type for *value* dyn arrays. Which works exactly like D ones. The assertion will fail; and output should be interesting ;-) Hope it's clear, because I cannot do better. I do not mean that D arrays are bad in any way. They work perfectly and are very efficient. Enforcing a true interface between implementation and semantics would certainly have a relevant cost in terms of space& time. But please, stop stating D arrays are referenced if you want newcomers to have a chance& understand the actual behaviour, to use them without beeing constantly bitten, and to stop& complain. Denis class RefArray(Element) { Element* elements; uint length; private uint capacity; this () { this.elements = cast(Element*) malloc(Element.sizeof); this.capacity = 1; this.length = 0; } void reAlloc() { writeln("realloc"); this.capacity *= 2; size_t memSize = this.capacity * Element.sizeof; realloc(this.elements, memSize); } void put(Element element) { if (this.length>= this.capacity) this.reAlloc(); this.elements[this.length] = element; ++ this.length; } void opBinary(string op) (Element element) if (op == "+") {
Nov 09 2010
D arrays very powerful but you first need to understand what is going on. You should check the book. An inconsistency is the copy of static arrays at assignment, but necessary one. One thing i don't like about D arrays is an undefined case in dynamic array reallocation. -- Using Opera's revolutionary email client: http://www.opera.com/mail/
Nov 08 2010
On 11/8/10 4:50 PM, so wrote:D arrays very powerful but you first need to understand what is going on. You should check the book.Or a mildly outdated but accurate preview of the relevant chapter: http://erdani.com/d/thermopylae.pdf Andrei
Nov 08 2010
On Monday, November 08, 2010 16:50:46 so wrote:D arrays very powerful but you first need to understand what is going on. You should check the book. An inconsistency is the copy of static arrays at assignment, but necessary one. One thing i don't like about D arrays is an undefined case in dynamic array reallocation.It's perfectly defined, just not knowable at compile time. You can even check the array's capacity if you want to try and figure out when it's going to happen. And there's not really any reasonable alternative. What would have happen instead? Make an array reallocate _every_ time that it's resized? That would be highly inefficient and could really degrade performance. Appending becomes O(n) instead of amortized O(1). If you're not altering the actual elements of the array, then the current implementation is great. If you _are_ altering them, then simply dup the array to guarantee that it's been reallocated. - Jonathan M Davis
Nov 08 2010
I didn't mean that one, check page 112 on http://erdani.com/d/thermopylae.pdf -- Using Opera's revolutionary email client: http://www.opera.com/mail/
Nov 08 2010
Oh yeh you are right, i said reallocation. Should have said assignment. -- Using Opera's revolutionary email client: http://www.opera.com/mail/
Nov 08 2010
On 09/11/2010 01:43, Jonathan M Davis wrote:On Monday, November 08, 2010 16:50:46 so wrote:Making the array reallocate _every_ time that it's resized (to a greater length) is actually not that unreasonable. Would it be highly inneficient? Only if you write bad code. TDPL agrees with you, I quote: " One easy way out would be to always reallocate a upon appending to it [...] Although that behavior is easiest to implement, it has serious efficiency problems. For example, oftentimes arrays are iteratively grown in a loop: int[] a; foreach (i; 0 .. 100) { a ~= i; } " Hum, "oftentimes"? I wonder if such code is really that common (and what languages are we talking about here?) But more importantly, there is a simple solution: don't write such code, don't use arrays like if they are lists, preallocate instead and then fill the array. So with this alternative behavior, you can still write efficient code, and nearly as easily. The only advantage of the current behavior is that it is more noob friendly, which is an advantage of debatable value. -- Bruno Medeiros - Software EngineerD arrays very powerful but you first need to understand what is going on. You should check the book. An inconsistency is the copy of static arrays at assignment, but necessary one. One thing i don't like about D arrays is an undefined case in dynamic array reallocation.It's perfectly defined, just not knowable at compile time. You can even check the array's capacity if you want to try and figure out when it's going to happen. And there's not really any reasonable alternative. What would have happen instead? Make an array reallocate _every_ time that it's resized? That would be highly inefficient and could really degrade performance. Appending becomes O(n) instead of amortized O(1). If you're not altering the actual elements of the array, then the current implementation is great. If you _are_ altering them, then simply dup the array to guarantee that it's been reallocated. - Jonathan M Davis
Nov 26 2010
On 11/26/2010 07:22 PM, Bruno Medeiros wrote:But more importantly, there is a simple solution: don't write such code, don't use arrays like if they are lists, preallocate instead and then fill the array. So with this alternative behavior, you can still write efficient code, and nearly as easily.What about when you don't know the length before, or working with immutable elements?The only advantage of the current behavior is that it is more noob friendly, which is an advantage of debatable value.I believe you will find to have that exactly backwards.
Nov 26 2010
On 26/11/2010 19:36, Pelle Månsson wrote:On 11/26/2010 07:22 PM, Bruno Medeiros wrote:I must recognize I made a huge blunder with that, my reasoning was indeed very wrong. :( don't know the length: Do the exponentional growth yourself. Double the array length when it gets full, and keep track of real length in a separate variable. At the end, downsize the array to real length. - This version is significantly more complex than the code with the current behavior, significantly enough to counter my argument (the "nearly as easily" part). - It's actually also slightly less efficient, because whenever you grow the array, half of the elements have to be default-initialized, which is not the case when you just grow the capacity (using current resize behavior). I think one might resolve this by doing some cast to void[]'s and back, but that would make this code version even more complex. immutable: preallocate another array _whose elements are typed as tail-immutable_, fill it, at the end cast it to the array typed with immutable elements. - Ouch, this one is even worse, especially depending on what the type of the immutable elements are, because of the need to determine the tail-immutable version of that type. If the elements are Objects, it's not even possible to do that, so you would need to type the temporary array as void*[]. And that would make the code even more complex if you'd want to access and use the elements while the array is being constructed. (alternatively you could cast away all the immutable from the element type, but it's less safe, on a more complex loop you would risk modifying them by mistake) :/ -- Bruno Medeiros - Software EngineerBut more importantly, there is a simple solution: don't write such code, don't use arrays like if they are lists, preallocate instead and then fill the array. So with this alternative behavior, you can still write efficient code, and nearly as easily.What about when you don't know the length before, or working with immutable elements?
Nov 26 2010
On 11/26/10 12:22 PM, Bruno Medeiros wrote:On 09/11/2010 01:43, Jonathan M Davis wrote:It would be difficult to challenge the assumption that appends in a loop are common.On Monday, November 08, 2010 16:50:46 so wrote:Making the array reallocate _every_ time that it's resized (to a greater length) is actually not that unreasonable. Would it be highly inneficient? Only if you write bad code. TDPL agrees with you, I quote: " One easy way out would be to always reallocate a upon appending to it [...] Although that behavior is easiest to implement, it has serious efficiency problems. For example, oftentimes arrays are iteratively grown in a loop: int[] a; foreach (i; 0 .. 100) { a ~= i; } " Hum, "oftentimes"? I wonder if such code is really that common (and what languages are we talking about here?)D arrays very powerful but you first need to understand what is going on. You should check the book. An inconsistency is the copy of static arrays at assignment, but necessary one. One thing i don't like about D arrays is an undefined case in dynamic array reallocation.It's perfectly defined, just not knowable at compile time. You can even check the array's capacity if you want to try and figure out when it's going to happen. And there's not really any reasonable alternative. What would have happen instead? Make an array reallocate _every_ time that it's resized? That would be highly inefficient and could really degrade performance. Appending becomes O(n) instead of amortized O(1). If you're not altering the actual elements of the array, then the current implementation is great. If you _are_ altering them, then simply dup the array to guarantee that it's been reallocated. - Jonathan M DavisBut more importantly, there is a simple solution: don't write such code, don't use arrays like if they are lists, preallocate instead and then fill the array. So with this alternative behavior, you can still write efficient code, and nearly as easily.I disagree. Often you don't know the length to preallocate (e.g. input is from a file etc). The fact that there's a convenient append operator only makes things more in favor of supporting such idioms. The technique (exponential capacity growth) is well known.The only advantage of the current behavior is that it is more noob friendly, which is an advantage of debatable value.I don't think the current behavior favors noobs. Andrei
Nov 26 2010
On 26/11/2010 19:16, Andrei Alexandrescu wrote:On 11/26/10 12:22 PM, Bruno Medeiros wrote:You could still do exponential capacity growth by manipulating the length property, but yeah, that would create a host of complexity and other issues (see my reply to Pelle). Yeah, my reasoning was really broken. :'( (I need some R&R, lol) -- Bruno Medeiros - Software EngineerBut more importantly, there is a simple solution: don't write such code, don't use arrays like if they are lists, preallocate instead and then fill the array. So with this alternative behavior, you can still write efficient code, and nearly as easily.I disagree. Often you don't know the length to preallocate (e.g. input is from a file etc). The fact that there's a convenient append operator only makes things more in favor of supporting such idioms. The technique (exponential capacity growth) is well known.The only advantage of the current behavior is that it is more noob friendly, which is an advantage of debatable value.I don't think the current behavior favors noobs. Andrei
Nov 26 2010
On Fri, 26 Nov 2010 21:59:37 +0000 Bruno Medeiros <brunodomedeiros+spam com.gmail> wrote:You could still do exponential capacity growth by manipulating the=20 length property, but yeah, that would create a host of complexity and=20 other issues (see my reply to Pelle). Yeah, my reasoning was really=20 broken. :'(=20 What is the reason why D does not internally manages exp growth with a (pri= vate) capacity field? It's very common, and I thought the reason was precis= ely efficiency as it does not require reallocating so often, esp in cases s= uch as feeding an array in a loop like in Andrei's example.) Denis -- -- -- -- -- -- -- vit esse estrany =E2=98=A3 spir.wikidot.com
Nov 26 2010
On 26/11/2010 22:12, spir wrote:On Fri, 26 Nov 2010 21:59:37 +0000 Bruno Medeiros<brunodomedeiros+spam com.gmail> wrote:But D does exactly that, there is a capacity field (internal to the GC), and array growth is managed automatically, in an exponential way. -- Bruno Medeiros - Software EngineerYou could still do exponential capacity growth by manipulating the length property, but yeah, that would create a host of complexity and other issues (see my reply to Pelle). Yeah, my reasoning was really broken. :'(What is the reason why D does not internally manages exp growth with a (private) capacity field? It's very common, and I thought the reason was precisely efficiency as it does not require reallocating so often, esp in cases such as feeding an array in a loop like in Andrei's example.) Denis -- -- -- -- -- -- -- vit esse estrany ☣ spir.wikidot.com
Nov 26 2010
On 26/11/2010 19:16, Andrei Alexandrescu wrote:Well, there was actually no assumption yet, I wanted first of all to know what languages you had in mind, because I'm wasn't sure I understood you correctly. C and C++ don't even have (dynamic) arrays. append operation, arrays cannot be resized. So trivially that idiom is not common in these languages. :) I don't know about Python, Ruby, Erlang, Haskell, Perl, PHP. Or perhaps you were just being very liberal in your meaning of "arrays", and were also thinking of constructs like C++'s std::vector ? -- Bruno Medeiros - Software Engineer" One easy way out would be to always reallocate a upon appending to it [...] Although that behavior is easiest to implement, it has serious efficiency problems. For example, oftentimes arrays are iteratively grown in a loop: int[] a; foreach (i; 0 .. 100) { a ~= i; } " Hum, "oftentimes"? I wonder if such code is really that common (and what languages are we talking about here?)It would be difficult to challenge the assumption that appends in a loop are common.
Nov 26 2010
On Fri, 26 Nov 2010 18:22:46 +0000 Bruno Medeiros <brunodomedeiros+spam com.gmail> wrote:Making the array reallocate _every_ time that it's resized (to a greater==20length) is actually not that unreasonable. Would it be highly=20 inneficient? Only if you write bad code. TDPL agrees with you, I quote: =20 " One easy way out would be to always reallocate a upon appending to it [...] Although that behavior is easiest to implement, it has serious efficiency problems. For example, oftentimes arrays are=20 iteratively grown in a loop: =20 int[] a; foreach (i; 0 .. 100) { a ~=3D i; } =20 " =20 Hum, "oftentimes"? I wonder if such code is really that common (and what==20languages are we talking about here?) =20 But more importantly, there is a simple solution: don't write such code,==20don't use arrays like if they are lists, preallocate instead and then=20 fill the array. So with this alternative behavior, you can still write=20 efficient code, and nearly as easily. =20 The only advantage of the current behavior is that it is more noob=20 friendly, which is an advantage of debatable value.Well, except that "noobs" usually don't care about performance. (Anybody would else preallocate, I guess, if only because it is just a few = more key strokes; but the corresponding idiom is not that obvious: T[] xxx =3D new T[yyy.length]; ) Denis -- -- -- -- -- -- -- vit esse estrany =E2=98=A3 spir.wikidot.com
Nov 26 2010
Bruno Medeiros Wrote:Making the array reallocate _every_ time that it's resized (to a greater length) is actually not that unreasonable. Would it be highly inneficient? Only if you write bad code. TDPL agrees with you, I quote:
Nov 26 2010
On 26/11/2010 21:30, Kagamin wrote:Bruno Medeiros Wrote:Huh? -- Bruno Medeiros - Software EngineerMaking the array reallocate _every_ time that it's resized (to a greater length) is actually not that unreasonable. Would it be highly inneficient? Only if you write bad code. TDPL agrees with you, I quote:
Nov 26 2010