www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Arrays passed by almost reference?

reply Ali Cehreli <acehreli yahoo.com> writes:
I haven't started reading Andrei's chapter on arrays yet. I hope I won't find
out that the following behavior is expected. :)

import std.cstream;

void modify(int[] a)
{
    a[0] = 1;
    a ~= 2;

    dout.writefln("During: ", a);
}

void main()
{
    int[] a = [ 0 ];

    dout.writefln("Before: ", a);
    modify(a);
    dout.writefln("After : ", a);
}

The output with dmd 2.035 is

Before: [0]
During: [1,2]
After : [1]

I don't understand arrays. :D

Ali
Nov 05 2009
next sibling parent reply dsimcha <dsimcha yahoo.com> writes:
== Quote from Ali Cehreli (acehreli yahoo.com)'s article
 I haven't started reading Andrei's chapter on arrays yet. I hope I won't find
out that the following behavior is expected. :)
 import std.cstream;
 void modify(int[] a)
 {
     a[0] = 1;
     a ~= 2;
     dout.writefln("During: ", a);
 }
 void main()
 {
     int[] a = [ 0 ];
     dout.writefln("Before: ", a);
     modify(a);
     dout.writefln("After : ", a);
 }
 The output with dmd 2.035 is
 Before: [0]
 During: [1,2]
 After : [1]
 I don't understand arrays. :D
 Ali
This is one of those areas where the low-level details of how arrays are implemented arrays leak out. This is unfortunate, but in a close-to-the-metal language it's sometimes a necessary evil. (Dynamic) Arrays are structs that consist of a pointer to the first element and a length. Essentially, the memory being pointed to by the array is passed by reference, but the pointer to the memory and the length of the array are passed by value. While this may seem ridiculous at first, it's a tradeoff that allows for the extremely convenient slicing syntax we have to be implemented efficiently. When you do the a[0] = 1, what you're really doing is: *(a.ptr) = 1; When you do the a ~= 2, what you're really doing is: // Make sure the block of memory pointed to by a.ptr // has enough capacity to be appended to. a.length += 1; *(a.ptr + 1) = 2; Realistically, the only way to understand D arrays and use them effectively is to understand the basics of how they work under the hood. If you try to memorize a bunch of abstract rules, it will seem absurdly confusing.
Nov 05 2009
next sibling parent Travis Boucher <boucher.travis gmail.com> writes:
dsimcha wrote:
 == Quote from Ali Cehreli (acehreli yahoo.com)'s article
 I haven't started reading Andrei's chapter on arrays yet. I hope I won't find
out that the following behavior is expected. :)
 import std.cstream;
 void modify(int[] a)
 {
     a[0] = 1;
     a ~= 2;
     dout.writefln("During: ", a);
 }
 void main()
 {
     int[] a = [ 0 ];
     dout.writefln("Before: ", a);
     modify(a);
     dout.writefln("After : ", a);
 }
 The output with dmd 2.035 is
 Before: [0]
 During: [1,2]
 After : [1]
 I don't understand arrays. :D
 Ali
This is one of those areas where the low-level details of how arrays are implemented arrays leak out. This is unfortunate, but in a close-to-the-metal language it's sometimes a necessary evil. (Dynamic) Arrays are structs that consist of a pointer to the first element and a length. Essentially, the memory being pointed to by the array is passed by reference, but the pointer to the memory and the length of the array are passed by value. While this may seem ridiculous at first, it's a tradeoff that allows for the extremely convenient slicing syntax we have to be implemented efficiently. When you do the a[0] = 1, what you're really doing is: *(a.ptr) = 1; When you do the a ~= 2, what you're really doing is: // Make sure the block of memory pointed to by a.ptr // has enough capacity to be appended to. a.length += 1; *(a.ptr + 1) = 2; Realistically, the only way to understand D arrays and use them effectively is to understand the basics of how they work under the hood. If you try to memorize a bunch of abstract rules, it will seem absurdly confusing.
main.a starts as: struct { int length = 1; int *data = 0x12345; // some address pointing to [ 0 ] } inside of modify, a is: struct { // different then main.a int length = 2; int *data = 0x12345; // same as main.a data [ 1, 2] } back in main: struct { // same as original main.a int length = 1; int *data = 0x12345; // hasn't changed address, but data has to [ 1 ] } To get the expected results, pass a as a reference: void modify(ref int[] a);
Nov 05 2009
prev sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
dsimcha wrote:
 == Quote from Ali Cehreli (acehreli yahoo.com)'s article
 I haven't started reading Andrei's chapter on arrays yet. I hope I won't find
out that the following behavior is expected. :)
 import std.cstream;
 void modify(int[] a)
 {
     a[0] = 1;
     a ~= 2;
     dout.writefln("During: ", a);
 }
 void main()
 {
     int[] a = [ 0 ];
     dout.writefln("Before: ", a);
     modify(a);
     dout.writefln("After : ", a);
 }
 The output with dmd 2.035 is
 Before: [0]
 During: [1,2]
 After : [1]
 I don't understand arrays. :D
 Ali
This is one of those areas where the low-level details of how arrays are implemented arrays leak out. This is unfortunate, but in a close-to-the-metal language it's sometimes a necessary evil. (Dynamic) Arrays are structs that consist of a pointer to the first element and a length. Essentially, the memory being pointed to by the array is passed by reference, but the pointer to the memory and the length of the array are passed by value. While this may seem ridiculous at first, it's a tradeoff that allows for the extremely convenient slicing syntax we have to be implemented efficiently. When you do the a[0] = 1, what you're really doing is: *(a.ptr) = 1; When you do the a ~= 2, what you're really doing is: // Make sure the block of memory pointed to by a.ptr // has enough capacity to be appended to. a.length += 1; *(a.ptr + 1) = 2; Realistically, the only way to understand D arrays and use them effectively is to understand the basics of how they work under the hood. If you try to memorize a bunch of abstract rules, it will seem absurdly confusing.
I don't think it's that bad. Bartosz tried to get me into a diatribe about how array behavior can't be defined formally. Of course it can. The chunk and the limits of the chunk are part of D's array abstraction. The limits are passed by value. The ~= operation may nondeterministically choose to bind the limits to a different chunk. The right to modify members as they want is a fundamental right of any non-const member, so no confusion there. The decision is encapsulated. User code must write code that works according to that specification. Could there be a better array specification? No doubt. But much as he tried, Bartosz couldn't come up with one. We couldn't come up with one. So if you could come up with one, speak up or forever use the existing one. Andrei
Nov 05 2009
prev sibling next sibling parent reply Frank Benoit <keinfarbton googlemail.com> writes:
Ali Cehreli schrieb:
 I haven't started reading Andrei's chapter on arrays yet. I hope I won't find
out that the following behavior is expected. :)
 
 import std.cstream;
 
 void modify(int[] a)
 {
     a[0] = 1;
     a ~= 2;
 
     dout.writefln("During: ", a);
 }
 
 void main()
 {
     int[] a = [ 0 ];
 
     dout.writefln("Before: ", a);
     modify(a);
     dout.writefln("After : ", a);
 }
 
 The output with dmd 2.035 is
 
 Before: [0]
 During: [1,2]
 After : [1]
 
 I don't understand arrays. :D
 
 Ali
 
int[] a; a is kind of a pointer, one with the extra length information. When passed to modify(), a is passed by-value, the contained data is certainly passed by-reference since a points to the data. This is why the a.length was not updated. If you change "modify" to : void modify(ref int[] a){... it should work as you expected.
Nov 05 2009
parent "Nick Sabalausky" <a a.a> writes:
"Frank Benoit" <keinfarbton googlemail.com> wrote in message 
news:hcvff9$9cr$1 digitalmars.com...
 Ali Cehreli schrieb:
 I haven't started reading Andrei's chapter on arrays yet. I hope I won't 
 find out that the following behavior is expected. :)

 import std.cstream;

 void modify(int[] a)
 {
     a[0] = 1;
     a ~= 2;

     dout.writefln("During: ", a);
 }

 void main()
 {
     int[] a = [ 0 ];

     dout.writefln("Before: ", a);
     modify(a);
     dout.writefln("After : ", a);
 }

 The output with dmd 2.035 is

 Before: [0]
 During: [1,2]
 After : [1]

 I don't understand arrays. :D

 Ali
int[] a; a is kind of a pointer, one with the extra length information. When passed to modify(), a is passed by-value, the contained data is certainly passed by-reference since a points to the data. This is why the a.length was not updated. If you change "modify" to : void modify(ref int[] a){... it should work as you expected.
Or you could force totally-by-value semantics with: void modify(const(int)[] a){ int[] _a = a.dup; ... (Anyone know if scope can be used on that to allocate it on the stack, or is that just for classes?) I do agree it can sometimes be a bit weird though. But like others mentioned, it's kind of a necissary evil, and once you understand how the arrays work under-the-hood, it becomes a bit easier.
Nov 05 2009
prev sibling next sibling parent reply Ali Cehreli <acehreli yahoo.com> writes:
Thanks for all the responses.

And yes, I know that 'ref' is what works for me here. I am trying to figure out
whether I should develop a guideline like "always pass arrays with 'ref', or
you may face surprises."

I understand it very well now and was able to figure out a way to cause some
bugs. :)

What can be said about the output of the following program? Will main.a[0] be
printed as 1 or 111?

import std.cstream;

void modify(int[] a)
{
    a[0] = 1;

    // ... more operations ...

    a[0] = 111;
}

void main()
{
    int[] a;
    a ~= 0;
    modify(a);

    dout.writefln(a[0]);
}

It depends on the operations in between the two assignments to a[0] in 'modify':

- if we leave the comment in place, main.a[0] is 111

- if we replace the comment with this code

    foreach (i; 0 .. 10) {
        a ~= 2;
    }

then main.a[0] is 1. In a sense, modify.a caused only "some" side effects in
main.a. If we shorten the foreach, then main.a[0] is again 111. To me, this is
at an unmanagable level. Unless we always pass with 'ref'.

I don't think that this is easy to explain to a learner; and I think that is a
good indicator that there is a problem with these semantics.

Ali
Nov 05 2009
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Ali Cehreli wrote:
 Thanks for all the responses.
 
 And yes, I know that 'ref' is what works for me here. I am trying to figure
out whether I should develop a guideline like "always pass arrays with 'ref',
or you may face surprises."
 
 I understand it very well now and was able to figure out a way to cause some
bugs. :)
 
 What can be said about the output of the following program? Will main.a[0] be
printed as 1 or 111?
 
 import std.cstream;
 
 void modify(int[] a)
 {
     a[0] = 1;
 
     // ... more operations ...
 
     a[0] = 111;
 }
 
 void main()
 {
     int[] a;
     a ~= 0;
     modify(a);
 
     dout.writefln(a[0]);
 }
 
 It depends on the operations in between the two assignments to a[0] in
'modify':
 
 - if we leave the comment in place, main.a[0] is 111
 
 - if we replace the comment with this code
 
     foreach (i; 0 .. 10) {
         a ~= 2;
     }
 
 then main.a[0] is 1. In a sense, modify.a caused only "some" side effects in
main.a. If we shorten the foreach, then main.a[0] is again 111. To me, this is
at an unmanagable level. Unless we always pass with 'ref'.
 
 I don't think that this is easy to explain to a learner; and I think that is a
good indicator that there is a problem with these semantics.
The ball is in your court to define better semantics. Andrei
Nov 05 2009
next sibling parent reply Leandro Lucarella <llucax gmail.com> writes:
Andrei Alexandrescu, el  5 de noviembre a las 16:10 me escribiste:
 Ali Cehreli wrote:
Thanks for all the responses.

And yes, I know that 'ref' is what works for me here. I am trying to figure out
whether I should develop a guideline like "always pass arrays with 'ref', or
you may face surprises."

I understand it very well now and was able to figure out a way to cause some
bugs. :)

What can be said about the output of the following program? Will main.a[0] be
printed as 1 or 111?

import std.cstream;

void modify(int[] a)
{
    a[0] = 1;

    // ... more operations ...

    a[0] = 111;
}

void main()
{
    int[] a;
    a ~= 0;
    modify(a);

    dout.writefln(a[0]);
}

It depends on the operations in between the two assignments to a[0] in 'modify':

- if we leave the comment in place, main.a[0] is 111

- if we replace the comment with this code

    foreach (i; 0 .. 10) {
        a ~= 2;
    }

then main.a[0] is 1. In a sense, modify.a caused only "some" side effects in
main.a. If we shorten the foreach, then main.a[0] is again 111. To me, this is
at an unmanagable level. Unless we always pass with 'ref'.

I don't think that this is easy to explain to a learner; and I think that is a
good indicator that there is a problem with these semantics.
The ball is in your court to define better semantics.
Just make arrays a reference value, like classes! -- Leandro Lucarella (AKA luca) http://llucax.com.ar/ ---------------------------------------------------------------------- GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145 104C 949E BFB6 5F5A 8D05) ---------------------------------------------------------------------- A lo que Peperino respondióles: aquel que tenga sabañones que se los moje, aquel que padece calvicie no padece un osito, no es bueno comer lechón en día de gastritis, no mezcleis el vino con la sandía, sacad la basura después de las ocho, en caso de emergencia rompa el vidrio con el martillo, a cien metros desvio por Pavón. -- Peperino Pómoro
Nov 05 2009
parent reply Travis Boucher <boucher.travis gmail.com> writes:
Leandro Lucarella wrote:
 Andrei Alexandrescu, el  5 de noviembre a las 16:10 me escribiste:
 Ali Cehreli wrote:
 Thanks for all the responses.

 And yes, I know that 'ref' is what works for me here. I am trying to figure
out whether I should develop a guideline like "always pass arrays with 'ref',
or you may face surprises."

 I understand it very well now and was able to figure out a way to cause some
bugs. :)

 What can be said about the output of the following program? Will main.a[0] be
printed as 1 or 111?

 import std.cstream;

 void modify(int[] a)
 {
    a[0] = 1;

    // ... more operations ...

    a[0] = 111;
 }

 void main()
 {
    int[] a;
    a ~= 0;
    modify(a);

    dout.writefln(a[0]);
 }

 It depends on the operations in between the two assignments to a[0] in
'modify':

 - if we leave the comment in place, main.a[0] is 111

 - if we replace the comment with this code

    foreach (i; 0 .. 10) {
        a ~= 2;
    }

 then main.a[0] is 1. In a sense, modify.a caused only "some" side effects in
main.a. If we shorten the foreach, then main.a[0] is again 111. To me, this is
at an unmanagable level. Unless we always pass with 'ref'.

 I don't think that this is easy to explain to a learner; and I think that is a
good indicator that there is a problem with these semantics.
The ball is in your court to define better semantics.
Just make arrays a reference value, like classes!
You mean dynamic arrays, but what about static arrays? Sometimes it makes more sense to send a static array as a value rather then a reference (think in the case of small vectors). Then we'd have 2 semantics for arrays, one for static arrays and one for dynamic arrays. I am not fully against pass-by-ref arrays, I just think in passing by reference all of the time could have some performance implications.
Nov 05 2009
parent reply Leandro Lucarella <llucax gmail.com> writes:
Travis Boucher, el  5 de noviembre a las 20:44 me escribiste:
I don't think that this is easy to explain to a learner; and I think that is a
good indicator that there is a problem with these semantics.
The ball is in your court to define better semantics.
Just make arrays a reference value, like classes!
You mean dynamic arrays, but what about static arrays?
I would say "make them value types", but they already are ;)
 Sometimes it makes more sense to send a static array as a value rather
 then a reference (think in the case of small vectors).
That's why they already are value types.
 Then we'd have 2 semantics for arrays, one for static arrays and one
 for dynamic arrays.
Yes. They should have different semantics because they are different.
 I am not fully against pass-by-ref arrays, I just think in passing by
 reference all of the time could have some performance implications.
OK, make 2 different types then: slices (value types, can't append, they are only a view on other's data) and dynamic arrays (reference type, can append, but a little slower to manipulate). It's a shame this idea didn't came true after all... -- Leandro Lucarella (AKA luca) http://llucax.com.ar/ ---------------------------------------------------------------------- GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145 104C 949E BFB6 5F5A 8D05) ---------------------------------------------------------------------- - i bet microsoft's developers were on diet when they had to do win95 - microsoft has developers?
Nov 05 2009
next sibling parent Travis Boucher <boucher.travis gmail.com> writes:
 I am not fully against pass-by-ref arrays, I just think in passing by
 reference all of the time could have some performance implications.
OK, make 2 different types then: slices (value types, can't append, they are only a view on other's data) and dynamic arrays (reference type, can append, but a little slower to manipulate). It's a shame this idea didn't came true after all...
I just wonder if that would be confusing. Static arrays of 2 different sizes are 2 different types. Another example of how it is already confusing: -- int[2] a = [1, 2]; int[] b = [11, 22, 33]; b = a; a[0] = 111; /* Now both a and b == [111, 2], instead of the intuitive b == [1,2], c == [111, 2]. They point at the same data. */ b.length = b.length + 1; // now at different data. a[1] = 222; /* a == [111, 222], b == [111,2,0] as expected */ -- Something that is nice about dynamic arrays is how they can intermix with static arrays (int[] b = int[2]) in an efficient (and lazy copying) manor. It makes functions like this fast and efficient: int addThemAll(int[] data) { int rv = 0; foreach (i,v; data) rv += v; return rv; } Since an implicit case from a static array to a dynamic array is cheap, and slicing an array to a dynamic array is cheap (as long as you are only reading from the array). I don't see how separating them to have different call semantics solves the problem. However making a clearer definition of each (in documentation for example) might be helpful. Me, being new D, I am glad this thread exists because I can see how I could have shot myself in the foot in the future without playing around and learning the difference.
Nov 05 2009
prev sibling parent reply "Bob Jones" <me not.com> writes:
"Leandro Lucarella" <llucax gmail.com> wrote in message 
news:20091106035612.GI3748 llucax.com.ar...
 I am not fully against pass-by-ref arrays, I just think in passing by
 reference all of the time could have some performance implications.
OK, make 2 different types then: slices (value types, can't append, they are only a view on other's data) and dynamic arrays (reference type, can append, but a little slower to manipulate). It's a shame this idea didn't came true after all...
Thats the whole problem. Dynamic arrays and slices are not the same thing, and having a syntax that allows code to be ignorant of which it is dealing with is always going to have problems imo. Being able to resize or append to slices is fubar imo. I'd go with slices being value types, no concentenation, or resizing / reallocating, etc.. Dynamic arrays could be a library type. A templated struct that has a pointer, length, or whatever. They can have operator overloads for implicit convertion to slices, so any code that accepts slice can take dynamic arrays, and prevent side effects. Code that is going to reallocate, has to take a dynamic array. So at least whats happening is more obvious/explicit.
Nov 05 2009
parent reply Yigal Chripun <yigal100 gmail.com> writes:
On 06/11/2009 07:07, Bob Jones wrote:
 "Leandro Lucarella"<llucax gmail.com>  wrote in message
 news:20091106035612.GI3748 llucax.com.ar...
 I am not fully against pass-by-ref arrays, I just think in passing by
 reference all of the time could have some performance implications.
OK, make 2 different types then: slices (value types, can't append, they are only a view on other's data) and dynamic arrays (reference type, can append, but a little slower to manipulate). It's a shame this idea didn't came true after all...
Thats the whole problem. Dynamic arrays and slices are not the same thing, and having a syntax that allows code to be ignorant of which it is dealing with is always going to have problems imo. Being able to resize or append to slices is fubar imo. I'd go with slices being value types, no concentenation, or resizing / reallocating, etc.. Dynamic arrays could be a library type. A templated struct that has a pointer, length, or whatever. They can have operator overloads for implicit convertion to slices, so any code that accepts slice can take dynamic arrays, and prevent side effects. Code that is going to reallocate, has to take a dynamic array. So at least whats happening is more obvious/explicit.
I agree with the above. the semantics should be: DynamicArray!(T) as a dynamic array int[x] is a static array RandomAccessRange!(T) is a slice int[] a; // compile error (names are not important ATM) I don't think there's a need for a dedicated array slice type and instead they should be range types. It should be easy to change underlining containers with compatible range types.
Nov 06 2009
parent reply Travis Boucher <boucher.travis gmail.com> writes:
Yigal Chripun wrote:
 On 06/11/2009 07:07, Bob Jones wrote:
 "Leandro Lucarella"<llucax gmail.com>  wrote in message
 news:20091106035612.GI3748 llucax.com.ar...
 I am not fully against pass-by-ref arrays, I just think in passing by
 reference all of the time could have some performance implications.
OK, make 2 different types then: slices (value types, can't append, they are only a view on other's data) and dynamic arrays (reference type, can append, but a little slower to manipulate). It's a shame this idea didn't came true after all...
Thats the whole problem. Dynamic arrays and slices are not the same thing, and having a syntax that allows code to be ignorant of which it is dealing with is always going to have problems imo. Being able to resize or append to slices is fubar imo. I'd go with slices being value types, no concentenation, or resizing / reallocating, etc.. Dynamic arrays could be a library type. A templated struct that has a pointer, length, or whatever. They can have operator overloads for implicit convertion to slices, so any code that accepts slice can take dynamic arrays, and prevent side effects. Code that is going to reallocate, has to take a dynamic array. So at least whats happening is more obvious/explicit.
I agree with the above. the semantics should be: DynamicArray!(T) as a dynamic array int[x] is a static array RandomAccessRange!(T) is a slice int[] a; // compile error (names are not important ATM) I don't think there's a need for a dedicated array slice type and instead they should be range types. It should be easy to change underlining containers with compatible range types.
You can create DynamicArray and RandomAccessRange already now. Currently int[] a is very intuitive in its purpose, its just some of the implementation details that get confusing. int doSomething(in int[]) a) tells me doSomething is going to process an int array of any size and not modify it. int doSomething(int[] a) tells me doSomething is going to process an int array of any size and possibly modify it. An explicit 'out int[] a' would make it even more obvious what the function is going to do. The thing is, dynamic arrays and slices are pretty much the same thing, its just hard to track what the underlying store points to.
Nov 05 2009
parent reply gzp <galap freemail.hu> writes:
 
 You can create DynamicArray and RandomAccessRange already now.
 
 Currently int[] a is very intuitive in its purpose, its just some of the 
 implementation details that get confusing.
 
 int doSomething(in int[]) a)
 tells me doSomething is going to process an  int array of any size and 
 not modify it.
 
 int doSomething(int[] a)
 tells me doSomething is going to process an int array of any size and 
 possibly modify it.
 
 An explicit 'out int[] a' would make it even more obvious what the 
 function is going to do.
 
 The thing is, dynamic arrays and slices are pretty much the same thing, 
 its just hard to track what the underlying store points to.
I think problem is that, dynamic arrays and slices are NOT the same. They have a common subset of interfaces (length, at, slice(maybe)), but they are just different. An array owns it's element, it can resize/remove, etc. the underlying structure. Ex A special array that stores the elements in a tree can add remove nodes at any time, but a slice of this "array" cannot alter the tree - only the elements (For example a AVL-tree cannot have a slice that modifies the element as it would require a restructuring of the tree, element modification can be performed only through the array itself) I know int[] is a much simpler storage, but it makes my point more understandable. So now back to int[] According to the current implementation foo(ref int[]) is an array: It can add/remove elements (restructure the tree) foo(int[]) is a mixed thing. It is a slice with an automatic copy feature. It is not an array nor a slice. If it would be a slice, than the underling struct could not be altered, but here it can be. It does not own its elements as it cannot resize the underlying structure without penalty. Actually it is a slice + copy_on_resize. When you modify the elements, it alters the element of the referred array, but when you resize it, it copies (or not, depending on the DMD implementation!!!) the elements into another array and creates a new slice+copy_on_resize object for this array. Thus foo(int[]) is worse than the dangling pointers or buffer overwrite errors from C(C++). Semantically they are correct, so bug produced from resized int[] cannot be detected using tricks like patterns in the memory. It is something that can cause really-really nasty bugs. Especially when the copy of the original array on resizing depend on the dmd implementation. (depends on how much extra datas are allocated for an array before they are really reallocated). From my point of view it's quite rare to have an array that is temporally extended with elements (partially enabling to modify to old ones), then forget the new ones. So please don't favor a feature in the core language that is either hardly used and causes bugs, that can be hardly detected. So my proposal is that (as been already mentioned by others): - have array those contains the element + structure, ex RandomAccessArray!(int) AVLTree - have slices (ranges) of array, where the structure cannot be altered - decide if int[] stands for a short version for either RandomAccessArray!(int) or Range!(int), but DO NOT have a mixed meaning, that can be altered/modified with const/immutable/ref qualifiers. Some random thoughts: [1,2,3,4] literal is the array itself. It could alter the structure, if it would not be an immutable object. int[] a = [1,2,3,4] a is a slice of the array, cannot be resized. int[] a = new int[100]; is a slice too, new int[100] is a short version for new RandomAccassArray!(int)(100) int[100] b; int[] a = b; is a short version for a copy: a = RandomAccassArray!(int)( b ).opRange() or a much better solution'd be a slice to a static array if that's possible. a.array is the referred array, thus a.array ~= 2 could resize the array, 1. the slice a itself is not modified, it still points to the original subset - but then what about the removed elements ??? 2. the slice a is automatically resized to point to the altered structure 3. slices of static arrays they cannot be resized const int[] cannot resize the underlying array int[] can int[new] is the array itself (i'm not sure) int[new] a = new int[100]; int[] slice_a = a; assert( slice_a.array.ptr == a.ptr ); Gzp.
Nov 12 2009
parent reply Ali Cehreli <acehreli yahoo.com> writes:
gzp Wrote:

 I think problem is that, dynamic arrays and slices are NOT the same. 
I agree with most of what you wrote, but I can't see that in the current implementation.
 They have a common subset of interfaces (length, at, slice(maybe)), but 
 they are just different. An array owns it's element, it can 
 resize/remove, etc. the underlying structure. Ex A special array that 
 stores the elements in a tree can add remove nodes at any time, but a 
 slice of this "array" cannot alter the tree - only the elements (For 
 example a AVL-tree cannot have a slice that modifies the element as it 
 would require a restructuring of the tree, element modification can be 
 performed only through the array itself)
 I know int[] is a much simpler storage, but it makes my point more 
 understandable.
 
 So now back to int[]
 According to the current implementation
 
 foo(ref int[]) is an array: It can add/remove elements (restructure the 
 tree)
I don't think so: that is a reference to a slice.
 foo(int[]) is a mixed thing. It is a slice with an automatic copy 
 feature. It is not an array nor a slice.
It is a pass-by-value slice. Now there is one more slice that provides access to what the argument has been providing access to.
 If it would be a slice, than the underling struct could not be altered, 
 but here it can be.
D2's "slice" is different than some other languages'. :)
 It does not own its elements as it cannot resize the underlying 
 structure without penalty.
Agreed.
 Actually it is a slice + copy_on_resize.
Yes.
 When you modify the elements, 
 it alters the element of the referred array, but when you resize it, it 
 copies (or not, depending on the DMD implementation!!!) the elements 
 into another array and creates a new slice+copy_on_resize object for 
 this array.
That is the "discretionary" part in the semantics. But, we must find a different entity (GC? druntime?) that makes the copies; becaus that entity is the owner of the elements.
 Thus foo(int[]) is worse than the dangling pointers or buffer overwrite 
 errors from C(C++). Semantically they are correct, so bug produced from 
 resized int[] cannot be detected using tricks like patterns in the 
 memory. It is something that can cause really-really nasty bugs. 
 Especially when the copy of the original array on resizing depend on the 
 dmd implementation. (depends on how much extra datas are allocated for 
 an array before they are really reallocated).
 
 
  From my point of view it's quite rare to have an array that is 
 temporally extended with elements (partially enabling to modify to old 
 ones), then  forget the new ones.
I don't understand the use either. But I can see how it is important for performance. But we may not know at that time whether the new ones will not be used. The new slice may be copied to another one. I agree though that I can't see any use case.
 So please don't favor a feature in the 
 core language that is either hardly used and causes bugs, that can be 
 hardly  detected.
 
 So my proposal is that (as been already mentioned by others):
   - have array those contains the element + structure,
 	ex RandomAccessArray!(int) AVLTree
   - have slices (ranges) of array, where the structure cannot be altered
   - decide if int[] stands for a short version for either 
 RandomAccessArray!(int) or Range!(int), but DO NOT have a mixed meaning, 
 that can be altered/modified with const/immutable/ref qualifiers.
Good proposals. My views on the current semantics: I recently took it a challenge to define the semantics of the current dmd implementation of "dynamic consequtive objects" (I don't want to call them slices or dynamic arrays.) I've posted my views this weeek... First two objections (not to you, but to the current nomenclature): 1) I disagree that D2 provides dynamic arrays to the programmer. The dynamic nature of the elements are maintained on the background; but the programmers never lay their hands on dynamic arrays. 2) D2's slices are not the same thing as in other languages. Still, I will call what the programmer receives "slices" below To illustrate, let's have a look at the following definition: int[] slice = new int[10]; - side effect: 10 objects are created - returned value: a slice to all of those objects It gets interesting: int[] slice2 = slice[1..$-1]; Now we have two entities that provide access to the underlying objects. This is a "sharing relationship." In this sense, the two share the access to those objects. The interesting part is that, either party can leave this relationship at will... As soon as they see unfit, they will go elsewhere and start providing access to copies of these object. Neither party owns these objects. The garbage collector does. Because, if we say that 'slice' was the owner and now went away, then is 'slice2' owning the objects? Has it been promoted to a "dynamic array?" I think not. For that reason, I see no difference between "dynamic arrays" and "slices" in D2. Neither owns the objects; they provide access. I describe this as "discretionary sharing semantics."
 Some random thoughts:
I am not sure whether the following are your proposals for change, but I tested them with the current implementation and they fit in the semantics as I understand.
 [1,2,3,4] literal is the array itself. It could alter the structure, if
 it would not be an immutable object.
 
 int[] a = [1,2,3,4] a is a slice of the array, cannot be resized.
It can be resized with 2.036 and fits my definition. As we append objects to 'a', it may get new copies to provide access to.
 int[] a = new int[100]; is a slice too,
Agreed: side effect is 100 element creation, return value is a slice.
 int[100] b;
 int[] a = b; is a short version for a copy: a = RandomAccassArray!(int)(
Disagreed: b is a fixed-sized array and 'a' is a slice that provides access to its objects. 'a' may terminate this sharing contract at will as it sees unfit.
 a.array is the referred array, thus a.array ~= 2 could resize the array,
 1. the slice a itself is not modified, it still points to the original
 subset - but then what about the removed elements ???
 2. the slice a is automatically resized to point to the altered structure
 3. slices of static arrays they cannot be resized
Reading those, I think you've been proposing. My attempt is to define the *current* semantics as of 2.036.
 const int[] cannot resize the underlying array
 int[] can
 
 int[new] is the array itself (i'm not sure)
 int[new] a = new int[100];
I haven't learned about T[new] yet, but I think it is discontinued. (?) Ali
Nov 12 2009
parent gzp <galap freemail.hu> writes:
 
 D2's "slice" is different than some other languages'. :)
 
It's okay to change/create new semantics for new languages. It's a must have to develop new features as long as they make sense. But think as a newbie to programming for a while. If you've just learned of arrays and slices and hardly know anything about memory layouts and pointers, would you understand these strange behaviours? OT: Actually I wouldn't even let a newbie to use GC either, as it hides the ownership questions (as the case here with slices). During program design one of the most crucial question is the role of each module/class. When you have a clear view of ownerships, roles, the design gets much better. With a GC this ownership question is postponed or even omitted and the barriers of the modules may become very thin; modularity/code reuse is gone. (I'm not saying that GC is bad! But more attention must be taken during program design)
 But, we must find a different entity (GC? druntime?) that makes the copies;
becaus that entity is the owner of the elements.
Exactly. And I'd prefer to distinct the owner, and the view of the array more. Don't let the view alter the structure (int[]), and allow the programmer to have access to the actual array (that's hidden now by the GC/druntime ).
 1) I disagree that D2 provides dynamic arrays to the programmer. The dynamic
nature of the elements are maintained on the background; but the programmers
never lay their hands on dynamic arrays.
Just as my comment above.
 It gets interesting:
 
   int[] slice2 = slice[1..$-1];
 
 Now we have two entities that provide access to the underlying objects. This
is a "sharing relationship." In this sense, the two share the access to those
objects.
 
 The interesting part is that, either party can leave this relationship at
will... As soon as they see unfit, they will go elsewhere and start providing
access to copies of these object.
If I want to leave the ownership let's make it explicit and write it down in the code. Don't let the program reviewer think for hours whether it's still the original array or some other different copy and view of it. int[] slice2 = slice.I_really_want_to_create_a_copy_with_2_additional_elements;
 Some random thoughts:
I am not sure whether the following are your proposals for change, but I tested them with the current implementation and they fit in the semantics as I understand.
Actually it was a kind of proposal suggestions those fit the current syntax. I'm not sure what's been implemented of them. They were just some ideas I'd like to see in D2.
 int[new] is the array itself (i'm not sure)
 int[new] a = new int[100];
I haven't learned about T[new] yet, but I think it is discontinued. (?)
I haven't learned of T[new] either, maybe it's a different thing. I don't know, I was not following that thread. It just simply seemed natural after I've seen the syntax in the newsgroup. One final comment why I really don't like the current slice implementation. What is the undefined behaviour of a program? When you cannot tell what is the outcome of a function knowing all the inputs. (the random seeds is an input too :) ) And foo(int[]) is undefined, since it depends on the state of memory fragmentation, the state of the moon, etc. The outcome depends on weather the GC can resize the array in place or not. Thus D has a built in feature that's undefined by nature, thus D is undefined. Or slice resizing creates a copy of the underlying array all the time? Then why can't we access this array directly, i'd be much clearer and readable what was the goal of the programmer. ex. foo( const C classRef) tells the programmer won't thange the class, Than why can't we have: foo( slice ) to indicate I want to change the items in the array foo( const slice ) for reading the items only foo( array ) I want to alter the structure of the elements foo( const array ) just for completeness, as it should have the same effect as the const slice. And for parallel programming they might also help, since for slice access the array structure cannot change so a read access is sufficient for the array structure (sometimes it's better to calculate something twice, and write it twice from different threads). And for the array use, we know, the structure might also be changed, thus it have to be guarded by critical sections as well. Gzp
Nov 13 2009
prev sibling parent Ali Cehreli <acehreli yahoo.com> writes:
Andrei Alexandrescu Wrote:

 Ali Cehreli wrote:
 I don't think that this is easy to explain to a learner; and I think that is a
good indicator that there is a problem with these semantics.
The ball is in your court to define better semantics. Andrei
I thought I passed the ball back to you in this thread: http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D&article_id=100318 It didn't attract any interest... :) Here are some points as teasers: 1) The term "dynamic array" and its distinction from "slice" is detrimental to understanding D's arrays, and is against their nature. For example, is 's2' valid below? void main() { int[] a = new int[11]; int[] s = a[2..8]; a = new int[55]; int[] s2 = s[2..6]; } If so, has 's' been "promoted" to a dynamic array? What happens is, even 'a' was a slice to begin with. It was providing access to the 11 consecutive objects that is being owned by the garbage collector. My point is, even the left hand of the first initialization is a slice: int[] slice = new int[11]; The expression 'new int[11]' creates 11 elements as a side effect, and returns "a slice to all of those elements." 2) The best that I can describe the semantics of slices is "discretionary share." It is like sharing a number of resources by a number of entities. Like, two companies sharing a cubicle space, where both are free to change the contents of cubicles. They can both add a new cubicle that is not shared by the other (s~=1). They can both leave the "sharing" of cubicles at any time an soon as they see unfit. This is a totally at-will arrangement between the two parties. 3) By accepting the above view, the exception of "slices are passed by reference" disappears too: Slices are passed by value as well. What happens is, the slice parameters starts "sharing" (or provides access to) all of the elements that the original slice is sharing. Same with assignment: It creates a sharing contract. I appreciate any comments. Ali
Nov 08 2009
prev sibling parent "Saaa" <empty needmail.com> writes:
Ali Cehreli wrote...

This helps me with the meaning of in, out & ref.
http://bayimg.com/NaeOgaaCC 
Nov 05 2009