www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Code security: "auto" / Reason for errors

reply Ozan <ozan.sueel gmail.com> writes:
Hi

I despair of "auto var1 = var2"for arrays. Isn't it a open door 
for errors. Example

import std.stdio;

void main()
{
	int[] a;
	foreach(i; 0..10) a ~= i;
	auto b = a; // correct dlang coding: auto b = a.dup;
	
	a[2] = 1;
	b[2] = 5; // Overwrites assignment before
	writeln(a);
	writeln(b); // Always a == b but developer would like to have (a 
!= b)
}

The behaviour is different to other non-container datatypes.
So in a first view, it looks like a data copy but it's only a 
pointer copy.

Regards, Ozan
Mar 02 2016
next sibling parent reply Adam D. Ruppe <destructionator gmail.com> writes:
On Wednesday, 2 March 2016 at 19:42:02 UTC, Ozan wrote:
 I despair of "auto var1 = var2"for arrays. Isn't it a open door 
 for errors. Example
 	int[] a;
 	foreach(i; 0..10) a ~= i;
 	auto b = a; // correct dlang coding: auto b = a.dup;
It'd do exactly the same thing if you wrote int[] b = a; so the auto changes nothing there. Generally, the slice assignment is a good thing because it gives you easy efficiency and doesn't hide the costs of a duplicate.
Mar 02 2016
parent reply Ozan <ozan.sueel gmail.com> writes:
 It'd do exactly the same thing if you wrote

 int[] b = a;

 so the auto changes nothing there. Generally, the slice 
 assignment is a good thing because it gives you easy efficiency 
 and doesn't hide the costs of a duplicate.
I agree for slices, but typically variables should have his own data. To add additional code for separate data has a lot of risk in my mind. Behaviors should also be the same. int a = 1: int b = a; // data copy int[] a; int[] b = a; // pointer copy is not the same and should be avoid. Regards, Ozan
Mar 02 2016
next sibling parent reply Jacob Carlborg <doob me.com> writes:
On 2016-03-02 21:01, Ozan wrote:

 I agree for slices, but typically variables should have his own data.
 To add additional code for separate data has a lot of risk in my mind.
 Behaviors should also be the same.

 int a = 1:
 int b = a;  // data copy

 int[] a;
 int[] b = a; // pointer copy

 is not the same and should be avoid.
Same thing for objects which are reference types. -- /Jacob Carlborg
Mar 02 2016
parent reply Ozan <ozan.sueel gmail.com> writes:
On Wednesday, 2 March 2016 at 20:07:30 UTC, Jacob Carlborg wrote:
 On 2016-03-02 21:01, Ozan wrote:

 I agree for slices, but typically variables should have his 
 own data.
 int a = 1:
 int b = a;  // data copy

 int[] a;
 int[] b = a; // pointer copy

 is not the same and should be avoid.
Same thing for objects which are reference types.
Yes, but D handles basic datatypes (int, char, ...) different to objects (similar to Java). And again an assignment like int[] b = a has his risks which should be avoid in language design. Reading code requires some experience but should would like expected from other languages. From security point of view I would recommend a style like int[] b = a; // data copy int[] b = a.ptr; // pointer copy, b & a pointing to the same data. a == b / a is b Better as int* b = a.ptr; which has same risks like in C int[] b = a.slice; // slice "copy", same data but with mighty slices, a ?= b / a !is b int[] b = a.dup; // data copy, a == b / a !is b Regards, Ozan
Mar 02 2016
parent Jacob Carlborg <doob me.com> writes:
On 2016-03-02 22:23, Ozan wrote:

 Yes, but D handles basic datatypes (int, char, ...) different to objects
 (similar to Java).
Arrays in Java and most other languages in the C family behaves the like in D. Example: class Foo { public static void main (String[] args) { int[] a = new int[5]; a[0] = 3; int[] b = a; b[0] = 8; System.out.println(a[0]); // prints 8 } } -- /Jacob Carlborg
Mar 02 2016
prev sibling next sibling parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
On 3/2/16 3:01 PM, Ozan wrote:
 It'd do exactly the same thing if you wrote

 int[] b = a;

 so the auto changes nothing there. Generally, the slice assignment is
 a good thing because it gives you easy efficiency and doesn't hide the
 costs of a duplicate.
I agree for slices, but typically variables should have his own data. To add additional code for separate data has a lot of risk in my mind. Behaviors should also be the same. int a = 1: int b = a; // data copy int[] a; int[] b = a; // pointer copy is not the same and should be avoid.
Pointer copying is inherent in D. Everything is done at the "head", deep copies are never implicit. This is a C-like language, so one must expect this kind of behavior and plan for it. If you want a self-duplicating dynamic array, you can write one pretty easily with postblit. If you use static arrays, then the copying behavior is what you get (because there is no pointer). What is the danger you are concerned about? -Steve
Mar 02 2016
parent reply John Nixon <john.h.nixon1 gmail.com> writes:
On Wednesday, 2 March 2016 at 21:37:56 UTC, Steven Schveighoffer 
wrote:

 Pointer copying is inherent in D. Everything is done at the 
 "head", deep copies are never implicit. This is a C-like 
 language, so one must expect this kind of behavior and plan for 
 it.
I sympathise with Ozan. What is the best reference you know that explains this fully? Clearly from your comments, we have lost the argument as far as D is concerned. This leads me to question whether a computer language that is similar to D except that all variables of any type are considered in the same way as objects that own their own data has been considered? I would like to suggest the following: 1. Assignment would imply full (deep) copies 2. “dup” functions would not need to exist 3. I think the const system could be much simpler, perhaps more like it is in C++ 4. Function parameters would be passed by reference by default (to avoid unnecessary copying, but with a reliable const system) I realise that copying large objects might then happen in erroneous code when not intended, but it would be easy for the programmer to diagnose this. I raised a similar issue with the following program that was being discussed in the Learn forum for D and works correctly. Adding another writeln statement shows that the second line of test_fun calls first CS.this then CS.opAssign. An alternative version using a .dup function also works. import std.stdio; struct CS { char[] t; this(const CS rhs) { this = rhs; } CS opAssign(const CS rhs) { writeln("CS.opAssign called"); this.t = rhs.t.dup; return this; } }; void test_fun(const ref CS rhs) { auto cs = CS(rhs); writeln("cs = ",cs); } void main() { CS rhs; rhs.t = "string".dup; test_fun(rhs); return; } I wanted to be able to write instead simply: struct CS { char[] t; }; void test_fun(const ref CS rhs) { auto cs = rhs; writeln("cs = ",cs); } void main() { CS rhs; rhs.t = "string".dup; test_fun(rhs); return; } i.e. the functions this and opAssign aren’t needed, and the simple definition/assignment “auto cs = rhs;” would carry out the full copying of the CS object (as is the case for the simplest of D’s fundamental types). This would guarantee that rhs could not be changed in test_fun and allow it to be declared const. It seems to me there would be great advantages in the simpler syntax (especially if CS had many other members). I think shallow copying at any level should not be the default especially as in general there would be a level of copying that needs specification i.e. how many pointers to de-reference e.g. if some elements were obtained by multiple indirection e.g. an array of arrays of char , char [][], where each array is dynamic.
Jun 01 2016
next sibling parent reply Kagamin <spam here.lot> writes:
On Wednesday, 1 June 2016 at 14:52:29 UTC, John Nixon wrote:
 Clearly from your comments, we have lost the argument as far as 
 D is concerned. This leads me to question whether a computer 
 language that is similar to D except that all variables of any 
 type are considered in the same way as objects that own their 
 own data has been considered? I would like to suggest the 
 following:
 1. Assignment would imply full (deep) copies
 2. “dup” functions would not need to exist
 3. I think the const system could be much simpler, perhaps more 
 like it is in C++
 4. Function parameters would be passed by reference by default 
 (to avoid unnecessary copying, but with a reliable const system)
Value-type containers are planned for phobos, but not done yet. You can try https://github.com/economicmodeling/containers/blob/master/src/conta ners/dynamicarray.d - it's not copyable currently.
Jun 01 2016
parent John Nixon <john.h.nixon1 gmail.com> writes:
On Wednesday, 1 June 2016 at 15:56:24 UTC, Kagamin wrote:

 Value-type containers are planned for phobos, but not done yet.
Thank you for this info. This is probably what I want, meanwhile I’ll try to work round it. If you have any indication of the timing it might be useful.
Jun 01 2016
prev sibling parent Alex Parrill <initrd.gz gmail.com> writes:
On Wednesday, 1 June 2016 at 14:52:29 UTC, John Nixon wrote:
 On Wednesday, 2 March 2016 at 21:37:56 UTC, Steven 
 Schveighoffer wrote:

 Pointer copying is inherent in D. Everything is done at the 
 "head", deep copies are never implicit. This is a C-like 
 language, so one must expect this kind of behavior and plan 
 for it.
I sympathise with Ozan. What is the best reference you know that explains this fully?
Slices/dynamic arrays are literally just a pointer (arr.ptr) and a length (arr.length). Assigning a slice simply copies the ptr and length fields, causing the slice to refer to the entire section of data. Slicing (arr[1..2]) returns a new slice with the ptr and length fields updated. (This also means you can slice arbitrary pointers; ex. `(cast(ubyte*) malloc(1024))[0..1024]` to get a slice of memory backed by C malloc. Very useful.) The only magic happens when increasing the size of the array, via appending or setting length, which usually allocates a new array from the GC heap, except when D determines that it can get away with not doing so (i.e. when the data points somewhere in a GC heap and there's no data in-use after the end of the array. capacity also looks at GC metadata).
Jun 01 2016
prev sibling parent reply Jack Stouffer <jack jackstouffer.com> writes:
On Wednesday, 2 March 2016 at 20:01:01 UTC, Ozan wrote:
 I agree for slices, but typically variables should have his own 
 data.
 To add additional code for separate data has a lot of risk in 
 my mind.
 Behaviors should also be the same.

 int a = 1:
 int b = a;  // data copy

 int[] a;
 int[] b = a; // pointer copy

 is not the same and should be avoid.
It seems you are being confused by the difference between reference and value types. See this article for an explaination on the difference: http://www.albahari.com/valuevsreftypes.aspx Dynamic arrays are reference types in D; ints are value types.
Mar 02 2016
parent reply Daniel Kozak via Digitalmars-d <digitalmars-d puremagic.com> writes:
Dne 3.3.2016 v 03:39 Jack Stouffer via Digitalmars-d napsal(a):
  Dynamic arrays are reference types in D;
No, they are value types, but mimic reference types in some cases void fun(int[] arr) { arr ~= 10; } void fun2(ref int[] arr) { arr ~= 10; } void mod(int[] arr) { if (arr.length) arr[0] = 7; } void mod2(ref int[] arr) { if (arr.length) arr[0] = 7; } void main() { int[] arr; int[] arr2 = [1]; fun(arr); fun(arr2); writeln(arr); // [] writeln(arr2); // [1] fun2(arr); fun2(arr2); writeln(arr); // [10] writeln(arr2); // [1, 10] mod(arr); mod(arr2); writeln(arr); // [7] writeln(arr2); // [7, 10] mod(arr); mod(arr2); writeln(arr); // [7] writeln(arr2); // [7, 10] }
Mar 02 2016
parent Steven Schveighoffer <schveiguy yahoo.com> writes:
On 3/3/16 2:03 AM, Daniel Kozak via Digitalmars-d wrote:
 Dne 3.3.2016 v 03:39 Jack Stouffer via Digitalmars-d napsal(a):
  Dynamic arrays are reference types in D;
No, they are value types, but mimic reference types in some cases
The case that you pass something by reference, and change what that reference is pointing at, is not a special quality of arrays. They are most definitely reference types. What gets people confused is that arrays will change what they reference in surprising ways. Appending is one such case. Increasing length is another. -Steve
Mar 03 2016
prev sibling next sibling parent Jacob Carlborg <doob me.com> writes:
On 2016-03-02 20:42, Ozan wrote:
 Hi

 I despair of "auto var1 = var2"for arrays. Isn't it a open door for
 errors. Example

 import std.stdio;

 void main()
 {
      int[] a;
      foreach(i; 0..10) a ~= i;
      auto b = a; // correct dlang coding: auto b = a.dup;

      a[2] = 1;
      b[2] = 5; // Overwrites assignment before
      writeln(a);
      writeln(b); // Always a == b but developer would like to have (a != b)
 }

 The behaviour is different to other non-container datatypes.
 So in a first view, it looks like a data copy but it's only a pointer copy.
Depending on your needs you can make the array const, or immutable: void main() { const(int)[] a; foreach(i; 0..10) a ~= i; auto b = a; a[2] = 1; // does not compile b[2] = 5; // does not compile } -- /Jacob Carlborg
Mar 02 2016
prev sibling parent reply deadalnix <deadalnix gmail.com> writes:
On Wednesday, 2 March 2016 at 19:42:02 UTC, Ozan wrote:
 Hi

 I despair of "auto var1 = var2"for arrays. Isn't it a open door 
 for errors. Example

 import std.stdio;

 void main()
 {
 	int[] a;
 	foreach(i; 0..10) a ~= i;
 	auto b = a; // correct dlang coding: auto b = a.dup;
 	
 	a[2] = 1;
 	b[2] = 5; // Overwrites assignment before
 	writeln(a);
 	writeln(b); // Always a == b but developer would like to have 
 (a != b)
 }

 The behaviour is different to other non-container datatypes.
 So in a first view, it looks like a data copy but it's only a 
 pointer copy.

 Regards, Ozan
Everything behaves as designed, auto changes nothing in the example and there is no security concern. We have a bingo.
Jun 01 2016
parent Timon Gehr <timon.gehr gmx.ch> writes:
On 01.06.2016 17:34, deadalnix wrote:
 On Wednesday, 2 March 2016 at 19:42:02 UTC, Ozan wrote:
 Hi

 I despair of "auto var1 = var2"for arrays. Isn't it a open door for
 errors. Example

 import std.stdio;

 void main()
 {
     int[] a;
     foreach(i; 0..10) a ~= i;
     auto b = a; // correct dlang coding: auto b = a.dup;

     a[2] = 1;
     b[2] = 5; // Overwrites assignment before
     writeln(a);
     writeln(b); // Always a == b but developer would like to have (a
 != b)
 }

 The behaviour is different to other non-container datatypes.
 So in a first view, it looks like a data copy but it's only a pointer
 copy.

 Regards, Ozan
Everything behaves as designed, auto changes nothing in the example and there is no security concern. We have a bingo.
Mutable aliasing can be error prone if it is not what you need, because then it is essentially a form of manual memory management. Built-in slices are likely just too low-level for the OP.
Jun 01 2016