www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - "a[++i] = i" vs "a[i] = ++i"

reply "Ivan Smirnov" <i.s.smirnov gmail.com> writes:
I was wondering if this behavior is actually documented anywhere? 
Let's say we want to increment i and store the new value at the 
index equal to the new (incremented) value.

	int[int] a;
	int i = 1;
	a[++i] = i;
	writeln(a);
	a[i] = ++i;
	writeln(a);

 [2:2]
 [2:2, 3:3]
If any, I would expect it to work for either one of lines but not both? What is interesting, if you compile the same in C (I used clang), you get a "warning: unsequenced modification and access to 'i'" on both lines and the answer is (as I would initially except)
 [2:1]
 [2:1, 3:3]
Dec 20 2013
next sibling parent reply =?UTF-8?B?QWxpIMOHZWhyZWxp?= <acehreli yahoo.com> writes:
On 12/20/2013 02:29 PM, Ivan Smirnov wrote:> I was wondering if this 
behavior is actually documented anywhere? Let's
 say we want to increment i and store the new value at the index equal to
 the new (incremented) value.

      int[int] a;
      int i = 1;
      a[++i] = i;
      writeln(a);
      a[i] = ++i;
      writeln(a);

 [2:2]
 [2:2, 3:3]
If any, I would expect it to work for either one of lines but not both? What is interesting, if you compile the same in C (I used clang), you get a "warning: unsequenced modification
That is a roundabout way of saying "the assignment operator does not define a sequence point". :p
 and access to 'i'" on both
 lines and the answer is (as I would initially except)

 [2:1]
 [2:1, 3:3]
Well, regardless of one's expectations, that is one outcome of unspecified behavior. :) Although D is less vocal on these topics it is the same as C and C++: The evaluation order is unspecified. I've read before that Walter wants to eventually define such evaluation orders but it is not specified yet. Ali
Dec 20 2013
parent reply "bearophile" <bearophileHUGS lycos.com> writes:
Ali Çehreli:

 Although D is less vocal on these topics it is the same as C 
 and C++: The evaluation order is unspecified. I've read before 
 that Walter wants to eventually define such evaluation orders 
 but it is not specified yet.
Right, currently the outcome of that kind of code is not specified in D. So writing it is a error, and the D compiler doesn't warn you. Do not write that kind of code in D. Walter wants to eventually (but when? Five years from now?) make that code specified, this means all the D compilers will know what that kind of code produce and will all give the same result reliably. But in my opinion making the code unambiguous and specified for the compiler is not enough. You have also to take in account people that write and read the code, it needs to be unambiguous for them too. For them that kind of code can become tricky, even when you know the rules used by the compiler. So I'd like that kind of D code just to be disallowed, to be an error (and this doesn't break compatibility with C, because equivalent C code is not defined). Bye, bearophile
Dec 20 2013
parent reply "aldanor" <i.s.smirnov gmail.com> writes:
On Friday, 20 December 2013 at 23:17:31 UTC, bearophile wrote:

 Right, currently the outcome of that kind of code is not 
 specified in D. So writing it is a error, and the D compiler 
 doesn't warn you. Do not write that kind of code in D.

 Walter wants to eventually (but when? Five years from now?) 
 make that code specified, this means all the D compilers will 
 know what that kind of code produce and will all give the same 
 result reliably.

 But in my opinion making the code unambiguous and specified for 
 the compiler is not enough. You have also to take in account 
 people that write and read the code, it needs to be unambiguous 
 for them too. For them that kind of code can become tricky, 
 even when you know the rules used by the compiler. So I'd like 
 that kind of D code just to be disallowed, to be an error (and 
 this doesn't break compatibility with C, because equivalent C 
 code is not defined).
I completely agree. I guess my point is, it would be nice if the compiler (at least) fired a warning about this kind of things, kinda of like clang does for both statements (and rightly so). It is nigh impossible for a D newcomer figure out if this kind of code is valid or not, and as you can see from the random example above one might think that it is indeed the way to go.
Dec 20 2013
parent reply "bearophile" <bearophileHUGS lycos.com> writes:
aldanor:

 it would be nice if the compiler (at least) fired a warning 
 about this kind of things,
For reasons D doesn't like warnings, they are kept at a minimum. D prefers errors :-) Bye, bearophile
Dec 20 2013
parent reply "aldanor" <i.s.smirnov gmail.com> writes:
On Saturday, 21 December 2013 at 01:00:48 UTC, bearophile wrote:
 aldanor:

 it would be nice if the compiler (at least) fired a warning 
 about this kind of things,
For reasons D doesn't like warnings, they are kept at a minimum. D prefers errors :-) Bye, bearophile
So.. isn't it exactly the case where a warning is most suitable? :) C/C++ compilers sure get it right in this regard.
Dec 20 2013
parent reply "Dicebot" <public dicebot.lv> writes:
On Saturday, 21 December 2013 at 02:52:26 UTC, aldanor wrote:
 So.. isn't it exactly the case where a warning is most 
 suitable? :) C/C++ compilers sure get it right in this regard.
No. There is no such thing as "suitable compiler warning". It should have been an error.
Dec 20 2013
parent reply "aldanor" <i.s.smirnov gmail.com> writes:
On Saturday, 21 December 2013 at 02:56:57 UTC, Dicebot wrote:
 No. There is no such thing as "suitable compiler warning".
So should this considered a bug then and be filed? (if that will do any good, lol)
Dec 21 2013
parent reply "bearophile" <bearophileHUGS lycos.com> writes:
aldanor:

 So should this considered a bug then and be filed?
I think so. I have a related EnhancementRequest open, but I have to close it down or modify it... Bye, bearophile
Dec 21 2013
parent reply David Held <dmd wyntrmute.com> writes:
On 12/21/2013 6:21 AM, bearophile wrote:
 aldanor:

 So should this considered a bug then and be filed?
I think so. I have a related EnhancementRequest open, but I have to close it down or modify it... Bye, bearophile
int mightUpdate(int& x) { ... return x; } { ... a[mightUpdate(i)] = mightUpdate(i); ... } Is this also a bug? How would the compiler know whether to emit a diagnostic for this case? Dave
Dec 23 2013
next sibling parent reply Charles Hixson <charleshixsn earthlink.net> writes:
On 12/23/2013 12:39 PM, David Held wrote:
 On 12/21/2013 6:21 AM, bearophile wrote:
 aldanor:

 So should this considered a bug then and be filed?
I think so. I have a related EnhancementRequest open, but I have to close it down or modify it... Bye, bearophile
int mightUpdate(int& x) { ... return x; } { ... a[mightUpdate(i)] = mightUpdate(i); ... } Is this also a bug? How would the compiler know whether to emit a diagnostic for this case? Dave
Yes, that should be an error. And the compiler could notice that the function parameter values are allowed to be modified, and forbid such use. As it should. After all, who's going to know which call would modify the value of i? Of course, it could be protocol that the functions included within the statement are all called before the local values are calculated, and that if the same syntax appears twice, the final value of the syntax is the value used...but that, while reasonable for the compiler, is hell on people reading the code. Note that one can make several different plausible arguments as to the order in which the code should be evaluated. When such is possible, the code should be an error. And when it relies on rarely used rules, it should probably be an error. (E.g., if you define that the location into which values will be stored must be evaluated before the expression to be stored is evaluated, you have a case that would be well defined, but it's a "rare event" kind of rule, and unless there's good reason, it should be an error.) -- Charles Hixson
Dec 23 2013
parent reply David Held <dmd wyntrmute.com> writes:
On 12/23/2013 4:12 PM, Charles Hixson wrote:
 On 12/23/2013 12:39 PM, David Held wrote:
[...]
 int mightUpdate(int& x)
 {
     ...
     return x;
 }

 {
     ...
     a[mightUpdate(i)] = mightUpdate(i);
     ...
 }

 Is this also a bug?  How would the compiler know whether to emit a
 diagnostic for this case?
Yes, that should be an error. And the compiler could notice that the function parameter values are allowed to be modified, and forbid such use. As it should. [...]
Ok, let's make it more interesting... void undefinedBehavior(int[] a, int b) { int x = f(b); int y = g(b); a[mightUpdate(a[x])] = mightUpdate(a[y]); } Now, depending on b, f(), and g(), you might have the same as above, or you might not. Should this be illegal, or allowed? Dave
Dec 23 2013
parent reply "monarch_dodra" <monarchdodra gmail.com> writes:
On Tuesday, 24 December 2013 at 07:08:49 UTC, David Held wrote:
 Ok, let's make it more interesting...
The compiler is only supposed to error out on stuff that is actually illegal according to the language. What you are doing is not *illegal*, it just produces un-specified behavior. The compiler has no obligation to error out on it, and may not even be *allowed* to. At best, it might be able to detect something wrong, and be allowed to warn you about it. AFAIK, Walter is not exactly for documenting such behavior, but just for enforcing that all compilers adhere to the same behvior. EG: the "wrong" behavior will be the same on all platforms. This way, if your code works on a machine, it *should* work just fine on another.
Dec 24 2013
parent Timon Gehr <timon.gehr gmx.ch> writes:
On 12/24/2013 02:37 PM, monarch_dodra wrote:
 On Tuesday, 24 December 2013 at 07:08:49 UTC, David Held wrote:
 Ok, let's make it more interesting...
The compiler is only supposed to error out on stuff that is actually illegal according to the language. What you are doing is not *illegal*, it just produces un-specified behavior. ...
The behaviour is specified modulo evaluation order. An implementation will just pick some evaluation order, it is not allowed to produce arbitrary behaviour. In any case, it is a given that strict left-to-right evaluation order will eventually be required, so I don't think that it makes a lot of sense to discuss whether or not to error out.
Dec 24 2013
prev sibling parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Mon, Dec 23, 2013 at 04:12:05PM -0800, Charles Hixson wrote:
 On 12/23/2013 12:39 PM, David Held wrote:
[...]
int mightUpdate(int& x)
{
    ...
    return x;
}

{
    ...
    a[mightUpdate(i)] = mightUpdate(i);
    ...
}

Is this also a bug?  How would the compiler know whether to emit a
diagnostic for this case?

Dave
Yes, that should be an error. And the compiler could notice that the function parameter values are allowed to be modified, and forbid such use. As it should. After all, who's going to know which call would modify the value of i? Of course, it could be protocol that the functions included within the statement are all called before the local values are calculated, and that if the same syntax appears twice, the final value of the syntax is the value used...but that, while reasonable for the compiler, is hell on people reading the code. Note that one can make several different plausible arguments as to the order in which the code should be evaluated. When such is possible, the code should be an error. And when it relies on rarely used rules, it should probably be an error. (E.g., if you define that the location into which values will be stored must be evaluated before the expression to be stored is evaluated, you have a case that would be well defined, but it's a "rare event" kind of rule, and unless there's good reason, it should be an error.)
[...] Agreed. Note that introducing assignment into the mix may not help matters, but complicate them even more. For example: int x=1, y=0; writeln((y = ++x) + (++y--) * (x=y)); // what does this print? Or worse yet: int func1(int x) { ... } int func2(int x) { ... } int func3(int x) { ... } int function(int)[] funcptrs = [ &func1, &func2, &func3 ]; int x=0, y=0; y += funcptrs[++x--](y = --x++); // what does this do? The only place I can see where you'd even *want* to write code like this is in an entry to the IOCCC. Make it refuse to compile, I say. T -- "Holy war is an oxymoron." -- Lazarus Long
Dec 23 2013
parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 12/24/2013 01:48 AM, H. S. Teoh wrote:
 Agreed. Note that introducing assignment into the mix may not help
 matters, but complicate them even more. For example:

 	int x=1, y=0;
 	writeln((y = ++x) + (++y--) * (x=y)); // what does this print?
 ...
'y-- is not an lvalue'. Assuming we add parens as follows: int x=1, y=0; writeln((y = ++x) + ((++y)--) * (x=y)); We should get 8 according to strict left to right evaluation rules. Furthermore, we should have the value 2 stored in both x and y.
 Or worse yet:

 	int func1(int x) { ... }
 	int func2(int x) { ... }
 	int func3(int x) { ... }

 	int function(int)[] funcptrs = [ &func1, &func2, &func3 ];

 	int x=0, y=0;
 	y += funcptrs[++x--](y = --x++); // what does this do?
'x-- is not an lvalue' 'x++ is not an lvalue' Assuming we run the following code instead: int func1(int x) { return 1*x; } int func2(int x) { return 2*x; } int func3(int x) { return 3*x; } int function(int)[] funcptrs = [ &func1, &func2, &func3 ]; int x=0, y=0; y += funcptrs[(++x)--](y = (--x)++) We should be left in a state where x contains 0 and y contains -2 according to strict left to right evaluation rules.
 The only place I can see where you'd even*want*  to write code like this
 is in an entry to the IOCCC. Make it refuse to compile, I say.
I don't think it makes sense to add some arbitrary rules to the compiler implementation just to ban some code that nobody writes anyway and potentially some perfectly fine code as well.
Dec 24 2013
parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Tue, Dec 24, 2013 at 05:15:15PM +0100, Timon Gehr wrote:
 On 12/24/2013 01:48 AM, H. S. Teoh wrote:
Agreed. Note that introducing assignment into the mix may not help
matters, but complicate them even more. For example:

	int x=1, y=0;
	writeln((y = ++x) + (++y--) * (x=y)); // what does this print?
...
'y-- is not an lvalue'.
D'oh, should've tested my code before posting it. Next time I'm gonna write all code snippets as unittests... :-P (OK, kidding.)
 Assuming we add parens as follows:
 
     int x=1, y=0;
     writeln((y = ++x) + ((++y)--) * (x=y));
 
 We should get 8 according to strict left to right evaluation rules.
 Furthermore, we should have the value 2 stored in both x and y.
What is "strict left to right evaluation", since * has to be evaluated before +? Or you consider all side-effects to be evaluated first (from left to right), and then the expression?
Or worse yet:

	int func1(int x) { ... }
	int func2(int x) { ... }
	int func3(int x) { ... }

	int function(int)[] funcptrs = [ &func1, &func2, &func3 ];

	int x=0, y=0;
	y += funcptrs[++x--](y = --x++); // what does this do?
'x-- is not an lvalue' 'x++ is not an lvalue' Assuming we run the following code instead: int func1(int x) { return 1*x; } int func2(int x) { return 2*x; } int func3(int x) { return 3*x; } int function(int)[] funcptrs = [ &func1, &func2, &func3 ]; int x=0, y=0; y += funcptrs[(++x)--](y = (--x)++) We should be left in a state where x contains 0 and y contains -2 according to strict left to right evaluation rules.
The only place I can see where you'd even*want*  to write code like
this is in an entry to the IOCCC. Make it refuse to compile, I say.
I don't think it makes sense to add some arbitrary rules to the compiler implementation just to ban some code that nobody writes anyway and potentially some perfectly fine code as well.
A potential issue is that forcing a particular evaluation order may be detrimental to the optimizer, e.g., according to strict left-to-right evaluation (if I understand you correctly), then to evaluate this: z = (y = ++x) + ((++y)--) * (x=y) you'd have to compile it into the equivalent of: y = ++x; // i.e., ++x; y = x; tmp1 = y; tmp2 = ++y; // i.e., ++y; tmp2 = y; y--; x=y; z = tmp1 + tmp2 * x; // i.e., tmp3 = tmp2*x; tmp3 += tmp1; z = tmp3; Since the order can't be permuted without changing the semantics, the codegen would have to load x and y multiple times as well as introduce temporaries in order to evaluate the final expression, and the optimizer wouldn't be able to do anything about it. Of course, this is probably not *that* big of a deal if normal expressions without side-effects can be optimized as usual -- it'd be the user's own fault for writing unoptimizable code. Code with side-effects are already difficult to optimize anyway, so it probably doesn't make that big of a difference. T -- "No, John. I want formats that are actually useful, rather than over-featured megaliths that address all questions by piling on ridiculous internal links in forms which are hideously over-complex." -- Simon St. Laurent on xml-dev
Dec 24 2013
parent Timon Gehr <timon.gehr gmx.ch> writes:
On 12/24/2013 06:46 PM, H. S. Teoh wrote:
 ...

 What is "strict left to right evaluation", since * has to be evaluated
 before +? Or you consider all side-effects to be evaluated first (from
 left to right), and then the expression?
It is about the order operands are evaluated in. Eg. if you have a binary expression, first evaluate the left operand and then the right operand (and finally compute the result).
...

 A potential issue is that forcing a particular evaluation order may be
 detrimental to the optimizer, e.g., according to strict left-to-right
 evaluation (if I understand you correctly), then to evaluate this:

 	z = (y = ++x) + ((++y)--) * (x=y)

 you'd have to compile it into the equivalent of:

 	y = ++x;	// i.e., ++x; y = x;
 	tmp1 = y;
 	tmp2 = ++y;	// i.e., ++y; tmp2 = y;
 	y--;
 	x=y;
 	z = tmp1 + tmp2 * x; // i.e., tmp3 = tmp2*x; tmp3 += tmp1; z = tmp3;

 Since the order can't be permuted without changing the semantics, the
 codegen would have to load x and y multiple times as well as introduce
 temporaries in order to evaluate the final expression, and the optimizer
 wouldn't be able to do anything about it.
...
Yes, but in this case we wouldn't want the optimizer to do anything other than this. The case where unspecified order helps is when the optimizer cannot prove that two evaluation orders yield the same result (even though it is the case), in which case it can freely choose the more performant of the two. I don't think this is the right trade-off for D.
 Of course, this is probably not *that* big of a deal if normal
 expressions without side-effects can be optimized as usual -- it'd be
 the user's own fault for writing unoptimizable code. Code with
 side-effects are already difficult to optimize anyway, so it probably
 doesn't make that big of a difference.


 T
Dec 24 2013
prev sibling parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Fri, Dec 20, 2013 at 11:29:54PM +0100, Ivan Smirnov wrote:
 I was wondering if this behavior is actually documented anywhere?
 Let's say we want to increment i and store the new value at the
 index equal to the new (incremented) value.
 
 	int[int] a;
 	int i = 1;
 	a[++i] = i;
 	writeln(a);
 	a[i] = ++i;
 	writeln(a);
[...] In C/C++, the evaluation order in this case is unspecified, so you're in the territory of undefined behaviour. Whatever you observe will be specific to the compiler (and version) you're using, and you cannot depend on getting the same result across different compilers (or even different versions of the same compiler, or even different compiler flags -- using -O may change the behaviour in some cases). Now, in D, I believe this is currently also undefined behaviour, although Walter did mention at some point that he wanted to fix evaluation order in these cases. But I don't know if that has happened yet. Either way, you're treading on dangerous ground, since the spec currently doesn't say one way or another, so different compilers could have different interpretations of what the above code means. So the runtime behaviour could change from compiler to compiler, or from version to version, or from specifying -O or not. Mixing side-effects in an expression that references the changed value multiple times should be avoided in general because of these kinds of ambiguity. For example, what should the following code do? int a = 1; int b = ++a * --a + --a * ++a; (N.B. * must be evaluated before +. So does ++/-- get evaluated before or after *, or something else?) T -- Государство делает вид, что платит нам зарплату, а мы делаем вид, что работаем.
Dec 20 2013