www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - why $ is need to access array [negative index]?

reply mw <mingwu gmail.com> writes:
In Python it's such a convenience to be able to access array 
element from the end:

arr[-1]

in D, we can do that too, but need an extra $: arr[$-1]

I'm porting some code from Python to D:

   int[3] signs;          // sign: -1, 0, 1
   int sign = -1;         // for example
   writeln(signs[sign]);  // Range violation

// Error: array index 18446744073709551615 is out of bounds 
signs[0 .. 3]

(yes, I know I can use AA, int[int], but it just make things 
complicated)

Can we have a DIP remove / make optional `$` in this usage?

Thoughts?
Sep 18 2020
next sibling parent reply Steven Schveighoffer <schveiguy gmail.com> writes:
On 9/18/20 3:53 PM, mw wrote:
 In Python it's such a convenience to be able to access array element 
 from the end:
 
 arr[-1]
 
 in D, we can do that too, but need an extra $: arr[$-1]
 
 I'm porting some code from Python to D:
 
    int[3] signs;          // sign: -1, 0, 1
    int sign = -1;         // for example
    writeln(signs[sign]);  // Range violation
 
 // Error: array index 18446744073709551615 is out of bounds signs[0 .. 3]
 
 (yes, I know I can use AA, int[int], but it just make things complicated)
 
 Can we have a DIP remove / make optional `$` in this usage?
 
 Thoughts?
 
I would say no. The indexing code lowers to a machine instruction. Making it so a negative value means something else means every single indexing operation is going to have to check whether it's negative, and if so do something completely different. You can create your own array type if you want this behavior. -Steve
Sep 18 2020
parent reply mw <mingwu gmail.com> writes:
On Friday, 18 September 2020 at 20:06:01 UTC, Steven 
Schveighoffer wrote:
 I would say no. The indexing code lowers to a machine 
 instruction. Making it so a negative value means something else 
 means every single indexing operation is going to have to check 
 whether it's negative, and if so do something completely 
 different.
Currently we have range check on every single indexing operation already; so the trade-off here is: adding one more check v.s. the convenience it buys.
 You can create your own array type if you want this behavior.
Sep 18 2020
next sibling parent mipri <mipri minimaltype.com> writes:
On Friday, 18 September 2020 at 20:46:00 UTC, mw wrote:
 On Friday, 18 September 2020 at 20:06:01 UTC, Steven 
 Schveighoffer wrote:
 I would say no. The indexing code lowers to a machine 
 instruction. Making it so a negative value means something 
 else means every single indexing operation is going to have to 
 check whether it's negative, and if so do something completely 
 different.
Currently we have range check on every single indexing operation already; so the trade-off here is: adding one more check v.s. the convenience it buys.
Range checks that never fire because your code isn't buggy are very friendly to the branch predictor. Negative-indexing checks are not so friendly.
Sep 18 2020
prev sibling next sibling parent Steven Schveighoffer <schveiguy gmail.com> writes:
On 9/18/20 4:46 PM, mw wrote:
 On Friday, 18 September 2020 at 20:06:01 UTC, Steven Schveighoffer wrote:
 I would say no. The indexing code lowers to a machine instruction. 
 Making it so a negative value means something else means every single 
 indexing operation is going to have to check whether it's negative, 
 and if so do something completely different.
Currently we have range check on every single indexing operation already; so the trade-off here is: adding one more check v.s. the convenience it buys.
-boundscheck=off This currently disables the bounds checks. If your proposed feature went in, then it would still have to check the index for sign. -Steve
Sep 18 2020
prev sibling parent reply IGotD- <nise nise.com> writes:
On Friday, 18 September 2020 at 20:46:00 UTC, mw wrote:
 Currently we have range check on every single indexing 
 operation already; so the trade-off here is: adding one more 
 check v.s. the convenience it buys.
We want as few range checks as possible because indexing arrays are not too seldom done it loops. Adding an extra check could affect performance in this case.
Sep 18 2020
parent reply claptrap <clap trap.com> writes:
On Friday, 18 September 2020 at 21:31:53 UTC, IGotD- wrote:
 On Friday, 18 September 2020 at 20:46:00 UTC, mw wrote:
 Currently we have range check on every single indexing 
 operation already; so the trade-off here is: adding one more 
 check v.s. the convenience it buys.
We want as few range checks as possible because indexing arrays are not too seldom done it loops. Adding an extra check could affect performance in this case.
You dont need an extra check, you just do an unsigned comparison, negative values will interpreted as larger. IE... int idx = length-i; assert(cast(unsigned) idx < length) catches anything outside the valid range. Yes you have the issue that the index could wrap around and end up valid, but you already have that anyway. I mean if (idx-1) is going to cause a problem so will (idx+0xFFFFFFFF)
Sep 20 2020
parent reply IGotD- <nise nise.com> writes:
On Sunday, 20 September 2020 at 14:01:58 UTC, claptrap wrote:
 You dont need an extra check, you just do an unsigned 
 comparison, negative values will interpreted as larger. IE...

 int idx = length-i;
 assert(cast(unsigned) idx < length)

 catches anything outside the valid range.

 Yes you have the issue that the index could wrap around and end 
 up valid, but you already have that anyway. I mean if (idx-1) 
 is going to cause a problem so will (idx+0xFFFFFFFF)
No, if -1 is going to be interpreted as the element just before the end of the array you will need to have an extra check additional from the extra range check. Still completely irrelevant for D as indexes are of type size_t which is unsigned.
Sep 20 2020
parent claptrap <clap trap.com> writes:
On Sunday, 20 September 2020 at 14:06:10 UTC, IGotD- wrote:
 On Sunday, 20 September 2020 at 14:01:58 UTC, claptrap wrote:
 You dont need an extra check, you just do an unsigned 
 comparison, negative values will interpreted as larger. IE...

 int idx = length-i;
 assert(cast(unsigned) idx < length)

 catches anything outside the valid range.

 Yes you have the issue that the index could wrap around and 
 end up valid, but you already have that anyway. I mean if 
 (idx-1) is going to cause a problem so will (idx+0xFFFFFFFF)
No, if -1 is going to be interpreted as the element just before the end of the array you will need to have an extra check additional from the extra range check. Still completely irrelevant for D as indexes are of type size_t which is unsigned.
Nevermind I missunderstood the OP
Sep 20 2020
prev sibling next sibling parent reply IGotD- <nise nise.com> writes:
On Friday, 18 September 2020 at 19:53:41 UTC, mw wrote:
 In Python it's such a convenience to be able to access array 
 element from the end:

 arr[-1]

 in D, we can do that too, but need an extra $: arr[$-1]

 I'm porting some code from Python to D:

   int[3] signs;          // sign: -1, 0, 1
   int sign = -1;         // for example
   writeln(signs[sign]);  // Range violation

 // Error: array index 18446744073709551615 is out of bounds 
 signs[0 .. 3]

 (yes, I know I can use AA, int[int], but it just make things 
 complicated)

 Can we have a DIP remove / make optional `$` in this usage?

 Thoughts?
Array indexes in D are using unsigned integers so negative numbers don't exist. What you are suggesting isn't possible in D. $ is the length of the array so $-1 is correct in this regard and produces a correct positive number given that it is inside the array bounds.
Sep 18 2020
parent mw <mingwu gmail.com> writes:
On Friday, 18 September 2020 at 20:08:36 UTC, IGotD- wrote:
 Array indexes in D are using unsigned integers so negative 
 numbers don't exist. What you are suggesting isn't possible in
This is the *consequence* of current design. We are talking about change the design here.
 D. $ is the length of the array so $-1 is correct in this 
 regard and produces a correct positive number given that it is 
 inside the array bounds.
Sep 18 2020
prev sibling parent reply bachmeier <no spam.net> writes:
On Friday, 18 September 2020 at 19:53:41 UTC, mw wrote:
 In Python it's such a convenience to be able to access array 
 element from the end:

 arr[-1]

 in D, we can do that too, but need an extra $: arr[$-1]

 I'm porting some code from Python to D:

   int[3] signs;          // sign: -1, 0, 1
   int sign = -1;         // for example
   writeln(signs[sign]);  // Range violation

 // Error: array index 18446744073709551615 is out of bounds 
 signs[0 .. 3]

 (yes, I know I can use AA, int[int], but it just make things 
 complicated)

 Can we have a DIP remove / make optional `$` in this usage?

 Thoughts?
I'm inclined to say typing a single character is not a hardship in exchange for extreme clarity of the code. Keep in mind that Python is not the only language with a negative array index. For instance, C supports it: https://stackoverflow.com/questions/3473675/are-negative-array-indexes-allowed-in-c That's probably the place to look if you want to start, given the relationship of C and D. Then there's R, for which x[-1] means to drop the first element. That makes a lot of sense if x is treated as a vector of data. Python's usage, on the other hand, is not at all intuitive. Why -4 would mean the fourth to last element is unclear. I believe it was copied from Perl. Ruby does the same thing. Then there's PHP, which allows you to use a negative array index as an arbitrary reference to an element: x[-2] could be any of the elements. The strange one is Javascript, which has negative indexes that are actually properties or something like that. Bottom line is that the Python approach is one of many, it only makes sense if someone tells you what it means, and it saves you a single character in return for less clear code. As noted, it's really easy to create a struct that operates like Python if you want. That's the beauty of D.
Sep 18 2020
next sibling parent reply mw <mingwu gmail.com> writes:
On Friday, 18 September 2020 at 20:48:47 UTC, bachmeier wrote:
 I'm inclined to say typing a single character is not a hardship 
 in exchange for extreme clarity of the code. Keep in mind that
It not about saving "typing a single character". In the example I showed, signs[sign] = ...; // the sign can be -1, 0, 1 in D, to write the same code, you have to test the sign and branch: if (sign >= 0) { signs[ sign] = ...; } else { signs[$+sign] = ...; // remember + here }
Sep 18 2020
next sibling parent reply bachmeier <no spam.net> writes:
On Friday, 18 September 2020 at 20:56:30 UTC, mw wrote:
 On Friday, 18 September 2020 at 20:48:47 UTC, bachmeier wrote:
 I'm inclined to say typing a single character is not a 
 hardship in exchange for extreme clarity of the code. Keep in 
 mind that
It not about saving "typing a single character". In the example I showed, signs[sign] = ...; // the sign can be -1, 0, 1 in D, to write the same code, you have to test the sign and branch: if (sign >= 0) { signs[ sign] = ...; } else { signs[$+sign] = ...; // remember + here }
I'm confused. The justification for a major language change would be convenience of porting Python code to D?
Sep 18 2020
parent reply mw <mingwu gmail.com> writes:
On Friday, 18 September 2020 at 21:02:11 UTC, bachmeier wrote:
 in D, to write the same code, you have to test the sign and 
 branch:

 if (sign >= 0) {
   signs[  sign] = ...;
 } else {
   signs[$+sign] = ...;  // remember + here
 }
I'm confused. The justification for a major language change would be convenience of porting Python code to D?
You are indeed confused: the justification is either compiler writes that branching code, or user have to write it.
Sep 18 2020
parent Ethan <gooberman gmail.com> writes:
On Friday, 18 September 2020 at 21:06:22 UTC, mw wrote:
 You are indeed confused: the justification is either compiler 
 writes that branching code, or user have to write it.
Good. Let the user write it every time. I don't want the compiler inserting branches in to my code simply because of the essentially-never chance I might not want to put a $ in front of a -<integer> statement for array access. Further to write it every time: Write an array template. It's simple. Here's the important parts, the rest isn't too difficult to write from here struct Array( T ) { ref auto opIndex( int Index ) { if( Index < 0 ) Index = slice.length + Index; return slice[ Index ]; } T[] slice; }
Sep 18 2020
prev sibling next sibling parent reply Steven Schveighoffer <schveiguy gmail.com> writes:
On 9/18/20 4:56 PM, mw wrote:
 On Friday, 18 September 2020 at 20:48:47 UTC, bachmeier wrote:
 I'm inclined to say typing a single character is not a hardship in 
 exchange for extreme clarity of the code. Keep in mind that
It not about saving "typing a single character". In the example I showed, signs[sign] = ...;  // the sign can be -1, 0, 1 in D, to write the same code, you have to test the sign and branch: if (sign >= 0) {   signs[  sign] = ...; } else {   signs[$+sign] = ...;  // remember + here }
Well, if you don't care about verbosity. signs[(sign + $) % $] = ...; -Steve
Sep 18 2020
parent reply mw <mingwu gmail.com> writes:
On Friday, 18 September 2020 at 21:11:58 UTC, Steven 
Schveighoffer wrote:
 Well, if you don't care about verbosity.

 signs[(sign + $) % $] = ...;
I want compiler writes this verbosity, instead of the user :-)
Sep 18 2020
next sibling parent reply mw <mingwu gmail.com> writes:
On Friday, 18 September 2020 at 21:14:19 UTC, mw wrote:
 On Friday, 18 September 2020 at 21:11:58 UTC, Steven 
 Schveighoffer wrote:
 Well, if you don't care about verbosity.

 signs[(sign + $) % $] = ...;
I want compiler writes this verbosity, instead of the user :-)
And ... this [(sign + $) % $] is not right, it's defensive programming: and will hide real bugs, e.g. when array index > array.length. See how easy user code contains bugs. The compiler should really do the hard work to make programmer's life easier.
Sep 18 2020
parent Steven Schveighoffer <schveiguy gmail.com> writes:
On 9/18/20 5:17 PM, mw wrote:
 On Friday, 18 September 2020 at 21:14:19 UTC, mw wrote:
 On Friday, 18 September 2020 at 21:11:58 UTC, Steven Schveighoffer wrote:
 Well, if you don't care about verbosity.

 signs[(sign + $) % $] = ...;
I want compiler writes this verbosity, instead of the user :-)
And ... this [(sign + $) % $] is not right, it's defensive programming: and will hide real bugs, e.g. when array index > array.length.
It depends on how you want to define indexing. Let me write that a different way: "And ... this processing of negative indexes is not right, it's defensive programming: and will hide real bugs, e.g. when array index < 0" D array indexing is set in stone. Like really hard, billion-year-old stone. Again, if you want a different indexing scheme, write a type. D makes it really easy! -Steve
Sep 18 2020
prev sibling next sibling parent mipri <mipri minimaltype.com> writes:
On Friday, 18 September 2020 at 21:14:19 UTC, mw wrote:
 On Friday, 18 September 2020 at 21:11:58 UTC, Steven 
 Schveighoffer wrote:
 Well, if you don't care about verbosity.

 signs[(sign + $) % $] = ...;
I want compiler writes this verbosity, instead of the user :-)
It's not free, and not always desirable. This code would silently accept sign=100 for example.
Sep 18 2020
prev sibling next sibling parent Steven Schveighoffer <schveiguy gmail.com> writes:
On 9/18/20 5:14 PM, mw wrote:
 On Friday, 18 September 2020 at 21:11:58 UTC, Steven Schveighoffer wrote:
 Well, if you don't care about verbosity.

 signs[(sign + $) % $] = ...;
I want compiler writes this verbosity, instead of the user :-)
A custom type will do this for you. Just write one. Would be as easy as: arr.pyIdx[-1]; // uses python indexing. -Steve
Sep 18 2020
prev sibling parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Fri, Sep 18, 2020 at 09:14:19PM +0000, mw via Digitalmars-d wrote:
 On Friday, 18 September 2020 at 21:11:58 UTC, Steven Schveighoffer wrote:
 Well, if you don't care about verbosity.
 
 signs[(sign + $) % $] = ...;
I want compiler writes this verbosity, instead of the user :-)
Yawn. struct MyArray(T) { T[] impl; alias impl this; this(T[] data) { impl = data; } ref T opIndex(ptrdiff_t idx) { return (idx < 0) ? impl[$ + idx] : impl[idx]; } } MyArray!int x = [ 1, 2, 3 ]; assert(x[-1] == 3); // not verbose anymore T -- Живёшь только однажды.
Sep 18 2020
prev sibling parent Patrick Schluter <Patrick.Schluter bbox.fr> writes:
On Friday, 18 September 2020 at 20:56:30 UTC, mw wrote:
 On Friday, 18 September 2020 at 20:48:47 UTC, bachmeier wrote:
 I'm inclined to say typing a single character is not a 
 hardship in exchange for extreme clarity of the code. Keep in 
 mind that
It not about saving "typing a single character". In the example I showed, signs[sign] = ...; // the sign can be -1, 0, 1 in D, to write the same code, you have to test the sign and branch: if (sign >= 0) { signs[ sign] = ...; } else { signs[$+sign] = ...; // remember + here }
signs[1+sign]
Sep 18 2020
prev sibling next sibling parent reply bachmeier <no spam.net> writes:
On Friday, 18 September 2020 at 20:48:47 UTC, bachmeier wrote:
 On Friday, 18 September 2020 at 19:53:41 UTC, mw wrote:
 In Python it's such a convenience to be able to access array 
 element from the end:

 arr[-1]

 in D, we can do that too, but need an extra $: arr[$-1]

 I'm porting some code from Python to D:

   int[3] signs;          // sign: -1, 0, 1
   int sign = -1;         // for example
   writeln(signs[sign]);  // Range violation

 // Error: array index 18446744073709551615 is out of bounds 
 signs[0 .. 3]

 (yes, I know I can use AA, int[int], but it just make things 
 complicated)

 Can we have a DIP remove / make optional `$` in this usage?

 Thoughts?
I'm inclined to say typing a single character is not a hardship in exchange for extreme clarity of the code. Keep in mind that Python is not the only language with a negative array index. For instance, C supports it: https://stackoverflow.com/questions/3473675/are-negative-array-indexes-allowed-in-c That's probably the place to look if you want to start, given the relationship of C and D. Then there's R, for which x[-1] means to drop the first element. That makes a lot of sense if x is treated as a vector of data. Python's usage, on the other hand, is not at all intuitive. Why -4 would mean the fourth to last element is unclear. I believe it was copied from Perl. Ruby does the same thing. Then there's PHP, which allows you to use a negative array index as an arbitrary reference to an element: x[-2] could be any of the elements. The strange one is Javascript, which has negative indexes that are actually properties or something like that. Bottom line is that the Python approach is one of many, it only makes sense if someone tells you what it means, and it saves you a single character in return for less clear code. As noted, it's really easy to create a struct that operates like Python if you want. That's the beauty of D.
I forgot to add when discussing C, that D already has negative index values, which mean something completely different than Python's usage: double[] x = [1, 2, 3, 4]; double * y = &(x.ptr)[2]; writeln(y[-2]); // 1
Sep 18 2020
parent mw <mingwu gmail.com> writes:
On Friday, 18 September 2020 at 20:57:04 UTC, bachmeier wrote:
 I forgot to add when discussing C, that D already has negative 
 index values, which mean something completely different than 
 Python's usage:

 double[] x = [1, 2, 3, 4];
 double * y = &(x.ptr)[2];
 writeln(y[-2]); // 1
This is using array index syntax on a raw pointer.
Sep 18 2020
prev sibling parent mipri <mipri minimaltype.com> writes:
On Friday, 18 September 2020 at 20:48:47 UTC, bachmeier wrote:
 On Friday, 18 September 2020 at 19:53:41 UTC, mw wrote:
 In Python it's such a convenience to be able to access array 
 element from the end:

 arr[-1]

 in D, we can do that too, but need an extra $: arr[$-1]

 I'm porting some code from Python to D:

   int[3] signs;          // sign: -1, 0, 1
   int sign = -1;         // for example
   writeln(signs[sign]);  // Range violation

 // Error: array index 18446744073709551615 is out of bounds 
 signs[0 .. 3]

 (yes, I know I can use AA, int[int], but it just make things 
 complicated)

 Can we have a DIP remove / make optional `$` in this usage?

 Thoughts?
I'm inclined to say typing a single character is not a hardship in exchange for extreme clarity of the code.
The hardship isn't typing a single character, but requiring an explicit test to determine whether you should index with or without that single character. Consider: int ex(int i) { immutable int[3] ns = [1, 2, 3]; if (i < 0) { return ns[$+i]; } else { return ns[i]; } } void main() { import std.stdio, std.range, std.algorithm; iota(3) .map!(n => n - 1) .map!ex .each!writeln; } In Python you can calculate an index on a numeric plane that includes negative numbers and then use it; in D you have some syntax to conveniently refer to the size of the array.
Sep 18 2020