www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - Array start index

reply "DLearner" <bmqazwsx123 gmail.com> writes:
Does the D language set in stone that the first element of an 
array _has_ to be index zero?
Wouldn't starting array elements at one avoid the common 
'off-by-one' logic error, it does
seem more natural to begin a count at 1.

Actually, maybe even better to allow array definitions of form
int foo[x:y];
(y >= x) creating integer variables foo[x], foo[x+1],...,foo[y].

I think the (very old) IBM PL/I language was like this.
Aug 01 2015
next sibling parent Rikki Cattermole <alphaglosined gmail.com> writes:
On 1/08/2015 9:35 p.m., DLearner wrote:
 Does the D language set in stone that the first element of an array
 _has_ to be index zero?
 Wouldn't starting array elements at one avoid the common 'off-by-one'
 logic error, it does
 seem more natural to begin a count at 1.

 Actually, maybe even better to allow array definitions of form
 int foo[x:y];
 (y >= x) creating integer variables foo[x], foo[x+1],...,foo[y].

 I think the (very old) IBM PL/I language was like this.
In c style languages (like D) the index actually defines the offset in memory. Not actually the index. So while in some languages 1 is used instead of 0, 0 maps better to the hardware. Think of this byte array: ubyte* myptr = [ |---|-------| | i | value | |---|-------| | 0 | 1 | | 1 | 255 | ]; For 0 start of index: size_t i = 0; assert(myptr[i] == 1); For 1 start of index: size_t i = 1; assert(i != 0); assert(myptr[i-1] == 1); While this is not the complete reason why 0 is chosen, it is something to think about.
Aug 01 2015
prev sibling next sibling parent "bachmeier" <no spam.net> writes:
On Saturday, 1 August 2015 at 09:35:53 UTC, DLearner wrote:
 Does the D language set in stone that the first element of an 
 array _has_ to be index zero?
 Wouldn't starting array elements at one avoid the common 
 'off-by-one' logic error, it does
 seem more natural to begin a count at 1.

 Actually, maybe even better to allow array definitions of form
 int foo[x:y];
 (y >= x) creating integer variables foo[x], foo[x+1],...,foo[y].

 I think the (very old) IBM PL/I language was like this.
Seems you could easily wrap an array in a struct, define opIndex and opSlice appropriately, and use alias this to keep the other properties of the array. The problem you'll run into is interacting with other code that assumes zero-based indexing. I thought about setting the first index at 1 for my library that embeds D inside R, since that's how it's done in R, but very quickly realized how confusing it would be.
Aug 01 2015
prev sibling next sibling parent reply "John Colvin" <john.loughran.colvin gmail.com> writes:
On Saturday, 1 August 2015 at 09:35:53 UTC, DLearner wrote:
 Does the D language set in stone that the first element of an 
 array _has_ to be index zero?
For the builtin slice types? Yes, set in stone.
 Wouldn't starting array elements at one avoid the common 
 'off-by-one' logic error, it does
 seem more natural to begin a count at 1.

 Actually, maybe even better to allow array definitions of form
 int foo[x:y];
 (y >= x) creating integer variables foo[x], foo[x+1],...,foo[y].

 I think the (very old) IBM PL/I language was like this.
See https://www.cs.utexas.edu/users/EWD/transcriptions/EWD08xx/EWD831.html As other commenters have/will point out, you can easily define a custom type in D that behaves as you describe, and - Dijkstra notwithstanding - there are valid uses for such things.
Aug 01 2015
parent reply "DLearner" <bmqazwsx123 gmail.com> writes:
On Saturday, 1 August 2015 at 17:55:06 UTC, John Colvin wrote:
 On Saturday, 1 August 2015 at 09:35:53 UTC, DLearner wrote:
 Does the D language set in stone that the first element of an 
 array _has_ to be index zero?
For the builtin slice types? Yes, set in stone.
 Wouldn't starting array elements at one avoid the common 
 'off-by-one' logic error, it does
 seem more natural to begin a count at 1.

 Actually, maybe even better to allow array definitions of form
 int foo[x:y];
 (y >= x) creating integer variables foo[x], 
 foo[x+1],...,foo[y].

 I think the (very old) IBM PL/I language was like this.
See https://www.cs.utexas.edu/users/EWD/transcriptions/EWD08xx/EWD831.html As other commenters have/will point out, you can easily define a custom type in D that behaves as you describe, and - Dijkstra notwithstanding - there are valid uses for such things.
D is a C derivative, so it seems a shame not to identify causes of bugs in C, and design them out in D. For example, in C difficult to write non-trivial commercial programs without using pointers. Pointer manipulation has a terrible reputation for bugs. But in D, easy to write commercial programs without using pointers. Problem has been designed away. Similarly, off-by-one array bugs are commonplace in C. We should seek to eliminate the source of those bugs, which basically reduces to the issue that programmers find it unnatural to start a count at zero. Whether they _should_ find a zero start unnatural is irrelevant - they just do as an observed fact, so let's change the language so the issue is avoided (designed away). Suggestion: if the codebase for D is considered so large that zero-basing cannot now be changed, why not extend the language to allow for array definitions like 'int[x:y] foo'? And then have a rule that 'int[:y] bar' defines a 1-based array of y elements?
Aug 01 2015
parent reply Andrej Mitrovic via Digitalmars-d-learn writes:
On 8/1/15, DLearner via Digitalmars-d-learn
<digitalmars-d-learn puremagic.com> wrote:
 D is a C derivative, so it seems a shame not to identify causes
 of bugs in C,
 and design them out in D.
This has already been done! D defines an array to be a struct with a pointer and a length. See this article: http://www.drdobbs.com/architecture-and-design/cs-biggest-mistake/228701625 I would argue it's not "off-by-one" that's causing most issues when dealing with C "arrays", but instead it's in general out-of-bounds issues (whether it's off bye one or off by 50..) since you often don't have the length or could easily use the wrong variable as the length. Think about how much D code would actually have subtle off-by-one errors if D didn't use 0-based indexing like the majority of popular languages use. Any time you would interface with other languages you would have to double, triple-check all your uses of arrays. FWIW at the very beginning I also found it odd that languages use 0-based indexing, but that was before I had any significant programming experience under my belt. By now it's second nature to me to use 0-based indexing.
Aug 01 2015
parent reply "bachmeier" <no spam.net> writes:
On Saturday, 1 August 2015 at 19:04:10 UTC, Andrej Mitrovic wrote:
 On 8/1/15, DLearner via Digitalmars-d-learn 
 <digitalmars-d-learn puremagic.com> wrote:
 D is a C derivative, so it seems a shame not to identify causes
 of bugs in C,
 and design them out in D.
This has already been done! D defines an array to be a struct with a pointer and a length. See this article: http://www.drdobbs.com/architecture-and-design/cs-biggest-mistake/228701625 I would argue it's not "off-by-one" that's causing most issues when dealing with C "arrays", but instead it's in general out-of-bounds issues (whether it's off bye one or off by 50..) since you often don't have the length or could easily use the wrong variable as the length. Think about how much D code would actually have subtle off-by-one errors if D didn't use 0-based indexing like the majority of popular languages use. Any time you would interface with other languages you would have to double, triple-check all your uses of arrays. FWIW at the very beginning I also found it odd that languages use 0-based indexing, but that was before I had any significant programming experience under my belt. By now it's second nature to me to use 0-based indexing.
But what type of programming are you doing? Even after decades of programming and trying out dozens of languages, zero-based indexing still gets me at times when the arrays I work with represent vectors and matrices. Especially when porting code from other languages that use one-based indexing. One of the nice things about D is that it gives you the tools to easily make the change if you want.
Aug 01 2015
parent reply "QAston" <qastonx gmail.com> writes:
On Saturday, 1 August 2015 at 23:02:51 UTC, bachmeier wrote:
 But what type of programming are you doing? Even after decades 
 of programming and trying out dozens of languages, zero-based 
 indexing still gets me at times when the arrays I work with 
 represent vectors and matrices. Especially when porting code 
 from other languages that use one-based indexing. One of the 
 nice things about D is that it gives you the tools to easily 
 make the change if you want.
Adding 1-indexed arrays to the language fixes nothing. Just write your 1-indexed array type and if you enjoy using it, publish it as a library. Who knows, if demand is high it may even end up in phobos.
Aug 02 2015
parent reply "bachmeier" <no spam.com> writes:
On Sunday, 2 August 2015 at 21:58:48 UTC, QAston wrote:

 Adding 1-indexed arrays to the language fixes nothing. Just 
 write your 1-indexed array type and if you enjoy using it, 
 publish it as a library. Who knows, if demand is high it may 
 even end up in phobos.
Oh, I don't think that's a good idea. It's too confusing to have more than one method of indexing within the same language. You just have to do a thorough job of testing, as the possibility of errors is something you'll have to live with, given the different design choices of different languages.
Aug 03 2015
parent reply "DLearner" <bmqazwsx123 gmail.com> writes:
On Monday, 3 August 2015 at 13:45:01 UTC, bachmeier wrote:
 On Sunday, 2 August 2015 at 21:58:48 UTC, QAston wrote:

 Adding 1-indexed arrays to the language fixes nothing. Just 
 write your 1-indexed array type and if you enjoy using it, 
 publish it as a library. Who knows, if demand is high it may 
 even end up in phobos.
Oh, I don't think that's a good idea. It's too confusing to have more than one method of indexing within the same language. You just have to do a thorough job of testing, as the possibility of errors is something you'll have to live with, given the different design choices of different languages.
Looks like 0-base is fixed, to avoid problems with existing code. But nothing stops _adding_ to the language by allowing int[x:y] foo to mean valid symbols are foo[x], foo[x+1],..., foo[y]. Plus rule that int[:y] means valid symbols are foo[1], foo[2],..., foo[y]. That way, 1-start achieved, with no conflict with existing code?
Aug 03 2015
next sibling parent Jonathan M Davis via Digitalmars-d-learn writes:
On Monday, August 03, 2015 21:32:03 DLearner via Digitalmars-d-learn wrote:
 On Monday, 3 August 2015 at 13:45:01 UTC, bachmeier wrote:
 On Sunday, 2 August 2015 at 21:58:48 UTC, QAston wrote:

 Adding 1-indexed arrays to the language fixes nothing. Just
 write your 1-indexed array type and if you enjoy using it,
 publish it as a library. Who knows, if demand is high it may
 even end up in phobos.
Oh, I don't think that's a good idea. It's too confusing to have more than one method of indexing within the same language. You just have to do a thorough job of testing, as the possibility of errors is something you'll have to live with, given the different design choices of different languages.
Looks like 0-base is fixed, to avoid problems with existing code. But nothing stops _adding_ to the language by allowing int[x:y] foo to mean valid symbols are foo[x], foo[x+1],..., foo[y]. Plus rule that int[:y] means valid symbols are foo[1], foo[2],..., foo[y]. That way, 1-start achieved, with no conflict with existing code?
Almost all programming languages in heavy use at this point in time start indexing at 0. It would be highly confusing to almost all programmers out there to have 1-based indexing. In addition, having 0-based indexing actually makes checking against the end of arrays and other random-access ranges easier. You can just check against length without having to do any math. In general, I would expect 1-based indexing to _increase_ the number of off by one errors in code - both because 0-based indexing helps avoid such problems when dealing with the end of the array and more importantly, because almost everyone expects 0-based indexing. You're really barking up the wrong tree if you're trying to get any support for 1-based indexing in D. I doubt that you will see much of anyone who thinks that it's even vaguely a good idea, and there's no way that Walter or Andrei (or probably anyone in the main dev team) who is going to agree that it's even worth considering. I think that the reality of the matter is that if you're going to do much programming - especially if you're going to be professional programmer - you just need to get used to the idea that array indices start at 0. There are a few languages out there where they don't, but they are far from the norm. - Jonathan M Davis
Aug 03 2015
prev sibling parent "QAston" <qaston gmail.com> writes:
On Monday, 3 August 2015 at 21:32:05 UTC, DLearner wrote:
 Looks like 0-base is fixed, to avoid problems with existing 
 code.

 But nothing stops _adding_ to the language by allowing
 int[x:y] foo to mean valid symbols are foo[x], foo[x+1],..., 
 foo[y].
 Plus rule that int[:y] means valid symbols are foo[1], 
 foo[2],..., foo[y].

 That way, 1-start achieved, with no conflict with existing code?
There're quite a few things stopping this from being added to the language. 1. People will have to learn this new feature and it's interaction with gazillion of other D features. 2. There would be a redundancy - core language will have 2 array types while one of them can be easily implemented using the other. 3. Devs will have to maintain it - as if they don't have enough things to fix atm. Really, this is so simple to do as a library - just use opIndex, opSlice with a template struct. As a general rule - start asking for language features only when things can't be done without them.
Aug 04 2015
prev sibling next sibling parent "jmh530" <john.michael.hall gmail.com> writes:
On Saturday, 1 August 2015 at 09:35:53 UTC, DLearner wrote:
 Does the D language set in stone that the first element of an 
 array _has_ to be index zero?
 Wouldn't starting array elements at one avoid the common 
 'off-by-one' logic error, it does
 seem more natural to begin a count at 1.

 Actually, maybe even better to allow array definitions of form
 int foo[x:y];
 (y >= x) creating integer variables foo[x], foo[x+1],...,foo[y].

 I think the (very old) IBM PL/I language was like this.
I come from matlab and R which start matrices at 1. I used to think it was more natural. However after I started using numpy I now think 0 index is better. Also see http://www.cs.utexas.edu/users/EWD/transcriptions/EWD08xx/EWD831.html Give it a try. You might find you like it.
Aug 03 2015
prev sibling next sibling parent "Observer" <spurious.address yahoo.com> writes:
On Saturday, 1 August 2015 at 09:35:53 UTC, DLearner wrote:
 Does the D language set in stone that the first element of an 
 array _has_ to be index zero?
 Wouldn't starting array elements at one avoid the common 
 'off-by-one' logic error, it does
 seem more natural to begin a count at 1.

 Actually, maybe even better to allow array definitions of form
 int foo[x:y];
 (y >= x) creating integer variables foo[x], foo[x+1],...,foo[y].
This experiment has already been run. Perl used to support a $[ variable to set the array base. After experience with the confusion and problems that causes, it was finally deprecated and effectively removed from the language. See the end paragraphs of http://perldoc.perl.org/perlvar.html and also http://search.cpan.org/~wolfsage/perl/ext/arybase/arybase.pm for more info.
Aug 03 2015
prev sibling next sibling parent reply "Marc =?UTF-8?B?U2Now7x0eiI=?= <schuetzm gmx.net> writes:
On Saturday, 1 August 2015 at 09:35:53 UTC, DLearner wrote:
 Does the D language set in stone that the first element of an 
 array _has_ to be index zero?
 Wouldn't starting array elements at one avoid the common 
 'off-by-one' logic error, it does
 seem more natural to begin a count at 1.
I, too, don't think this is a good idea in general, but I can see a few use-cases where 1-based indices may be more natural. It's easy to define a wrapper: struct OneBasedArray(T) { T[] _payload; alias _payload this; T opIndex(size_t index) { assert(index > 0); return _payload[index-1]; } void opIndexAssign(U : T)(size_t index, auto ref U value) { assert(index > 0); _payload[index-1] = value; } } unittest { OneBasedArray!int arr; arr = [1,2,3]; arr ~= 4; assert(arr.length == 4); assert(arr[1] == 1); assert(arr[2] == 2); assert(arr[3] == 3); assert(arr[4] == 4); } Test with: rdmd -main -unittest xx.d This can of course be easily extended to support other bases than one.
Aug 04 2015
next sibling parent reply Bastiaan Veelo <Bastiaan Veelo.net> writes:
On Tuesday, 4 August 2015 at 08:18:50 UTC, Marc Schütz wrote:
         void opIndexAssign(U : T)(size_t index, auto ref U 
 value) {
Careful here, you've got the arguments reversed. The unit test didn't detect this because it was ambiguous. This one isn't: unittest { OneBasedArray!int arr; arr = [1,2,3]; arr ~= 14; assert(arr.length == 4); assert(arr[1] == 1); assert(arr[2] == 2); assert(arr[3] == 3); assert(arr[4] == 14); }
Feb 06 2017
parent reply Bastiaan Veelo <Bastiaan Veelo.net> writes:
On Monday, 6 February 2017 at 14:26:35 UTC, Bastiaan Veelo wrote:
 The unit test didn't detect this because it was ambiguous.
Sorry for that misinformation. I should have said that opIndexAssign wasn't tested. Here is a better test. unittest { OneBasedArray!int arr; arr = [1,2,3]; arr ~= 4; arr[4] = 14; assert(arr.length == 4); assert(arr[1] == 1); assert(arr[2] == 2); assert(arr[3] == 3); assert(arr[4] == 14); }
Feb 06 2017
parent reply pineapple <meapineapple gmail.com> writes:
One reason for zero-based indexes that isn't "it's what we're all 
used to" is that if you used one-based indexes, you would be able 
to represent one fewer index than zero-based, since one of the 
representable values - zero - could no longer be used to 
represent any index.

Also, it's what we're all used to, and it makes perfect sense to 
a lot of us, and the only times in recent memory I've ever made 
off-by-one errors were when I was trying to use Lua and its 
one-based indexing.
Feb 06 2017
next sibling parent Nemanja Boric <4burgos gmail.com> writes:
On Monday, 6 February 2017 at 18:55:19 UTC, pineapple wrote:
 One reason for zero-based indexes that isn't "it's what we're 
 all used to" is that if you used one-based indexes, you would 
 be able to represent one fewer index than zero-based, since one 
 of the representable values - zero - could no longer be used to 
 represent any index.

 Also, it's what we're all used to, and it makes perfect sense 
 to a lot of us, and the only times in recent memory I've ever 
 made off-by-one errors were when I was trying to use Lua and 
 its one-based indexing.
Related: https://www.cs.utexas.edu/users/EWD/transcriptions/EWD08xx/EWD831.html
Feb 06 2017
prev sibling parent bachmeier <no spam.net> writes:
On Monday, 6 February 2017 at 18:55:19 UTC, pineapple wrote:
 One reason for zero-based indexes that isn't "it's what we're 
 all used to" is that if you used one-based indexes, you would 
 be able to represent one fewer index than zero-based, since one 
 of the representable values - zero - could no longer be used to 
 represent any index.

 Also, it's what we're all used to, and it makes perfect sense 
 to a lot of us, and the only times in recent memory I've ever 
 made off-by-one errors were when I was trying to use Lua and 
 its one-based indexing.
In the scientific programming community, languages like Matlab, Fortran, and R use one-based indexing for a very good reason. Zero-based indexing is a computer science thing that makes no sense to someone working with equations.
Feb 06 2017
prev sibling parent reply Bastiaan Veelo <Bastiaan Veelo.net> writes:
(There is an honest question in the end, please read on.)

All good reasons set aside, both in favour and against 0-based 
arrays, the only reason that is relevant to me right now is that 
we are seriously looking into translating close to a million 
lines of foreign code to D, from a language that supports arrays 
over arbitrary ranges of indices (positive and negative). 
Adapting this much program logic to a different array base is out 
of the question; this is an engineering application and lives are 
at stake. So let's not bring the base-arguments back into this 
sub-thread but focus on a performant solution.

Expanding on Marc's outset, I now have:

/*****************
  * A fixed-length array with an index that runs from $(D_PARAM 
first)
  * to $(D_PARAM last) inclusive.
  *
  * Indices are converted, which involves a small overhead.
  */
struct StaticArray(T, ptrdiff_t first, ptrdiff_t last) {
     T[last - first + 1] _payload;

     alias _payload this;

     // Support e = arr[5];
     ref T opIndex(ptrdiff_t index) {
         assert(index >= first);
         assert(index <= last);
         return _payload[index - first];
     }

     // Support arr[5] = e;
     void opIndexAssign(U : T)(auto ref U value, ptrdiff_t index) {
         assert(index >= first);
         assert(index <= last);
         _payload[index - first] = value;
     }

     // Support foreach(e; arr).
     int opApply(scope int delegate(ref T) dg)
     {
         int result = 0;

         for (int i = 0; i < _payload.length; i++)
         {
             result = dg(_payload[i]);
             if (result)
                 break;
         }
         return result;
     }

     // Support foreach(i, e; arr).
     int opApply(scope int delegate(ptrdiff_t index, ref T) dg)
     {
         int result = 0;

         for (int i = 0; i < _payload.length; i++)
         {
             result = dg(i + first, _payload[i]);
             if (result)
                 break;
         }
         return result;
     }

     // Write to binary file.
     void toFile(string fileName)
     {
         import std.stdio;
         auto f = File(fileName, "wb");
         if (f.tryLock)
         {
             f.rawWrite(_payload);
         }
     }
}

unittest {
     StaticArray!(int, -10, 10) arr;
     assert(arr.length == 21);

     foreach (ref e; arr)
         e = 42;
     assert(arr[-10] == 42);
     assert(arr[0]   == 42);
     assert(arr[10]  == 42);

     foreach (i, ref e; arr)
         e = i;
     assert(arr[-10] == -10);
     assert(arr[0]   ==   0);
     assert(arr[5]   ==   5);
     assert(arr[10]  ==  10);

     arr[5] = 15;
     assert(arr[5]   == 15);
}

//////////////

(The first and last indices probably don't need to be template 
arguments, they could possibly be immutable members of the 
struct; But that is not what worries me now.)
The thing is that a small overhead is paid in opIndex() and 
opIndexAssign() (the other member functions are fine). I'd like 
to get rid of that overhead.

In "Numerical Recipes in C", section 1.2, Press et al. propose an 
easy solution using an offset pointer:

float b[4], *bb;
bb = b - 1;

Thus bb[1] through bb[4] all exist, no space is wasted nor is 
there a run-time overhead.

I have tried to adapt the internals of my struct to Press' 
approach, but it seems to me it is not that easy in D -- or I'm 
just not proficient enough. Can someone here show me how that 
could be done?

Thanks!
Feb 06 2017
parent reply =?UTF-8?Q?Ali_=c3=87ehreli?= <acehreli yahoo.com> writes:
On 02/06/2017 03:00 PM, Bastiaan Veelo wrote:

 In "Numerical Recipes in C", section 1.2, Press et al. propose an easy
 solution using an offset pointer:

 float b[4], *bb;
 bb = b - 1;

 Thus bb[1] through bb[4] all exist, no space is wasted nor is there a
 run-time overhead.

 I have tried to adapt the internals of my struct to Press' approach, but
 it seems to me it is not that easy in D -- or I'm just not proficient
 enough. Can someone here show me how that could be done?
struct StaticArray(T, ptrdiff_t first, ptrdiff_t last) { T[last - first + 1] _payload; If the static array has to be a member like that (as opposed to a dynamic one), you need an extra member: T* _ptr; Because structs cannot have default constructors, you need an init() function to establish the invariant: void init() { this._ptr = _payload.ptr - first; } Of course, that relies on the fact that _payload.ptr - first is a valid address to hold in a pointer. [OT]: // Commented-out as it seems dangerous because _payload does not // have the same semantics as StaticArray. // // alias _payload this; Which means, you would implement length() yourself: size_t length() { return _payload.length; } Then you use _ptr when indexing: // Support e = arr[5]; ref T opIndex(ptrdiff_t index) { assert(index >= first); assert(index <= last); return *(_ptr + index); } Ali
Feb 06 2017
parent reply Bastiaan Veelo <Bastiaan Veelo.net> writes:
On Monday, 6 February 2017 at 23:42:55 UTC, Ali Çehreli wrote:
 Then you use _ptr when indexing:

     // Support e = arr[5];
     ref T opIndex(ptrdiff_t index) {
         assert(index >= first);
         assert(index <= last);
         return *(_ptr + index);
     }

 Ali
Thank you very much for your time. Sadly this gives me an access violation. The traceback doesn't seem right though, as it seems to jump to opIndexAssign from where opIndex is used. If I'm not mistaken, I have confirmed that _ptr is a valid address, see below. We do not need to take measures against the GC? - Bastiaan. On Windows: rdmd -main -unittest -debug -g source\epcompat\array.d object.Error (0): Access Violation ---------------- 0x00402057 in void epcompat.array.__unittestL81_1() at C:\SARC\Pascal2017\D\epcompat\source\epcompat\array.d(88) 0x0040465C in void epcompat.array.__modtest() at C:\SARC\Pascal2017\D\epcompat\source\epcompat\array.d(39) 0x0040A455 in int core.runtime.runModuleUnitTests().__foreachbody1(object.ModuleInfo*) 0x0040C007 in int object.ModuleInfo.opApply(scope int delegate(object.ModuleInfo*)).__lambda2(immutable(object.ModuleInfo*)) 0x00406808 in _d_run_main 0x004046D4 in main at C:\SARC\Pascal2017\D\epcompat\source\epcompat\array.d(7) 0x004225DD in mainCRTStartup 0x779C62C4 in BaseThreadInitThunk 0x77B50FD9 in RtlSubscribeWnfStateChangeNotification 0x77B50FA4 in RtlSubscribeWnfStateChangeNotification The complete source: module epcompat.array; // Test with rdmd -main -unittest -debug -g source\epcompat\array.d alias StaticArray_offset StaticArray; /***************** * A fixed-length array with an index that runs from $(D_PARAM first) * to $(D_PARAM last) inclusive. * * Implemented by means of an offset pointer. */ struct StaticArray_offset(T, ptrdiff_t first, ptrdiff_t last) { T[last - first + 1] _payload; T* _ptr; void init() { assert( first < cast(size_t)_payload.ptr); // Address space underrun. assert(-first < size_t.max - cast(size_t)_payload.ptr); // Address space overrun. this._ptr = _payload.ptr - first; } size_t length() { return _payload.length; } // Support e = arr[5]; ref T opIndex(ptrdiff_t index) { assert(index >= first); assert(index <= last); return *(_ptr + index); } // Support arr[5] = e; void opIndexAssign(U : T)(auto ref U value, ptrdiff_t index) { assert(index >= first); assert(index <= last); *(_ptr + index) = value; } // Line 39 // Support foreach(e; arr). int opApply(scope int delegate(ref T) dg) { int result = 0; for (int i = 0; i < _payload.length; i++) { result = dg(_payload[i]); if (result) break; } return result; } // Support foreach(i, e; arr). int opApply(scope int delegate(ptrdiff_t index, ref T) dg) { int result = 0; for (int i = first; i <= last; i++) { result = dg(i, *(_ptr + i)); if (result) break; } return result; } // Write to binary file. void toFile(string fileName) { import std.stdio; auto f = File(fileName, "wb"); if (f.tryLock) { f.rawWrite(_payload); } } } unittest { StaticArray!(int, -10, 10) arr; assert(arr.length == 21); foreach (ref e; arr) e = 42; assert(arr[-10] == 42); // Line 88 assert(arr[0] == 42); assert(arr[10] == 42); foreach (i, ref e; arr) e = i; assert(arr[-10] == -10); assert(arr[0] == 0); assert(arr[5] == 5); assert(arr[10] == 10); arr[5] = 15; assert(arr[5] == 15); }
Feb 07 2017
next sibling parent reply =?UTF-8?Q?Ali_=c3=87ehreli?= <acehreli yahoo.com> writes:
On 02/07/2017 02:11 AM, Bastiaan Veelo wrote:

     void init() {
         assert( first < cast(size_t)_payload.ptr);              //
 Address space underrun.
         assert(-first < size_t.max - cast(size_t)_payload.ptr); //
 Address space overrun.
         this._ptr = _payload.ptr - first;
     }
You forgot to call that most important function. ;) 1) I don't understand the first assert there, which does not pass for me, so I commented it out. 2) May bad: init() is not a good name for a struct member, so it should be renamed: void initialize() { // assert( first < cast(size_t)_payload.ptr); // Address space underrun. assert(-first < size_t.max - cast(size_t)_payload.ptr); // Address space overrun. this._ptr = _payload.ptr - first; } 3) Instead of having to remember to call it, let's introduce a function that does it for us: auto makeStaticArray(T, ptrdiff_t first, ptrdiff_t last)() { auto s = StaticArray!(T, first, last)(); s.initialize(); return s; } unittest { // StaticArray!(int, -10, 10) arr; auto arr = makeStaticArray!(int, -10, 10);
     foreach (i, ref e; arr)
         e = i;
Unrelated: That line passes because you're building 32-bits. Here is the error I got: Error: cannot implicitly convert expression (i) of type long to int You can cast it: e = cast(int)i; or by import std.conv : to; e = i.to!int; Ali
Feb 07 2017
parent Bastiaan Veelo <Bastiaan Veelo.net> writes:
On Tuesday, 7 February 2017 at 20:28:30 UTC, Ali Çehreli wrote:
 You forgot to call that most important function. ;)
Hah of course. I assumed the name would give it some special meaning, like postblit.
 1) I don't understand the first assert there, which does not 
 pass for me, so I commented it out.
The intention is to check that (_payload.ptr - first) is larger than 0.
 2) May bad: init() is not a good name for a struct member, so 
 it should be renamed:

     void initialize() {
         // assert( first < cast(size_t)_payload.ptr);
    // Address space underrun.
         assert(-first < size_t.max - cast(size_t)_payload.ptr); 
 // Address space overrun.
         this._ptr = _payload.ptr - first;
     }

 3) Instead of having to remember to call it, let's introduce a 
 function that does it for us:

 auto makeStaticArray(T, ptrdiff_t first, ptrdiff_t last)() {
     auto s = StaticArray!(T, first, last)();
     s.initialize();
     return s;
 }
OK good.
 unittest {
     // StaticArray!(int, -10, 10) arr;
     auto arr = makeStaticArray!(int, -10, 10);

     foreach (i, ref e; arr)
         e = i;
Unrelated: That line passes because you're building 32-bits. Here is the error I got: Error: cannot implicitly convert expression (i) of type long to int You can cast it: e = cast(int)i; or by import std.conv : to; e = i.to!int;
Thanks a lot for your illustrative answers, including the next one! Bastiaan.
Feb 07 2017
prev sibling parent reply =?UTF-8?Q?Ali_=c3=87ehreli?= <acehreli yahoo.com> writes:
Pressed send too soon, before considering your GC question.

On 02/07/2017 02:11 AM, Bastiaan Veelo wrote:

 We do not need to take measures against the GC?
Of course we have to and the struct that I wrote is illegal because it is self-referencing through the _ptr member. (D has the right to move structs around, making my _ptr potentially pointing to an illegal address.) This optimization cannot work if the array is a static array inside the same struct. It would work with a dynamic array but then it would probably be slower than applying the *(ptr+index) technique. Ali
Feb 07 2017
parent reply Bastiaan Veelo <Bastiaan Veelo.net> writes:
On Tuesday, 7 February 2017 at 20:33:35 UTC, Ali Çehreli wrote:
 On 02/07/2017 02:11 AM, Bastiaan Veelo wrote:

 We do not need to take measures against the GC?
Of course we have to and the struct that I wrote is illegal because it is self-referencing through the _ptr member. (D has the right to move structs around, making my _ptr potentially pointing to an illegal address.)
Gosh am I glad I brought that up, and for people like you hanging out in the learn group. Thanks for pointing this out!
 This optimization cannot work if the array is a static array 
 inside the same struct. It would work with a dynamic array but 
 then it would probably be slower than applying the *(ptr+index) 
 technique.
You mean slower than the _payload[index - first] technique? Is that because of heap memory versus stack memory? Am I correct that the reason that a dynamic array would work is because it is allocated independently from the struct, meaning that its address stays the same even if the struct is moved? We could .reserve a dynamic array in initialize() to prevent it being resized. Thanks again. Bastiaan.
Feb 07 2017
parent reply =?UTF-8?Q?Ali_=c3=87ehreli?= <acehreli yahoo.com> writes:
On 02/07/2017 02:04 PM, Bastiaan Veelo wrote:

 This optimization cannot work if the array is a static array inside
 the same struct. It would work with a dynamic array but then it would
 probably be slower than applying the *(ptr+index) technique.
You mean slower than the _payload[index - first] technique? Is that because of heap memory versus stack memory?
Mandatory disclaimer: We can't be sure without testing. Not exactly because stack versus heap because the whole object might be sitting in heap memory anyway: alias S = StaticArray!(/* ... */); S[] objects; objects ~= S(/* ... */); So, all of those are on the heap. It's more about having everything at hand, near each other, close in memory. If the member is a static array, then when we have an S[], all members are there to be operated on. If the array in dynamic, then an S[] has indirections, reaching out to the heap, which may involve long-latency memory reads. foreach (s; objects) { s[42]; // May have a cache miss and read from memory } In the static array case, the entire body of s is already on the cache. On the other hand, when the objects are large, then few of those can fit in the cache. etc. One needs to profile to see what works better.
 Am I correct that the reason that a dynamic array would work is because
 it is allocated independently from the struct, meaning that its address
 stays the same even if the struct is moved?
Yes, that would be a requirement to prevent the self-reference.
 We could .reserve a dynamic
 array in initialize() to prevent it being resized.
Makes sense. Another solution that came to my mind: You can keep the self-referencing solution but make sure that the objects are not moved after initialize(): S[] objects; objects.reserve(100); objects.each!(o => o.initialize()); // The objects will not move if you're careful
 Thanks again.
 Bastiaan.
Ali
Feb 07 2017
parent Bastiaan Veelo <Bastiaan Veelo.net> writes:
Wrapup: I am going to go for the original approach of index 
conversion, and leaving the offset-pointer approach for what it 
is.

Reasons:
1) uncertain efficiency gain/loss,
2) theoretically it may fail,
3) .sizeof does not include the payload,
4) analysis of the assembler generated by our reference compiler 
(Prospero) shows that it also uses index conversion. 
(Interestingly, it does so at compile time, when possible).

Thanks for many interesting insights.

Bastiaan.
Feb 08 2017
prev sibling parent WhatMeForget <kheaser gmail.com> writes:
On Saturday, 1 August 2015 at 09:35:53 UTC, DLearner wrote:
 Does the D language set in stone that the first element of an 
 array _has_ to be index zero?
 Wouldn't starting array elements at one avoid the common 
 'off-by-one' logic error, it does
 seem more natural to begin a count at 1.

 Actually, maybe even better to allow array definitions of form
 int foo[x:y];
 (y >= x) creating integer variables foo[x], foo[x+1],...,foo[y].

 I think the (very old) IBM PL/I language was like this.
There is a good metaphor in multi-story buildings. Americans the UK, their first floor is what Americans would call the 2nd floor. But then how can you trust a group of people who drive on the wrong side of the road :)
Feb 06 2017